Relative mutation positions within the gene body — relative_mutation

Given a set of mutations and gene annotation, calculate the relative position of each mutation within the gene body they are in.

Usage

relative_mutation_positions(
  mutations,
  annotation,
  bin = FALSE,
  by = c(region = "gene_name"),
  threads = 1
)

Arguments

mutations: either the tibble output from find_variants. It must have columns seqnames, pos, and a third column for specifying the gene id or gene name. The mutation must be within the gene region.
annotation: Either path to the annotation file (GTF/GFF) or a GRanges object of the gene annotation.
bin: logical(1): whether to bin the relative positions into 100 bins.
by: character(1): the column name in the annotation to match with the gene annotation. E.g. c("region" = "gene_name") to match the `region` column in the mutations with the `gene_name` column in the annotation.
threads: integer(1): number of threads to use.

Value

If bin = FALSE, a list of numeric vectors of relative positions of each mutation within the gene body. If bin = TRUE, a numeric vector of length 100 of the number of mutations in each bin.

Examples

outdir <- tempfile()
dir.create(outdir)
genome_fa <- system.file("extdata", "rps24.fa.gz", package = "FLAMES")
minimap2_align( # align to genome
  config = jsonlite::fromJSON(
    system.file("extdata", "config_sclr_nanopore_3end.json", package = "FLAMES")),
  fa_file = genome_fa,
  fq_in = system.file("extdata", "fastq", "demultiplexed.fq.gz", package = "FLAMES"),
  annot = system.file("extdata", "rps24.gtf.gz", package = "FLAMES"),
  outdir = outdir
)
#> 11:36:09 PM Thu Feb 06 2025 minimap2_align
#>                                                   total mapped primary
#> /tmp/RtmpmujAsI/file252e3b9bc6cc/align2genome.bam    10     10      10
#>                                                   secondary
#> /tmp/RtmpmujAsI/file252e3b9bc6cc/align2genome.bam         0
variants <- find_variants(
  bam_path = file.path(outdir, "align2genome.bam"),
  reference = genome_fa,
  annotation = system.file("extdata", "rps24.gtf.gz", package = "FLAMES"),
  min_nucleotide_depth = 4
)
#> 23:36:10 Reading reference ...
#> 23:36:10 Reading annotation ...
#> Merging overlapping genes ...
#> 0 overlapping regions, their gene_name(s) are merged with `, ` as separator
#> 23:36:10 Adding unannotated gaps ...
#> 23:36:10 Got 1 bam file, parallelizing over each region ...
#> 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |======================================================================| 100%
#> 
#> 23:36:11 Merging results ...
#> 23:36:11 Calculating homopolymer percentages ...
#> 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |=========                                                             |  12%
  |                                                                            
  |==================                                                    |  25%
  |                                                                            
  |==========================                                            |  38%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |============================================                          |  62%
  |                                                                            
  |====================================================                  |  75%
  |                                                                            
  |=============================================================         |  88%
  |                                                                            
  |======================================================================| 100%
#> 
positions <- 
 relative_mutation_positions(
   mutations = variants,
   annotation = system.file("extdata", "rps24.gtf.gz", package = "FLAMES")
 )
#> 23:36:12 Reading annotation ...
#> Warning: Using one column matrices in `filter()` was deprecated in dplyr 1.1.0.
#> ℹ Please use one dimensional logical vectors instead.
#> ℹ The deprecated feature was likely used in the FLAMES package.
#>   Please report the issue at <https://github.com/mritchielab/FLAMES/issues>.
#> 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |======================================================================| 100%
#>