Relative mutation positions within the gene body
Source:R/sc_mutations.R
relative_mutation_positions.Rd
Given a set of mutations and gene annotation, calculate the relative position of each mutation within the gene body they are in.
Usage
relative_mutation_positions(
mutations,
annotation,
bin = FALSE,
by = c(region = "gene_name"),
threads = 1
)
Arguments
- mutations
either the tibble output from
find_variants
. It must have columnsseqnames
,pos
, and a third column for specifying the gene id or gene name. The mutation must be within the gene region.- annotation
Either path to the annotation file (GTF/GFF) or a GRanges object of the gene annotation.
- bin
logical(1): whether to bin the relative positions into 100 bins.
- by
character(1): the column name in the annotation to match with the gene annotation. E.g.
c("region" = "gene_name")
to match the `region` column in the mutations with the `gene_name` column in the annotation.- threads
integer(1): number of threads to use.
Value
If bin = FALSE
, a list of numeric vectors of relative positions of each mutation within the gene body.
If bin = TRUE
, a numeric vector of length 100 of the number of mutations in each bin.
Examples
outdir <- tempfile()
dir.create(outdir)
genome_fa <- system.file("extdata", "rps24.fa.gz", package = "FLAMES")
minimap2_align( # align to genome
config = jsonlite::fromJSON(
system.file("extdata", "config_sclr_nanopore_3end.json", package = "FLAMES")),
fa_file = genome_fa,
fq_in = system.file("extdata", "fastq", "demultiplexed.fq.gz", package = "FLAMES"),
annot = system.file("extdata", "rps24.gtf.gz", package = "FLAMES"),
outdir = outdir
)
#> 06:55:04 AM Mon Jan 06 2025 minimap2_align
#> total mapped primary secondary
#> /tmp/RtmpqmWKMU/filecd67e8e7fac/align2genome.bam 10 10 10 0
variants <- find_variants(
bam_path = file.path(outdir, "align2genome.bam"),
reference = genome_fa,
annotation = system.file("extdata", "rps24.gtf.gz", package = "FLAMES"),
min_nucleotide_depth = 4
)
#> 06:55:04 Reading reference ...
#> 06:55:04 Reading annotation ...
#> Merging overlapping genes ...
#> 0 overlapping regions, their gene_name(s) are merged with `, ` as separator
#> 06:55:04 Adding unannotated gaps ...
#> 06:55:04 Got 1 bam file, parallelizing over each region ...
#>
|
| | 0%
|
|======================================================================| 100%
#>
#> 06:55:06 Merging results ...
#> 06:55:06 Calculating homopolymer percentages ...
#>
|
| | 0%
|
|========= | 12%
|
|================== | 25%
|
|========================== | 38%
|
|=================================== | 50%
|
|============================================ | 62%
|
|==================================================== | 75%
|
|============================================================= | 88%
|
|======================================================================| 100%
#>
positions <-
relative_mutation_positions(
mutations = variants,
annotation = system.file("extdata", "rps24.gtf.gz", package = "FLAMES")
)
#> 06:55:07 Reading annotation ...
#> Warning: Using one column matrices in `filter()` was deprecated in dplyr 1.1.0.
#> ℹ Please use one dimensional logical vectors instead.
#> ℹ The deprecated feature was likely used in the FLAMES package.
#> Please report the issue at <https://github.com/mritchielab/FLAMES/issues>.
#>
|
| | 0%
|
|======================================================================| 100%
#>