Skip to contents

Filter the transcript coverage by applying a filter function to the coverage values.

Usage

filter_coverage(x, filter_fn = convolution_filter)

Arguments

x

The tibble returned by get_coverage, or a BAM file path, or a GAlignments object.

filter_fn

The filter function to apply to the coverage values. The function should take a numeric vector of coverage values and return a logical value (TRUE if the transcript passes the filter, FALSE otherwise). The default filter function is convolution_filter, which filters out transcripts with sharp drops / rises in coverage.

Value

a tibble of the transcript information and coverages, with transcipts that pass the filter

Examples

# Create a BAM file with minimap2_realign
temp_path <- tempfile()
bfc <- BiocFileCache::BiocFileCache(temp_path, ask = FALSE)
file_url <- 'https://raw.githubusercontent.com/OliverVoogd/FLAMESData/master/data'
fastq1 <- bfc[[names(BiocFileCache::bfcadd(bfc, 'Fastq1', paste(file_url, 'fastq/sample1.fastq.gz', sep = '/')))]]
genome_fa <- bfc[[names(BiocFileCache::bfcadd(bfc, 'genome.fa', paste(file_url, 'SIRV_isoforms_multi-fasta_170612a.fasta', sep = '/')))]]
annotation <- bfc[[names(BiocFileCache::bfcadd(bfc, 'annot.gtf', paste(file_url, 'SIRV_isoforms_multi-fasta-annotation_C_170612a.gtf', sep = '/')))]]
outdir <- tempfile()
dir.create(outdir)
fasta <- annotation_to_fasta(annotation, genome_fa, outdir)
#> Import genomic features from the file as a GRanges object ... 
#> OK
#> Prepare the 'metadata' data frame ... 
#> OK
#> Make the TxDb object ... 
#> Warning: The "phase" metadata column contains non-NA values for features of type
#>   exon. This information was ignored.
#> OK
minimap2_realign(
  config = jsonlite::fromJSON(
    system.file("extdata", "config_sclr_nanopore_3end.json", package = "FLAMES")),
  fq_in = fastq1,
  outdir = outdir
)
#> 02:13:16 AM Fri Oct 25 2024 minimap2_realign
#> Sorting by position
#>                                                         total mapped primary
#> /tmp/RtmpoS7Kzz/file1f6d2ac26af4/realign2transcript.bam  4554   3764    2500
#>                                                         secondary
#> /tmp/RtmpoS7Kzz/file1f6d2ac26af4/realign2transcript.bam      2051
x <- get_coverage(file.path(outdir, 'realign2transcript.bam'))
nrow(x)
#> [1] 26
filter_coverage(x) |>
  nrow()
#> 26 transcripts found in the BAM file.
#> 3(11.54%) transcripts failed the filter.
#> Failed transcripts account for 140 reads, out of 1095(12.79%) reads in total.
#> [1] 23