Skip to contents

Removes isoform annotations that could produce ambigious reads, such as isoforms that only differ by the 5' / 3' end. This could be useful for plotting average coverage plots.

Usage

filter_annotation(annotation, keep = "tss_differ")

Arguments

annotation

path to the GTF annotation file, or the parsed GenomicRanges object.

keep

string, one of 'tss_differ' (only keep isoforms that all differ by the transcription start site position), 'tes_differ' (only keep those that differ by the transcription end site position), 'both' (only keep those that differ by both the start and end site), or 'single_transcripts' (only keep genes that contains a sinlge transcript).

Value

GenomicRanges of the filtered isoforms

Examples

filtered_annotation <- filter_annotation(
  system.file("extdata", "rps24.gtf.gz", package = 'FLAMES'), keep = 'tes_differ')
#> Import genomic features from the file as a GRanges object ... 
#> OK
#> Prepare the 'metadata' data frame ... 
#> OK
#> Make the TxDb object ... 
#> Warning: The "phase" metadata column contains non-NA values for features of type
#>   stop_codon. This information was ignored.
#> OK
filtered_annotation
#> GRanges object with 6 ranges and 2 metadata columns:
#>       seqnames    ranges strand |     tx_id              tx_name
#>          <Rle> <IRanges>  <Rle> | <integer>          <character>
#>   [1]    chr14   19-5159      + |         1 ENSMUST00000225994.1
#>   [2]    chr14   32-3389      + |         6 ENSMUST00000225117.1
#>   [3]    chr14   68-5124      + |         7 ENSMUST00000224568.1
#>   [4]    chr14   86-1118      + |         8 ENSMUST00000224549.1
#>   [5]    chr14  160-2761      + |         9 ENSMUST00000224569.1
#>   [6]    chr14  450-1290      + |        10 ENSMUST00000224699.1
#>   -------
#>   seqinfo: 1 sequence from an unspecified genome; no seqlengths