Skip to contents

This function is deprecated. Please use [SingleCellPipeline()] instead.

Usage

sc_long_pipeline(
  annotation,
  fastq,
  outdir,
  genome_fa,
  minimap2 = NULL,
  barcodes_file = NULL,
  expect_cell_number = NULL,
  config_file = NULL
)

Arguments

annotation

The file path to the annotation file in GFF3 format

fastq

The file path to input fastq file

outdir

The path to directory to store all output files.

genome_fa

The file path to genome fasta file.

minimap2

Path to minimap2, optional.

barcodes_file

The file with expected cell barcodes, with each barcode on a new line.

expect_cell_number

The expected number of cells in the sample. This is used if barcodes_file is not provided. See BLAZE for more details.

config_file

File path to the JSON configuration file.

Value

A SingleCellPipeline object containing the transcript counts.

See also

SingleCellPipeline for the new pipeline interface, BulkPipeline for bulk long data, MultiSampleSCPipeline for multi sample single cell pipelines.

Examples

outdir <- tempfile()
dir.create(outdir)
bc_allow <- file.path(outdir, "bc_allow.tsv")
genome_fa <- file.path(outdir, "rps24.fa")
R.utils::gunzip(
  filename = system.file("extdata", "bc_allow.tsv.gz", package = "FLAMES"),
  destname = bc_allow, remove = FALSE
)
R.utils::gunzip(
  filename = system.file("extdata", "rps24.fa.gz", package = "FLAMES"),
  destname = genome_fa, remove = FALSE
)
sce <- FLAMES::sc_long_pipeline(
  genome_fa = genome_fa,
  fastq = system.file("extdata", "fastq", "musc_rps24.fastq.gz", package = "FLAMES"),
  annotation = system.file("extdata", "rps24.gtf.gz", package = "FLAMES"),
  outdir = outdir,
  barcodes_file = bc_allow
)
#> No config file provided, creating a default config in /tmp/RtmpgpEV0i/file25bf67c4d00e 
#> Writing configuration parameters to:  /tmp/RtmpgpEV0i/file25bf67c4d00e/config_file_9663.json 
#> Configured steps: 
#> 	barcode_demultiplex: TRUE
#> 	genome_alignment: TRUE
#> 	gene_quantification: TRUE
#> 	isoform_identification: TRUE
#> 	read_realignment: TRUE
#> 	transcript_quantification: TRUE
#> samtools not found, will use Rsamtools package instead
#> Running step: barcode_demultiplex
#> FLEXIPLEX 0.96.2
#> Setting max barcode edit distance to 2
#> Setting max flanking sequence edit distance to 8
#> Setting read IDs to be  replaced
#> Setting number of threads to 8
#> Search pattern: 
#> primer: CTACACGACGCTCTTCCGATCT
#> BC: NNNNNNNNNNNNNNNN
#> UMI: NNNNNNNNNNNN
#> polyT: TTTTTTTTT
#> Setting known barcodes from /tmp/RtmpgpEV0i/file25bf67c4d00e/bc_allow.tsv
#> Number of known barcodes: 143
#> Processing file: /__w/_temp/Library/FLAMES/extdata/fastq/musc_rps24.fastq.gz
#> Searching for barcodes...
#> Number of reads processed: 393
#> Number of reads where at least one barcode was found: 368
#> Number of reads with exactly one barcode match: 364
#> Number of chimera reads: 1
#> All done!
#> Running step: genome_alignment
#> Creating junction bed file from GFF3 annotation.
#> Aligning sample /tmp/RtmpgpEV0i/file25bf67c4d00e/matched_reads.fastq -> /tmp/RtmpgpEV0i/file25bf67c4d00e/align2genome.bam
#> Your fastq file appears to have tags, but you did not provide the -y option to minimap2 to include the tags in the output.
#> Warning: samtools not found, using Rsamtools instead, this could be slower and might fail for large BAM files.
#> Sorting BAM files by genome coordinates with 8 threads...
#> Indexing bam files
#> Running step: gene_quantification
#> 03:23:40 AM Wed May 21 2025 quantify genes 
#> Found genome alignment file(s): 	align2genome.bam
#> Running step: isoform_identification
#> Import genomic features from the file as a GRanges object ... 
#> OK
#> Prepare the 'metadata' data frame ... 
#> OK
#> Make the TxDb object ... 
#> Warning: genome version information is not available for this TxDb object
#> OK
#> Running step: read_realignment
#> Checking for fastq file(s) /__w/_temp/Library/FLAMES/extdata/fastq/musc_rps24.fastq.gz
#> 	files found
#> Checking for fastq file(s) /tmp/RtmpgpEV0i/file25bf67c4d00e/matched_reads.fastq
#> 	files found
#> Checking for fastq file(s) /tmp/RtmpgpEV0i/file25bf67c4d00e/matched_reads_dedup.fastq
#> 	files found
#> Realigning sample /tmp/RtmpgpEV0i/file25bf67c4d00e/matched_reads_dedup.fastq -> /tmp/RtmpgpEV0i/file25bf67c4d00e/realign2transcript.bam
#> Warning: samtools not found, using Rsamtools instead, this could be slower and might fail for large BAM files.
#> Sorting BAM files by 8 with CB threads...
#> Running step: transcript_quantification
#> Import genomic features from the file as a GRanges object ... 
#> OK
#> Prepare the 'metadata' data frame ... 
#> OK
#> Make the TxDb object ... 
#> Warning: genome version information is not available for this TxDb object
#> OK
#> Import genomic features from the file as a GRanges object ... 
#> OK
#> Prepare the 'metadata' data frame ... 
#> OK
#> Make the TxDb object ... 
#> Warning: genome version information is not available for this TxDb object
#> OK
#> Pipeline saved to /tmp/RtmpgpEV0i/file25bf67c4d00e/pipeline.rds