Skip to contents

Impute missing transcript counts using a shared nearest neighbor graph

Usage

sc_impute_transcript(combined_sce, dimred = "PCA", ...)

Arguments

combined_sce

A SingleCellExperiment object with gene counts and a "transcript" altExp slot.

dimred

The name of the reduced dimension to use for building the shared nearest neighbor graph.

...

Additional arguments to pass to scran::buildSNNGraph. E.g. k = 30.

Value

A SingleCellExperiment object with imputed logcounts assay in the "transcript" altExp slot.

Details

For cells with NA values in the "transcript" altExp slot, this function imputes the missing values from cells with non-missing values. A shared nearest neighbor graph is built using reduced dimensions from the SingleCellExperiment object, and the imputation is done where the imputed value for a cell is the weighted sum of the transcript counts of its neighbors. Imputed values are stored in the "logcounts" assay of the "transcript" altExp slot. The "counts" assay is used to obtain logcounts but left unchanged.

Examples

sce <- SingleCellExperiment::SingleCellExperiment(assays = list(counts = matrix(rpois(50, 5), ncol = 10)))
long_read <- SingleCellExperiment::SingleCellExperiment(assays = list(counts = matrix(rpois(40, 5), ncol = 10)))
SingleCellExperiment::altExp(sce, "transcript") <- long_read
SingleCellExperiment::counts(SingleCellExperiment::altExp(sce))[,1:2] <- NA
SingleCellExperiment::counts(SingleCellExperiment::altExp(sce))
#>      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#> [1,]   NA   NA    4    4    9    7    6    7    5     1
#> [2,]   NA   NA    5    4    9    6    4    6    5     5
#> [3,]   NA   NA    5    5    9    3    7    8    4     2
#> [4,]   NA   NA    5    6    2    6    5    7    7     6
imputed_sce <- sc_impute_transcript(sce, k = 4)
#> Warning: more singular values/vectors requested than available
#> Warning: You're computing too large a percentage of total singular values, use a standard svd instead.
#> Imputing transcript counts ...
SingleCellExperiment::logcounts(SingleCellExperiment::altExp(imputed_sce))
#> 4 x 10 Matrix of class "dgeMatrix"
#>          [,1]     [,2]     [,3]     [,4]     [,5]     [,6]     [,7]     [,8]
#> [1,] 2.606167 2.387593 2.479993 2.479993 2.954196 2.985583 2.793234 2.686501
#> [2,] 2.686290 2.757838 2.749252 2.479993 2.954196 2.793234 2.308753 2.500984
#> [3,] 2.567505 2.568373 2.749252 2.749252 2.954196 1.987652 2.985583 2.850857
#> [4,] 2.656247 2.721574 2.749252 2.976074 1.321928 2.793234 2.571236 2.686501
#>          [,9]    [,10]
#> [1,] 2.627273 1.352516
#> [2,] 2.627273 3.132224
#> [3,] 2.362570 2.038135
#> [4,] 3.044394 3.367571