You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi guys, looking for some help on an issue I cannot seem to solve. I have been following the integration pipeline (https://stuartlab.org/signac/articles/merging) for a set of 19 samples, and am having some issues when it comes to processing data following a merge. My individual sample pipeline looks something like this:
Then merging all samples into an object called total_ATAC and running:
total_ATAC <- RunTFIDF(total_ATAC)
total_ATAC <- FindTopFeatures(total_ATAC, min.cutoff = 20)
total_ATAC[['ATAC']]
total_ATAC <- RunSVD(total_ATAC, n = 50)
total_ATAC <- RunUMAP(total_ATAC, dims = 2:50, reduction = 'lsi')
However, when I run 'RunUMAP' I receive the error:
14:33:10 UMAP embedding parameters a = 0.9922 b = 1.112
Error in checkna(X) : Missing values found in 'X'
Which is due to the cell embeddings from LSI being all NA valued:
head(total_ATAC@reductions$lsi@cell.embeddings, 3)
LSI_1 LSI_2 LSI_3 LSI_4 LSI_5 LSI_6 LSI_7 LSI_8 LSI_9 LSI_10 LSI_11 LSI_12 LSI_13
H9A_AAACGGTTCCACTATG-1 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
H9A_AAAGCGAAGCAAACAC-1 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
H9A_AAAGGCAAGCTTGCTA-1 NaN NaN NaN NaN NaN NaN NaN NaN
This issue has only persisted across merged data, as I can run the SVD and UMAP on individual samples with no problem. I cannot seem to figure out how to troubleshoot this, tried many different things, all to no solution.
If you have any ideas, it would be much appreciated.
Thanks a bunch.
Here's my sessioninfo():
R version 4.4.1 (2024-06-14)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 22.04.5 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0
Random number generation:
RNG: L'Ecuyer-CMRG
Normal: Inversion
Sample: Rejection
I should add, I attempted the example workflow for merging data and I think I may have isolated the issue. When running RunSVD(combined) I receive the error:
Warning in irlba(A = t(x = object), nv = n, work = irlba.work, tol = tol) :
did not converge--results might be invalid!; try increasing work or maxit
Scaling cell embeddings
And then receive all NA's for cell embeddings across all of the example cells:
combined@reductions$lsi@cell.embeddings
LSI_1 LSI_2 LSI_3 LSI_4 LSI_5 LSI_6 LSI_7 LSI_8 LSI_9 LSI_10 LSI_11 LSI_12 LSI_13
500_AAACTGCAGACTCGGA-1 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
500_AAAGATGCACCTATTT-1 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
500_AAAGATGCAGATACAA-1 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
Could this be an error forced by irlba ? How should I circumvent?
Thanks.
Hi guys, looking for some help on an issue I cannot seem to solve. I have been following the integration pipeline (https://stuartlab.org/signac/articles/merging) for a set of 19 samples, and am having some issues when it comes to processing data following a merge. My individual sample pipeline looks something like this:
Reading in peaks
H9A <- Read10X_h5("09A_counts/outs/filtered_peak_bc_matrix.h5", use.names = T)
peaks_H9A <- read.table(
file = "09A_counts/outs/peaks.bed",
col.names = c("chr", "start", "end"))
gr.H9A <- makeGRangesFromDataFrame(peaks_H9A)
(defining combined_peaks using all objects)
Loading metadata
md.H9A <- read.table(
file = "09A_counts/outs/singlecell.csv",
stringsAsFactors = FALSE,
sep = ",",
header = TRUE,
row.names = 1
)[-1, ] # remove the first row
Initial filtering of low-count cells
plot(md.H9A$passed_filters)
md.H9A <- md.H9A[md.H9A$passed_filters > 100, ]
create fragment objects
frags.H9A <- CreateFragmentObject(
path = "09A_counts/outs/fragments.tsv.gz",
cells = rownames(md.H9A)
)
Quantifying peaks
H9A.counts <- FeatureMatrix(
fragments = frags.H9A,
features = combined.peaks,
cells = rownames(md.H9A)
)
Creating objects
H9A_assay <- CreateChromatinAssay(H9A.counts, fragments = frags.H9A)
s.H9A <- CreateSeuratObject(H9A_assay, assay = "ATAC", meta.data=md.H9A, project="H9A")
ncol(s.H9A) #2119
Annotation(s.H9A) <- annotation
Quality control - per sample
grange <- StringToGRanges(rownames(s.H9A), sep = c(":", "-"))
grange.use <- seqnames(grange) %in% standardChromosomes(grange)
s.H9A <- s.H9A[as.vector(grange.use), ]
s.H9A <- NucleosomeSignal(object=s.H9A) #2119 cell barcodes
s.H9A <- TSSEnrichment(object=s.H9A, fast=FALSE)
s.H9A$pct_reads_in_peaks <- s.H9A@meta.data$peak_region_fragments / s.H9A@meta.data$passed_filters * 100
s.H9A$blacklist_ratio <- s.H9A@meta.data$blacklist_region_fragments / s.H9A@meta.data$peak_region_fragments
s.H9A$high.tss <- ifelse(s.H9A$TSS.enrichment > 1, 'High', 'Low')
s.H9A$nucleosome_group <- ifelse(s.H9A$nucleosome_signal > 2, 'NS > 2', 'NS < 2')
s.H9A <- subset(
x = s.H9A,
subset = peak_region_fragments < 250 &
pct_reads_in_peaks > 0.1 &
pct_reads_in_peaks < 2 &
nucleosome_signal < 1.5 &
nucleosome_signal > 0.1 &
TSS.enrichment < 5
)
ncol(s.H9A) #1638 cells
Then merging all samples into an object called total_ATAC and running:
total_ATAC <- RunTFIDF(total_ATAC)
total_ATAC <- FindTopFeatures(total_ATAC, min.cutoff = 20)
total_ATAC[['ATAC']]
total_ATAC <- RunSVD(total_ATAC, n = 50)
total_ATAC <- RunUMAP(total_ATAC, dims = 2:50, reduction = 'lsi')
However, when I run 'RunUMAP' I receive the error:
14:33:10 UMAP embedding parameters a = 0.9922 b = 1.112
Error in checkna(X) : Missing values found in 'X'
Which is due to the cell embeddings from LSI being all NA valued:
This issue has only persisted across merged data, as I can run the SVD and UMAP on individual samples with no problem. I cannot seem to figure out how to troubleshoot this, tried many different things, all to no solution.
If you have any ideas, it would be much appreciated.
Thanks a bunch.
Here's my sessioninfo():
R version 4.4.1 (2024-06-14)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 22.04.5 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0
Random number generation:
RNG: L'Ecuyer-CMRG
Normal: Inversion
Sample: Rejection
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
time zone: America/New_York
tzcode source: system (glibc)
attached base packages:
[1] grid stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] EnsDb.Hsapiens.v86_2.99.0 ensembldb_2.28.1 AnnotationFilter_1.28.0
[4] GenomicFeatures_1.56.0 SingleCellExperiment_1.26.0 rhdf5_2.48.0
[7] Rcpp_1.0.13 data.table_1.16.0 plyr_1.8.9
[10] magrittr_2.0.3 gtable_0.3.5 gtools_3.9.5
[13] gridExtra_2.3 ArchR_1.0.2 Signac_1.14.0
[16] SCP_0.5.6 stringr_1.5.1 reprex_2.1.1
[19] cowplot_1.1.3 SCINA_1.2.0 gplots_3.1.3.1
[22] MASS_7.3-61 infercnv_1.20.0 patchwork_1.3.0
[25] Seurat_5.1.0 SeuratObject_5.0.2 sp_2.1-4
[28] Matrix_1.6-5 gprofiler2_0.2.3 scales_1.3.0
[31] NMF_0.28 cluster_2.1.6 rngtools_1.5.2
[34] registry_0.5-1 DOSE_3.28.2 org.Hs.eg.db_3.18.0
[37] AnnotationDbi_1.66.0 clusterProfiler_4.10.1 readxl_1.4.3
[40] rtracklayer_1.64.0 ggpubr_0.6.0 reshape2_1.4.4
[43] tidyr_1.3.1 dplyr_1.1.4 DESeq2_1.44.0
[46] SummarizedExperiment_1.34.0 Biobase_2.64.0 MatrixGenerics_1.16.0
[49] matrixStats_1.4.1 GenomicRanges_1.56.1 GenomeInfoDb_1.40.1
[52] IRanges_2.38.1 S4Vectors_0.42.1 BiocGenerics_0.50.0
[55] tibble_3.2.1 DescTools_0.99.56 caret_6.0-94
[58] lattice_0.22-6 ggrepel_0.9.6 ggplot2_3.5.1
[61] pROC_1.18.5 doMC_1.3.8 iterators_1.0.14
[64] foreach_1.5.2
loaded via a namespace (and not attached):
[1] igraph_2.0.3 ica_1.0-3 plotly_4.10.4 zlibbioc_1.50.0
[5] tidyselect_1.2.1 bit_4.0.5 doParallel_1.0.17 clue_0.3-65
[9] rjson_0.2.23 blob_1.2.4 S4Arrays_1.4.1 png_0.1-8
[13] cli_3.6.3 ggplotify_0.1.2 ProtGenerics_1.36.0 goftest_1.2-3
[17] BiocIO_1.14.0 SCpubr_2.0.2 purrr_1.0.2 uwot_0.2.2
[21] shadowtext_0.1.4 curl_5.2.2 mime_0.12 tidytree_0.4.6
[25] leiden_0.4.3.1 coin_1.4-3 ComplexHeatmap_2.18.0 stringi_1.8.4
[29] backports_1.5.0 rjags_4-16 slingshot_2.10.0 parallelDist_0.2.6
[33] XML_3.99-0.17 Exact_3.3 lubridate_1.9.3 httpuv_1.6.15
[37] rappdirs_0.3.3 splines_4.4.1 RcppRoll_0.3.1 prodlim_2024.06.25
[41] ggraph_2.2.1 sctransform_0.4.1 rootSolve_1.8.2.4 DBI_1.2.3
[45] HDF5Array_1.32.1 withr_3.0.1 class_7.3-22 enrichplot_1.22.0
[49] lmtest_0.9-40 ggnewscale_0.5.0 tidygraph_1.3.1 formatR_1.14
[53] BiocManager_1.30.25 htmlwidgets_1.6.4 fs_1.6.4 biomaRt_2.58.2
[57] princurve_2.1.6 labeling_0.4.3 SparseArray_1.4.8 cellranger_1.1.0
[61] lmom_3.0 reticulate_1.39.0 zoo_1.8-12 XVector_0.44.0
[65] UCSC.utils_1.0.0 timechange_0.3.0 fansi_1.0.6 caTools_1.18.3
[69] timeDate_4032.109 ggtree_3.10.1 R.oo_1.26.0 RSpectra_0.16-2
[73] irlba_2.3.5.1 fastDummies_1.7.4 gridGraphics_0.5-1 lazyeval_0.2.2
[77] yaml_2.3.10 phyclust_0.1-34 survival_3.7-0 scattermore_1.2
[81] crayon_1.5.3 RcppAnnoy_0.0.22 RColorBrewer_1.1-3 progressr_0.14.0
[85] tweenr_2.0.3 later_1.3.2 ggridges_0.5.6 codetools_0.2-20
[89] GlobalOptions_0.1.2 KEGGREST_1.44.1 Rtsne_0.17 shape_1.4.6.1
[93] limma_3.60.4 Rsamtools_2.20.0 filelock_1.0.3 pkgconfig_2.0.3
[97] xml2_1.3.6 spatstat.univar_3.0-1 GenomicAlignments_1.40.0 aplot_0.2.3
[101] spatstat.sparse_3.1-0 ape_5.8 viridisLite_0.4.2 gridBase_0.4-7
[105] xtable_1.8-4 car_3.1-2 fastcluster_1.2.6 httr_1.4.7
[109] tools_4.4.1 globals_0.16.3 hardhat_1.4.0 broom_1.0.6
[113] nlme_3.1-166 futile.logger_1.4.3 lambda.r_1.2.4 HDO.db_0.99.1
[117] dbplyr_2.5.0 assertthat_0.2.1 digest_0.6.37 farver_2.1.2
[121] TrajectoryUtils_1.10.1 ModelMetrics_1.2.2.2 yulab.utils_0.1.7 viridis_0.6.5
[125] rpart_4.1.23 glue_1.7.0 cachem_1.1.0 BiocFileCache_2.10.2
[129] polyclip_1.10-7 proxyC_0.4.1 generics_0.1.3 Biostrings_2.72.1
[133] mvtnorm_1.3-1 parallelly_1.38.0 statmod_1.5.0 R.cache_0.16.0
[137] RcppHNSW_0.6.0 carData_3.0-5 pbapply_1.7-2 spam_2.10-0
[141] gson_0.1.0 utf8_1.2.4 gower_1.0.1 SeuratWrappers_0.3.0
[145] graphlayouts_1.1.1 ggsignif_0.6.4 shiny_1.9.1 lava_1.8.0
[149] GenomeInfoDbData_1.2.12 R.utils_2.12.3 rhdf5filters_1.16.0 RCurl_1.98-1.16
[153] memoise_2.0.1 R.methodsS3_1.8.2 gld_2.6.6 future_1.34.0
[157] RANN_2.6.2 Cairo_1.6-2 spatstat.data_3.1-2 rstudioapi_0.16.0
[161] spatstat.utils_3.1-0 hms_1.1.3 fitdistrplus_1.2-1 munsell_0.5.1
[165] colorspace_2.1-1 rlang_1.1.4 ipred_0.9-15 dotCall64_1.1-1
[169] ggforce_0.4.2 circlize_0.4.16 coda_0.19-4.1 e1071_1.7-16
[173] TH.data_1.1-2 remotes_2.5.0 recipes_1.1.0 modeltools_0.2-23
[177] abind_1.4-8 GOSemSim_2.28.1 libcoin_1.0-10 treeio_1.26.0
[181] Rhdf5lib_1.26.0 futile.options_1.0.1 bitops_1.0-8 promises_1.3.0
[185] scatterpie_0.2.4 RSQLite_2.3.7 qvalue_2.34.0 sandwich_3.1-1
[189] fgsea_1.28.0 DelayedArray_0.30.1 proxy_0.4-27 GO.db_3.18.0
[193] compiler_4.4.1 forcats_1.0.0 prettyunits_1.2.0 boot_1.3-31
[197] listenv_0.9.1 edgeR_4.2.1 tensor_1.5 progress_1.2.3
[201] BiocParallel_1.38.0 spatstat.random_3.3-2 R6_2.5.1 fastmap_1.2.0
[205] multcomp_1.4-26 fastmatch_1.1-4 rstatix_0.7.2 ROCR_1.0-11
[209] rsvd_1.0.5 nnet_7.3-19 KernSmooth_2.23-24 miniUI_0.1.1.1
[213] deldir_2.0-4 htmltools_0.5.8.1 RcppParallel_5.1.9 bit64_4.0.5
[217] spatstat.explore_3.3-2 lifecycle_1.0.4 restfulr_0.0.15 vctrs_0.6.5
[221] spatstat.geom_3.3-3 ggfun_0.1.6 future.apply_1.11.2 pillar_1.9.0
[225] locfit_1.5-9.10 jsonlite_1.8.8 expm_1.0-0 argparse_2.2.3
[229] GetoptLong_1.0.5
The text was updated successfully, but these errors were encountered: