You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am frequently running into this problem where I want to add the gene names, gene biotypes and other features to the differentially accessible peaks in a single-cell ATAC-seq data analysis of two experimental groups.
The next step I do is to find the closest features using the following code -
ClosestFeature(data_Spleen, regions = rownames(da_peaks_Spleen))
This returns me the following error message:
Warning message:
In ClosestFeature(data_Spleen, regions = rownames(da_peaks_Spleen)) :
The following seqlevels present in query regions are not present
in the supplied gene annotations and will be removed: GL456216.1, JH584295.1, JH584304.1
I get the same error message (as above) after each of the codes when I run the following codes to extract gene_name and gene_biotype
gene <- ClosestFeature(data_Spleen, regions = rownames(da_peaks_Spleen))$gene_name
biotype <- ClosestFeature(data_Spleen, regions = rownames(da_peaks_Spleen))$gene_biotype
Although I get a list of genes and biotypes, the main problem comes when I try to add this gene or biotype to the da_peaks_Spleen object using this code-
da_peaks_Spleen$gene <- gene
Error in $<-.data.frame(*tmp*, gene, value = c("Gm20546", "Gm26917", :
replacement has 2282 rows, data has 2288
I also get the same error when I am trying to add Biotype and other features.
sessionInfo()
R version 4.2.2 (2022-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)
Warning message:
In ClosestFeature(data_Spleen, regions = rownames(da_peaks_Spleen)) :
The following seqlevels present in query regions are not present
in the supplied gene annotations and will be removed: GL456216.1, JH584295.1, JH584304.1
Ok, this is not an error but a warning saying that some seqlevels in the query regions are not present in the gene annotations. You cannot find a closest feature for these since they're on a chromosome or scaffold that you don't have information for
Although I get a list of genes and biotypes, the main problem comes when I try to add this gene or biotype to the da_peaks_Spleen object using this code-
da_peaks_Spleen$gene <- gene
Since some query regions are removed (the warning message above), there are rows missing from the output dataframe, so you can't just try to assign a column from that dataframe directly to the differential accessibility results. They have different length.
I wouldn't say this is a bug per se, but we could change the ClosestFeature function to still return values for regions that are not present in the annotations, putting NA.
@timoast Thank you for the explanation. It is reassuring to know that it is not a bug, but could be modified to make the ClosestFeature returning NA values.
For the time being, I have used tibble to add the gene and biotype to the da_peaks object by merging them.
I am frequently running into this problem where I want to add the gene names, gene biotypes and other features to the differentially accessible peaks in a single-cell ATAC-seq data analysis of two experimental groups.
When I use the following code
da_peaks_Spleen <- FindMarkers( object = data_Spleen, ident.1 = rownames(data_Spleen[[]][data_Spleen$orig.ident == "ATAC_SpleenD1",]), ident.2 = rownames(data_Spleen[[]][data_Spleen$orig.ident == "ATAC_SpleenD7",]), min.pct = 0.05, test.use = 'LR', latent.vars = 'peak_region_fragments' )
I get the object da_peaks_Spleen.
The next step I do is to find the closest features using the following code -
ClosestFeature(data_Spleen, regions = rownames(da_peaks_Spleen))
This returns me the following error message:
Warning message:
In ClosestFeature(data_Spleen, regions = rownames(da_peaks_Spleen)) :
The following seqlevels present in query regions are not present
in the supplied gene annotations and will be removed: GL456216.1, JH584295.1, JH584304.1
I get the same error message (as above) after each of the codes when I run the following codes to extract gene_name and gene_biotype
gene <- ClosestFeature(data_Spleen, regions = rownames(da_peaks_Spleen))$gene_name
biotype <- ClosestFeature(data_Spleen, regions = rownames(da_peaks_Spleen))$gene_biotype
Although I get a list of genes and biotypes, the main problem comes when I try to add this gene or biotype to the da_peaks_Spleen object using this code-
da_peaks_Spleen$gene <- gene
Error in
$<-.data.frame
(*tmp*
, gene, value = c("Gm20546", "Gm26917", :replacement has 2282 rows, data has 2288
I also get the same error when I am trying to add Biotype and other features.
sessionInfo()
R version 4.2.2 (2022-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)
Matrix products: default
locale:
[1] LC_COLLATE=English_United Kingdom.utf8 LC_CTYPE=English_United Kingdom.utf8
[3] LC_MONETARY=English_United Kingdom.utf8 LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.utf8
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] tibble_3.1.8 EnhancedVolcano_1.16.0 ggrepel_0.9.2
[4] MAST_1.24.1 SingleCellExperiment_1.20.0 SummarizedExperiment_1.28.0
[7] MatrixGenerics_1.10.0 matrixStats_0.63.0 patchwork_1.1.2
[10] ggplot2_3.4.0 EnsDb.Mmusculus.v79_2.99.0 ensembldb_2.22.0
[13] AnnotationFilter_1.22.0 GenomicFeatures_1.50.3 AnnotationDbi_1.60.0
[16] Biobase_2.58.0 GenomicRanges_1.50.2 GenomeInfoDb_1.34.6
[19] IRanges_2.32.0 S4Vectors_0.36.1 BiocGenerics_0.44.0
[22] hdf5r_1.3.8 Signac_1.9.0 SeuratObject_4.1.3
[25] Seurat_4.3.0
loaded via a namespace (and not attached):
[1] utf8_1.2.2 spatstat.explore_3.0-5 reticulate_1.27
[4] tidyselect_1.2.0 RSQLite_2.2.20 htmlwidgets_1.6.1
[7] grid_4.2.2 BiocParallel_1.32.5 Rtsne_0.16
[10] devtools_2.4.5 munsell_0.5.0 codetools_0.2-18
[13] ragg_1.2.5 ica_1.0-3 interp_1.1-3
[16] future_1.31.0 miniUI_0.1.1.1 withr_2.5.0
[19] spatstat.random_3.1-3 colorspace_2.1-0 progressr_0.13.0
[22] filelock_1.0.2 knitr_1.42 rstudioapi_0.14
[25] ROCR_1.0-11 tensor_1.5 listenv_0.9.0
[28] labeling_0.4.2 GenomeInfoDbData_1.2.9 polyclip_1.10-4
[31] bit64_4.0.5 farver_2.1.1 parallelly_1.34.0
[34] vctrs_0.5.2 generics_0.1.3 xfun_0.37
[37] biovizBase_1.46.0 BiocFileCache_2.6.0 R6_2.5.1
[40] bitops_1.0-7 spatstat.utils_3.0-1 cachem_1.0.6
[43] DelayedArray_0.24.0 assertthat_0.2.1 promises_1.2.0.1
[46] BiocIO_1.8.0 scales_1.2.1 nnet_7.3-18
[49] gtable_0.3.1 globals_0.16.2 processx_3.8.0
[52] goftest_1.2-3 rlang_1.0.6 systemfonts_1.0.4
[55] RcppRoll_0.3.0 splines_4.2.2 rtracklayer_1.58.0
[58] lazyeval_0.2.2 dichromat_2.0-0.1 checkmate_2.1.0
[61] spatstat.geom_3.0-5 BiocManager_1.30.19 yaml_2.3.7
[64] reshape2_1.4.4 abind_1.4-5 backports_1.4.1
[67] httpuv_1.6.8 Hmisc_4.7-2 usethis_2.1.6
[70] tools_4.2.2 ellipsis_0.3.2 RColorBrewer_1.1-3
[73] sessioninfo_1.2.2 ggridges_0.5.4 Rcpp_1.0.10
[76] plyr_1.8.8 base64enc_0.1-3 progress_1.2.2
[79] zlibbioc_1.44.0 purrr_1.0.1 RCurl_1.98-1.9
[82] ps_1.7.2 prettyunits_1.1.1 rpart_4.1.19
[85] deldir_1.0-6 pbapply_1.7-0 urlchecker_1.0.1
[88] cowplot_1.1.1 zoo_1.8-11 cluster_2.1.4
[91] fs_1.6.0 magrittr_2.0.3 data.table_1.14.6
[94] scattermore_0.8 lmtest_0.9-40 RANN_2.6.1
[97] ProtGenerics_1.30.0 fitdistrplus_1.1-8 pkgload_1.3.2
[100] hms_1.1.2 mime_0.12 xtable_1.8-4
[103] XML_3.99-0.13 jpeg_0.1-10 gridExtra_2.3
[106] compiler_4.2.2 biomaRt_2.54.0 KernSmooth_2.23-20
[109] crayon_1.5.2 htmltools_0.5.4 later_1.3.0
[112] Formula_1.2-4 tidyr_1.3.0 DBI_1.1.3
[115] dbplyr_2.3.0 MASS_7.3-58.1 rappdirs_0.3.3
[118] Matrix_1.5-4 cli_3.6.0 parallel_4.2.2
[121] igraph_1.3.5 pkgconfig_2.0.3 GenomicAlignments_1.34.0
[124] foreign_0.8-83 sp_1.6-0 plotly_4.10.1
[127] spatstat.sparse_3.0-0 xml2_1.3.3 XVector_0.38.0
[130] VariantAnnotation_1.44.0 stringr_1.5.0 callr_3.7.3
[133] digest_0.6.31 sctransform_0.3.5 RcppAnnoy_0.0.20
[136] spatstat.data_3.0-0 Biostrings_2.66.0 leiden_0.4.3
[139] fastmatch_1.1-3 htmlTable_2.4.1 uwot_0.1.14
[142] restfulr_0.0.15 curl_5.0.0 shiny_1.7.4
[145] Rsamtools_2.14.0 rjson_0.2.21 lifecycle_1.0.3
[148] nlme_3.1-160 jsonlite_1.8.4 BSgenome_1.66.2
[151] viridisLite_0.4.1 fansi_1.0.4 pillar_1.8.1
[154] lattice_0.20-45 KEGGREST_1.38.0 fastmap_1.1.0
[157] httr_1.4.4 pkgbuild_1.4.0 survival_3.4-0
[160] remotes_2.4.2 glue_1.6.2 png_0.1-8
[163] bit_4.0.5 profvis_0.3.7 stringi_1.7.12
[166] blob_1.2.3 textshaping_0.3.6 latticeExtra_0.6-30
[169] memoise_2.0.1 dplyr_1.0.10 irlba_2.3.5.1
[172] future.apply_1.10.0
The text was updated successfully, but these errors were encountered: