Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"I performed the gseGO function using clusterProfiler for all ontologies together, but I wasn't able to apply the simplify function to remove redundancy. I tried various options suggested online, but none worked. As a workaround, I ran gseGO for each ontology (BP, MF, CC) separately and then removed redundancy for each ontology using GOsim. Is it correct to combine the results from all ontologies into one file and then combining it with metadata from one ontology (e.g., the output of gseGO for BP) to create enrichment plots?" or what could be the easiest way to perfrom that #707

Open
8 tasks
Sidragull57 opened this issue Jul 16, 2024 · 13 comments

Comments

@Sidragull57
Copy link

Prerequisites

  • Have you read Feedback and followed the guide?
    • make sure you are using the latest release version
    • read the documentation
    • google your quesion/issue

Describe your issue

  • Make a reproducible example (e.g. 1)
  • your code should contain comments to describe the problem (e.g. what you expected and what actually happened)

Ask in the right place

  • for bugs or feature requests, post here (github issue)
  • for questions, please post to Bioconductor or Biostars with the tag clusterProfiler
@guidohooiveld
Copy link

Next time please properly format your post!

Yet, I cannot reproduce your problem!

> library(clusterProfiler)
> library(org.Hs.eg.db)
> 
> data(geneList, package="DOSE")
> 
> res <- gseGO(geneList     = geneList,
+              OrgDb        = org.Hs.eg.db,
+              ont          = "ALL",
+              eps          = 0,
+              minGSSize    = 15,
+              maxGSSize    = 500,
+              pvalueCutoff = 0.05)
using 'fgsea' for GSEA analysis, please cite Korotkevich et al (2019).

preparing geneSet collections...
GSEA analysis...
leading edge analysis...
done...
> 
> res
#
# Gene Set Enrichment Analysis
#
#...@organism    Homo sapiens 
#...@setType     GOALL 
#...@keytype     ENTREZID 
#...@geneList    Named num [1:12495] 4.57 4.51 4.42 4.14 3.88 ...
 - attr(*, "names")= chr [1:12495] "4312" "8318" "10874" "55143" ...
#...nPerm        
#...pvalues adjusted by 'BH' with cutoff <0.05 
#...833 enriched terms found
'data.frame':   833 obs. of  12 variables:
 $ ONTOLOGY       : chr  "BP" "BP" "BP" "BP" ...
 $ ID             : chr  "GO:0098813" "GO:0007059" "GO:0051276" "GO:0000819" ...
 $ Description    : chr  "nuclear chromosome segregation" "chromosome segregation" "chromosome organization" "sister chromatid segregation" ...
 $ setSize        : int  238 319 473 185 152 327 317 362 224 138 ...
 $ enrichmentScore: num  0.633 0.585 0.52 0.661 0.686 ...
 $ NES            : num  2.88 2.72 2.54 2.93 2.96 ...
 $ pvalue         : num  3.41e-30 3.61e-30 1.75e-30 1.39e-27 1.88e-25 ...
 $ p.adjust       : num  7.11e-27 7.11e-27 7.11e-27 2.06e-24 2.22e-22 ...
 $ qvalue         : num  5.48e-27 5.48e-27 5.48e-27 1.58e-24 1.71e-22 ...
 $ rank           : num  449 1374 1374 449 532 ...
 $ leading_edge   : chr  "tags=23%, list=4%, signal=22%" "tags=27%, list=11%, signal=25%" "tags=24%, list=11%, signal=22%" "tags=25%, list=4%, signal=24%" ...
 $ core_enrichment: chr  "55143/991/9493/1062/4605/9133/10403/7153/23397/259266/9787/11065/220134/51203/22974/10460/4751/983/4085/81930/8"| __truncated__ "55143/991/9493/1062/4605/9133/10403/7153/23397/259266/9787/11065/55355/220134/51203/22974/10460/4751/79019/5583"| __truncated__ "8318/55143/991/9493/1062/4605/10403/7153/23397/9787/11065/55355/220134/51203/22974/10460/4751/55839/983/4085/98"| __truncated__ "55143/991/9493/1062/4605/10403/7153/23397/9787/11065/220134/51203/22974/10460/4751/983/4085/81930/81620/332/383"| __truncated__ ...
#...Citation
 T Wu, E Hu, S Xu, M Chen, P Guo, Z Dai, T Feng, L Zhou, W Tang, L Zhan, X Fu, S Liu, X Bo, and G Yu.
 clusterProfiler 4.0: A universal enrichment tool for interpreting omics data.
 The Innovation. 2021, 2(3):100141 

> 
> res.simplify <- simplify(res)
> 
> res.simplify
#
# Gene Set Enrichment Analysis
#
#...@organism    Homo sapiens 
#...@setType     GOALL 
#...@keytype     ENTREZID 
#...@geneList    Named num [1:12495] 4.57 4.51 4.42 4.14 3.88 ...
 - attr(*, "names")= chr [1:12495] "4312" "8318" "10874" "55143" ...
#...nPerm        
#...pvalues adjusted by 'BH' with cutoff <0.05 
#...326 enriched terms found
'data.frame':   326 obs. of  12 variables:
 $ ONTOLOGY       : chr  "BP" "BP" "BP" "BP" ...
 $ ID             : chr  "GO:0098813" "GO:0007059" "GO:0051276" "GO:0000819" ...
 $ Description    : chr  "nuclear chromosome segregation" "chromosome segregation" "chromosome organization" "sister chromatid segregation" ...
 $ setSize        : int  238 319 473 185 327 491 423 104 197 129 ...
 $ enrichmentScore: num  0.633 0.585 0.52 0.661 0.541 ...
 $ NES            : num  2.88 2.72 2.54 2.93 2.54 ...
 $ pvalue         : num  3.41e-30 3.61e-30 1.75e-30 1.39e-27 1.11e-24 ...
 $ p.adjust       : num  7.11e-27 7.11e-27 7.11e-27 2.06e-24 1.09e-21 ...
 $ qvalue         : num  5.48e-27 5.48e-27 5.48e-27 1.58e-24 8.41e-22 ...
 $ rank           : num  449 1374 1374 449 1246 ...
 $ leading_edge   : chr  "tags=23%, list=4%, signal=22%" "tags=27%, list=11%, signal=25%" "tags=24%, list=11%, signal=22%" "tags=25%, list=4%, signal=24%" ...
 $ core_enrichment: chr  "55143/991/9493/1062/4605/9133/10403/7153/23397/259266/9787/11065/220134/51203/22974/10460/4751/983/4085/81930/8"| __truncated__ "55143/991/9493/1062/4605/9133/10403/7153/23397/259266/9787/11065/55355/220134/51203/22974/10460/4751/79019/5583"| __truncated__ "8318/55143/991/9493/1062/4605/10403/7153/23397/9787/11065/55355/220134/51203/22974/10460/4751/55839/983/4085/98"| __truncated__ "55143/991/9493/1062/4605/10403/7153/23397/9787/11065/220134/51203/22974/10460/4751/983/4085/81930/81620/332/383"| __truncated__ ...
#...Citation
 T Wu, E Hu, S Xu, M Chen, P Guo, Z Dai, T Feng, L Zhou, W Tang, L Zhan, X Fu, S Liu, X Bo, and G Yu.
 clusterProfiler 4.0: A universal enrichment tool for interpreting omics data.
 The Innovation. 2021, 2(3):100141 

> 
> packageVersion("clusterProfiler")
[1] ‘4.13.0’
> 

@Sidragull57
Copy link
Author

Sorry for inconvenience, Actually, I wanted to remove redundancy from the result of GSEA-GO results and then want to make an enrichemnt plot and could not a way to do that.

I followed your instruction and perfrom

res <- gseGO(geneList = geneList,

  •         OrgDb        = org.Mm.eg.db,
    
  •          ont          = "ALL",
    
  •         eps          = 0,
    
  •         minGSSize    = 15,
    
  •         maxGSSize    = 500,
    
  •         pvalueCutoff = 0.05)
    

preparing geneSet collections...
GSEA analysis...
leading edge analysis...
done...
Es gab 12 Warnungen (Anzeige mit warnings())

res

Gene Set Enrichment Analysis

#...@organism Mus musculus
#...@setType GOALL
#...@keytype ENTREZID
#...@GeneList Named num [1:16133] 5.9 4.66 2.89 2.24 2.1 ...

  • attr(*, "names")= chr [1:16133] "26900" "26908" "20592" "20227" ...
    #...nPerm
    #...pvalues adjusted by 'BH' with cutoff <0.05
    #...17 enriched terms found
    'data.frame': 17 obs. of 12 variables:
    $ ONTOLOGY : chr "MF" "MF" "MF" "BP" ...
    $ ID : chr "GO:0019787" "GO:0004842" "GO:0061659" "GO:0043161" ...
    $ Description : chr "ubiquitin-like protein transferase activity" "ubiquitin-protein transferase activity" "ubiquitin-like protein ligase activity" "proteasome-mediated ubiquitin-dependent protein catabolic process" ...
    $ setSize : int 410 387 307 410 294 483 127 202 116 171 ...
    $ enrichmentScore: num -0.349 -0.346 -0.343 -0.317 -0.335 ...
    $ NES : num -1.89 -1.86 -1.8 -1.71 -1.75 ...
    $ pvalue : num 1.31e-09 4.89e-09 4.59e-07 5.06e-07 2.13e-06 ...
    $ p.adjust : num 8.37e-06 1.57e-05 8.11e-04 8.11e-04 2.73e-03 ...
    $ qvalue : num 7.78e-06 1.45e-05 7.53e-04 7.53e-04 2.54e-03 ...
    $ rank : num 3438 3438 3474 3723 3474 ...
    $ leading_edge : chr "tags=34%, list=21%, signal=28%" "tags=34%, list=21%, signal=27%" "tags=36%, list=22%, signal=29%" "tags=34%, list=23%, signal=27%" ...
    $ core_enrichment: chr "67138/235315/19822/68795/56715/67338/208650/77853/547109/59003/209462/242521/217333/54484/80751/67455/74132/170"| truncated "67138/235315/19822/68795/56715/67338/208650/77853/547109/59003/209462/242521/217333/54484/80751/67455/74132/280"| truncated "78889/67138/19822/68795/56715/67338/208650/77853/547109/59003/209462/54484/80751/74132/28077/22215/53323/672511"| truncated "19173/77891/11651/104318/448987/23805/232566/233040/14198/234684/93687/79043/71765/76375/12387/226144/19822/687"| truncated ...
    #...Citation
    T Wu, E Hu, S Xu, M Chen, P Guo, Z Dai, T Feng, L Zhou, W Tang, L Zhan, X Fu, S Liu, X Bo, and G Yu.
    clusterProfiler 4.0: A universal enrichment tool for interpreting omics data.
    The Innovation. 2021, 2(3):100141

res.simplify <- simplify(res)

Error in as.list.default(X) :
No method to convert this S4 class into a vector

Kindly tell me a way to remove redundancy from GO results and then plot it as enrichment plot

@guidohooiveld
Copy link

guidohooiveld commented Jul 17, 2024

Again, it is working for me....! Does the exact same code from me also work for you? (copy/paste it)

Note; for your specific dataset:

  • which warnings are returned?
  • what is an enrichmentplot? a dotplot? cnetplot? ... ?
  • which version of clusterProfiler are you using? I strongly suspect it is not the latest one...
> library(clusterProfiler)
> library(org.Hs.eg.db)
> 
> data(geneList, package="DOSE")
> 
> res <- gseGO(geneList     = geneList,
+              OrgDb        = org.Hs.eg.db,
+              ont          = "ALL",
+              eps          = 0,
+              minGSSize    = 15,
+              maxGSSize    = 500,
+              pvalueCutoff = 0.05)
using 'fgsea' for GSEA analysis, please cite Korotkevich et al (2019).

preparing geneSet collections...
GSEA analysis...
leading edge analysis...
done...
> 
> res
#
# Gene Set Enrichment Analysis
#
#...@organism    Homo sapiens 
#...@setType     GOALL 
#...@keytype     ENTREZID 
#...@geneList    Named num [1:12495] 4.57 4.51 4.42 4.14 3.88 ...
 - attr(*, "names")= chr [1:12495] "4312" "8318" "10874" "55143" ...
#...nPerm        
#...pvalues adjusted by 'BH' with cutoff <0.05 
#...889 enriched terms found
'data.frame':   889 obs. of  12 variables:
 $ ONTOLOGY       : chr  "BP" "BP" "BP" "BP" ...
 $ ID             : chr  "GO:0007059" "GO:0051276" "GO:0098813" "GO:0000819" ...
 $ Description    : chr  "chromosome segregation" "chromosome organization" "nuclear chromosome segregation" "sister chromatid segregation" ...
 $ setSize        : int  319 473 238 185 152 327 317 138 362 224 ...
 $ enrichmentScore: num  0.585 0.52 0.633 0.661 0.686 ...
 $ NES            : num  2.75 2.52 2.88 2.92 2.94 ...
 $ pvalue         : num  1.43e-31 1.24e-30 2.46e-30 2.26e-27 6.34e-26 ...
 $ p.adjust       : num  8.47e-28 3.67e-27 4.85e-27 3.34e-24 7.50e-23 ...
 $ qvalue         : num  6.39e-28 2.77e-27 3.65e-27 2.52e-24 5.66e-23 ...
 $ rank           : num  1374 1374 449 449 532 ...
 $ leading_edge   : chr  "tags=27%, list=11%, signal=25%" "tags=24%, list=11%, signal=22%" "tags=23%, list=4%, signal=22%" "tags=25%, list=4%, signal=24%" ...
 $ core_enrichment: chr  "55143/991/9493/1062/4605/9133/10403/7153/23397/259266/9787/11065/55355/220134/51203/22974/10460/4751/79019/5583"| __truncated__ "8318/55143/991/9493/1062/4605/10403/7153/23397/9787/11065/55355/220134/51203/22974/10460/4751/55839/983/4085/98"| __truncated__ "55143/991/9493/1062/4605/9133/10403/7153/23397/259266/9787/11065/220134/51203/22974/10460/4751/983/4085/81930/8"| __truncated__ "55143/991/9493/1062/4605/10403/7153/23397/9787/11065/220134/51203/22974/10460/4751/983/4085/81930/81620/332/383"| __truncated__ ...
#...Citation
 T Wu, E Hu, S Xu, M Chen, P Guo, Z Dai, T Feng, L Zhou, W Tang, L Zhan, X Fu, S Liu, X Bo, and G Yu.
 clusterProfiler 4.0: A universal enrichment tool for interpreting omics data.
 The Innovation. 2021, 2(3):100141 

> 
> res <- setReadable(res, OrgDb = org.Hs.eg.db, keyType="ENTREZID")
> 
> res.simplify <- simplify(res)
> 
> res.simplify
#
# Gene Set Enrichment Analysis
#
#...@organism    Homo sapiens 
#...@setType     GOALL 
#...@keytype     ENTREZID 
#...@geneList    Named num [1:12495] 4.57 4.51 4.42 4.14 3.88 ...
 - attr(*, "names")= chr [1:12495] "4312" "8318" "10874" "55143" ...
#...nPerm        
#...pvalues adjusted by 'BH' with cutoff <0.05 
#...345 enriched terms found
'data.frame':   345 obs. of  12 variables:
 $ ONTOLOGY       : chr  "BP" "BP" "BP" "BP" ...
 $ ID             : chr  "GO:0007059" "GO:0051276" "GO:0098813" "GO:0000819" ...
 $ Description    : chr  "chromosome segregation" "chromosome organization" "nuclear chromosome segregation" "sister chromatid segregation" ...
 $ setSize        : int  319 473 238 185 327 491 423 197 104 129 ...
 $ enrichmentScore: num  0.585 0.52 0.633 0.661 0.541 ...
 $ NES            : num  2.75 2.52 2.88 2.92 2.54 ...
 $ pvalue         : num  1.43e-31 1.24e-30 2.46e-30 2.26e-27 1.20e-24 ...
 $ p.adjust       : num  8.47e-28 3.67e-27 4.85e-27 3.34e-24 1.18e-21 ...
 $ qvalue         : num  6.39e-28 2.77e-27 3.65e-27 2.52e-24 8.89e-22 ...
 $ rank           : num  1374 1374 449 449 1246 ...
 $ leading_edge   : chr  "tags=27%, list=11%, signal=25%" "tags=24%, list=11%, signal=22%" "tags=23%, list=4%, signal=22%" "tags=25%, list=4%, signal=24%" ...
 $ core_enrichment: chr  "CDCA8/CDC20/KIF23/CENPE/MYBL2/CCNB2/NDC80/TOP2A/NCAPH/ASPM/DLGAP5/UBE2C/HJURP/SKA1/NUSAP1/TPX2/TACC3/NEK2/CENPM"| __truncated__ "CDC45/CDCA8/CDC20/KIF23/CENPE/MYBL2/NDC80/TOP2A/NCAPH/DLGAP5/UBE2C/HJURP/SKA1/NUSAP1/TPX2/TACC3/NEK2/CENPN/CDK1"| __truncated__ "CDCA8/CDC20/KIF23/CENPE/MYBL2/CCNB2/NDC80/TOP2A/NCAPH/ASPM/DLGAP5/UBE2C/SKA1/NUSAP1/TPX2/TACC3/NEK2/CDK1/MAD2L1"| __truncated__ "CDCA8/CDC20/KIF23/CENPE/MYBL2/NDC80/TOP2A/NCAPH/DLGAP5/UBE2C/SKA1/NUSAP1/TPX2/TACC3/NEK2/CDK1/MAD2L1/KIF18A/CDT"| __truncated__ ...
#...Citation
 T Wu, E Hu, S Xu, M Chen, P Guo, Z Dai, T Feng, L Zhou, W Tang, L Zhan, X Fu, S Liu, X Bo, and G Yu.
 clusterProfiler 4.0: A universal enrichment tool for interpreting omics data.
 The Innovation. 2021, 2(3):100141 

> 
> dotplot(res.simplify)
> 
> cnetplot(res.simplify)
Warning message:
ggrepel: 13 unlabeled data points (too many overlaps). Consider increasing max.overlaps 
> 
> 

dotplot:
image

cnetplot:
image

@Sidragull57
Copy link
Author

Thank you very much .

now it worked for me

but if I applied it on Comaprecluster goGSEA, i got error message.

ck_TP_VR_GO # output of compare cluster gseaGO

Result of Comparing 4 gene clusters

#.. @fun gseGO
#.. @geneClusters List of 4
$ d1 : Named num [1:16225] 1.12 1.08 1.07 1 1 ...
..- attr(, "names")= chr [1:16225] "70325" "14450" "14693" "22351" ...
$ h12: Named num [1:16139] 1.93 1.93 1.67 1.38 1.32 ...
..- attr(
, "names")= chr [1:16139] "14605" "74747" "14229" "73825" ...
$ d5 : Named num [1:16082] 1.251 1.088 1.052 0.991 0.987 ...
..- attr(, "names")= chr [1:16082] "74477" "14632" "207259" "12724" ...
$ d10: Named num [1:16127] 1.196 1.025 1.018 0.999 0.996 ...
..- attr(
, "names")= chr [1:16127] "229004" "74007" "68527" "228846" ...
#...Result 'data.frame': 287 obs. of 13 variables:
$ Cluster : Factor w/ 4 levels "d1","h12","d5",..: 1 1 1 1 1 1 1 1 1 1 ...
$ ONTOLOGY : chr "CC" "BP" "BP" "CC" ...
$ ID : chr "GO:0022626" "GO:0002181" "GO:0006397" "GO:0005681" ...
$ Description : chr "cytosolic ribosome" "cytoplasmic translation" "mRNA processing" "spliceosomal complex" ...
$ setSize : int 101 140 449 189 230 405 321 16 218 248 ...
$ enrichmentScore: num -0.474 -0.437 -0.296 -0.375 -0.349 ...
$ NES : num -2.21 -2.16 -1.69 -1.93 -1.84 ...
$ pvalue : num 3.15e-08 1.70e-08 1.05e-07 2.38e-07 4.63e-07 ...
$ p.adjust : num 0.000128 0.000128 0.000286 0.000486 0.000755 ...
$ qvalue : num 0.000125 0.000125 0.000279 0.000474 0.000736 ...
$ rank : num 5455 3967 4093 2916 2789 ...
$ leading_edge : chr "tags=66%, list=34%, signal=44%" "tags=44%, list=24%, signal=34%" "tags=35%, list=25%, signal=27%" "tags=42%, list=18%, signal=35%" ...
$ core_enrichment: chr "20005/19921/20084/19989/78294/20088/54217/319195/19941/19951/11837/76808/267019/22187/100043787/19946/27176/619"| truncated "20055/56040/67427/98221/20068/13629/27370/116905/100503670/67115/20054/208922/433702/19981/16898/27207/75617/67"| truncated "192170/70465/65105/18747/19655/68955/54614/78688/230257/16549/330216/67040/107686/231769/20624/83701/66899/6649"| truncated "67178/66354/19134/192170/19655/54614/107686/20624/66492/74200/76479/68011/24010/19704/230596/20227/68592/192160"| truncated ...
#.. number of enriched terms found for each gene cluster:
#.. d1: 41
#.. h12: 0
#.. d5: 119
#.. d10: 127

#...Citation
T Wu, E Hu, S Xu, M Chen, P Guo, Z Dai, T Feng, L Zhou,
W Tang, L Zhan, X Fu, S Liu, X Bo, and G Yu.
clusterProfiler 4.0: A universal enrichment tool for interpreting omics data.
The Innovation. 2021, 2(3):100141

res_cC <- setReadable(ck_TP_VR_GO, OrgDb = org.Mm.eg.db, keyType="ENTREZID")
res.simplify_CC <- simplify(res_cC)

Fehler in match.arg(ont, c("BP", "CC", "MF")) :
'arg' should be one of “BP”, “CC”, “MF”

@guidohooiveld
Copy link

guidohooiveld commented Jul 17, 2024

Once more, that is working for me as well!

  • Please report back; what did you change so that it works now when analyzing a single dataset?
  • Which version of clusterProfiler are you using?
  • Why are the lengths of your input gene lists not identical? Note that the code as such also works in case lists have different lengths ( I tested that).
> library(clusterProfiler)
> library(enrichplot)
> library(org.Hs.eg.db)
> 
> ## load sample data
> data(geneList, package="DOSE")
> head(geneList)
    4312     8318    10874    55143    55388      991 
4.572613 4.514594 4.418218 4.144075 3.876258 3.677857 
>  
> ## using sample data, create list with 3 comparisons to be used as input for comparCluster
> ## note that 'List3' is the reverse of 'List1' and 'List2'.
> inputList <- list(List1 = geneList, List2 = geneList, List3 = sort(-1*geneList, decreasing = TRUE) )
> 
> ## run gseGO on all input genelists
> res.combined <- compareCluster(geneClusters=inputList,
+                               fun = "gseGO",
+                               OrgDb = org.Hs.eg.db,
+                               keyType = "ENTREZID",
+                               ont = "ALL",
+                               eps = 0,
+                               pvalueCutoff = 0.05,
+                               pAdjustMethod = "BH",
+                               minGSSize = 15,
+                               maxGSSize = 500)
> 
> ## convert entrezids into symbols
> res.combined <- setReadable(res.combined, OrgDb = org.Hs.eg.db, keyType="ENTREZID")
> 
> ## simplify
> res.combined.simplify <- simplify(res.combined)
> res.combined.simplify
#
# Result of Comparing 3 gene clusters 
#
#.. @fun         gseGO 
#.. @geneClusters       List of 3
 $ List1: Named num [1:12495] 4.57 4.51 4.42 4.14 3.88 ...
  ..- attr(*, "names")= chr [1:12495] "4312" "8318" "10874" "55143" ...
 $ List2: Named num [1:12495] 4.57 4.51 4.42 4.14 3.88 ...
  ..- attr(*, "names")= chr [1:12495] "4312" "8318" "10874" "55143" ...
 $ List3: Named num [1:12495] 4.3 3.95 3.6 3.46 3.42 ...
  ..- attr(*, "names")= chr [1:12495] "4969" "57758" "79901" "79838" ...
#...Result      'data.frame':   1004 obs. of  13 variables:
 $ Cluster        : Factor w/ 3 levels "List1","List2",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ ONTOLOGY       : chr  "BP" "BP" "BP" "BP" ...
 $ ID             : chr  "GO:0098813" "GO:0007059" "GO:0051276" "GO:0000819" ...
 $ Description    : chr  "nuclear chromosome segregation" "chromosome segregation" "chromosome organization" "sister chromatid segregation" ...
 $ setSize        : int  238 319 473 185 327 491 423 104 197 129 ...
 $ enrichmentScore: num  0.633 0.585 0.52 0.661 0.541 ...
 $ NES            : num  2.91 2.78 2.59 2.95 2.56 ...
 $ pvalue         : num  9.67e-31 7.56e-31 4.48e-31 8.94e-27 8.44e-25 ...
 $ p.adjust       : num  1.91e-27 1.91e-27 1.91e-27 1.32e-23 8.32e-22 ...
 $ qvalue         : num  1.46e-27 1.46e-27 1.46e-27 1.01e-23 6.37e-22 ...
 $ rank           : num  449 1374 1374 449 1246 ...
 $ leading_edge   : chr  "tags=23%, list=4%, signal=22%" "tags=27%, list=11%, signal=25%" "tags=24%, list=11%, signal=22%" "tags=25%, list=4%, signal=24%" ...
 $ core_enrichment: chr  "CDCA8/CDC20/KIF23/CENPE/MYBL2/CCNB2/NDC80/TOP2A/NCAPH/ASPM/DLGAP5/UBE2C/SKA1/NUSAP1/TPX2/TACC3/NEK2/CDK1/MAD2L1"| __truncated__ "CDCA8/CDC20/KIF23/CENPE/MYBL2/CCNB2/NDC80/TOP2A/NCAPH/ASPM/DLGAP5/UBE2C/HJURP/SKA1/NUSAP1/TPX2/TACC3/NEK2/CENPM"| __truncated__ "CDC45/CDCA8/CDC20/KIF23/CENPE/MYBL2/NDC80/TOP2A/NCAPH/DLGAP5/UBE2C/HJURP/SKA1/NUSAP1/TPX2/TACC3/NEK2/CENPN/CDK1"| __truncated__ "CDCA8/CDC20/KIF23/CENPE/MYBL2/NDC80/TOP2A/NCAPH/DLGAP5/UBE2C/SKA1/NUSAP1/TPX2/TACC3/NEK2/CDK1/MAD2L1/KIF18A/CDT"| __truncated__ ...
#.. number of enriched terms found for each gene cluster:
#..   List1: 342 
#..   List2: 336 
#..   List3: 326 
#
#...Citation
T Wu, E Hu, S Xu, M Chen, P Guo, Z Dai, T Feng, L Zhou, 
W Tang, L Zhan, X Fu, S Liu, X Bo, and G Yu. 
clusterProfiler 4.0: A universal enrichment tool for interpreting omics data. 
The Innovation. 2021, 2(3):100141 

> 
> 
> ## dotplot
> dotplot(res.combined.simplify, font.size=8, showCategory=8, title =("GSEA results"), split=".sign") + facet_grid(.~.sign)
> 
> ## cnetplot
> cnetplot(res.combined.simplify)
>

image

image

@Sidragull57
Copy link
Author

Sidragull57 commented Jul 18, 2024

I am using R version 4.3.3. Initially, I unloaded the clusterProfiler package and then reinstalled it. This approach made the simplify function work for me.

Unload the clusterProfiler package if it is loaded

if ("package:clusterProfiler" %in% search()) {
detach("package:clusterProfiler", unload = TRUE)
}

Update all outdated packages

update.packages(ask = FALSE)

Install BiocManager if not already installed

if (!requireNamespace("BiocManager", quietly = TRUE)) {
install.packages("BiocManager")
}

Reinstall the latest version of clusterProfiler

BiocManager::install("clusterProfiler", ask = FALSE)

I got this code from ChatGPT

However, each time before running simplify, I have to unload and reinstall clusterProfiler to get it to work again. I am not sure what is causing this issue.

I re-run everything

library(clusterProfiler)
library(org.Mm.eg.db)
library(org.Hs.eg.db)
data(geneList, package="DOSE")

set.seed(1)

res <- gseGO(gene = geneList_1d_fgsea_VR,

  •                            OrgDb = org.Mm.eg.db,
    
  •                            ont = "All",  # All
    
  •                            pAdjustMethod = "fdr",
    
  •                            pvalueCutoff  = 0.05,
    
  •                            exponent = 1,
    
  •                            minGSSize = 10,
    
  •                            maxGSSize = 500,
    
  •                            eps = 0,
    
  •                            verbose = TRUE,
    
  •                            seed = TRUE) 
    

preparing geneSet collections...
GSEA analysis...
leading edge analysis...
done...
Es gab 12 Warnungen (Anzeige mit warnings())

res <- setReadable(res, OrgDb = org.Mm.eg.db, keyType="ENTREZID")
res.simplify <- simplify(res)
res.simplify

Gene Set Enrichment Analysis

#...@organism Mus musculus
#...@setType GOALL
#...@keytype ENTREZID
#...@GeneList Named num [1:16225] 1.12 1.08 1.07 1 1 ...

  • attr(*, "names")= chr [1:16225] "70325" "14450" "14693" "22351" ...
    #...nPerm
    #...pvalues adjusted by 'fdr' with cutoff <0.05
    #...31 enriched terms found
    'data.frame': 31 obs. of 12 variables:
    $ ONTOLOGY : chr "CC" "CC" "CC" "CC" ...
    $ ID : chr "GO:0022626" "GO:0005681" "GO:0036464" "GO:0016607" ...
    $ Description : chr "cytosolic ribosome" "spliceosomal complex" "cytoplasmic ribonucleoprotein granule" "nuclear speck" ...
    $ setSize : int 101 189 230 338 362 124 25 95 225 140 ...
    $ enrichmentScore: num -0.474 -0.375 -0.349 -0.299 0.327 ...
    $ NES : num -2.21 -1.93 -1.84 -1.66 1.68 ...
    $ pvalue : num 3.15e-08 2.38e-07 4.63e-07 2.22e-06 5.47e-06 ...
    $ p.adjust : num 0.000128 0.000486 0.000755 0.00212 0.002627 ...
    $ qvalue : num 0.000125 0.000474 0.000736 0.002068 0.002563 ...
    $ rank : num 5455 2916 2789 3252 3696 ...
    $ leading_edge : chr "tags=66%, list=34%, signal=44%" "tags=42%, list=18%, signal=35%" "tags=28%, list=17%, signal=24%" "tags=29%, list=20%, signal=24%" ...
    $ core_enrichment: chr "Rpl9/Rpl19/Rps18/Rpl7/Rps27a/Rps24/Rpl36/Rpl17/Rpl26/Rpl32/Rplp0/Rpl18a/Rps15a/Ubb/Rpl36a-ps1/Rpl30/Rpl7a/Zcchc"| truncated "Zmat5/Snw1/Prpf4b/Eif4a3/Rbmx/Prpf40b/Snrpd2/Eftud2/Zmat2/Khdc4/Smndc1/Snrpg/Ik/Upf1/Prpf38a/Sart1/Syf2/Casc3/A"| truncated "Larp4/Hnrnpu/Rps4x/Xrn1/Rpl28/Ago2/Snrpb2/Rbm20/Mapt/Eif4e/Tnrc6c/Cnot7/Lsm3/Polr2d/Larp1/Grb7/Sqstm1/Dhx36/Ddx"| truncated "Cxxc1/Tcim/Topors/Syf2/Phf7/Aagab/Hipk1/Api5/Prcc/Sgk1/Prpf19/Trip12/Hdac4/Dnaaf1/Srsf3/Kat6a/Cdk12/Hnrnpu/Srpk"| truncated ...
    #...Citation
    T Wu, E Hu, S Xu, M Chen, P Guo, Z Dai, T Feng, L Zhou, W Tang, L Zhan, X Fu, S Liu, X Bo, and G Yu.
    clusterProfiler 4.0: A universal enrichment tool for interpreting omics data.
    The Innovation. 2021, 2(3):100141

But it did not work for my comparecluster data and I got

Fehler in match.arg(ont, c("BP", "CC", "MF")) :
'arg' should be one of “BP”, “CC”, “MF”

but now I have performed the whole analysis using the data you have mentioned

data(geneList, package="DOSE")

res_Dose <- gseGO(gene = geneList,

  •          OrgDb = org.Hs.eg.db,
    
  •          ont = "All",  # All
    
  •          pAdjustMethod = "fdr",
    
  •          pvalueCutoff  = 0.05,
    
  •          exponent = 1,
    
  •          minGSSize = 10,
    
  •          maxGSSize = 500,
    
  •          eps = 0,
    
  •          verbose = TRUE,
    
  •          seed = TRUE) 
    

preparing geneSet collections...
GSEA analysis...
leading edge analysis...
done...
Warnmeldungen:
1: In serialize(data, node$con) :
'package:stats' evtl. nicht verfügbar während des Ladens
2: In serialize(data, node$con) :
'package:stats' evtl. nicht verfügbar während des Ladens
3: In serialize(data, node$con) :
'package:stats' evtl. nicht verfügbar während des Ladens
4: In serialize(data, node$con) :
'package:stats' evtl. nicht verfügbar während des Ladens
5: In serialize(data, node$con) :
'package:stats' evtl. nicht verfügbar während des Ladens
6: In serialize(data, node$con) :
'package:stats' evtl. nicht verfügbar während des Ladens
7: In serialize(data, node$con) :
'package:stats' evtl. nicht verfügbar während des Ladens
8: In serialize(data, node$con) :
'package:stats' evtl. nicht verfügbar während des Ladens
9: In serialize(data, node$con) :
'package:stats' evtl. nicht verfügbar während des Ladens
10: In serialize(data, node$con) :
'package:stats' evtl. nicht verfügbar während des Ladens

res_Dose

Gene Set Enrichment Analysis

#...@organism Homo sapiens
#...@setType GOALL
#...@keytype ENTREZID
#...@GeneList Named num [1:12495] 4.57 4.51 4.42 4.14 3.88 ...

  • attr(*, "names")= chr [1:12495] "4312" "8318" "10874" "55143" ...
    #...nPerm
    #...pvalues adjusted by 'fdr' with cutoff <0.05
    #...868 enriched terms found
    'data.frame': 868 obs. of 12 variables:
    $ ONTOLOGY : chr "BP" "BP" "BP" "BP" ...
    $ ID : chr "GO:0051276" "GO:0007059" "GO:0098813" "GO:0000819" ...
    $ Description : chr "chromosome organization" "chromosome segregation" "nuclear chromosome segregation" "sister chromatid segregation" ...
    $ setSize : int 471 319 249 198 332 308 160 139 367 486 ...
    $ enrichmentScore: num 0.523 0.572 0.609 0.633 0.534 ...
    $ NES : num 2.57 2.7 2.79 2.84 2.53 ...
    $ pvalue : num 2.20e-31 1.24e-28 3.49e-27 1.31e-24 1.09e-23 ...
    $ p.adjust : num 1.68e-27 4.73e-25 8.88e-24 2.50e-21 1.66e-20 ...
    $ qvalue : num 1.34e-27 3.78e-25 7.09e-24 2.00e-21 1.33e-20 ...
    $ rank : num 1374 1374 449 449 1246 ...
    $ leading_edge : chr "tags=24%, list=11%, signal=22%" "tags=26%, list=11%, signal=24%" "tags=21%, list=4%, signal=21%" "tags=23%, list=4%, signal=22%" ...
    $ core_enrichment: chr "CDC45/CDCA8/CDC20/KIF23/CENPE/MYBL2/NDC80/TOP2A/NCAPH/DLGAP5/UBE2C/HJURP/NUSAP1/TPX2/TACC3/NEK2/CENPN/CDK1/MAD2"| truncated "CDCA8/CDC20/KIF23/CENPE/MYBL2/CCNB2/NDC80/TOP2A/NCAPH/ASPM/DLGAP5/UBE2C/HJURP/SKA1/NUSAP1/TPX2/TACC3/NEK2/CENPN"| truncated "CDCA8/CDC20/KIF23/CENPE/MYBL2/CCNB2/NDC80/TOP2A/NCAPH/ASPM/DLGAP5/UBE2C/NUSAP1/TPX2/TACC3/NEK2/MAD2L1/KIF18A/CD"| truncated "CDCA8/CDC20/KIF23/CENPE/MYBL2/NDC80/TOP2A/NCAPH/DLGAP5/UBE2C/NUSAP1/TPX2/TACC3/NEK2/MAD2L1/KIF18A/CDT1/BIRC5/KI"| truncated ...
    #...Citation
    T Wu, E Hu, S Xu, M Chen, P Guo, Z Dai, T Feng, L Zhou, W Tang, L Zhan, X Fu, S Liu, X Bo, and G Yu.
    clusterProfiler 4.0: A universal enrichment tool for interpreting omics data.
    The Innovation. 2021, 2(3):100141

res_Dose <- setReadable(res_Dose, OrgDb = org.Hs.eg.db, keyType="ENTREZID")
res.Dose.simplify <- simplify(res_Dose)
Fehler in match.arg(ont, c("BP", "CC", "MF")) :
'arg' should be one of “BP”, “CC”, “MF”

same error when I perormed comaprecluster with the data

res.combined <- compareCluster(geneClusters=inputList, fun = "gseGO",

  •                                                           OrgDb = org.Hs.eg.db,
    
  •                                                         keyType = "ENTREZID",
    
  •                                                         ont = "ALL",
    
  •                                                           eps = 0,
    
  •                                                        pvalueCutoff = 0.05,
    
  •                                                          pAdjustMethod = "BH",
    
  •                                                          minGSSize = 15,
    
  •                                                          maxGSSize = 500)
    

Es gab 30 Warnungen (Anzeige mit warnings())

res.combined <- setReadable(res.combined, OrgDb = org.Hs.eg.db, keyType="ENTREZID")
res.combined.simplify <- simplify(res.combined)
Fehler in match.arg(ont, c("BP", "CC", "MF")) :
'arg' should be one of “BP”, “CC”, “MF”

I am extremly sorry but i did not know what is wrong.

@guidohooiveld
Copy link

guidohooiveld commented Jul 18, 2024

Some thoughts:

Since you got he error when analyzing the included human dataset, and I do not, it points to an issue with your R/Bioconductor installation.

Also, are you running it on a laptop/PC with R-studio, or on a computer cluster? Reason I am asking is because of the warning regarding the stats package.
Pointing to R-studio because of https://forum.posit.co/t/error-when-running-parallelized-process-warning-in-serialize-package-stats-may-not-be-available-when-loading/110573

Therefore:

Also:

  • if using R-Studio, does the example code work directly in R?
  • what happens if you explicitly refer to the simplify function of clusterProfiler? Thus: res.Dose.simplify <- clusterProfiler::simplify(res_Dose)

@Sidragull57
Copy link
Author

Sidragull57 commented Jul 18, 2024

I updated R from version 4.3 to 4.4.1 and updated all the packages, but the problem persists for ontology = "ALL". If I change to only one specific ontology like "BP", it works. Below, you can see the details.

BiocManager::valid()
'getOption("repos")' replaces Bioconductor standard repositories, see 'help("repositories", package =
"BiocManager")' for details.
Replacement repositories:
CRAN: https://cran.rstudio.com/

  • sessionInfo()

R version 4.4.1 (2024-06-14 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 10 x64 (build 19045)

Matrix products: default

locale:
[1] LC_COLLATE=German_Germany.utf8 LC_CTYPE=German_Germany.utf8 LC_MONETARY=German_Germany.utf8
[4] LC_NUMERIC=C LC_TIME=German_Germany.utf8

time zone: Europe/Berlin
tzcode source: internal

attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods base

other attached packages:
[1] org.Hs.eg.db_3.19.1 msigdbr_7.5.1.9001 org.Mm.eg.db_3.19.1 AnnotationDbi_1.66.0
[5] IRanges_2.38.1 S4Vectors_0.42.1 Biobase_2.64.0 BiocGenerics_0.50.0
[9] clusterProfiler_4.12.0 dplyr_1.1.4 igraph_2.0.3

loaded via a namespace (and not attached):
[1] DBI_1.2.3 shadowtext_0.1.4 gson_0.1.0 gridExtra_2.3
[5] remotes_2.5.0 rlang_1.1.4 magrittr_2.0.3 DOSE_3.30.1
[9] compiler_4.4.1 RSQLite_2.3.7 png_0.1-8 vctrs_0.6.5
[13] reshape2_1.4.4 stringr_1.5.1 pkgconfig_2.0.3 crayon_1.5.3
[17] fastmap_1.2.0 XVector_0.44.0 ggraph_2.2.1 utf8_1.2.4
[21] HDO.db_0.99.1 enrichplot_1.24.0 UCSC.utils_1.0.0 purrr_1.0.2
[25] bit_4.0.5 zlibbioc_1.50.0 cachem_1.1.0 aplot_0.2.3
[29] GenomeInfoDb_1.40.1 jsonlite_1.8.8 blob_1.2.4 BiocParallel_1.38.0
[33] tweenr_2.0.3 parallel_4.4.1 R6_2.5.1 stringi_1.8.4
[37] RColorBrewer_1.1-3 GOSemSim_2.30.0 Rcpp_1.0.12 snow_0.4-4
[41] Matrix_1.7-0 splines_4.4.1 tidyselect_1.2.1 qvalue_2.36.0
[45] viridis_0.6.5 codetools_0.2-20 curl_5.2.1 lattice_0.22-6
[49] tibble_3.2.1 plyr_1.8.9 treeio_1.28.0 withr_3.0.0
[53] KEGGREST_1.44.1 gridGraphics_0.5-1 scatterpie_0.2.3 polyclip_1.10-6
[57] Biostrings_2.72.1 BiocManager_1.30.23 pillar_1.9.0 ggtree_3.12.0
[61] ggfun_0.1.5 generics_0.1.3 ggplot2_3.5.1 munsell_0.5.1
[65] scales_1.3.0 tidytree_0.4.6 glue_1.7.0 lazyeval_0.2.2
[69] tools_4.4.1 ggnewscale_0.4.10 data.table_1.15.4 fgsea_1.30.0
[73] babelgene_22.9 fs_1.6.4 graphlayouts_1.1.1 fastmatch_1.1-4
[77] tidygraph_1.3.1 cowplot_1.1.3 grid_4.4.1 tidyr_1.3.1
[81] ape_5.8 colorspace_2.1-0 nlme_3.1-165 GenomeInfoDbData_1.2.12
[85] patchwork_1.2.0 ggforce_0.4.2 cli_3.6.3 fansi_1.0.6
[89] viridisLite_0.4.2 gtable_0.3.5 yulab.utils_0.1.4 digest_0.6.36
[93] ggrepel_0.9.5 ggplotify_0.1.2 farver_2.1.2 memoise_2.0.1
[97] lifecycle_1.0.4 httr_1.4.7 GO.db_3.19.1 bit64_4.0.5
[101] MASS_7.3-61

Bioconductor version '3.19'

  • 1 packages out-of-date
  • 1 packages too new

create a valid installation with

BiocManager::install(c(
"msigdbr", "Rcpp"
), update = TRUE, ask = FALSE, force = TRUE)

more details: BiocManager::valid()$too_new, BiocManager::valid()$out_of_date

Warnmeldung:
1 packages out-of-date; 1 packages too new

library(clusterProfiler)
library(org.Hs.eg.db)
data(geneList, package="DOSE")
res <- gseGO(geneList = geneList,

  •                        OrgDb        = org.Hs.eg.db,
    
  •                       ont          = "ALL",
    
  •                      eps          = 0,
    
  •                      minGSSize    = 15,
    
  •                      maxGSSize    = 500,
    
  •                       pvalueCutoff = 0.05)
    

using 'fgsea' for GSEA analysis, please cite Korotkevich et al (2019).

preparing geneSet collections...
GSEA analysis...
leading edge analysis...
done...
Warnmeldungen:
1: In serialize(data, node$con) :
'package:stats' evtl. nicht verfügbar während des Ladens
2: In serialize(data, node$con) :
'package:stats' evtl. nicht verfügbar während des Ladens
3: In serialize(data, node$con) :
'package:stats' evtl. nicht verfügbar während des Ladens
4: In serialize(data, node$con) :
'package:stats' evtl. nicht verfügbar während des Ladens
5: In serialize(data, node$con) :
'package:stats' evtl. nicht verfügbar während des Ladens
6: In serialize(data, node$con) :
'package:stats' evtl. nicht verfügbar während des Ladens
7: In serialize(data, node$con) :
'package:stats' evtl. nicht verfügbar während des Ladens
8: In serialize(data, node$con) :
'package:stats' evtl. nicht verfügbar während des Ladens
9: In serialize(data, node$con) :
'package:stats' evtl. nicht verfügbar während des Ladens
10: In serialize(data, node$con) :
'package:stats' evtl. nicht verfügbar während des Ladens

res <- setReadable(res, OrgDb = org.Hs.eg.db, keyType="ENTREZID")
res.simplify <- simplify(res)
Fehler in match.arg(ont, c("BP", "CC", "MF")) :
'arg' sollte eines von '“BP”, “CC”, “MF”' sein
res_BP <- gseGO(geneList = geneList,

  •          OrgDb        = org.Hs.eg.db,
    
  •          ont          = "BP",
    
  •          eps          = 0,
    
  •          minGSSize    = 15,
    
  •          maxGSSize    = 500,
    
  •          pvalueCutoff = 0.05)
    

using 'fgsea' for GSEA analysis, please cite Korotkevich et al (2019).

preparing geneSet collections...
GSEA analysis...
leading edge analysis...
done...
Warnmeldungen:
1: In serialize(data, node$con) :
'package:stats' evtl. nicht verfügbar während des Ladens
2: In serialize(data, node$con) :
'package:stats' evtl. nicht verfügbar während des Ladens
3: In serialize(data, node$con) :
'package:stats' evtl. nicht verfügbar während des Ladens
4: In serialize(data, node$con) :
'package:stats' evtl. nicht verfügbar während des Ladens
5: In serialize(data, node$con) :
'package:stats' evtl. nicht verfügbar während des Ladens
6: In serialize(data, node$con) :
'package:stats' evtl. nicht verfügbar während des Ladens
7: In serialize(data, node$con) :
'package:stats' evtl. nicht verfügbar während des Ladens
8: In serialize(data, node$con) :
'package:stats' evtl. nicht verfügbar während des Ladens
9: In serialize(data, node$con) :
'package:stats' evtl. nicht verfügbar während des Ladens
10: In serialize(data, node$con) :
'package:stats' evtl. nicht verfügbar während des Ladens

res_BP <- setReadable(res_BP, OrgDb = org.Hs.eg.db, keyType="ENTREZID")
res_BP.simplify <- simplify(res_BP)
res_BP

Gene Set Enrichment Analysis

#...@organism Homo sapiens
#...@setType BP
#...@keytype ENTREZID
#...@GeneList Named num [1:12495] 4.57 4.51 4.42 4.14 3.88 ...

  • attr(*, "names")= chr [1:12495] "4312" "8318" "10874" "55143" ...
    #...nPerm
    #...pvalues adjusted by 'BH' with cutoff <0.05
    #...681 enriched terms found
    'data.frame': 681 obs. of 11 variables:
    $ ID : chr "GO:0007059" "GO:0051276" "GO:0098813" "GO:0000819" ...
    $ Description : chr "chromosome segregation" "chromosome organization" "nuclear chromosome segregation" "sister chromatid segregation" ...
    $ setSize : int 319 473 238 185 152 327 224 362 491 423 ...
    $ enrichmentScore: num 0.585 0.52 0.633 0.661 0.686 ...
    $ NES : num 2.76 2.58 2.91 2.94 2.97 ...
    $ pvalue : num 4.55e-31 4.52e-31 8.49e-30 2.12e-27 1.59e-25 ...
    $ p.adjust : num 1.04e-27 1.04e-27 1.30e-26 2.43e-24 1.46e-22 ...
    $ qvalue : num 7.90e-28 7.90e-28 9.83e-27 1.84e-24 1.10e-22 ...
    $ rank : num 1374 1374 449 449 532 ...
    $ leading_edge : chr "tags=27%, list=11%, signal=25%" "tags=24%, list=11%, signal=22%" "tags=23%, list=4%, signal=22%" "tags=25%, list=4%, signal=24%" ...
    $ core_enrichment: chr "CDCA8/CDC20/KIF23/CENPE/MYBL2/CCNB2/NDC80/TOP2A/NCAPH/ASPM/DLGAP5/UBE2C/HJURP/SKA1/NUSAP1/TPX2/TACC3/NEK2/CENPM"| truncated "CDC45/CDCA8/CDC20/KIF23/CENPE/MYBL2/NDC80/TOP2A/NCAPH/DLGAP5/UBE2C/HJURP/SKA1/NUSAP1/TPX2/TACC3/NEK2/CENPN/CDK1"| truncated "CDCA8/CDC20/KIF23/CENPE/MYBL2/CCNB2/NDC80/TOP2A/NCAPH/ASPM/DLGAP5/UBE2C/SKA1/NUSAP1/TPX2/TACC3/NEK2/CDK1/MAD2L1"| truncated "CDCA8/CDC20/KIF23/CENPE/MYBL2/NDC80/TOP2A/NCAPH/DLGAP5/UBE2C/SKA1/NUSAP1/TPX2/TACC3/NEK2/CDK1/MAD2L1/KIF18A/CDT"| truncated ...
    #...Citation
    T Wu, E Hu, S Xu, M Chen, P Guo, Z Dai, T Feng, L Zhou, W Tang, L Zhan, X Fu, S Liu, X Bo, and G Yu.
    clusterProfiler 4.0: A universal enrichment tool for interpreting omics data.
    The Innovation. 2021, 2(3):100141

res_BP.simplify

Gene Set Enrichment Analysis

#...@organism Homo sapiens
#...@setType BP
#...@keytype ENTREZID
#...@GeneList Named num [1:12495] 4.57 4.51 4.42 4.14 3.88 ...

  • attr(*, "names")= chr [1:12495] "4312" "8318" "10874" "55143" ...
    #...nPerm
    #...pvalues adjusted by 'BH' with cutoff <0.05
    #...249 enriched terms found
    'data.frame': 249 obs. of 11 variables:
    $ ID : chr "GO:0007059" "GO:0051276" "GO:0098813" "GO:0000819" ...
    $ Description : chr "chromosome segregation" "chromosome organization" "nuclear chromosome segregation" "sister chromatid segregation" ...
    $ setSize : int 319 473 238 185 327 491 423 197 104 129 ...
    $ enrichmentScore: num 0.585 0.52 0.633 0.661 0.541 ...
    $ NES : num 2.76 2.58 2.91 2.94 2.56 ...
    $ pvalue : num 4.55e-31 4.52e-31 8.49e-30 2.12e-27 2.75e-24 ...
    $ p.adjust : num 1.04e-27 1.04e-27 1.30e-26 2.43e-24 2.10e-21 ...
    $ qvalue : num 7.90e-28 7.90e-28 9.83e-27 1.84e-24 1.59e-21 ...
    $ rank : num 1374 1374 449 449 1246 ...
    $ leading_edge : chr "tags=27%, list=11%, signal=25%" "tags=24%, list=11%, signal=22%" "tags=23%, list=4%, signal=22%" "tags=25%, list=4%, signal=24%" ...
    $ core_enrichment: chr "CDCA8/CDC20/KIF23/CENPE/MYBL2/CCNB2/NDC80/TOP2A/NCAPH/ASPM/DLGAP5/UBE2C/HJURP/SKA1/NUSAP1/TPX2/TACC3/NEK2/CENPM"| truncated "CDC45/CDCA8/CDC20/KIF23/CENPE/MYBL2/NDC80/TOP2A/NCAPH/DLGAP5/UBE2C/HJURP/SKA1/NUSAP1/TPX2/TACC3/NEK2/CENPN/CDK1"| truncated "CDCA8/CDC20/KIF23/CENPE/MYBL2/CCNB2/NDC80/TOP2A/NCAPH/ASPM/DLGAP5/UBE2C/SKA1/NUSAP1/TPX2/TACC3/NEK2/CDK1/MAD2L1"| truncated "CDCA8/CDC20/KIF23/CENPE/MYBL2/NDC80/TOP2A/NCAPH/DLGAP5/UBE2C/SKA1/NUSAP1/TPX2/TACC3/NEK2/CDK1/MAD2L1/KIF18A/CDT"| truncated ...
    #...Citation
    T Wu, E Hu, S Xu, M Chen, P Guo, Z Dai, T Feng, L Zhou, W Tang, L Zhan, X Fu, S Liu, X Bo, and G Yu.
    clusterProfiler 4.0: A universal enrichment tool for interpreting omics data.
    The Innovation. 2021, 2(3):100141

inputList <- list(List1 = geneList, List2 = geneList, List3 = sort(-1*geneList, decreasing = TRUE) )
res.combined <- compareCluster(geneClusters=inputList,

  •                                                        fun = "gseGO",
    
  •                                                         OrgDb = org.Hs.eg.db,
    
  •                                                           keyType = "ENTREZID",
    
  •                                                           ont = "ALL",
    
  •                                                          eps = 0,
    
  •                                                           pvalueCutoff = 0.05,
    
  •                                                           pAdjustMethod = "BH",
    
  •                                                           minGSSize = 15,
    
  •                                                           maxGSSize = 500)
    

Es gab 30 Warnungen (Anzeige mit warnings())

res.combined

Result of Comparing 3 gene clusters

#.. @fun gseGO
#.. @geneClusters List of 3
$ List1: Named num [1:12495] 4.57 4.51 4.42 4.14 3.88 ...
..- attr(, "names")= chr [1:12495] "4312" "8318" "10874" "55143" ...
$ List2: Named num [1:12495] 4.57 4.51 4.42 4.14 3.88 ...
..- attr(
, "names")= chr [1:12495] "4312" "8318" "10874" "55143" ...
$ List3: Named num [1:12495] 4.3 3.95 3.6 3.46 3.42 ...
..- attr(*, "names")= chr [1:12495] "4969" "57758" "79901" "79838" ...
#...Result 'data.frame': 2657 obs. of 13 variables:
$ Cluster : Factor w/ 3 levels "List1","List2",..: 1 1 1 1 1 1 1 1 1 1 ...
$ ONTOLOGY : chr "BP" "BP" "BP" "BP" ...
$ ID : chr "GO:0051276" "GO:0007059" "GO:0098813" "GO:0000819" ...
$ Description : chr "chromosome organization" "chromosome segregation" "nuclear chromosome segregation" "sister chromatid segregation" ...
$ setSize : int 473 319 238 185 152 327 317 138 224 362 ...
$ enrichmentScore: num 0.52 0.585 0.633 0.661 0.686 ...
$ NES : num 2.56 2.76 2.9 2.96 2.93 ...
$ pvalue : num 3.42e-31 1.05e-30 5.32e-30 5.45e-27 7.08e-26 ...
$ p.adjust : num 2.02e-27 3.09e-27 1.05e-26 8.06e-24 8.37e-23 ...
$ qvalue : num 1.55e-27 2.37e-27 8.03e-27 6.17e-24 6.40e-23 ...
$ rank : num 1374 1374 449 449 532 ...
$ leading_edge : chr "tags=24%, list=11%, signal=22%" "tags=27%, list=11%, signal=25%" "tags=23%, list=4%, signal=22%" "tags=25%, list=4%, signal=24%" ...
$ core_enrichment: chr "8318/55143/991/9493/1062/4605/10403/7153/23397/9787/11065/55355/220134/51203/22974/10460/4751/55839/983/4085/98"| truncated "55143/991/9493/1062/4605/9133/10403/7153/23397/259266/9787/11065/55355/220134/51203/22974/10460/4751/79019/5583"| truncated "55143/991/9493/1062/4605/9133/10403/7153/23397/259266/9787/11065/220134/51203/22974/10460/4751/983/4085/81930/8"| truncated "55143/991/9493/1062/4605/10403/7153/23397/9787/11065/220134/51203/22974/10460/4751/983/4085/81930/81620/332/383"| truncated ...
#.. number of enriched terms found for each gene cluster:
#.. List1: 869
#.. List2: 846
#.. List3: 942

#...Citation
T Wu, E Hu, S Xu, M Chen, P Guo, Z Dai, T Feng, L Zhou,
W Tang, L Zhan, X Fu, S Liu, X Bo, and G Yu.
clusterProfiler 4.0: A universal enrichment tool for interpreting omics data.
The Innovation. 2021, 2(3):100141

res.combined <- setReadable(res.combined, OrgDb = org.Hs.eg.db, keyType="ENTREZID")
res.combined.simplify <- simplify(res.combined)
Fehler in match.arg(ont, c("BP", "CC", "MF")) :
'arg' sollte eines von '“BP”, “CC”, “MF”' sein
res_BP.combined <- compareCluster(geneClusters=inputList,

  •                                                        fun = "gseGO",
    
  •                                                         OrgDb = org.Hs.eg.db,
    
  •                                                           keyType = "ENTREZID",
    
  •                                                           ont = "BP",
    
  •                                                          eps = 0,
    
  •                                                           pvalueCutoff = 0.05,
    
  •                                                           pAdjustMethod = "BH",
    
  •                                                           minGSSize = 15,
    
  •                                                           maxGSSize = 500)
    

Es gab 30 Warnungen (Anzeige mit warnings())

res_BP.combined <- setReadable(res_BP.combined, OrgDb = org.Hs.eg.db, keyType="ENTREZID")
res_BP.combined

Result of Comparing 3 gene clusters

#.. @fun gseGO
#.. @geneClusters List of 3
$ List1: Named num [1:12495] 4.57 4.51 4.42 4.14 3.88 ...
..- attr(, "names")= chr [1:12495] "4312" "8318" "10874" "55143" ...
$ List2: Named num [1:12495] 4.57 4.51 4.42 4.14 3.88 ...
..- attr(
, "names")= chr [1:12495] "4312" "8318" "10874" "55143" ...
$ List3: Named num [1:12495] 4.3 3.95 3.6 3.46 3.42 ...
..- attr(*, "names")= chr [1:12495] "4969" "57758" "79901" "79838" ...
#...Result 'data.frame': 2275 obs. of 12 variables:
$ Cluster : Factor w/ 3 levels "List1","List2",..: 1 1 1 1 1 1 1 1 1 1 ...
$ ID : chr "GO:0007059" "GO:0051276" "GO:0098813" "GO:0000819" ...
$ Description : chr "chromosome segregation" "chromosome organization" "nuclear chromosome segregation" "sister chromatid segregation" ...
$ setSize : int 319 473 238 185 152 327 224 362 491 423 ...
$ enrichmentScore: num 0.585 0.52 0.633 0.661 0.686 ...
$ NES : num 2.75 2.53 2.91 2.97 3.01 ...
$ pvalue : num 4.55e-31 7.56e-31 1.38e-30 1.60e-27 6.91e-26 ...
$ p.adjust : num 1.73e-27 1.73e-27 2.11e-27 1.83e-24 6.33e-23 ...
$ qvalue : num 1.29e-27 1.29e-27 1.57e-27 1.37e-24 4.73e-23 ...
$ rank : num 1374 1374 449 449 532 ...
$ leading_edge : chr "tags=27%, list=11%, signal=25%" "tags=24%, list=11%, signal=22%" "tags=23%, list=4%, signal=22%" "tags=25%, list=4%, signal=24%" ...
$ core_enrichment: chr "CDCA8/CDC20/KIF23/CENPE/MYBL2/CCNB2/NDC80/TOP2A/NCAPH/ASPM/DLGAP5/UBE2C/HJURP/SKA1/NUSAP1/TPX2/TACC3/NEK2/CENPM"| truncated "CDC45/CDCA8/CDC20/KIF23/CENPE/MYBL2/NDC80/TOP2A/NCAPH/DLGAP5/UBE2C/HJURP/SKA1/NUSAP1/TPX2/TACC3/NEK2/CENPN/CDK1"| truncated "CDCA8/CDC20/KIF23/CENPE/MYBL2/CCNB2/NDC80/TOP2A/NCAPH/ASPM/DLGAP5/UBE2C/SKA1/NUSAP1/TPX2/TACC3/NEK2/CDK1/MAD2L1"| truncated "CDCA8/CDC20/KIF23/CENPE/MYBL2/NDC80/TOP2A/NCAPH/DLGAP5/UBE2C/SKA1/NUSAP1/TPX2/TACC3/NEK2/CDK1/MAD2L1/KIF18A/CDT"| truncated ...
#.. number of enriched terms found for each gene cluster:
#.. List1: 790
#.. List2: 760
#.. List3: 725

#...Citation
T Wu, E Hu, S Xu, M Chen, P Guo, Z Dai, T Feng, L Zhou,
W Tang, L Zhan, X Fu, S Liu, X Bo, and G Yu.
clusterProfiler 4.0: A universal enrichment tool for interpreting omics data.
The Innovation. 2021, 2(3):100141

res_BP.combined.simplify <- simplify(res_BP.combined)
res_BP.combined.simplify

Result of Comparing 3 gene clusters

#.. @fun gseGO
#.. @geneClusters List of 3
$ List1: Named num [1:12495] 4.57 4.51 4.42 4.14 3.88 ...
..- attr(, "names")= chr [1:12495] "4312" "8318" "10874" "55143" ...
$ List2: Named num [1:12495] 4.57 4.51 4.42 4.14 3.88 ...
..- attr(
, "names")= chr [1:12495] "4312" "8318" "10874" "55143" ...
$ List3: Named num [1:12495] 4.3 3.95 3.6 3.46 3.42 ...
..- attr(*, "names")= chr [1:12495] "4969" "57758" "79901" "79838" ...
#...Result 'data.frame': 820 obs. of 12 variables:
$ Cluster : Factor w/ 3 levels "List1","List2",..: 1 1 1 1 1 1 1 1 1 1 ...
$ ID : chr "GO:0007059" "GO:0051276" "GO:0098813" "GO:0000819" ...
$ Description : chr "chromosome segregation" "chromosome organization" "nuclear chromosome segregation" "sister chromatid segregation" ...
$ setSize : int 319 473 238 185 327 491 423 197 104 129 ...
$ enrichmentScore: num 0.585 0.52 0.633 0.661 0.541 ...
$ NES : num 2.75 2.53 2.91 2.97 2.55 ...
$ pvalue : num 4.55e-31 7.56e-31 1.38e-30 1.60e-27 1.53e-24 ...
$ p.adjust : num 1.73e-27 1.73e-27 2.11e-27 1.83e-24 1.16e-21 ...
$ qvalue : num 1.29e-27 1.29e-27 1.57e-27 1.37e-24 8.70e-22 ...
$ rank : num 1374 1374 449 449 1246 ...
$ leading_edge : chr "tags=27%, list=11%, signal=25%" "tags=24%, list=11%, signal=22%" "tags=23%, list=4%, signal=22%" "tags=25%, list=4%, signal=24%" ...
$ core_enrichment: chr "CDCA8/CDC20/KIF23/CENPE/MYBL2/CCNB2/NDC80/TOP2A/NCAPH/ASPM/DLGAP5/UBE2C/HJURP/SKA1/NUSAP1/TPX2/TACC3/NEK2/CENPM"| truncated "CDC45/CDCA8/CDC20/KIF23/CENPE/MYBL2/NDC80/TOP2A/NCAPH/DLGAP5/UBE2C/HJURP/SKA1/NUSAP1/TPX2/TACC3/NEK2/CENPN/CDK1"| truncated "CDCA8/CDC20/KIF23/CENPE/MYBL2/CCNB2/NDC80/TOP2A/NCAPH/ASPM/DLGAP5/UBE2C/SKA1/NUSAP1/TPX2/TACC3/NEK2/CDK1/MAD2L1"| truncated "CDCA8/CDC20/KIF23/CENPE/MYBL2/NDC80/TOP2A/NCAPH/DLGAP5/UBE2C/SKA1/NUSAP1/TPX2/TACC3/NEK2/CDK1/MAD2L1/KIF18A/CDT"| truncated ...
#.. number of enriched terms found for each gene cluster:
#.. List1: 282
#.. List2: 274
#.. List3: 264

#...Citation
T Wu, E Hu, S Xu, M Chen, P Guo, Z Dai, T Feng, L Zhou,
W Tang, L Zhan, X Fu, S Liu, X Bo, and G Yu.
clusterProfiler 4.0: A universal enrichment tool for interpreting omics data.
The Innovation. 2021, 2(3):100141

res.combined.simplify <- clusterProfiler::simplify(res.combined)
Fehler in match.arg(ont, c("BP", "CC", "MF")) :
'arg' sollte eines von '“BP”, “CC”, “MF”' sein
res_BP.combined.simplify <- clusterProfiler::simplify(res_BP.combined)
res_BP.combined.simplify

Result of Comparing 3 gene clusters

#.. @fun gseGO
#.. @geneClusters List of 3
$ List1: Named num [1:12495] 4.57 4.51 4.42 4.14 3.88 ...
..- attr(, "names")= chr [1:12495] "4312" "8318" "10874" "55143" ...
$ List2: Named num [1:12495] 4.57 4.51 4.42 4.14 3.88 ...
..- attr(
, "names")= chr [1:12495] "4312" "8318" "10874" "55143" ...
$ List3: Named num [1:12495] 4.3 3.95 3.6 3.46 3.42 ...
..- attr(*, "names")= chr [1:12495] "4969" "57758" "79901" "79838" ...
#...Result 'data.frame': 820 obs. of 12 variables:
$ Cluster : Factor w/ 3 levels "List1","List2",..: 1 1 1 1 1 1 1 1 1 1 ...
$ ID : chr "GO:0007059" "GO:0051276" "GO:0098813" "GO:0000819" ...
$ Description : chr "chromosome segregation" "chromosome organization" "nuclear chromosome segregation" "sister chromatid segregation" ...
$ setSize : int 319 473 238 185 327 491 423 197 104 129 ...
$ enrichmentScore: num 0.585 0.52 0.633 0.661 0.541 ...
$ NES : num 2.75 2.53 2.91 2.97 2.55 ...
$ pvalue : num 4.55e-31 7.56e-31 1.38e-30 1.60e-27 1.53e-24 ...
$ p.adjust : num 1.73e-27 1.73e-27 2.11e-27 1.83e-24 1.16e-21 ...
$ qvalue : num 1.29e-27 1.29e-27 1.57e-27 1.37e-24 8.70e-22 ...
$ rank : num 1374 1374 449 449 1246 ...
$ leading_edge : chr "tags=27%, list=11%, signal=25%" "tags=24%, list=11%, signal=22%" "tags=23%, list=4%, signal=22%" "tags=25%, list=4%, signal=24%" ...
$ core_enrichment: chr "CDCA8/CDC20/KIF23/CENPE/MYBL2/CCNB2/NDC80/TOP2A/NCAPH/ASPM/DLGAP5/UBE2C/HJURP/SKA1/NUSAP1/TPX2/TACC3/NEK2/CENPM"| truncated "CDC45/CDCA8/CDC20/KIF23/CENPE/MYBL2/NDC80/TOP2A/NCAPH/DLGAP5/UBE2C/HJURP/SKA1/NUSAP1/TPX2/TACC3/NEK2/CENPN/CDK1"| truncated "CDCA8/CDC20/KIF23/CENPE/MYBL2/CCNB2/NDC80/TOP2A/NCAPH/ASPM/DLGAP5/UBE2C/SKA1/NUSAP1/TPX2/TACC3/NEK2/CDK1/MAD2L1"| truncated "CDCA8/CDC20/KIF23/CENPE/MYBL2/NDC80/TOP2A/NCAPH/DLGAP5/UBE2C/SKA1/NUSAP1/TPX2/TACC3/NEK2/CDK1/MAD2L1/KIF18A/CDT"| truncated ...
#.. number of enriched terms found for each gene cluster:
#.. List1: 282
#.. List2: 274
#.. List3: 264

#...Citation
T Wu, E Hu, S Xu, M Chen, P Guo, Z Dai, T Feng, L Zhou,
W Tang, L Zhan, X Fu, S Liu, X Bo, and G Yu.
clusterProfiler 4.0: A universal enrichment tool for interpreting omics data.
The Innovation. 2021, 2(3):100141

@guidohooiveld
Copy link

guidohooiveld commented Jul 18, 2024

I am lost... I have the same version of clusterProfiler as you have installed, and it works fine for me...

> packageVersion("clusterProfiler")
[1] ‘4.12.0’
> 

Yet, I am still puzzled why you get the warnings, but I don't...

Warnmeldungen:
1: In serialize(data, node$con) :
'package:stats' evtl. nicht verfügbar während des Ladens

Also, when I check the source code of simplify I do see that GOALL is supported, but you get the error...

GOALL was indeed initially not supported, but since v4.2.0 it is.... check this commit (from 27 Oct 2021):
b75f09a

Below is is what I see when I check the source code in R. Note the explicit mention of GOALL.
What about your installation?

> library(clusterProfiler)
> selectMethod(simplify, signature="gseaResult")
Method Definition:

function (x, ...) 
{
    .local <- function (x, cutoff = 0.7, by = "p.adjust", select_fun = min, 
        measure = "Wang", semData = NULL) 
    {
        if (!x@setType %in% c("BP", "MF", "CC", "GOALL")) 
            stop("simplify only applied to output from gseGO and enrichGO...")
        res <- as.data.frame(x)
        if (x@setType == "GOALL") {
            x@result <- simplify_ALL(res = res, cutoff = cutoff, 
                by = by, select_fun = select_fun, measure = measure, 
                semData = semData)
        }
        else {
            x@result <- simplify_internal(res = res, cutoff = cutoff, 
                by = by, select_fun = select_fun, measure = measure, 
                ontology = x@setType, semData = semData)
        }
        return(x)
    }
    .local(x, ...)
}
<bytecode: 0x000002d8d342b808>
<environment: namespace:clusterProfiler>

Signatures:
        x           
target  "gseaResult"
defined "gseaResult"
> 

@Sidragull57
Copy link
Author

I checked and get the same as you

library(clusterProfiler) > selectMethod(simplify, signature="gseaResult") Method Definition: function (x, ...) { .local <- function (x, cutoff = 0.7, by = "p.adjust", select_fun = min, measure = "Wang", semData = NULL) { if (!x@setType %in% c("BP", "MF", "CC", "GOALL")) stop("simplify only applied to output from gseGO and enrichGO...") res <- as.data.frame(x) if (x@setType == "GOALL") { x@result <- simplify_ALL(res = res, cutoff = cutoff, by = by, select_fun = select_fun, measure = measure, semData = semData) } else { x@result <- simplify_internal(res = res, cutoff = cutoff, by = by, select_fun = select_fun, measure = measure, ontology = x@setType, semData = semData) } return(x) } .local(x, ...) } Signatures: x target "gseaResult" defined "gseaResult" --   > | > >

@guidohooiveld
Copy link

guidohooiveld commented Jul 19, 2024

So:

  • you have the latest version of R, Bioconductor and clusterProfiler installed. [Just to be sure; run once more as administrator in a freshly started session of R this line of code: BiocManager::install(version = "3.19") ].
  • The analysis with the identical sample dataset works on my system, but not on yours.
  • Yet, checking the simplify function in R on you system shows GOALL is supported.

--> I am out of suggestions... Sorry! The last thing to suggest is to use another PC or laptop...

@Sidragull57
Copy link
Author

Thank you very much for your support. I am now atleast able to remove redundancy from the GSEA output for a specific ontology.
Your help is greatly appreciated.

@guidohooiveld
Copy link

One last remark:

Again, be sure to also run the code in R 'only' (and not through R-studio)!

Another user just reported that this solved his/her problem, although that problem was not at all related to your issue... but you never know...

#708 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants