Only get 1 core hashes in pangenome

Hello, 

I get only a really small amount of core hashes for any pangenome I make (1-5 core hashes)
I have been turning up and down the thresholds in the script, but this does not help much. 

all sketches are here: `/group/ctbrowngrp2/amhorst/2025-pangenome/results/pangenome/l_amylovorus/test_sourmash_param`

Downloaded all gtdb reference strains for 1 species, then sketched individually
cat all sigs for 1 species (sketched at scaled=1,k=21,31)
 `sourmash sig cat *.zip  -o ../l_amylovorus.zip`

Downsample bc pangenome_merge ignores a scaled value (and will make pangenome sketch at 1)
`sourmash sig downsample l_amylovorus.zip -k 21 
--scaled 100 -o l_amylovorus.21.100.zip
`

pangenome merge
`sourmash scripts pangenome_merge l_amylovorus.21.100.zip -k 21 \
-o l_amylovorus.pang.zip --scaled 100`

make ranktable
```
sourmash scripts pangenome_ranktable l_amylovorus.pang.zip -o l_amylovorus.pang.original_script.csv 
-k 21 --scaled 100
```
I have one ranktable made with the original script (1 core hash) 
When defining `central_core_threshold = 0.50 `, I get 1 core hash. 
When defining `central_core_threshold = 0.15 `, I get about 9600 core hashes. 
However, in the original script, the threshold for shell is also 0.15. When looking at number of shell hashes in the ranktable made with the original script, i get about 15000 hashes classified as shell.

How do you define these? It's based of number of genomes a hash is found in right? Any ideas?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Only get 1 core hashes in pangenome #20

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Only get 1 core hashes in pangenome #20

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions