Singularity run failed on large dataset

Hi, Andrea,

   I used the latest version of PGGB with Singularity to construct a genome graph for 67 potato genomes. I partitioned the graph by chromosome and ran the jobs on SLURM with 2000 GB of memory. PGGB ran successfully on smaller partitions (e.g., 834 MB for chr01.67.fa.gz.c325321.community.2.fa), but failed on larger partitions (e.g., 1.9 GB for chr01.67.fa.gz.c325321.community.0.fa).

   Thank you in advance.
 
Lin

```
singularity exec $sif_dir/pggb_latest.sif pggb --version
INFO:    Convert SIF file to sandbox...
pggb e25486b
INFO:    Cleaning up image...


#!/bin/bash
#SBATCH --job-name=chr01.1       
#SBATCH --partition=smp01,smp02  
#SBATCH --nodes=1              
#SBATCH --ntasks-per-node=20  
#SBATCH --error=chr01.1.err        
#SBATCH --output=chr01.1.out        ##
#SBATCH --mem=2000g 

chr='$chr'
num='$num'
sif_dir=/home/softwares/sif_dir
singularity run -B ${PWD}/01_genome:/01_genome $sif_dir/pggb_latest.sif pggb \
-i /01_genome/$chr/$chr.67.fa.gz.c325321.community.$num.fa \
-o /01_genome/$chr/$chr.67.fa.gz.c325321.community.$num.fa.out \
-s 10000 -l 50000 -p 90 -c 1 -K 19 -F 0.001 -g 30 \
-k 47 -f 0 -B 10M \
-n 67 -j 0 -e 0 -G 700,1100 -P 1,4,6,2,26,1 -O 0.001 -d 100 -Q Consensus_ \
-Y "#" --skip-viz --threads 20 --poa-threads 20


```
 Here is the error file:
```
INFO:    Convert SIF file to sandbox...
/usr/local/bin/pggb: line 608: /dev/fd/63: No such file or directory
INFO:    Cleaning up image...
```
Here is the log file:
```
Starting pggb on 05-24-2025_030037

Command: /usr/local/bin/pggb -i /01_genome/chr01/chr01.67.fa.gz.c325321.community.0.fa -o /01_genome/chr01/chr01.67.fa.gz.c325321.community.0.fa.out -s 10000 -l 50000 -p 90 -c 1 -K 19 -F 0.001 -g 30 -k 47 -f 0 -B 1000M -n 67 -j 0 -e 0 -G 700,1100 -P 1,4,6,2,26,1 -O 0.001 -d 100 -Q Consensus_ -Y # --skip-viz --threads 20 --poa-threads 20

PARAMETERS

general:
  input-fasta:        /01_genome/chr01/chr01.67.fa.gz.c325321.community.0.fa
  output-dir:         /01_genome/chr01/chr01.67.fa.gz.c325321.community.0.fa.out
  temp-dir:           /01_genome/chr01/chr01.67.fa.gz.c325321.community.0.fa.out
  resume:             false
  compress:           false
  threads:            20
  poa_threads:        20
pggb:
  version:            e25486b
wfmash:
  version:            v0.13.1-0-g042386f0
  segment-length:     10000
  block-length:       50000
  map-pct-id:         90
  n-mappings:         1
  no-splits:          false
  sparse-map:         false
  mash-kmer:          19
  mash-kmer-thres:    0.001
  hg-filter-ani-diff: 30
  exclude-delim:      #
  no-merge-segments:  false
seqwish:
  version:            v0.7.11-0-g0eb6468
  min-match-len:      47
  sparse-factor:      0
  transclose-batch:   1000000000
smoothxg:
  version:            v0.8.2-0-g6a2193d
  skip-normalization: false
  n-haplotypes:       67
  path-jump-max:      0
  edge-jump-max:      0
  poa-length-target:  700,1100
  poa-params:         1,4,6,2,26,1
  poa_padding:        0.001
  run_abpoa:          false
  run_global_poa:     false
  pad-max-depth:      100
  write-maf:          false
  consensus-spec:     false
  consensus-prefix:   Consensus_
  block-id-min:       .9000
  block-ratio-min:    0
odgi:
  version:            v0.9.2-0-gbe6a0202
  viz:                false
  layout:             false
  stats:              false
gfaffix:
  version:            v0.2.1
  reduce-redundancy:  true
vg:
  version:            v1.62.0
  deconstruct:        false
reporting:
  version:            v1.22.2
  multiqc:            false

Running pggb

[mashmap] Skipping self mappings for single file all-vs-all mapping.
[mashmap] MashMap v3.1.1
[mashmap] Reference = [/01_genome/chr01/chr01.67.fa.gz.c325321.community.0.fa]
[mashmap] Query = [/01_genome/chr01/chr01.67.fa.gz.c325321.community.0.fa]
[mashmap] Kmer size = 19
[mashmap] Sketch size = 598
[mashmap] Segment length = 10000 (read split allowed)
[mashmap] Block length min = 50000
[mashmap] Chaining gap max = 20000
[mashmap] Mappings per segment = 1
[mashmap] Percentage identity threshold = 90%
[mashmap] Skip self mappings
[mashmap] Skipping sequences containing the same prefix based on the delimiter "#"
[mashmap] Hypergeometric filter w/ delta = 0.3 and confidence 0.999
[mashmap] Mapping output file = /dev/stdout
[mashmap] Filter mode = 1 (1 = map, 2 = one-to-one, 3 = none)
[mashmap] Execution threads  = 20
[mashmap::skch::Sketch::build] minmer windows picked from reference = 226145112
[mashmap::skch::Sketch::index] unique minmers = 28484299
[mashmap::skch::Sketch::computeFreqHist] Frequency histogram of minmer interval points = (2, 10253840) ... (44768, 1)
[mashmap::skch::Sketch::computeFreqHist] With threshold 0.001%, ignore minmers occurring >= 9814 times during lookup.
[wfmash::map] time spent computing the reference index: 124.06 sec
[mashmap::skch::Map::mapQuery] mapped  0.00% @ 0.00e+00 bp/s elapsed: 00:00:00:0[mashmap::skch::Map::mapQuery] mapped  0.02% @ 9.59e+05 bp/s elapsed:
......
[mashmap::skch::Map::mapQuery] mapped 100.00% @ 2.97e+06 bp/s elapsed: 00:00:10:55 remain: 00:00:00:00
[mashmap::skch::Map::mapQuery] count of mapped reads = 412, reads qualified for mapping = 413, total input reads = 413, total input bp = 1948262810
[wfmash::map] time spent mapping the query: 6.56e+02 sec
[wfmash::map] mapping results saved in: /dev/stdout
wfmash -s 10000 -l 50000 -p 90 -n 1 -k 19 -H 0.001 -Y # -t 20 --tmp-base /01_genome/chr01/chr01.67.fa.gz.c325321.community.0.fa.out /01_genome/chr01/chr01.67.fa.gz.c325321.community.0.fa --lower-triangular --hg-filter-ani-diff 30 --approx-map
8105.08s user 31.75s system 1040% cpu 782.27s total 24776372Kb max memory
[mashmap] Skipping self mappings for single file all-vs-all mapping.
[wfmash::align] Reference = [/01_genome/chr01/chr01.67.fa.gz.c325321.community.0.fa]
[wfmash::align] Query = [/01_genome/chr01/chr01.67.fa.gz.c325321.community.0.fa]
[wfmash::align] Mapping file = /01_genome/chr01/chr01.67.fa.gz.c325321.community.0.fa.out/wfmash-bTNmfy
[wfmash::align] Alignment identity cutoff = 72.00%
[wfmash::align] Alignment output file = /dev/stdout
[wfmash::align] time spent loading the reference index: 0.04 sec
[wfmash::align::computeAlignments] aligned  0.00% @ 0.00e+00 bp/s elapsed: 00:00
......
psed: 00:2[wfmash::align::computeAlignments] aligned 100.00% @ 1.70e+05 bp/s elapsed: 00:21:21:04 remain: 00:00:00:00
[wfmash::align::computeAlignments] count of mapped reads = 413, total aligned bp = 13057406303
[wfmash::align] time spent computing the alignment: 7.69e+04 sec
[wfmash::align] alignment results saved in: /dev/stdout
wfmash -s 10000 -l 50000 -p 90 -n 1 -k 19 -H 0.001 -Y # -t 20 --tmp-base /01_genome/chr01/chr01.67.fa.gz.c325321.community.0.fa.out /01_genome/chr01/chr01.67.fa.gz.c325321.community.0.fa --lower-triangular --hg-filter-ani-diff 30 -i /01_genome/chr01/chr01.67.fa.gz.c325321.community.0.fa.out/chr01.67.fa.gz.c325321.community.0.fa.c325321.mappings.wfmash.paf --invert-filtering
1530585.02s user 5883.30s system 1998% cpu 76865.35s total 4905340Kb max memory
```




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Singularity run failed on large dataset #459

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Singularity run failed on large dataset #459

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions