Skip to content

Build suggestions etc #24

@miraep8

Description

@miraep8

Hello!

I (believe) I just successfully built the medi database (after a lot of attempts on my part!) Since I know you said it would be useful to have some job build parameters, I thought I would pass along the params I used for the build, as well as some other tweaks I made to improve the build process:

Downloading:

  • 72:00 requested
  • 50 cores
  • 8 GB per core (so 400 in total)

Building:

  • 120:00 time requested
  • 60 cores
  • 32 GB per core (1920 in total)

No claims that the above are the absolute minimum necessary required for this build! (but they were sufficient - the memory in the second bit in particular was problematic).

In addition I ran into some issues using the conda version of kraken recommended in the env setup. (specifically it would hang at the self_classify stage). I found that manually installing kraken (and of course bracken) into the env worked better than relying on the conda version.

I also think it is possible to speed things up by parallelizing the self_classify step!

Lastly, just wanted to point out that (as of right now) its essential that you build the medi db in the medi repo (this is because the publishDir in database.nf switches back and forth between "$baseDir/data/" and "${launchDir}/data/" so unless baseDir = launchDir, sad times. Would be an easy fix though!

Best + excited to get started trying out Medi! 😃

Mirae

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions