Skip to content

Validity of run-time comparisons  #52

@rsuchecki

Description

@rsuchecki

⚠️

Run times may not be comparable between tools/runs if we don't ensure that the underlying computational conditions were comparable. For that, the evaluation would probably have to be executed on a dedicated server with a task having exclusive access to that machine and input/output files being placed on local storage (e.g. using nextflow's scratch true).

  1. Cluster execution could be acceptable if we can ensure
  • homogeneity of the nodes (explicit partition spec?)
  • exclusive use of nodes
  • use of local scratch space
  1. Cloud (awsbatch) execution could be acceptable if we can ensure
  • homogeneity of the nodes
  • exclusive use of nodes
  • use of local scratch space

In addition, we must capture more of the task information via

trace.fields = 'task_id,name,status,exit,realtime,%cpu,rss'
  • which should include requested resources cpus,memory,time - more here

The cpu details can be easily picked-up in the mapping process
e.g. beforeScript 'cat /proc/cpuinfo > cpuinfo' which can be parsed downstream.
It is of limited value on its own for serious speed benchmarking,
but may be useful for the indicative use of speed in reports.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions