Eliminate config options for approximate input file size

Asking the user to specify the input size is error prone and inconvenient.

Whenever we download a file to a local disk for the purpose of uploading it to the job store, we should switch to using Toil's import functionality instead. It uses streaming instead of local disk thereby eliminating the need for estimating a disk requirement for the import job. As imports are implemented in Toil right now, this approach might be less reliable and slower than using s3am but we can address those issues in Toil if and when they occur.

What do we do in cases where files are processed immediately after being downloaded from an external location and the job store upload is skipped? Not skipping is one option. Trying to determine the file size is another. For HTTP this can be done with a HEAD request, for S3 there is a API call, probably also being a HEAD request under the hood.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Eliminate config options for approximate input file size #427

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Eliminate config options for approximate input file size #427

Description

Activity

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Issue actions