Writing GATK Tools that use Python

Under construction.

General Guidelines

Some GATK tools depend on the use of Python for machine learning tasks. Such tools must have a Java front-end that:

uses standard GATK arguments
handles reading/writing of user inputs and final outputs to ensure GCS support/consistent authentication
handles temporary file and resource management
uses Python only when necessary, as a computational kernel
documents all dependencies
minimizes amount of code written in Python

Additionally, tool authors should:

ceclare all dependencies in the Conda environment definition file gatkcondaenv.yml
not depend on package versions that have Linux or Mac-specific dependencies
prefer single line commands embedded in Java over multiple, serial commands
write Python errors to stderr
raise exceptions in Python for error conditions
ensure that program correctness should not rely on consumption of Python stdout
logging: TBD

Conda Environment

GATK relies on a Conda environment to establish the correct version of Python and underlying required dependencies. This environment is defined declaratively in the file gatkcondaenv.yml, and shared by all GATK Python tools and peripheral code. Removing or changing the version of a dependency in this file should be done with care, and by consensus with all teams that are dependent on that package.

Executors

There are two methods for integrating Python with aJava front end (PythonScriptExeutor and StreamingPythonScriptExecutor). PythonScriptExecutor is an easy-to-use method for synchronously executing a single Python command, script or module. StreamingPythonScriptExecutor employs a more complex, keep-alive model, that allows execution of multiple commands, asynchronous commands, and data transfer through named pipes.

PythonScriptExeutor

Under construction.

StreamingPythonScriptExecutor

Under construction.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Writing GATK Tools that use Python

General Guidelines

Conda Environment

Executors

PythonScriptExeutor

StreamingPythonScriptExecutor

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally