Skip to content

Null PID reference #290

@kevinkovalchik

Description

@kevinkovalchik

I believe this is related to #201.

I am running Bpipe 0.9.11 in Apptainer. @hh1985 in #201 was using Docker, so possibly this is related to containerization though really I don't know.

I don't know if it is related, but I am running multiple instances of Bpipe concurrently. I have tried to isolate them by temporarily setting $HOME to a unique temporary directory for each instance (since use of $HOME is hardcoded into Bpipe in at least one place, if I recall correctly).

It seems like sometimes $BPIPE_PID ends up being null or an empty string. I don't know if this is just due to an IO error reading the temporary PID file or if there is another issue behind it. Every path that points to a file named $BPIPE_PID actually points to the parent directory, which would result in the error seen in #201 (along with whatever other issues come up with the PID being null).

Below is the head of a log file of a job which suffered from this issue. Note that the filename of the log is .bpipe/logs/.bpipe.log. It should be .bpipe/logs/$BPIPE_PID.bpipe.log, so the PID is null. This is also supported by the contents of the log:

bpipe.Runner	[1]	INFO	|11:08:23 Starting 
bpipe.Runner	[1]	INFO	|11:08:24 OS: Linux (5.15.0-75-generic) Java: 11.0.23 Vendor: Debian 
bpipe.Runner	[1]	INFO	|11:08:24 Initializing plugins ... 
bpipe.Config	[1]	INFO	|11:08:24 No plugins directory found: /output/.bpipe/plugins 
bpipe.Runner	[1]	INFO	|11:08:26 =================== GUID=b35ee14be2f543845b570c1fa5de6d85742cbe76 PID= () ==============

There is no PID in the log, and the whole job ends up failing.

When the failed job is rerun, it then (usually) gets a PID and proceeds as expected. Head of a log after restarting.bpipe/logs/3648408.bpipe.log:

bpipe.Runner	[1]	INFO	|11:10:11 Starting 
bpipe.Runner	[1]	INFO	|11:10:11 OS: Linux (5.15.0-75-generic) Java: 11.0.23 Vendor: Debian 
bpipe.Runner	[1]	INFO	|11:10:11 Initializing plugins ... 
bpipe.Config	[1]	INFO	|11:10:11 No plugins directory found: /output/.bpipe/plugins 
bpipe.Runner	[1]	INFO	|11:10:11 =================== GUID=fa5fa6ae3cbc25e0c44b5d8850817a28d18900cf PID=3648408 (3648408) ============== 

This time there is a PID.

My solution thus far has been to retry each job several times.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions