Skip to content

Unknown or uninitialized column: worker after halting a branched target #436

@sigmafelix

Description

@sigmafelix

@kyle-messier

Symptom

In most of containerized pipeline run attempts, when a branched target is halted or suspended by a user, all branched targets were failed with the identical error message of "unknown or uninitialised column: worker. This error message states that the specified worker (resource) is not defined in a controller group, which is not true since the _targets.R includes the correct worker names being used in the pipeline in the crew controller group.


Findings

  • The failure is observed only in branched targets, regardless of setting the specific resource that is expected to be used. All non-branched targets with a single object were completed as expected.
  • The error message is reproduced both in SLURM job submissions and interactive runs in a local workstation; we could focus on the targets or crew level solutions.
  • The controller group was suddenly losing all its workers after the suspension. Rerunning the pipeline (partially or entirely) with a controller name retrieval target stores NULL value
# In a target:
cntrlrs <- targets::tar_option_get("controller")
mycntrlrs <- names(cntrlrs$private$.controllers)
mycntrlrs
# NULL

What do we need to do?

First of all, I will check if all targets settings are lost after the first suspension or failure in branched targets. Afterwards, we might consider reporting this issue to crew and targets developers after crossing out potential causes of this error.

Sub-issues

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions