Skip to content

Getting "[ERROR] failed to connect to the bespoke executor" on some systems #416

@j-wags

Description

@j-wags

Jeff, you may not have succeeded in reproducing Josh's failure mode, but you are getting the same message as I am seeing when bespoke fails for me on HPC systems with busy networks. That is: [ERROR] failed to connect to the bespoke executor - please make sure one is running and your connection settings are correct. This error appeared suddenly for me as our IT organizaiton has been shutting down an old HPC cluster and migrating more users to the one I am using. I get failures of the same sort, coming from an inability to access the executor through the gateway interface from a node other than where the executor is running. It behaves as though the gateway has been shut down. The same bespoke application runs fine for me on less busy clusters. The way you get this error should give a clue to how the settings should be adjusted for more robust behavior on clusters with busy networks. Based on your understanding of the code, could you suggest settings I might try to see they fix my problem?

Originally posted by @BillSwope in #414 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions