Skip to content

[edison-jobs] Inconsistencies in job locks #367

@mgeissen

Description

@mgeissen

We found out, that there is some inconsistencies in the lock handling of edison-jobs.

Sometimes we are receiving the following log messages:

  • Clear Lock of Job JobName. Job stopped already
  • Clear Lock of Job JobName. JobID does not exist

Both messages comes from this method in the JobService:

/**
   * Checks all run locks and releases the lock, if the job is stopped.
   *
   * TODO: This method should never do something, otherwise the is a bug in the lock handling.
   * TODO: Check Log files + Remove
   */
  private void clearRunLocks() {
      jobMetaService.runningJobs().forEach((RunningJob runningJob) -> {
          final Optional<JobInfo> jobInfoOptional = jobRepository.findOne(runningJob.jobId);
          if (jobInfoOptional.isPresent() && jobInfoOptional.get().isStopped()) {
              jobMetaService.releaseRunLock(runningJob.jobType);
              LOG.error("Clear Lock of Job {}. Job stopped already.", runningJob.jobType);
          } else if (!jobInfoOptional.isPresent()){
              jobMetaService.releaseRunLock(runningJob.jobType);
              LOG.error("Clear Lock of Job {}. JobID does not exist", runningJob.jobType);
          }
      });
  }

This method is marked with a TODO and says that this should not happen. Currently we have no idea how could that happen. We found out that this happens with the DynamoDB and the MongoDB implementation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions