Skip to content

[Fixes #13331] Celery-related error handling using structured error data instead of exceptions during harvesting #13332

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jul 8, 2025

Conversation

Gpetrak
Copy link
Contributor

@Gpetrak Gpetrak commented Jul 3, 2025

In this PR, a different approach in the error handling is proposed in the harvesting app (tasks.py). Rather than relying solely on the on_error method in the finalizer, we could handle errors directly within the _harvest_resource task and return structured error information, for example:

return {
            "resource_id": harvestable_resource_id,
            "status": "failed",
            "error": str(exc),
        }

in case of an error, or otherwise:

return {
            "resource_id": harvestable_resource_id,
            "status": "success",
            "details": harvestable_resource.last_harvesting_message,
        }

With this approach, no exceptions are raised from Celery’s perspective. Instead, we return informative result data, which the finalizer can then evaluate to determine overall success or failure. This method results in clearer, more predictable error handling and improves transparency in the harvesting process. Using this method the on_error method is not necessary but we can leave it as it is for extra safety.

Related issue: #13331

Checklist

Reviewing is a process done by project maintainers, mostly on a volunteer basis. We try to keep the overhead as small as possible and appreciate if you help us to do so by completing the following items. Feel free to ask in a comment if you have troubles with any of them.

For all pull requests:

  • Confirm you have read the contribution guidelines
  • You have sent a Contribution Licence Agreement (CLA) as necessary (not required for small changes, e.g., fixing typos in the documentation)
  • Make sure the first PR targets the master branch, eventual backports will be managed later. This can be ignored if the PR is fixing an issue that only happens in a specific branch, but not in newer ones.

The following are required only for core and extension modules (they are welcomed, but not required, for contrib modules):

  • There is a ticket in https://github.com/GeoNode/geonode/issues describing the issue/improvement/feature (a notable exemption is, changes not visible to end-users)
  • The issue connected to the PR must have Labels and Milestone assigned
  • PR for bug fixes and small new features are presented as a single commit
  • Commit message must be in the form "[Fixes #<issue_number>] Title of the Issue"
  • PR title must be in the form "[Fixes #<issue_number>] Title of the PR"
  • New unit tests have been added covering the changes, unless there is an explanation on why the tests are not necessary/implemented
  • This PR passes all existing unit tests (test results will be reported by travis-ci after opening this PR)
  • This PR passes the QA checks: black geonode && flake8 geonode
  • Commits changing the settings, UI, existing user workflows, or adding new functionality, need to include documentation updates
  • Commits adding new texts do use gettext and have updated .po / .mo files (without location infos)

Submitting the PR does not require you to check all items, but by the time it gets merged, they should be either satisfied or inapplicable.

@cla-bot cla-bot bot added the cla-signed CLA Bot: community license agreement signed label Jul 3, 2025
@Gpetrak Gpetrak self-assigned this Jul 3, 2025
@Gpetrak Gpetrak added this to the 5.0.0 milestone Jul 3, 2025
@Gpetrak Gpetrak linked an issue Jul 3, 2025 that may be closed by this pull request
Copy link

codecov bot commented Jul 7, 2025

Codecov Report

Attention: Patch coverage is 85.20179% with 33 lines in your changes missing coverage. Please review.

Project coverage is 72.97%. Comparing base (5c6672f) to head (b3442f6).
Report is 6 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #13332      +/-   ##
==========================================
+ Coverage   72.92%   72.97%   +0.05%     
==========================================
  Files         916      917       +1     
  Lines       52373    52539     +166     
  Branches     6001     6005       +4     
==========================================
+ Hits        38193    38341     +148     
- Misses      12622    12639      +17     
- Partials     1558     1559       +1     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Gpetrak Gpetrak marked this pull request as ready for review July 7, 2025 10:09
@giohappy
Copy link
Contributor

giohappy commented Jul 7, 2025

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The code changes introduce a different approach in error handling in the harvesting app. Instead of relying solely on the on_error method in the finalizer, the code handles errors directly within the _harvest_resource task and returns structured error information. The review includes suggestions to improve the robustness of the error handling logic and enhance code clarity.

@giohappy giohappy merged commit ebacb5c into master Jul 8, 2025
17 checks passed
@giohappy giohappy deleted the ISSUE_13331 branch July 8, 2025 13:58
Copy link
Contributor

github-actions bot commented Jul 8, 2025

The backport to 4.4.x failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-4.4.x 4.4.x
# Navigate to the new working tree
cd .worktrees/backport-4.4.x
# Create a new branch
git switch --create backport-13332-to-4.4.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 ebacb5cd18857f7eb0f6b8ef0d171e4729ab142f
# Push it to GitHub
git push --set-upstream origin backport-13332-to-4.4.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-4.4.x

Then, create a pull request where the base branch is 4.4.x and the compare/head branch is backport-13332-to-4.4.x.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 4.4.x cla-signed CLA Bot: community license agreement signed enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Recover harvester sessions status when some tasks fail
3 participants