Skip to content

Conversation

@Alxiice
Copy link
Contributor

@Alxiice Alxiice commented Oct 13, 2025

Description

This PR goal is to change chunk system to be able to work with graphs where nodes can have no specific size set.

Changes

Compute system

  • New levels have been added on the computation :
    • extreme (that targets higher specs that what intensive does)
    • script which is a mode for single-process simple process. It targets specific machines on the farm that have the ability to run multiple jobs in parallel so for simple jobs it should allow faster processing
  • Delayed chunks creation
    • node
      • A _chunksCreated param has been added to check if the chunks have been correctly initialized
      • Now by default we use resetChunks when loading meshroom, that will create a list with a single chunk, and we can create chunks at any moment with _createChunks.
      • A new NodeStatusData has been added that tracks a nodeStatus file which is similar to the chunk statuses files but is specific to the node. This node status is used to get the cached range parametrization of the node so that we can retrieve it when possible with _createChunksFromCache.
    • taskManager
      • The TaskThread and TaskManager have been modified to launch the chunk creation when the node compute starts

Submitters

  • Submitters have been moved to another package (TBD where...)
  • The BaseSubmitter API have been updated to allow more flexibility. A new BaseSubmittedJob exists and is used to track jobs that have been created. The goal is to use this as an interface to call actions on it (stop/pause/restart...)
  • A new bin/meshroom_createChunks script have been created, and handles the chunk creation and additional chunk tasks spooling
    • Calls the chunk creation
    • Checks if we can spool additioanl tasks
    • If so, execute queue tasks that will compute the chunks
    • If not, execute the chunks serially on the current process

Changes to the tractor API have been implemented here meshroomHub/mrSubmitters#1

Examples

Peek 2025-10-09 19-05

ezgif-1a582893f45983

@Alxiice Alxiice self-assigned this Oct 13, 2025
@Alxiice Alxiice added the feature new feature (proposed as PR or issue planned by dev) label Oct 13, 2025
@Alxiice Alxiice added this to the Meshroom 2026.1.0 milestone Oct 13, 2025
@codecov
Copy link

codecov bot commented Oct 13, 2025

Codecov Report

❌ Patch coverage is 52.99145% with 275 lines in your changes missing coverage. Please review.
✅ Project coverage is 79.12%. Comparing base (4585522) to head (dd45238).
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
meshroom/core/node.py 58.04% 159 Missing ⚠️
meshroom/core/submitter.py 41.13% 83 Missing ⚠️
meshroom/core/graph.py 42.85% 16 Missing ⚠️
meshroom/core/desc/node.py 40.00% 15 Missing ⚠️
meshroom/core/__init__.py 0.00% 1 Missing ⚠️
meshroom/core/desc/computation.py 83.33% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #2918      +/-   ##
===========================================
- Coverage    80.88%   79.12%   -1.76%     
===========================================
  Files           59       59              
  Lines         7857     8292     +435     
===========================================
+ Hits          6355     6561     +206     
- Misses        1502     1731     +229     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Alxiice Alxiice force-pushed the dev/delayChunkEvaluation branch 2 times, most recently from 02bae2f to 15df47f Compare October 21, 2025 09:56
@Alxiice Alxiice changed the base branch from develop to dev/remove_submitters October 21, 2025 09:56
@cbentejac cbentejac force-pushed the dev/remove_submitters branch from 4b47850 to 51cfd91 Compare November 4, 2025 14:10
Base automatically changed from dev/remove_submitters to develop November 4, 2025 16:24
@cbentejac cbentejac force-pushed the dev/delayChunkEvaluation branch from 15b7db5 to ecbcbed Compare November 4, 2025 16:36
Copy link
Contributor

@cbentejac cbentejac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional notes:

  • Let's say I compute a graph (for example, the photogrammetry template) in a given location. All my nodes have been successfully computed, the statuses are all set to "SUCCESS". Now let's say I do a "save as" on the graph I just computed and save it to a new location. The cache changes, so we expect all the nodes' status to be reset. For nodes that have dynamic chunks, the display of the status is not refreshed, and we end up with nodes that appear as computed, although they have lost all their status files.
    Below is a screenshot of the photogrammetry graph that was computed and saved in a new location (the first branch was computed, the second was added from the template for the sake of comparison):
Image
  • There are some unpredictable behaviours when performing actions that are allowed but probably shouldn't be. An example would be clicking on "stop task" for a task that is already done computing and is in "SUCCESS" state, while nodes are being computed later on in the graph. The UI allows to perform it, there is an info message saying that the task has been successfully stopped, and the node's status is updated to "STOPPED" in the task manager and Graph Editor; it does not enable the "resume job" button. This seems to cause the job on the farm to finish computing the current task and then pausing the rest of the job (to be verified).

  • I have noticed on several occasions that when clicking on buttons from the "JOB" tab, the info message that is sent contains the name of a node that is not part of the job at all (if all my submitted tasks are for NodeName_2, I may get a message about NodeName_1, which is not being computed at all).

Copy link
Contributor

@cbentejac cbentejac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional notes:

  • Let's say I compute a graph (for example, the photogrammetry template) in a given location. All my nodes have been successfully computed, the statuses are all set to "SUCCESS". Now let's say I do a "save as" on the graph I just computed and save it to a new location. The cache changes, so we expect all the nodes' status to be reset. For nodes that have dynamic chunks, the display of the status is not refreshed, and we end up with nodes that appear as computed, although they have lost all their status files.
    Below is a screenshot of the photogrammetry graph that was computed and saved in a new location (the first branch was computed, the second was added from the template for the sake of comparison):
Image
  • There are some unpredictable behaviours when performing actions that are allowed but probably shouldn't be. An example would be clicking on "stop task" for a task that is already done computing and is in "SUCCESS" state, while nodes are being computed later on in the graph. The UI allows to perform it, there is an info message saying that the task has been successfully stopped, and the node's status is updated to "STOPPED" in the task manager and Graph Editor; it does not enable the "resume job" button. This seems to cause the job on the farm to finish computing the current task and then pausing the rest of the job (to be verified).

  • I have noticed on several occasions that when clicking on buttons from the "JOB" tab, the info message that is sent contains the name of a node that is not part of the job at all (if all my submitted tasks are for NodeName_2, I may get a message about NodeName_1, which is not being computed at all).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature new feature (proposed as PR or issue planned by dev)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants