Skip to content

Conversation

@MicahGale
Copy link
Collaborator

@MicahGale MicahGale commented Apr 10, 2025

Pull Request Checklist for MontePy

Description

This PR adds concurrent parsing using multiprocessing via concurrent.futures. This allows large models to be parsed faster, though it is not perfectly scalable. This option is off by default due to windows requiring a if __name__ == "__main__": guard.

I am merging this onto alpha-test-dev because this needs to be a minor release. I am not sure when we want to do a minor release yet. Probably after jit_parsing is implemented as well.

Implementation.

In general the process of parsing can be broken into three main tasks:

  1. Chunking: the process of splitting the file into individual inputs.
  2. Parsing the inputs into their full syntax trees, and MontePy objects.
  3. Post-parsing processing of linking objects together, and appending the objects to the problem.

How this works in the multi-processing mode is best viewed through the processes.

  • Main Process. The main process handles chunking, and post-parsing processing. It iterates through the file and dispatches the inputs as an iterator that fills a queue (handled by concurrent.futures). Once it is ready it reads the resulting objects out from another queue, and appends then to the problem. Once all objects appended the sub processes are terminated and the objects are linked together
  • Child processes. The children take inputs from a queue parses them and then returns the instantiated objects back through the queue.

performance

I haven't effectively collected data for this yet. This can be done at a later date. From the preliminary data I have gotten there appears to now be a floor on load times of about a second due to the added overhead.

Fixes #742, #745


General Checklist

  • I have performed a self-review of my own code.
  • The code follows the standards outlined in the development documentation.
  • I have formatted my code with black version 25.
  • I have added tests that prove my fix is effective or that my feature works (if applicable).

Documentation Checklist

  • I have documented all added classes and methods.
  • For infrastructure updates, I have updated the developer's guide.
  • For significant new features, I have added a section to the getting started guide.

First-Time Contributor Checklist

  • If this is your first contribution, add yourself to pyproject.toml if you wish to do so.

Additional Notes for Reviewers

Ensure that:

  • The submitted code is consistent with the merge checklist outlined here.
  • The PR covers all relevant aspects according to the development guidelines.
  • 100% coverage of the patch is achieved, or justification for a variance is given.

📚 Documentation preview 📚: https://montepy--743.org.readthedocs.build/en/743/

@MicahGale
Copy link
Collaborator Author

So final test that is failing is due to not having check mode supported.

@MicahGale MicahGale self-assigned this Apr 13, 2025
@MicahGale MicahGale added the performance 🐌 Issues related to speed and memory label Apr 13, 2025
@MicahGale
Copy link
Collaborator Author

I have narrowed the issue down to unpickling ParsingError. This is wild. Pickle is trying to run __init__?

I have boiled it down:

import montepy
import pickle
error = montepy.errors.ParsingError("", "", [])
pickle.loads(pickle.dumps(error))

This causes:

Traceback (most recent call last):
  File "/home/mgale/mambaforge/lib/python3.13/site-packages/IPython/core/interactiveshell.py", line 3667, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
    ~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<ipython-input-23-8c9e5f955a19>", line 1, in <module>
    pickle.loads(pickled)
    ~~~~~~~~~~~~^^^^^^^^^
TypeError: ParsingError.__init__() missing 2 required positional arguments: 'message' and 'error_queue'

@MicahGale
Copy link
Collaborator Author

Maybe this is our solution? https://bugs.python.org/issue32696

@MicahGale
Copy link
Collaborator Author

MicahGale commented Apr 14, 2025

It seems to be due to the fact that we are inheriting from ValueError and not Exception?

Update: It doesn't seem like ValueError inherits from Exception, but only Exception, which is a problem:

exception Exception
All built-in, non-system-exiting exceptions are derived from this class. All user-defined exceptions should also be derived from this class.

@MicahGale
Copy link
Collaborator Author

The issues with windows are independent of serial/multi-process mode. This is wild. These issues should be abstracted away by python.

@MicahGale MicahGale changed the base branch from develop to alpha-test-dev July 11, 2025 22:04
@MicahGale MicahGale requested a review from tjlaboss July 12, 2025 04:25
@MicahGale
Copy link
Collaborator Author

@tjlaboss I think this is finally ready.

@MicahGale MicahGale marked this pull request as ready for review July 14, 2025 15:41
@MicahGale
Copy link
Collaborator Author

I have gotten some preliminary data from @dodu94, that this actually slows down the parsing of large models. I think we need to pause this until we can figure out that issue.

@MicahGale MicahGale marked this pull request as draft July 15, 2025 19:33
@MicahGale
Copy link
Collaborator Author

TIL: multiprocessing.pool.imap maybe better suited for this use case. Also May need to multi-process the input chunking process, or seriously increase it's efficiency.

@MicahGale
Copy link
Collaborator Author

I believe the main bottle-neck now is actually the update_pointers. I am thinking I am eliminating the need for this in jit_parse. So I think I'll finalize that and merge it here and get some performance increase.

@MicahGale MicahGale changed the base branch from alpha-test-dev to minor-rel-dev October 4, 2025 15:57
@MicahGale MicahGale closed this Oct 9, 2025
@MicahGale MicahGale reopened this Oct 9, 2025
@MicahGale MicahGale closed this Oct 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance 🐌 Issues related to speed and memory

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement parallel parsing

3 participants