Skip to content

Use VLLM_WORKER_MULTIPROC_METHOD=spawn instead of --forked for tests #268

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 9, 2025

Conversation

tjohnson31415
Copy link
Collaborator

@tjohnson31415 tjohnson31415 commented Jun 27, 2025

This PR lets us remove the requirement of --forked from our pytest tests.

The hang that is observed without --forked is due to the known issue with libgomp and threading (see this gcc bug report that is a "won't fix"). It is a common problem in Python dueo to the usage of native libraries behind the scenes. If a process is forked after an OpenMP thread pool has been created, then the child will not have a threadpool and the code hangs the next time code enters a parallel context.

Where this comes up in our tests is actually because we use transformers to compare the generation results. vLLM and PyTorch delay initializing the thread pool until it is needed. When just using vLLM in V1, this does not happen in the frontend process, so it is ok to use fork(), but using transformer's model.generate in the main process during the tests initializes the thread pool and causes the next attempt to create a vllm.LLM to hang in the forked worker process.

With spawn, the new process is created from scratch and creates a new OpenMP thread pool. But there are trade-offs here too, eg. using spawn in offline mode requires particular handling in a script. vLLM docs have a good summary of the trade-offs of this setting in REF. Because of that, I just set in in the test environment. Using the vllm cli to run the code actually defaults to spawn anyways (REF).

FIX #146

Copy link

👋 Hi! Thank you for contributing to vLLM support on Spyre.
Just a reminder: Make sure that your code passes all the linting checks, otherwise your PR won't be able to be merged. To do so, first install the linting requirements, then run format.sh and commit the changes. This can be done with uv directly:

uv sync --frozen --group lint --active --inexact

Or this can be done with pip:

uv pip compile --group lint > requirements-lint.txt
pip install -r requirements-lint.txt
bash format.sh

Now you are good to go 🚀

@prashantgupta24
Copy link
Collaborator

bot:test
MARKERS="spyre"

@prashantgupta24
Copy link
Collaborator

bot:test
MARKERS="spyre and not cb"

1 similar comment
@prashantgupta24
Copy link
Collaborator

bot:test
MARKERS="spyre and not cb"

@prashantgupta24
Copy link
Collaborator

bot:test
MARKERS="spyre and not quantized and not multi and not embedding and not cb"

@tjohnson31415 tjohnson31415 changed the title [DO NOT MERGE] Try using OMP_NUM_THREADS=1 to allow non-forked tests Use VLLM_WORKER_MULTIPROC_METHOD=spawn instead of --forked for tests Jul 3, 2025
@tjohnson31415 tjohnson31415 marked this pull request as ready for review July 3, 2025 19:54
Copy link
Collaborator

@joerunde joerunde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lpgtm

@prashantgupta24 prashantgupta24 merged commit 697e3ba into main Jul 9, 2025
18 checks passed
@prashantgupta24 prashantgupta24 deleted the try-tests-no-fork branch July 9, 2025 16:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix pytest cleanup for V1 tests
3 participants