Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Importing QuPathProject taking up >100GB memory #69

Open
hwarden162 opened this issue Apr 20, 2022 · 8 comments
Open

Importing QuPathProject taking up >100GB memory #69

hwarden162 opened this issue Apr 20, 2022 · 8 comments
Labels
bug Something isn't working

Comments

@hwarden162
Copy link

I am trying to use paquo on a linux cluster. I login to a node with 200GB of memory.

Once on the node I use tmux to open two terminals on the same node. In one of the terminals I activate a conda environment and open up a python window and run

import os
os.getpid()

I then go to my other window and run

top -p <my_pid>

which, as I understand it, allows me to see how much memory I am using. At this point it shows 134060 under VIRT.

In the python window I then run

from paquo.projects import QuPathProject

and then it shows 102.7g under VIRT. As I understand it, this means the QuPathProject object is currently using 102.7GB of memory. This is then giving me errors further down in my pipeline where I try and do much smaller operations but I don't have any memory left.

Here is my conda environment
environment(1).txt

@ap--
Copy link
Collaborator

ap-- commented Apr 21, 2022

Hello @hwarden162,

Virtual memory is not a good proxy for real memory usage. Since paquo starts the JVM with a default setting of max ram usage of 50% of the systems ram (see: https://github.com/bayer-science-for-a-better-life/paquo/blob/51102306e7fc7d144807656641a2589233f57b11/paquo/.paquo.defaults.toml#L28 ) the value you report is a pretty good match.

I believe the memory issue you mention is probably due to something else. Could you provide the code snippet that reproduces the memory error? And provide the traceback or error message that's displayed when it occurs?

Cheers,
Andreas

@hwarden162
Copy link
Author

Hi @ap--,

This is the code I run:

from locale import normalize
from tifffile import imread
from csbdeep.utils import normalize
from stardist.models import StarDist2D
from shapely.geometry import Polygon
from paquo.projects import QuPathProject
from paquo.images import QuPathImageType

X = imread("/exports/igmm/eddie/boulter-lab/Hugh/StarDist/region.ndpi")
model = StarDist2D.from_pretrained('2D_versatile_he')
img = normalize(X, axis = (0,1))
labels, details = model.predict_instances(img, axes='YXC', n_tiles=(10,10,1), verbose = True)

<Some more code>

It is the img = normalize(X, axis = (0,1)) where the memory error gets thrown:

Traceback (most recent call last):
File "", line 1, in
File "/exports/igmm/eddie/kudla-lab/Hugh/Conda/conda_envs/paquo-env/lib/python3.10/site-packages/csbdeep/utils/utils.py", line 54, in normalize
mi = np.percentile(x,pmin,axis=axis,keepdims=True)
File "<array_function internals>", line 5, in percentile
File "/exports/igmm/eddie/kudla-lab/Hugh/Conda/conda_envs/paquo-env/lib/python3.10/site-packages/numpy/lib/function_base.py", line 3867, in percentile
return _quantile_unchecked(
File "/exports/igmm/eddie/kudla-lab/Hugh/Conda/conda_envs/paquo-env/lib/python3.10/site-packages/numpy/lib/function_base.py", line 3986, in _quantile_unchecked
r, k = _ureduce(a, func=_quantile_ureduce_func, q=q, axis=axis, out=out,
File "/exports/igmm/eddie/kudla-lab/Hugh/Conda/conda_envs/paquo-env/lib/python3.10/site-packages/numpy/lib/function_base.py", line 3564, in _ureduce
r = func(a, **kwargs)
File "/exports/igmm/eddie/kudla-lab/Hugh/Conda/conda_envs/paquo-env/lib/python3.10/site-packages/numpy/lib/function_base.py", line 4109, in _quantile_ureduce_func
x_below = take(ap, indices_below, axis=0)
File "<array_function internals>", line 5, in take
File "/exports/igmm/eddie/kudla-lab/Hugh/Conda/conda_envs/paquo-env/lib/python3.10/site-packages/numpy/core/fromnumeric.py", line 190, in take
return _wrapfunc(a, 'take', indices, axis=axis, out=out, mode=mode)
File "/exports/igmm/eddie/kudla-lab/Hugh/Conda/conda_envs/paquo-env/lib/python3.10/site-packages/numpy/core/fromnumeric.py", line 57, in _wrapfunc
return bound(*args, **kwds)
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 7.22 GiB for an array with shape (2583691264, 3) and data type uint8

However, if I run the same code but with from paquo.projects import QuPathProject removed, the script runs perfectly with no memory errors (up to the point where I need to use paquo). Obviously this isn't an error from the paquo library, but removing the QuPathProject import does fix the problem.

Thanks,
Hugh

@ap--
Copy link
Collaborator

ap-- commented Apr 21, 2022

Thanks for the traceback and the additional explanation!

Looks like I am wrong and we really reserve 50% of RAM then. The good thing is you can easily override the setting. But we should change the default to have a reasonable upper limit.

To fix your issue, follow the instructions here: https://paquo.readthedocs.io/en/latest/configuration.html to create a custom paquo configuration file on your cluster replacing the JAVA_OPTS -XX:MaxRAMPercentage=50 setting with -Xmx8g (I think 8Gb should be reasonable)

Let me know if anything is unclear and if this solves your issue.

Cheers,
Andreas

@hwarden162
Copy link
Author

hwarden162 commented Apr 21, 2022

I ran the python -m paquo config --search-tree command and chose one of the directories.

I then ran python -m paquo config --list --default > /path/to/.paquo.toml and changed

java_opts = [
"-XX:MaxRAMPercentage=50",
]

to

java_opts = [
"-Xmx8g",
]

Now, python -m paquo config --list reads

# current paquo configuration
# ===========================
# format: TOML
qupath_dir = ""
qupath_search_dirs = [ "/opt", "/Applications", "c:/Program Files", "/usr/local", "~/Applications", "~/AppData/Local", "~",]
qupath_search_dir_regex = "(?i)qupath.*"
qupath_search_conda = true
qupath_prefer_conda = true
java_opts = [ "-Xmx8g",]
safe_truncate = true
mock_backend = false
cli_force_log_level_error = true
warn_microsoft_store_python = true

so it looks like I successfully altered the .paquo.toml but importing QuPathProject is still taking up half of my memory (measured as before) and I am still getting the error as before when trying to normalise my image (and going over my memory allowance). I have tried completely exiting conda and my session on the terminal to see if rebooting would help but I am still getting the error.

Thanks,
Hugh

EDIT1: Putting backslash before hashtags to fix markdown formatting
EDIT2: More markdown fixes to properly show config output

@ap--
Copy link
Collaborator

ap-- commented Apr 22, 2022

Hmmm, just to be on the safe side:
Are you running your script from the same working directory, that you used to test if python -m paquo config --list returns the correct java_opts setting?

I'd now try to go back to -XX: MaxRAMPercentage=1 and use a very small value like 1 to see if you're able to make it work. Also: what's the peak memory usage of your script without importing paquo? What are the dimensions of region.ndpi?

@hwarden162
Copy link
Author

I am definitely running python -m paquo config --list in the same directory as I am running the code from.

I changed the .paquo.toml putting in -XX: MaxRAMPercentage=4. This showed up when I listed out the config but still when I imported paquo it seemed to take half of my memory and my pipeline failed.

Without importing paquo, the peak memory usage of my script appears to be ~38.2GB

@ap-- ap-- added the bug Something isn't working label Apr 25, 2022
@ap--
Copy link
Collaborator

ap-- commented Apr 25, 2022

Thanks for checking. Something definitely seems to be off, considering your script requiring ~40gb and you still run out of memory on a 200gb machine. I'll try to create a test case to reproduce the oom crash in a more controlled environment, and then we can hopefully improve the default behavior of paquo.

In the meantime my recommendation would be to lazy import paquo when you need it after the bigger computations are done, or factor out the paquo part into a separate script.

I'll update this issue when I start work on this.

Cheers,
Andreas 😃

@hwarden162
Copy link
Author

Thanks very much!

Paquo is a really useful bit of kit that is giving me a lot more flexibility in my workflows. Please let me know if I can help with any more information about the bug.

Best
Hugh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants