Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved support for tigger lsm's #110

Open
landmanbester opened this issue Aug 27, 2021 · 6 comments
Open

Improved support for tigger lsm's #110

landmanbester opened this issue Aug 27, 2021 · 6 comments

Comments

@landmanbester
Copy link
Collaborator

I am trying to calibrate while providing the pks1934-638 sky model in caracal. I ran tigger-convert on it to get a lsm.html model and pointed input_model.recipe at it. It falls over with a long error message, part of which reads

 File "/home/bester/venvs/qcal/lib/python3.7/site-packages/numba/core/dispatcher.py", line 141, in _get_implementation
    impl = self.py_func(*args, **kws)
           │    │        │       └ {}
           │    │        └ (array(float64, 2d, C), array(pyobject, 3d, C), array(float64, 1d, C), array(float64, 1d, C), int64)
           │    └ <function spectral_model at 0x7f18f909a440>
           └ <numba.core.dispatcher._GeneratedFunctionCompiler object at 0x7f18f909b4d0>
  File "/home/bester/venvs/qcal/lib/python3.7/site-packages/africanus/model/spectral/spec_model.py", line 99, in spectral_model
    in (stokes, spi, ref_freq, frequency))
        │       │    │         └ array(float64, 1d, C)
        │       │    └ array(float64, 1d, C)
        │       └ array(pyobject, 3d, C)
        └ array(float64, 2d, C)
  File "/home/bester/venvs/qcal/lib/python3.7/site-packages/africanus/model/spectral/spec_model.py", line 98, in <genexpr>
    arg_dtypes = tuple(np.dtype(a.dtype.name) for a
                       │  │     │ │     │         └ array(pyobject, 3d, C)
                       │  │     │ │     └ 'pyobject'
                       │  │     │ └ pyobject
                       │  │     └ array(pyobject, 3d, C)
                       │  └ <class 'numpy.dtype'>
                       └ <module 'numpy' from '/home/bester/venvs/qcal/lib/python3.7/site-packages/numpy/__init__.py'>

TypeError: data type 'pyobject' not understood

There is a telling little warning earlier on which does not end up in the log file viz.

/home/bester/venvs/qcal/lib/python3.7/site-packages/dask/array/core.py:3150: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.

Not sure if it's related. Full log can be found on oates at /home/bester/projects/ESO137/outputs.qc

@JSKenyon
Copy link
Collaborator

I believe this is related to predicting a sky model with a polynomial spectrum. This is supported in Codex Africanus but not in QuartiCal. I believe it should be relatively easy to fix.

@JSKenyon
Copy link
Collaborator

@landmanbester There is now a PR that aims to address this in #112. Currently, it assumes the first case from the Codex docs. Could you please give it a test drive?

@landmanbester
Copy link
Collaborator Author

Thanks, will do

@landmanbester
Copy link
Collaborator Author

Ok, it's not falling over anymore but the memory footprint is crazy. I was 400GB into swap space on a machine with 500GB ram. I'm guessing I can improve this by chunking the problem up further but this is a pretty small MS, the memory footprint when running with identical chunking but using a MODEL_DATA column is only about 120GB

@landmanbester landmanbester changed the title falls over when trying to predict from a tigger lsm Improved support for tigger lsm's Sep 1, 2021
@landmanbester
Copy link
Collaborator Author

Also, as discussed here, the first parametrisation does not match that of wsclean

@JSKenyon
Copy link
Collaborator

JSKenyon commented Sep 1, 2021

As discussed, this is a limitation of the codex predict. The codex predict will create arrays of size (source_chunk, row_chunk, chan_chunk, n_corr). You can naively think about this as each chunk of predict holding an array the size of source_chunk*data. If you have many threads, this can work out to increasing the memory footprint by an order of magnitude or two. You can partially mitigate this by setting input_model.source_chunks to one.

Edit: This is precisely why we are holding out for @sjperkins new implementation. It should avoid allocating these huge arrays.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants