Skip to content

Conversation

@vaishakp
Copy link

Summary

On newer CPUs (e.g. Intel 258v), the linux getconf function in glibc has issues fetching the correct cache configuration.
It returns incorrect cache sizes for all caches, and returns nothing for cache associations.

This PR changes the way cache configuration is read in by pycbc.

Details

The incorrect cache reporting by getconf led pycbc to throw the following error:

File ~/soft/anaconda/envs/gw/lib/python3.11/site-packages/pycbc/init.py:201
199 # Check for MKL capability
200 try:
--> 201 import pycbc.fft.mkl
202 HAVE_MKL=True
203 except (ImportError, OSError):

File ~/soft/anaconda/envs/gw/lib/python3.11/site-packages/pycbc/fft/init.py:17
1 # Copyright (C) 2012 Josh Willis, Andrew Miller
2 #
3 # This program is free software; you can redistribute it and/or modify it
(...) 14 # with this program; if not, write to the Free Software Foundation, Inc.,
15 # 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
---> 17 from .parser_support import insert_fft_option_group, verify_fft_options, from_cli
18 from .func_api import fft, ifft
19 from .class_api import FFT, IFFT

File ~/soft/anaconda/envs/gw/lib/python3.11/site-packages/pycbc/fft/parser_support.py:29
1 # Copyright (C) 2012 Josh Willis, Andrew Miller
2 #
3 # This program is free software; you can redistribute it and/or modify it
(...) 22 # =============================================================================
23 #
24 """
25 This package provides a front-end to various fast Fourier transform
26 implementations within PyCBC.
27 """
---> 29 from .backend_support import get_backend_modules, get_backend_names
30 from .backend_support import set_backend, get_backend
32 # Next we add all of the machinery to set backends and their options
33 # from the command line.

File ~/soft/anaconda/envs/gw/lib/python3.11/site-packages/pycbc/fft/backend_support.py:77
75 for scheme_name in ["cpu", "mkl", "cuda", "cupy"]:
76 try:
---> 77 mod = import('pycbc.fft.backend_' + scheme_name, fromlist = ['_alist', '_adict'])
78 _alist = getattr(mod, "_alist")
79 _adict = getattr(mod, "_adict")

File ~/soft/anaconda/envs/gw/lib/python3.11/site-packages/pycbc/fft/backend_cpu.py:18
1 # Copyright (C) 2014 Josh Willis
2 #
3 # This program is free software; you can redistribute it and/or modify
(...) 15 # Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
16 # MA 02111-1307 USA
---> 18 from .core import _list_available
20 _backend_dict = {'fftw' : 'fftw',
21 'mkl' : 'mkl',
22 'numpy' : 'npfft'}
23 _backend_list = ['mkl', 'fftw', 'numpy']

File ~/soft/anaconda/envs/gw/lib/python3.11/site-packages/pycbc/fft/core.py:28
1 # Copyright (C) 2012 Josh Willis, Andrew Miller
2 #
3 # This program is free software; you can redistribute it and/or modify it
(...) 22 # =============================================================================
23 #
24 """
25 This package provides a front-end to various fast Fourier transform
26 implementations within PyCBC.
27 """
---> 28 from pycbc.types import Array as _Array
29 from pycbc.types import TimeSeries as _TimeSeries
30 from pycbc.types import FrequencySeries as _FrequencySeries

File ~/soft/anaconda/envs/gw/lib/python3.11/site-packages/pycbc/types/init.py:1
----> 1 from .array import *
2 from .timeseries import *
3 from .frequencyseries import *

File ~/soft/anaconda/envs/gw/lib/python3.11/site-packages/pycbc/types/array.py:43
41 import pycbc.scheme as _scheme
42 from pycbc.scheme import schemed, cpuonly
---> 43 from pycbc.opt import LimitedSizeDict
45 #! FIXME: the uint32 datatype has not been fully tested,
46 # we should restrict any functions that do not allow an
47 # array of uint32 integers
48 _ALLOWED_DTYPES = [_numpy.float32, _numpy.float64, _numpy.complex64,
49 _numpy.complex128, _numpy.uint32, _numpy.int32, int]

File ~/soft/anaconda/envs/gw/lib/python3.11/site-packages/pycbc/opt.py:60
58 LEVEL2_CACHE_ASSOC = getconf('LEVEL2_CACHE_ASSOC')
59 LEVEL2_CACHE_LINESIZE = getconf('LEVEL2_CACHE_LINESIZE')
---> 60 LEVEL3_CACHE_SIZE = getconf('LEVEL3_CACHE_SIZE')
61 LEVEL3_CACHE_ASSOC = getconf('LEVEL3_CACHE_ASSOC')
62 LEVEL3_CACHE_LINESIZE = getconf('LEVEL3_CACHE_LINESIZE')

File ~/soft/anaconda/envs/gw/lib/python3.11/site-packages/pycbc/opt.py:48, in getconf(confvar)
47 def getconf(confvar):
---> 48 return int(subprocess.check_output(['getconf', confvar]))

ValueError: invalid literal for int() with base 10: b'undefined\n'

lscpu provided by (util-linux) gives accurate information about the caches. It is also included in MacOS.

This PR changes the way cache configuration is read in by pycbc.

Previous plan:

  1. If MacOS or Win, try to read L2 size from env. If not available ignore.
  2. If linux, check and use getconf.

New plan:

  1. If OS==win, only read L2 size from env variable.
  2. If OS==linux, use lscpu for cache config (size + assoc)
  3. If OS==darwin, read L2 linesize from sysctl. Ignore others.

Bugfixes

  1. Read correct cache config for newer linux machines

Improvements

  1. Mac can now read in L2 linesize

@josh-willis
Copy link
Contributor

Unfortunately, I am about to start an extended leave on Friday, and there is really no way I will be able to review this before then.

Can someone comment on where these constants are actually used? I did not immediately find it via git grep, and my recollection was that a lot of this was for the weave code that preceded cython. I think we include a lot less of that CPU specific stuff now, but my memory may not be entirely reliable.

Copy link
Contributor

@titodalcanton titodalcanton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No time to look at the functionality change here, but please revert the changes to the logging messages in pycbc/opt.py, and do not add commented-out code.

@spxiwh
Copy link
Contributor

spxiwh commented Oct 3, 2025

Thanks for pointing this out @vaishakp . @rahuldhurkunde also encountered a similar issue building a singularity image recently, so I think we do need to resolve this. As @josh-willis says though, a lot of this is legacy code that can probably be removed.

SO, the only place where these things (and I note, only the LEVEL2_CACHE_SIZE) still seem to be used are here:

pow2 = int(_np.log(opt.LEVEL2_CACHE_SIZE/3.0)/_np.log(2.0))

and here:

default_segsize = opt.LEVEL2_CACHE_SIZE / numpy.dtype('complex64').itemsize

both of these are providing segmentation settings being passed onto the SIMD "threshold and cluster" and "correlate" C code that @josh-willis wrote. Now, the code that Josh wrote for this is beyond me, but its a lot faster than alternatives that I tried to implement (including scipy/numpy functions for the same things) when we moved from weave to cython .... It's why these functions were then ported "in place" with me adding a Cython interface, rather than replacing code with Cython code as I did almost everywhere else.

There are some issues with this though: When this was written, with weave, the C code would be compiled "on the fly" and so would be compiled against the "right" processor. Since weave was removed and we moved to Cython, the Cython/C code is compiled when building wheels, and this is then shipped to various machines. So the "CACHE_SIZE" being used is the size of whatever builder on github the job lands on, rather than anything related to the job. You can, of course, compile PyCBC from source, but this would be a minority use-case at this point.

So I think I'm in favour of removing these values from PyCBC entirely, and just using default values (which are provided in both cases). @ahnitz thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants