Bugfix and Improve cache parsing on newer CPUs #5193

vaishakp · 2025-09-30T22:15:46Z

Summary

On newer CPUs (e.g. Intel 258v), the linux getconf function in glibc has issues fetching the correct cache configuration.
It returns incorrect cache sizes for all caches, and returns nothing for cache associations.

This PR changes the way cache configuration is read in by pycbc.

Details

The incorrect cache reporting by getconf led pycbc to throw the following error:

File ~/soft/anaconda/envs/gw/lib/python3.11/site-packages/pycbc/init.py:201
199 # Check for MKL capability
200 try:
--> 201 import pycbc.fft.mkl
202 HAVE_MKL=True
203 except (ImportError, OSError):

File ~/soft/anaconda/envs/gw/lib/python3.11/site-packages/pycbc/fft/init.py:17
1 # Copyright (C) 2012 Josh Willis, Andrew Miller
2 #
3 # This program is free software; you can redistribute it and/or modify it
(...) 14 # with this program; if not, write to the Free Software Foundation, Inc.,
15 # 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
---> 17 from .parser_support import insert_fft_option_group, verify_fft_options, from_cli
18 from .func_api import fft, ifft
19 from .class_api import FFT, IFFT

File ~/soft/anaconda/envs/gw/lib/python3.11/site-packages/pycbc/fft/parser_support.py:29
1 # Copyright (C) 2012 Josh Willis, Andrew Miller
2 #
3 # This program is free software; you can redistribute it and/or modify it
(...) 22 # =============================================================================
23 #
24 """
25 This package provides a front-end to various fast Fourier transform
26 implementations within PyCBC.
27 """
---> 29 from .backend_support import get_backend_modules, get_backend_names
30 from .backend_support import set_backend, get_backend
32 # Next we add all of the machinery to set backends and their options
33 # from the command line.

File ~/soft/anaconda/envs/gw/lib/python3.11/site-packages/pycbc/fft/backend_support.py:77
75 for scheme_name in ["cpu", "mkl", "cuda", "cupy"]:
76 try:
---> 77 mod = import('pycbc.fft.backend_' + scheme_name, fromlist = ['_alist', '_adict'])
78 _alist = getattr(mod, "_alist")
79 _adict = getattr(mod, "_adict")

File ~/soft/anaconda/envs/gw/lib/python3.11/site-packages/pycbc/fft/backend_cpu.py:18
1 # Copyright (C) 2014 Josh Willis
2 #
3 # This program is free software; you can redistribute it and/or modify
(...) 15 # Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
16 # MA 02111-1307 USA
---> 18 from .core import _list_available
20 _backend_dict = {'fftw' : 'fftw',
21 'mkl' : 'mkl',
22 'numpy' : 'npfft'}
23 _backend_list = ['mkl', 'fftw', 'numpy']

File ~/soft/anaconda/envs/gw/lib/python3.11/site-packages/pycbc/fft/core.py:28
1 # Copyright (C) 2012 Josh Willis, Andrew Miller
2 #
3 # This program is free software; you can redistribute it and/or modify it
(...) 22 # =============================================================================
23 #
24 """
25 This package provides a front-end to various fast Fourier transform
26 implementations within PyCBC.
27 """
---> 28 from pycbc.types import Array as _Array
29 from pycbc.types import TimeSeries as _TimeSeries
30 from pycbc.types import FrequencySeries as _FrequencySeries

File ~/soft/anaconda/envs/gw/lib/python3.11/site-packages/pycbc/types/init.py:1
----> 1 from .array import *
2 from .timeseries import *
3 from .frequencyseries import *

File ~/soft/anaconda/envs/gw/lib/python3.11/site-packages/pycbc/types/array.py:43
41 import pycbc.scheme as _scheme
42 from pycbc.scheme import schemed, cpuonly
---> 43 from pycbc.opt import LimitedSizeDict
45 #! FIXME: the uint32 datatype has not been fully tested,
46 # we should restrict any functions that do not allow an
47 # array of uint32 integers
48 _ALLOWED_DTYPES = [_numpy.float32, _numpy.float64, _numpy.complex64,
49 _numpy.complex128, _numpy.uint32, _numpy.int32, int]

File ~/soft/anaconda/envs/gw/lib/python3.11/site-packages/pycbc/opt.py:60
58 LEVEL2_CACHE_ASSOC = getconf('LEVEL2_CACHE_ASSOC')
59 LEVEL2_CACHE_LINESIZE = getconf('LEVEL2_CACHE_LINESIZE')
---> 60 LEVEL3_CACHE_SIZE = getconf('LEVEL3_CACHE_SIZE')
61 LEVEL3_CACHE_ASSOC = getconf('LEVEL3_CACHE_ASSOC')
62 LEVEL3_CACHE_LINESIZE = getconf('LEVEL3_CACHE_LINESIZE')

File ~/soft/anaconda/envs/gw/lib/python3.11/site-packages/pycbc/opt.py:48, in getconf(confvar)
47 def getconf(confvar):
---> 48 return int(subprocess.check_output(['getconf', confvar]))

ValueError: invalid literal for int() with base 10: b'undefined\n'

lscpu provided by (util-linux) gives accurate information about the caches. It is also included in MacOS.

This PR changes the way cache configuration is read in by pycbc.

Previous plan:

If MacOS or Win, try to read L2 size from env. If not available ignore.
If linux, check and use getconf.

New plan:

If OS==win, only read L2 size from env variable.
If OS==linux, use lscpu for cache config (size + assoc)
If OS==darwin, read L2 linesize from sysctl. Ignore others.

Bugfixes

Read correct cache config for newer linux machines

Improvements

Mac can now read in L2 linesize

josh-willis · 2025-10-01T00:44:51Z

Unfortunately, I am about to start an extended leave on Friday, and there is really no way I will be able to review this before then.

Can someone comment on where these constants are actually used? I did not immediately find it via git grep, and my recollection was that a lot of this was for the weave code that preceded cython. I think we include a lot less of that CPU specific stuff now, but my memory may not be entirely reliable.

titodalcanton

No time to look at the functionality change here, but please revert the changes to the logging messages in pycbc/opt.py, and do not add commented-out code.

spxiwh · 2025-10-03T12:51:13Z

Thanks for pointing this out @vaishakp . @rahuldhurkunde also encountered a similar issue building a singularity image recently, so I think we do need to resolve this. As @josh-willis says though, a lot of this is legacy code that can probably be removed.

SO, the only place where these things (and I note, only the LEVEL2_CACHE_SIZE) still seem to be used are here:

pycbc/pycbc/filter/simd_correlate.py

Line 59 in d791555

pow2 = int(_np.log(opt.LEVEL2_CACHE_SIZE/3.0)/_np.log(2.0))

and here:

pycbc/pycbc/events/threshold_cpu.py

Line 33 in d791555

default_segsize = opt.LEVEL2_CACHE_SIZE / numpy.dtype('complex64').itemsize

both of these are providing segmentation settings being passed onto the SIMD "threshold and cluster" and "correlate" C code that @josh-willis wrote. Now, the code that Josh wrote for this is beyond me, but its a lot faster than alternatives that I tried to implement (including scipy/numpy functions for the same things) when we moved from weave to cython .... It's why these functions were then ported "in place" with me adding a Cython interface, rather than replacing code with Cython code as I did almost everywhere else.

There are some issues with this though: When this was written, with weave, the C code would be compiled "on the fly" and so would be compiled against the "right" processor. Since weave was removed and we moved to Cython, the Cython/C code is compiled when building wheels, and this is then shipped to various machines. So the "CACHE_SIZE" being used is the size of whatever builder on github the job lands on, rather than anything related to the job. You can, of course, compile PyCBC from source, but this would be a minority use-case at this point.

So I think I'm in favour of removing these values from PyCBC entirely, and just using default values (which are provided in both cases). @ahnitz thoughts?

Vaishak Prasad added 2 commits September 30, 2025 15:04

improve cache parsing

688dacf

opt renew

70a2e1e

ahnitz requested review from GarethCabournDavies and josh-willis September 30, 2025 22:17

ahnitz assigned josh-willis Sep 30, 2025

Vaishak Prasad added 2 commits September 30, 2025 18:26

correctly parse L3CLineSize

c39534e

lsize from sys

856060d

titodalcanton requested changes Oct 3, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bugfix and Improve cache parsing on newer CPUs #5193

Bugfix and Improve cache parsing on newer CPUs #5193

Uh oh!

vaishakp commented Sep 30, 2025

Uh oh!

josh-willis commented Oct 1, 2025

Uh oh!

titodalcanton left a comment

Uh oh!

spxiwh commented Oct 3, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Bugfix and Improve cache parsing on newer CPUs #5193

Are you sure you want to change the base?

Bugfix and Improve cache parsing on newer CPUs #5193

Uh oh!

Conversation

vaishakp commented Sep 30, 2025

Summary

Details

Previous plan:

New plan:

Bugfixes

Improvements

Uh oh!

josh-willis commented Oct 1, 2025

Uh oh!

titodalcanton left a comment

Choose a reason for hiding this comment

Uh oh!

spxiwh commented Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

spxiwh commented Oct 3, 2025 •

edited

Loading