-
-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fallback md5 is used, when trying hard to only use OpenSSL #118224
Comments
… algorithms When OpenSSL is configured to only load "base+fips" providers into the Null library context, md5 might not be available at all. In such cases currently CPython fallsback to internal hashlib implementation is there is one - as there might not be if one compiles python with --with-builtin-hashlib-hashes=blake2. With this change "default" provider is attempted to be loaded to access nonsecurity hashes.
… algorithms When OpenSSL is configured to only load "base+fips" providers into the Null library context, md5 might not be available at all. In such cases currently CPython fallsback to internal hashlib implementation is there is one - as there might not be if one compiles python with --with-builtin-hashlib-hashes=blake2. With this change "default" provider is attempted to be loaded to access nonsecurity hashes.
Compile python without hashlib fallbacks for NIST algorithms, as they are slower and allow to bypass OpenSSL implementations in certain conditions. Fix OpenSSL base and fips provider usage for unapproved algorithms, such that one can use MD5 for non-security. And vice versa. Fixes: python/cpython#118224 Pathes are backports linked in the above upstream issue. Signed-off-by: Dimitri John Ledkov <[email protected]>
Compile python without hashlib fallbacks for NIST algorithms, as they are slower and allow to bypass OpenSSL implementations in certain conditions. Fix OpenSSL base and fips provider usage for unapproved algorithms, such that one can use MD5 for non-security. And vice versa. Fixes: python/cpython#118224 Patches are backports linked in the above upstream issue. Signed-off-by: Dimitri John Ledkov <[email protected]>
…algorithms When OpenSSL is configured to only load "base+fips" providers into the Null library context, md5 might not be available at all. In such cases currently CPython fallsback to internal hashlib implementation is there is one - as there might not be if one compiles python with --with-builtin-hashlib-hashes=blake2. With this change "default" provider is attempted to be loaded to access nonsecurity hashes.
By I don't see why should Python should not honor those decisions. If you want md5, why don't you enable one of the things that provide it? |
It is example of how to trigger the bug. The compliance bypass is that python silently fallsback to internal md5 (when compiled --with-builtin-hashlib-hashes=md5, aka default) when openssl one is incorrectly deemed unavailable (due to provider selection). When in fact, for non security purposes it is there and accessible (aka usage with indicator). Note, openssl here is not compiled with no-md5. |
If you believe that this is as security issue in python, please contact the security team -- see the security page (as linked from the Security policy links on GitHub). But, I don't see the “compliance bypass”. CPython is not FIPS compliant; there isn't anything to bypass. There are some tools/options that make building a FIPS-certifiable CPython easier, but we're not really in a position to test that they do give you a compliant build. Note that building with See also recent FIPS discussion. |
The CI tests are already there for the FIPS mode. If desired I can add the extra matrix test case to address this issue as an explicit check. I'm talking about honoring settings for the NIST algorithms, and using accelerated code-paths by default. And ensuring that existing functionality
FIPS is consent based. Blocking usage of unapproved algorithms is relevant within a cryptographic module only. And those that want to certify cpython they can choose themselves if they want to block blake2 usage out right, or if they want to implement a runtime FIPS indicator w.r.t. usage of blake2 (such that apps that use blake2 are indicated as running in unapproved mode) which is a possibility under FIPS certification.
That discussion takes a narrow view of FIPS certification. Hashes that are not used for security purposes do not need to be FIPS certified (ie. siphash, xxhash64, and so on are perfectly fine to use for pyc hashing; and for key-value dictionary implementations). As long as said things are not used to protect data at rest (e.g. file encryption) / in transit (e.g. TLS) / used for authentication (e.g. HTTP basic-auth). |
it's not a CVE in a classic sense of it. As compliance is frankly arbitrary and conflicting. (i.e. FIPS wants to use FIPS algorithms, and somebody else wants to use Cammelia, and someone else wants to use GOST). My patch here, is scoped to ensure that MD5 (and in the future SHA1) are treated fairly and correctly: honoring system configuration, using accelerated (openssl) codepath, and have I'm not trying to exert any opinion of how one should configure the cpython builds; nor how one should configure their openssl; or how one should configure their FIPS modules. And ensure that |
I'll need to leave this to security experts. Hopefully the discussion is useful. One thing I don't get: why don't you load “default” in the OpenSSL config? |
Please read carefully https://github.com/openssl/openssl/blob/master/README-PROVIDERS.md and especially the note w.r.t. FIPS provider under base provider. if no providers are loaded - a fallback one is loaded, in vanilla upstream it is the default; I find it sad that "base" provider is a subset of "default" provider, as that is very confusing duplication of code at runtime. Loding default+fips provider instead of base+fips provider has 3.5MB runtime memory usage penalty in most common cases, due to loadinging/init/iterating all the algorithms as available; which are then later filtered out by the "fips=yes" query. The cpython CI tests as implemented right now, also do not currently setup openssl.cnf in a way that is most comonly deployed by openssl-fips distributors. I.e. when compared with "canonical openssl fips" or "redhat openssl fips" or even "openssl project fips - as per recommended docs" |
I've read the documentation, and I don't see any hint of languages (or frameworks/libraries) being expected to silently load the default provider when asked for md5. From Python's point of view, the user has made two explicit choices:
I do not think it's Python's place to silently override these choices. The “user” here might be a redistributor or OS author. I get that this sucks for the end user, but Python is not at fault for md5 not being available. You should complain to whoever disabled it. Feel free to post on Discourse to seek a second opinion. I would consider a PR that exposes |
Ok fine. But what about this case:
Surely, cpython should not fallback to builtin md5 when openssl is configured in fips mode without md5. And only do so when usedforsecurity=False? I.e. I don't see cpython today checking what the default context query string is / default intention. It is checking if "md5" is resolved or not, but it is not checking why it was not resolved. |
That sounds more reasonable, though it might be harder to implement. |
True, that blocks using md5 for secure purposes (encryption / authentication / signing / kdf / etc). But will not enable using md5 for non-security purposes (e.g. identifying all court documents that are referenced by md5) as Separately, I find it odd that "import hashlib" tries to import and initialize "hashlib.md5". I wonder if it is time to start to require "hashlib.md5" and "hashlib.sha1" as imports.... or at least lazy inits. |
Relevant discussion - https://discuss.python.org/t/python-and-openssl-fips-mode/51389 |
is it expected for hashlib.md5 to be always available? why is it always loaded upon hashlib import? Can it be made a lazy import? The documentation has some strong words about shipping python without hashlib.md5. |
That's a good question. I guess CPython can go two ways:
|
Let me try to make this prettier, as the current traceback is very confusing and doesn't explain what is happening (it looks like python internals are broken, rather than runtime expectations do not match the ones that were assumed at build time - i.e. w.r.t. availability of md5). |
For FIPS, it's essential that |
That's really up to the organization that seeks FIPS certification for the system that contains the module. AFAIK, things like “FIPS modes” are just ways to make it easier to create and audit a system that meets FIPS requirements. It's hard to give guidance for a specific module without knowing the rest of the system. Anyway, I doubt that |
This is going to be a shameless plug - each distributions choose their own way. In Chainguard Python FIPS Image , I chose to implement this runtime behavior: $ docker run -ti --entrypoint python cgr.dev/chainguard-private/python-fips:latest-dev
Python 3.12.4 (tags/v3.12.4-dirty:8e8a4ba, Jun 26 2024, 12:07:40) [GCC 13.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import hashlib
>>> hashlib.md5
<built-in function openssl_md5>
>>> hashlib.md5()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
_hashlib.UnsupportedDigestmodError: [digital envelope routines] unsupported
>>> hashlib.md5(usedforsecurity=False)
<md5 _hashlib.HASH object @ 0x7b3e6ba96930>
>>> |
That's pretty much what RHEL does too. |
If that is the consensus among most Linux distributions shipping FIPS and python..... could we get this upstream into cpython? Because vendor specific behaviour, even when all of them agree in FIPS mode, is counter productive. |
Chime in here: https://discuss.python.org/t/python-and-openssl-fips-mode/51389/7 |
In many instances systemcrypto backend is tested for support of a particular algorithm. If it is supported, it is used, otherwise gocrypto fallback code path is used. This means effectively the toolchain allows to use MD5 DES TripleDES Ed25519 RC4 HKDF TLS1PF implemented with gocrypto, when FIPS module blocks these algorithms or doesn't even support them (i.e. when only base+fips providers are loaded). In some cases this might be due to incorrect check and/or incorrect runtime configuration of OpenSSL. It is very common to accidentaly activate "default" and "fips" providers in OpenSSL at the same time - which then exhibits odd properties. Specifically "default+fips" providers will list that RC4 and MD5 are supported without any property query strings. But fail at runtime when attempted to be used with property query string set to "fips=yes". If on the other hand "base" and "fips" providers loaded alone, RC4 and MD5 will not be listed as runtime available, and gocrypto fallback path may be taken by the toolchain. A similar issue is currently also present in cpython please see python/cpython#118224. Note that recommended way to configure OpenSSL in fips only mode is with base+fips providers alone - see https://github.com/openssl/openssl/blob/master/README-PROVIDERS.md such that default & legacy providers algorithms are not exposed at runtime. And this is how OpenSSL is configured in FIPS mode on Ubuntu Pro FIPS and Chainguard FIPS Images, and recommended by upstream. Please note internally md5 is used by go coverage, meaning in requirefips case coverage may fail to generate unless some additional APIs are introduced to allow insecure usage of md5 (equivalent to python's usedforsecurity=True|False flag). Or coverage ported to use SHA256. For a local reproducer use base+fips providers only, for example with following openssl.cnf: ``` config_diagnostics = 1 openssl_conf = openssl_init .include /etc/ssl/fipsmodule.cnf [openssl_init] providers = provider_sect alg_section = algorithm_sect [provider_sect] fips = fips_sect base = base_sect [base_sect] activate = 1 [algorithm_sect] default_properties = fips=yes ``` Or compile openssl without RC4 support by using 'no-rc4' configuration option. Signed-off-by: Dimitri John Ledkov <[email protected]>
The solution might be to support a new compiler option to not fallback to a default MD5 implementation when passing |
@xnox if you can please join the discourse discussion, I'd love some more feedback there to potentially figure out a solution for the FIPS related issues if that's feasible. |
Ok! I had more thoughts on how to do things, and also learning approaches other systems are doing. So I think better implementation can be done which is more intuitive. |
Bug report
Bug description:
When
--with-builtin-hashlib-hashes=blake2
to exclude fallback implementation of MD5upon importing hashlib fails to create MD5 construct.
Reference implementation is upstream openssl 3.3.0, with enable-fips, fipsinstall completed and openssl.cnf set to
In essence, things work well only when "default + fips" providers are loaded, as then MD5 functions in OpenSSL are detected as available and are used at runtime and correctly get blocked.
When only "base + fips" providers are loaded, ValueError is raised by OpenSSL constructor, and instead fallback implementation used from _md5 module if it was compiled in.
It seems like the above configuration was not tested, however it can be made to work. CPython should try to load the "default" OpenSSL provider, to guarantee access to non-fips hashes.
Security concerns
This is FedRAMP/FIPS compliance by-pass. This issue may allow using md5 without specifying "usedforsecurity=False" on systems otherwise configured to be in FIPS-mode only. And is the primary reason why documentation mentions that certain distributors of python remove md5 module altogether.
CPython versions tested on:
3.10, 3.11, 3.12
Operating systems tested on:
Linux
Linked PRs
The text was updated successfully, but these errors were encountered: