Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fallback md5 is used, when trying hard to only use OpenSSL #118224

Open
xnox opened this issue Apr 24, 2024 · 24 comments · Fixed by wolfi-dev/os#17570
Open

Fallback md5 is used, when trying hard to only use OpenSSL #118224

xnox opened this issue Apr 24, 2024 · 24 comments · Fixed by wolfi-dev/os#17570
Labels
type-bug An unexpected behavior, bug, or error

Comments

@xnox
Copy link

xnox commented Apr 24, 2024

Bug report

Bug description:

When

  • OpenSSL is configured in FIPS mode
  • recommended config is used to only load "base + fips" providers
  • without the default provider
  • CPython is compiled with --with-builtin-hashlib-hashes=blake2 to exclude fallback implementation of MD5

upon importing hashlib fails to create MD5 construct.

# python3.10 -c 'import hashlib'
ERROR:root:code for hash md5 was not found.
Traceback (most recent call last):
  File "/usr/lib/python3.10/hashlib.py", line 137, in __get_openssl_constructor
    f(usedforsecurity=False)
ValueError: [digital envelope routines] unsupported

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.10/hashlib.py", line 261, in <module>
    globals()[__func_name] = __get_hash(__func_name)
  File "/usr/lib/python3.10/hashlib.py", line 141, in __get_openssl_constructor
    return __get_builtin_constructor(name)
  File "/usr/lib/python3.10/hashlib.py", line 123, in __get_builtin_constructor
    raise ValueError('unsupported hash type ' + name)
ValueError: unsupported hash type md5

Reference implementation is upstream openssl 3.3.0, with enable-fips, fipsinstall completed and openssl.cnf set to

# cat /etc/ssl/openssl.cnf 
config_diagnostics = 1
openssl_conf = openssl_init

.include /etc/ssl/fipsmodule.cnf

[openssl_init]
providers = provider_sect
alg_section = algorithm_sect

[provider_sect]
fips = fips_sect
base = base_sect

[base_sect]
activate = 1

[algorithm_sect]
default_properties = fips=yes

In essence, things work well only when "default + fips" providers are loaded, as then MD5 functions in OpenSSL are detected as available and are used at runtime and correctly get blocked.

When only "base + fips" providers are loaded, ValueError is raised by OpenSSL constructor, and instead fallback implementation used from _md5 module if it was compiled in.

It seems like the above configuration was not tested, however it can be made to work. CPython should try to load the "default" OpenSSL provider, to guarantee access to non-fips hashes.

Security concerns

This is FedRAMP/FIPS compliance by-pass. This issue may allow using md5 without specifying "usedforsecurity=False" on systems otherwise configured to be in FIPS-mode only. And is the primary reason why documentation mentions that certain distributors of python remove md5 module altogether.

CPython versions tested on:

3.10, 3.11, 3.12

Operating systems tested on:

Linux

Linked PRs

@xnox xnox added the type-bug An unexpected behavior, bug, or error label Apr 24, 2024
xnox added a commit to xnox/cpython that referenced this issue Apr 24, 2024
… algorithms

When OpenSSL is configured to only load "base+fips" providers into the
Null library context, md5 might not be available at all. In such cases
currently CPython fallsback to internal hashlib implementation is
there is one - as there might not be if one compiles python with
--with-builtin-hashlib-hashes=blake2. With this change "default"
provider is attempted to be loaded to access nonsecurity hashes.
@xnox xnox changed the title Fallback md5 is used, when trying hard not to allow it Fallback md5 is used, when trying hard to only use OpenSSL Apr 24, 2024
xnox added a commit to xnox/cpython that referenced this issue Apr 24, 2024
… algorithms

When OpenSSL is configured to only load "base+fips" providers into the
Null library context, md5 might not be available at all. In such cases
currently CPython fallsback to internal hashlib implementation is
there is one - as there might not be if one compiles python with
--with-builtin-hashlib-hashes=blake2. With this change "default"
provider is attempted to be loaded to access nonsecurity hashes.
xnox added a commit to xnox/os that referenced this issue Apr 24, 2024
Compile python without hashlib fallbacks for NIST algorithms, as they
are slower and allow to bypass OpenSSL implementations in certain
conditions.

Fix OpenSSL base and fips provider usage for unapproved algorithms,
such that one can use MD5 for non-security. And vice versa.

Fixes: python/cpython#118224

Pathes are backports linked in the above upstream issue.

Signed-off-by: Dimitri John Ledkov <[email protected]>
xnox added a commit to xnox/os that referenced this issue Apr 24, 2024
Compile python without hashlib fallbacks for NIST algorithms, as they
are slower and allow to bypass OpenSSL implementations in certain
conditions.

Fix OpenSSL base and fips provider usage for unapproved algorithms,
such that one can use MD5 for non-security. And vice versa.

Fixes: python/cpython#118224

Patches are backports linked in the above upstream issue.

Signed-off-by: Dimitri John Ledkov <[email protected]>
xnox added a commit to xnox/cpython that referenced this issue Apr 25, 2024
…algorithms

When OpenSSL is configured to only load "base+fips" providers into the
Null library context, md5 might not be available at all. In such cases
currently CPython fallsback to internal hashlib implementation is
there is one - as there might not be if one compiles python with
--with-builtin-hashlib-hashes=blake2. With this change "default"
provider is attempted to be loaded to access nonsecurity hashes.
@xnox xnox reopened this Apr 26, 2024
@encukou
Copy link
Member

encukou commented May 7, 2024

By --with-builtin-hashlib-hashes=blake2, you are explicitly saying that you do not want Python to provide md5. By setting base & fips in openssl.cnf, you're saying that you don't want OpenSSL to provide md5. Taken together, you've disabled md5.

I don't see why should Python should not honor those decisions. If you want md5, why don't you enable one of the things that provide it?

@xnox
Copy link
Author

xnox commented May 7, 2024

By --with-builtin-hashlib-hashes=blake2, you are explicitly saying that you do not want Python to provide md5. By setting base & fips in openssl.cnf, you're saying that you don't want OpenSSL to provide md5. Taken together, you've disabled md5.

I don't see why should Python should not honor those decisions. If you want md5, why don't you enable one of the things that provide it?

It is example of how to trigger the bug. The compliance bypass is that python silently fallsback to internal md5 (when compiled --with-builtin-hashlib-hashes=md5, aka default) when openssl one is incorrectly deemed unavailable (due to provider selection). When in fact, for non security purposes it is there and accessible (aka usage with indicator).

Note, openssl here is not compiled with no-md5.

@encukou
Copy link
Member

encukou commented May 7, 2024

If you believe that this is as security issue in python, please contact the security team -- see the security page (as linked from the Security policy links on GitHub).

But, I don't see the “compliance bypass”. CPython is not FIPS compliant; there isn't anything to bypass. There are some tools/options that make building a FIPS-certifiable CPython easier, but we're not really in a position to test that they do give you a compliant build.
Asking CPython to bypass system OpenSSL configuration, in order to support a build we don't really test in CI, is... a big thing to ask.

Note that building with --with-builtin-hashlib-hashes=blake2 could also be a compliance bypass. (Or are you planning to get that implementation FIPS-certified?)

cc @SethMichaelLarson

See also recent FIPS discussion.

@xnox
Copy link
Author

xnox commented May 7, 2024

If you believe that this is as security issue in python, please contact the security team -- see the security page (as linked from the Security policy links on GitHub).

But, I don't see the “compliance bypass”. CPython is not FIPS compliant; there isn't anything to bypass. There are some tools/options that make building a FIPS-certifiable CPython easier, but we're not really in a position to test that they do give you a compliant build. Asking CPython to bypass system OpenSSL configuration, in order to support a build we don't really test in CI, is... a big thing to ask.

The CI tests are already there for the FIPS mode. If desired I can add the extra matrix test case to address this issue as an explicit check.

I'm talking about honoring settings for the NIST algorithms, and using accelerated code-paths by default. And ensuring that existing functionality usedforsecurity=True/False api of hashlib.md5 actually works correctly as documented. Without changing any other behaviour of any other algorithms. When used with openssl as a cryptographic module, or when used with generic one.

Note that building with --with-builtin-hashlib-hashes=blake2 could also be a compliance bypass. (Or are you planning to get that implementation FIPS-certified?)

FIPS is consent based. Blocking usage of unapproved algorithms is relevant within a cryptographic module only. And those that want to certify cpython they can choose themselves if they want to block blake2 usage out right, or if they want to implement a runtime FIPS indicator w.r.t. usage of blake2 (such that apps that use blake2 are indicated as running in unapproved mode) which is a possibility under FIPS certification.

cc @SethMichaelLarson

See also recent FIPS discussion.

That discussion takes a narrow view of FIPS certification. Hashes that are not used for security purposes do not need to be FIPS certified (ie. siphash, xxhash64, and so on are perfectly fine to use for pyc hashing; and for key-value dictionary implementations). As long as said things are not used to protect data at rest (e.g. file encryption) / in transit (e.g. TLS) / used for authentication (e.g. HTTP basic-auth).

@xnox
Copy link
Author

xnox commented May 7, 2024

If you believe that this is as security issue in python

it's not a CVE in a classic sense of it. As compliance is frankly arbitrary and conflicting. (i.e. FIPS wants to use FIPS algorithms, and somebody else wants to use Cammelia, and someone else wants to use GOST).

My patch here, is scoped to ensure that MD5 (and in the future SHA1) are treated fairly and correctly: honoring system configuration, using accelerated (openssl) codepath, and have usedforsecurity=True/False operate as documented.

I'm not trying to exert any opinion of how one should configure the cpython builds; nor how one should configure their openssl; or how one should configure their FIPS modules. And ensure that usedforsecurity=False works correctly in more combinations.

@encukou
Copy link
Member

encukou commented May 7, 2024

I'll need to leave this to security experts. Hopefully the discussion is useful.

One thing I don't get: why don't you load “default” in the OpenSSL config?
AFAIK, md5 works (with usedforsecurity=False only) when you use the “default+fips” providers rather than “base+fips”. I don't see a reason to use the latter, other than not wanting to use “default”.

@xnox
Copy link
Author

xnox commented May 7, 2024

I'll need to leave this to security experts. Hopefully the discussion is useful.

One thing I don't get: why don't you load “default” in the OpenSSL config? AFAIK, md5 works (with usedforsecurity=False only) when you use the “default+fips” providers rather than “base+fips”. I don't see a reason to use the latter, other than not wanting to use “default”.

Please read carefully https://github.com/openssl/openssl/blob/master/README-PROVIDERS.md and especially the note w.r.t. FIPS provider under base provider.

if no providers are loaded - a fallback one is loaded, in vanilla upstream it is the default;
if explicit providers are provided (i.e. null) only those are loaded, as per fips provider documentation it is recommended to use fips+base;
many fips distributors of openssl actually make base+fips the fallback, rather than default - such that without any config default provider is not loaded, but fips+base is instead; lots of people misconfigure this and needlessly load base+default or default+fips.

I find it sad that "base" provider is a subset of "default" provider, as that is very confusing duplication of code at runtime.

Loding default+fips provider instead of base+fips provider has 3.5MB runtime memory usage penalty in most common cases, due to loadinging/init/iterating all the algorithms as available; which are then later filtered out by the "fips=yes" query.

The cpython CI tests as implemented right now, also do not currently setup openssl.cnf in a way that is most comonly deployed by openssl-fips distributors. I.e. when compared with "canonical openssl fips" or "redhat openssl fips" or even "openssl project fips - as per recommended docs"

@encukou
Copy link
Member

encukou commented May 8, 2024

I've read the documentation, and I don't see any hint of languages (or frameworks/libraries) being expected to silently load the default provider when asked for md5.

From Python's point of view, the user has made two explicit choices:

  • Python is configured to disable md5.
  • OpenSSL is configured to disable md5.

I do not think it's Python's place to silently override these choices.

The “user” here might be a redistributor or OS author. I get that this sucks for the end user, but Python is not at fault for md5 not being available. You should complain to whoever disabled it.

Feel free to post on Discourse to seek a second opinion.

I would consider a PR that exposes OSSL_PROVIDER_load -- that is, allow the user to override system & build-time defaults with another explicit choice.

@xnox
Copy link
Author

xnox commented May 8, 2024

I've read the documentation, and I don't see any hint of languages (or frameworks/libraries) being expected to silently load the default provider when asked for md5.

From Python's point of view, the user has made two explicit choices:

  • Python is configured to disable md5.
  • OpenSSL is configured to disable md5.

I do not think it's Python's place to silently override these choices.

The “user” here might be a redistributor or OS author. I get that this sucks for the end user, but Python is not at fault for md5 not being available. You should complain to whoever disabled it.

Feel free to post on Discourse to seek a second opinion.

I would consider a PR that exposes OSSL_PROVIDER_load -- that is, allow the user to override system & build-time defaults with another explicit choice.

Ok fine.

But what about this case:

  • openssl configured in fips mode without md5 using base+fips providers only
  • python built with builtin md5 fallback
  • python today, skips openssl codepath and uses builtin md5 with usedforsecurity=True

Surely, cpython should not fallback to builtin md5 when openssl is configured in fips mode without md5. And only do so when usedforsecurity=False?

I.e. I don't see cpython today checking what the default context query string is / default intention. It is checking if "md5" is resolved or not, but it is not checking why it was not resolved.

@encukou
Copy link
Member

encukou commented May 9, 2024

That sounds more reasonable, though it might be harder to implement.
IMO, it'd be best to document to build Python without md5, and require OpenSSL, if you plan to enable FIPS mode.

@xnox
Copy link
Author

xnox commented May 9, 2024

That sounds more reasonable, though it might be harder to implement. IMO, it'd be best to document to build Python without md5, and require OpenSSL, if you plan to enable FIPS mode.

True, that blocks using md5 for secure purposes (encryption / authentication / signing / kdf / etc). But will not enable using md5 for non-security purposes (e.g. identifying all court documents that are referenced by md5) as usedforsecurity=False will be broken, unless things are done the way I am proposing it.

Separately, I find it odd that "import hashlib" tries to import and initialize "hashlib.md5". I wonder if it is time to start to require "hashlib.md5" and "hashlib.sha1" as imports.... or at least lazy inits.

@gpshead
Copy link
Member

gpshead commented May 13, 2024

@xnox
Copy link
Author

xnox commented May 13, 2024

I do not think it's Python's place to silently override these choices.

is it expected for hashlib.md5 to be always available? why is it always loaded upon hashlib import? Can it be made a lazy import? The documentation has some strong words about shipping python without hashlib.md5.

@encukou
Copy link
Member

encukou commented May 29, 2024

That's a good question. I guess CPython can go two ways:

  • Deprecate hashlib.algorithms_guaranteed, as it's definitely possible to build Python without some of the algorithms, or
  • Add strong language to --with-builtin-hashlib-hashes, saying that if you break hashlib.algorithms_guaranteed or algorithms_available, you're on the hook for explaining it to your users.

@xnox
Copy link
Author

xnox commented May 29, 2024

you're on the hook for explaining it to your users

Let me try to make this prettier, as the current traceback is very confusing and doesn't explain what is happening (it looks like python internals are broken, rather than runtime expectations do not match the ones that were assumed at build time - i.e. w.r.t. availability of md5).

@shashankram
Copy link

For FIPS, it's essential that hashlib.md5 doesn't get initialized upon import hashlib. What's the guidance to get around this and prevent this import error?
How do you get around the fact that modules may be directly consuming md5? E.g.
https://github.com/encode/httpx/blob/7c0cda153d301bde9a011e1dd7157d7e2b20889d/httpx/_auth.py#L177

@encukou
Copy link
Member

encukou commented Aug 1, 2024

That's really up to the organization that seeks FIPS certification for the system that contains the module.

AFAIK, things like “FIPS modes” are just ways to make it easier to create and audit a system that meets FIPS requirements. It's hard to give guidance for a specific module without knowing the rest of the system.

Anyway, I doubt that hashlib.md5 can't get initialized: you can still use md5 with usedforsecurity=False (but you need to audit all such usage).

@xnox
Copy link
Author

xnox commented Aug 1, 2024

For FIPS, it's essential that hashlib.md5 doesn't get initialized upon import hashlib. What's the guidance to get around this and prevent this import error? How do you get around the fact that modules may be directly consuming md5? E.g. https://github.com/encode/httpx/blob/7c0cda153d301bde9a011e1dd7157d7e2b20889d/httpx/_auth.py#L177

This is going to be a shameless plug - each distributions choose their own way. In Chainguard Python FIPS Image , I chose to implement this runtime behavior:

$ docker run -ti --entrypoint python cgr.dev/chainguard-private/python-fips:latest-dev
Python 3.12.4 (tags/v3.12.4-dirty:8e8a4ba, Jun 26 2024, 12:07:40) [GCC 13.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import hashlib
>>> hashlib.md5
<built-in function openssl_md5>
>>> hashlib.md5()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
_hashlib.UnsupportedDigestmodError: [digital envelope routines] unsupported
>>> hashlib.md5(usedforsecurity=False)
<md5 _hashlib.HASH object @ 0x7b3e6ba96930>
>>> 

@encukou
Copy link
Member

encukou commented Aug 12, 2024

That's pretty much what RHEL does too.

@xnox
Copy link
Author

xnox commented Aug 12, 2024

If that is the consensus among most Linux distributions shipping FIPS and python..... could we get this upstream into cpython? Because vendor specific behaviour, even when all of them agree in FIPS mode, is counter productive.

@encukou
Copy link
Member

encukou commented Aug 19, 2024

Chime in here: https://discuss.python.org/t/python-and-openssl-fips-mode/51389/7
Yes, I imagine that having more than one vendor work on this would make it more acceptable.
Currently it's hard to find the boundary between stuff all FIPS distros need and platform-specific quirks. (On top of the problems of a feature that benefits US government suppliers but is at least inconvenient for almost everyone else.)

xnox added a commit to xnox/microsoft-go that referenced this issue Sep 19, 2024
In many instances systemcrypto backend is tested for support of a
particular algorithm. If it is supported, it is used, otherwise
gocrypto fallback code path is used. This means effectively the
toolchain allows to use MD5 DES TripleDES Ed25519 RC4 HKDF TLS1PF
implemented with gocrypto, when FIPS module blocks these algorithms or
doesn't even support them (i.e. when only base+fips providers are
loaded).

In some cases this might be due to incorrect check and/or incorrect
runtime configuration of OpenSSL. It is very common to accidentaly
activate "default" and "fips" providers in OpenSSL at the same time -
which then exhibits odd properties. Specifically "default+fips"
providers will list that RC4 and MD5 are supported without any
property query strings. But fail at runtime when attempted to be used
with property query string set to "fips=yes".

If on the other hand "base" and "fips" providers loaded alone, RC4 and
MD5 will not be listed as runtime available, and gocrypto fallback
path may be taken by the toolchain.

A similar issue is currently also present in cpython please see
python/cpython#118224.

Note that recommended way to configure OpenSSL in fips only mode is
with base+fips providers alone - see
https://github.com/openssl/openssl/blob/master/README-PROVIDERS.md
such that default & legacy providers algorithms are not exposed at
runtime. And this is how OpenSSL is configured in FIPS mode on Ubuntu
Pro FIPS and Chainguard FIPS Images, and recommended by upstream.

Please note internally md5 is used by go coverage, meaning in
requirefips case coverage may fail to generate unless some additional
APIs are introduced to allow insecure usage of md5 (equivalent to
python's usedforsecurity=True|False flag). Or coverage ported to use
SHA256.

For a local reproducer use base+fips providers only, for example with following openssl.cnf:

```
config_diagnostics = 1
openssl_conf = openssl_init

.include /etc/ssl/fipsmodule.cnf

[openssl_init]
providers = provider_sect
alg_section = algorithm_sect

[provider_sect]
fips = fips_sect
base = base_sect

[base_sect]
activate = 1

[algorithm_sect]
default_properties = fips=yes
```

Or compile openssl without RC4 support by using 'no-rc4' configuration
option.

Signed-off-by: Dimitri John Ledkov <[email protected]>
@lVlayhem
Copy link

The solution might be to support a new compiler option to not fallback to a default MD5 implementation when passing hashlib.md5(usedforsecurity=True)

@stratakis
Copy link
Contributor

@xnox if you can please join the discourse discussion, I'd love some more feedback there to potentially figure out a solution for the FIPS related issues if that's feasible.

@xnox
Copy link
Author

xnox commented Oct 11, 2024

Ok! I had more thoughts on how to do things, and also learning approaches other systems are doing. So I think better implementation can be done which is more intuitive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants