-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Add C++ modules support #2291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add C++ modules support #2291
Conversation
|
Great work. I also thought about this. The approach is very similiar to what was done to https://github.com/nlohmann/json with nlohmann/json#4799. Then, I noticed ... Is this temporary approach with a file with This approach is not using modules natively but rather as an interface to the original way. Does this method work without disadvantages? |
|
This is the best way to my knowledge to support modules on top of a header-only or header/source library, allowing continued support for older versions while providing newer features as an option. I'm not aware of any disadvantages to it besides a being additional translation unit to compile, but if I am wrong please correct me. The only glaring difference in API is that detail symbols are hidden as they are not exported, but in my opinion that's probably better not to expose detail symbols and flood IDE suggestions with implementation details. |
|
What about compiled libraries? Is it possible to have the traditional method and modules installed in parallel? I am thinking of repositories that ship compiled This is relevant here, https://aur.archlinux.org/packages/cpp-httplib-compiled . |
|
Yes I believe it's possible to use shared/static libraries with modules, all of my modular projects compiled to shared libraries that an executable consumes |
|
@mikomikotaishi, thanks for the fine pull request! It's fantastic, but my concern is that someone needs to update @sum01 @jimmy-park @Tachi107 do you have any thought about this pull request? |
|
I could create a Python script, or some other kinds of automated means of updating, which you could run every time it is updated. Until then I would be OK with maintaining this file, as it is a simple process. Such a script would probably comb through the file and add any symbols not part of a detail or internal namespace, or prefixed with an underscore, etc. However, I am curious why it is not feasible to update the file manually. In case it isn't clear how, one can update the file by adding a |
|
I have also seen some repositories use bots to push some commits too. Potentially one such bot could be set up to automatically populate the module with new changes each time there is a mismatch. I don't know anything about how to set this up, but I have seen this before and it could potentially be a solution (but I think the simplest one is just to run a Python script each time any update to the library happens). |
|
Anyway, I think this could be one such way of automatically updating the module. |
|
@mikomikotaishi thanks for the additional explanation. I am ok with the following your suggestion.
We could automatically generate |
|
OK, that makes sense to me. (I don't know anything about how to run GitHub Actions or write scripts for it however, so I'm afraid the most I can do is create a script for this.) |
|
I'm not sure why there were failing workflows as I didn't change anything in the core library |
|
Never mind, it seems the failing CI is happening upstream too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this GitHub Action. But after further consideration, I now think this process should be done in CMakeLists.txt. You could generate modules/httplib.cppm under this line
Line 216 in 87c2b4e
| if(HTTPLIB_COMPILE) |
We actually do the similar to generate httplib.h and httplib.cc with split.py and put them in a distribution package. By doing this, we no longer need to keep modules/httplib.cppm in this repository. It will be created on the fly only when necessary.
@sum01 @jimmy-park Is my comment above correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be difficult to parse an especially large file without syntax analysis or parsing library. I don't have a lot of experience with this sort of thing, and because the header contains things like method definitions outside of the class it seems to be a very complex task. Then including a C++ parser would require a dependency even for a source generation script which would be additional bloat.
I guess this could be done by attaching to the header file that split.py generates, but I haven't actually seen what that looks like.
As for dynamically generating the module, this was raised before on the Dear ImGui library which currently does something like this. I think this is a bad choice for module API (having a static file is obviously best for consumers to be able to just read it from a distance without additional interaction, i.e. a "what-you-see-is-what-you-get" sort of API), but ultimately as you are in charge I'll look into this if you prefer it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm looking more into this and I think if we want to allow CMake to compile the module if it's generated into an "out" directory, we have to force the out directory to be named "out" so that it can be written into the CMakeLists.txt script
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyway @yhirose do you have any opinion on the output directory having a hard-coded name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sum01 @jimmy-park @Tachi107 could you please answer this question?
|
If the file can be run during the build process (and if the output consists of machine-generated files, it should *only* run during build time), then the destination directory should be configurable (maybe defaulting to the the current working directory). Even better, if the output is a single file, the script should allow the user to specify the output file itself (full path).
This is because downstreams (like Debian, which is what I maintain the meson build scripts for) may have some requirements on where build products should be stored.
|
|
@Tachi107 CMake needs to know what the output directory is ahead of time to compile the module. How do you propose to solve this? |
|
@yhirose I think there is one possible solution to allow both the directory to be user-specified while still supporting CMake module building, which is probably just to have the Python script generate the CMake file too. I don't know if this is too convoluted or awkward of a design though, so please do tell me your thoughts. |
|
Hi Miko,
On Tue Dec 9, 2025 at 10:46 PM CET, Miko wrote:
@Tachi107 CMake needs to know what the output directory is ahead of time to compile the module. How do you propose to solve this?
You should use CMake's add_custom_command() function to invoke the
script and pass it the output file path
https://cmake.org/cmake/help/latest/command/add_custom_command.html
Meson has a similar function, but I can do that myself after this gets
merged.
|
Tachi107
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some feedback now that I have been able to look at the whole code :)
| target_sources(httplib_module | ||
| PUBLIC | ||
| FILE_SET CXX_MODULES FILES | ||
| httplib.cppm | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This requires CMake 3.28, but this project only requires CMake 3.0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
C++ modules require a later version of CMake, so activating modules should require the user has the correct version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If so, the proper minimum required version and/or policy behaviour should be declared in the relevant portions of the code.
I don’t use or maintain the CMake code of this project, so I’ll defer to @sum01’s feedback.
| option(HTTPLIB_USE_NON_BLOCKING_GETADDRINFO "Enables the non-blocking alternatives for getaddrinfo." ON) | ||
| option(HTTPLIB_REQUIRE_ZSTD "Requires ZSTD to be found & linked, or fails build." OFF) | ||
| option(HTTPLIB_USE_ZSTD_IF_AVAILABLE "Uses ZSTD (if available) to enable zstd support." ON) | ||
| option(HTTPLIB_BUILD_MODULES "Build httplib modules (requires HTTPLIB_COMPILE to be ON)." OFF) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would this need to be an option? Does it add any other dependency? Cannot it be always be built, without an option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to leave it as opt-in because C++ module support is compiler dependent. Most newer compiler versions offer it, but older ones do not. The build could break if we force it on older toolchains.
update_modules.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which Python version does this script require? Is there a reason why, for example, Optional[str] is used instead of the built-in str | None introduced by PEP 604?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am just used to writing Optional[T], if T | None is better I can replace it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyway I believe this requires a minimum of Python 3.5 now
update_modules.py
Outdated
| def get_git_diff(file_path: str, base_ref: str = "HEAD") -> Optional[str]: | ||
| """ | ||
| Get the git diff for a specific file. | ||
| @param file_path Path to the file to diff | ||
| @param base_ref Git reference to compare against (default: HEAD) | ||
| @return The git diff output, or None if error | ||
| """ | ||
| try: | ||
| result: CompletedProcess = subprocess.run( | ||
| ["git", "diff", base_ref, "--", file_path], | ||
| capture_output=True, | ||
| text=True, | ||
| check=True | ||
| ) | ||
| return result.stdout | ||
| except CalledProcessError as e: | ||
| print(f"Error getting git diff: {e}", file=sys.stderr) | ||
| return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requiring a Git repository to built the modules is something many users would prefer to avoid. For example, users building from the release tarballs produced by GitHub will not have any Git history, and this would not work.
A way to resolve this would be to always generate the modules file from scratch, so that no diffing operations are required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was the suggestion @yhirose gave, and while I personally would prefer to have a static module for the purpose of having an obvious or clear module API (rather than behind a script) this is the approach I'm taking because the author does not maintain the module file or build system scripts.
That being said, this Python script is only supposed to be run by a GitHub Action script, so it does rely on the assumption of Git setup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mikomikotaishi you probably misunderstood what I mentioned in the following comment. I said exactly the same thing as @Tachi107 mentioned above: "Could you please simply regenerate httplib.cppm from scratch with the latest httplib.h? It will make this script much simpler".
#2291 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is what I was saying, we would be generating it on demand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some problems I am having with implementing this, for example having to insert the #ifdefs to add to the generated module all of the SSL classes. Do you have any suggestions for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the confusion... What I meant by 'from scratch' was to create httplib.cppm entirely, rather than just updating a portion of it. So we would like you not to use get_git_diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I meant that I would be regenerating it entirely. My only problem is how to generate the symbols with the aforementioned.
|
@mikomikotaishi Thanks for your willingness. However, I am now thinking of closing this pull request for the maintenance reason I commented on before. I would like to confirm that you will commit to updating modules/httplib.cppm whenever new external symbols are added or existing symbols are removed/renamed by contributors. I am not going to wait for this update to happen. This update must be done by you before I release a new version. Otherwise, httplib.h will be shipped with an incompatible, out-of-date httplib.cppm. If you accept this responsibility from now on, I will reconsider it. |
|
I will commit to this until I get a working script. (Seeing as you do ping contributors to request reviews, I hope it's not a steep ask for you to also ping me for symbol additions as well.) I'm sure it is possible - I have just had trouble with it. I can probably make a draft PR for that for the time being. I am curious if you have any objection to me using |
|
Actually, we could probably do this: module;
// include all included headers seen in <httplib.h>,
// to prevent them from being re-included in the export block
#include <...>
export module httplib;
export {
#include "../httplib.h"
}This way is far simpler, only problem is that this is sort of a hacky way to do it and also ends up exposing the |
|
Until then, this is probably the best way to do it with no maintenance overhead, so I've gone ahead and pushed that, instead. @yhirose could you re-review the current state and tell me your thoughts? |
|
On Mon Dec 29, 2025 at 4:40 AM CET, Miko wrote:
I am curious if you have any objection to me using `libclang` for
this? It does introduce a new dependency but it can be made
conditional. @Tachi107 seeing as you wrote this script originally, do
you have any objections to that?
I did not wrote that script originally :)
But, for the record, adding one level of indentation for the whole
split.py file just to have the code in a "main" function, with it being
the only one, doesn't look like a cleanup ;)
Don't fall in the trap of following best pracises just for the sake of
it.
On Mon Dec 29, 2025 at 4:46 AM CET, Miko wrote:
Actually, we could probably do this:
```cpp
module;
// include all headers seen in <httplib.h>,
// to prevent them from being re-included
Is this needed? Don't they have include guards already?
export module httplib;
export {
#include <httplib.h>
}
```
This way is far simpler, only problem is that this is sort of a hacky
way to do it and also ends up exposing the `httplib::detail::*`
symbols, which is preferably not exposed.
Couldn't you include the header produced by split.py, which contains
less stuff?
Still, this long thread leaves me with a question: is all this mess
worth it? How much does using C++20 modules instead of the header
produced by split.py improve things? Because if the improvement is
there it may make sense to pursue this. But if not, why bother?
|
|
Yes, we could probably use that header instead, and it probably is better to do that if we want to install the modules component from a package. As for why I am even proposing adding modules, it is because it is cheaper to use them by building once and then importing, than including multiple times and then rebuilding the contents each time (even after splitting the contents the header still has a couple thousand lines I believe). As another person mentioned, one could use precompiled headers, but they are not an official language feature. Also, because those headers have include guards I am including them first in the global module fragment before including httplib in the export block, to ensure that module httplib doesn't export a symbols like std::* symbols or structs from C POSIX headers, etc. |
|
On Mon Dec 29, 2025 at 1:17 PM CET, Miko wrote:
As for why I am even proposing adding modules, it is because it is
cheaper to use them by building once and then importing, than
including multiple times and then rebuilding the contents each time.
I'm aware of that. It is generally true. But in practise, how much does
this improve the building time for this library?
|
|
I have never benchmarked it. But I can say confidently that expecting a compilation time improvement is reasonable in importing it 100 times over including it 100 times. Even then, I don't think the current solution is a huge mess. I haven't changed the library itself or introduced peculiar artefacts into the header, the most extent this PR would have is
|
|
If the following simple approach works, I’m fine with it. I don’t mind exposing the module;
// include all included headers seen in <httplib.h>,
// to prevent them from being re-included in the export block
#include <...>
export module httplib;
export {
#include "../httplib.h"
}Also, please don’t forget to remove any files that are no longer needed, revert the changes in README.md, and add any necessary comments only to the CMake files. (I intentionally avoid mentioning any build systems in the README, as they are optional and not officially supported by cpp-httplib itself, but are kindly maintained by excellent contributors. Thanks for your understanding.) |
|
I did make this change, but we have to re-open the PR for it to appear here. I will update the README as requested. |
5c0e60e to
b3963ad
Compare
|
I've removed the changes to the README, but I did notice that the README mentions how to use the script to split the program into |
|
What if we put all detail namespaces inside an
namespace httplib {
// ...
#ifndef HTTPLIB_MODULE_HIDE_DETAILS
namespace detail {
// details here
}
#endif
// ...
}
module;
#define HTTPLIB_MODULE_HIDE_DETAILS
// all httplib includes
#include <...>
export module httplib;
export {
// Because HTTPLIB_MODULE_HIDE_DETAILS is defined, httplib::detail::* symbols aren't exported
#include "../httplib.h"
}@yhirose do let me know what you think, I think this wouldn't any more difficult to maintain moving forward, we would just have to ensure that |
|
Thanks, will review tomorrow |
|
@mikomikotaishi Please don't change httplib.h or README.md at all. Instead, just focus on creating the simpler httplib.cppm, making the smallest possible changes and adding only the minimum necessary usage to the CMake-related files. Thank you. |
|
Then the PR is basically in a finished state. |
I think the build script itself is fine. Meson has |
| install(TARGETS httplib_module | ||
| EXPORT ${PROJECT_NAME}Targets | ||
| LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR} | ||
| ARCHIVE DESTINATION ${CMAKE_INSTALL_LIBDIR} | ||
| RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR} | ||
| FILE_SET CXX_MODULES DESTINATION ${CMAKE_INSTALL_INCLUDEDIR}/httplib/modules | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused: what are you installing here? The httplib.cppm file, or the built module interface?
In any case, creating a new httplib_module target feels wrong. The module should be added to the already-existing target, so that consumers of httplib::httplib automatically get the modules without having to chose one target or the others.
I know I asked already, but I guess this installes the .cppm file, and not the built module. It'd be nice to also install the built module interface, via the CXX_MODULES_BMI argument of the install(TARGETS) function.
Also, since this code will only run on recent CMake versions, all the various ARCHIVE/LIBRARY/... DESTINATION ceremony is unnecessary; they all have reasonable default values (and already use GNUInstallDirs).
Lastly, it might be easier to do all this in the top-level CMakeLists.txt file. Since pretty much all CMake code is already there (apart from tests), these 20-ish lines of code might as well be put there too.
| * Include all headers in httplib.h, to prevent them from being re-exported | ||
| * in the export block. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will easily get out-of-sync with the main header file. Can't these includes and defines be sourced and generated from the httplib.h file directly, during the configure or build step?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll do this when I work on using the splitting script to use the reduced header.
| * but it will have to do until we figure out how to use a script to do it. | ||
| */ | ||
| export { | ||
| #include "../httplib.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned earlier, would it make sense to change this to the split .h, with less implementation details exposed, or does it require the full header?
In case the smaller header can be used instead, it might make sense to make this file a template httplib.cppm.in and configure it from CMake by including the header produced my execute_process() (which should really be changed to add_custom_command(OUTPUT), but that's a separate issue) via @VAR@ substitutions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to do this at some point with the smaller header, seeing as the the library has to be compiled anyway if the module has to be compiled.
969ea3e to
6da7f0c
Compare
This pull request adds support for C++20 modules through CMake. It is enabled by the
HTTPLIB_BUILD_MODULESoption (which requiresHTTPLIB_COMPILEto be enabled, though it probably doesn't have to - I only forced this requirement because it seems to make the most sense to force the library to compile if modules are to be compiled).