@@ -1320,6 +1320,280 @@ indirectly imported internal partition units are not reachable.
13201320The suggested approach for using an internal partition unit in Clang is
13211321to only import them in the implementation unit.
13221322
1323+ Using Clang Module Map to Avoid mixing #include and import problems
1324+ -------------------------------------------------------------------
1325+
1326+ .. note ::
1327+ Discussion in this section is experimental.
1328+
1329+ Problems Background
1330+ ~~~~~~~~~~~~~~~~~~~
1331+
1332+ As discussed before, the redeclaration in different TU is one of the major problems
1333+ of using modules from the perspective of the compiler. The redeclaration pattern
1334+ is a major trigger of compiler bugs. And even if the compiler accepts the redeclaration
1335+ pattern as expected, the compilation performance will be affected too.
1336+
1337+ e.g,
1338+
1339+ .. code-block :: c++
1340+
1341+ // a.h
1342+ #pragma once
1343+ class A { ... };
1344+
1345+ // a.cppm
1346+ module;
1347+ #include "a.h"
1348+ export module a;
1349+ export using ::A;
1350+
1351+ // a.cc
1352+ import a;
1353+ #include "a.h"
1354+ A a;
1355+
1356+ Here in ``a.cc ``, we have redeclaration for ``A ``, one from ``a.cppm `` and one from ``a.cc ``
1357+ itself.
1358+
1359+ To avoid the redeclaration pattern, in previous section, we suggested users to comment
1360+ out thirdparty headers manually.
1361+
1362+ And here we will introduce another approach to avoid such redeclaration pattern by using
1363+ clang module map.
1364+
1365+ Clang Module Map Background
1366+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~
1367+
1368+ Clang Module Map is a feature of Clang Header Modules. See `Clang Module <Modules.html >`_
1369+ for full introduction of Clang Header Modules. Here we would only introduce Clang Header
1370+ Modules to make this document self contained.
1371+
1372+ Clang Implicit Header Modules
1373+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1374+
1375+ In Clang Implicit Header Module mode, Clang will read the module map and compile the
1376+ header in the module map into a module file and use the module file automatically.
1377+ This sounds very nice. But due to the complexity, this is not so wonderful in practice.
1378+ Clang has to compile the same header in different preprocessor context into
1379+ different module file for correctness conservatively. Then this may trigger the
1380+ redeclaration in different TU problems. So that the user of implicit header modules
1381+ has to design a module system bottom up carefully. And clang implicit header module
1382+ `has many issues with soundness and performance due to tradeoffs made for module
1383+ reuse and filesystem contention
1384+ <https://discourse.llvm.org/t/clang-modules-build-daemon-build-system-agnostic-support-for-explicitly-built-modules> `_.
1385+
1386+ Clang Explicit Header Modules
1387+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1388+
1389+ Clang explicit header modules offloads the job of creating and managing module files
1390+ to the build system. Given the C++20 modules and clang header modules actually share the
1391+ same underlying implementation, it is actually possible to reuse the interface of clang module
1392+ map for C++20 named modules.
1393+
1394+ Technically, Clang Explicit Header Modules may be able to solve the redeclaration problem.
1395+ For the above example,
1396+
1397+ e.g,
1398+
1399+ .. code-block :: c++
1400+
1401+ // a.h
1402+ #pragma once
1403+ class A { ... };
1404+
1405+ // a.cppm
1406+ module;
1407+ #include "a.h"
1408+ export module a;
1409+ export using ::A;
1410+
1411+ // a.cc
1412+ import a;
1413+ #include "a.h"
1414+ A a;
1415+
1416+ The build system can build the header into a module file and use it in both ``a.cppm `` and ``a.cc ``.
1417+ Then there is no redeclaration in the example. All the declaration of ``class A `` come from the
1418+ synthesized TU ``a.h ``.
1419+
1420+ But there are problems: (1) the build system needs to support clang explicit module.
1421+ (2) The interaction between clang named modules and clang header modules are theoriticall fine but
1422+ not verified in practice. And also the document itself is about standard C++ modules, so we won't
1423+ expand here.
1424+
1425+ Examples
1426+ ~~~~~~~~
1427+
1428+ To use Clang Module Map for C++20 Named Modules, end users have to wait for the support
1429+ from build systems. Here we ignore the build systems to help users to understand the
1430+ mechanism.
1431+
1432+ Here is an example of using clang module map to replace a header to an import of a module.
1433+
1434+ .. code-block :: c++
1435+
1436+ // a.h
1437+ #pragma once
1438+ static_assert(false, "don't include a.h");
1439+
1440+ // main.cpp
1441+ #include "a.h"
1442+ int main() {
1443+ return 0;
1444+ }
1445+
1446+ // a.cppm
1447+ module;
1448+ #include <iostream>
1449+ export module a;
1450+ struct Init {
1451+ Init() {
1452+ std::cout << "Module 'a' got imported" << std::endl;
1453+ }
1454+ };
1455+ Init a;
1456+
1457+ // a.cppm.modulemap
1458+ module a {
1459+ header "a.h"
1460+ }
1461+
1462+ Then invoke Clang with:
1463+
1464+ .. code-block :: console
1465+
1466+ $ clang++ -std=c++20 a.cppm -c -fmodule-output=a.pcm -o a.o
1467+ $ clang++ -std=c++20 main.cpp -fmodule-map-file=a.cppm.modulemap -fmodule-file=a=a.pcm a.o -o main
1468+ $ ./main
1469+ Module 'a' got imported
1470+
1471+ We can find that the header file ``a.h `` is not included actually (otherwise the compilation should fail due to the static assert).
1472+ And it imports the module ``a `` and then the varaible in module ``a `` got initialized.
1473+
1474+ The secret comes from the flag ``-fmodule-map-file=a.cppm.modulemap ``, the content of ``a.cppm.modulemap `` says:
1475+ map the #include of ``a.h `` to the import to module ``a ``. Then when the compiler sees ``#include "a.h" ``, the compiler
1476+ won't include ``a.h `` actually but tries to import the module ``a ``. And the from the command line ``-fmodule-file=a=a.pcm ``,
1477+ the compiler get the module file of module ``a ``, then module file of module ``a `` get imported and the inclusion of ``a.h ``
1478+ is skipped.
1479+
1480+ Then we can try to use the mechanism to avoid redeclaration pattern for header wrapping modules.
1481+
1482+ .. code-block :: c++
1483+
1484+ // a.h
1485+ #pragma once
1486+ class A { ... };
1487+
1488+ // a.cppm
1489+ module;
1490+ #include "a.h"
1491+ export module a;
1492+ export using ::A;
1493+
1494+ // a.cc
1495+ import a;
1496+ #include "a.h"
1497+ A a;
1498+
1499+ // a.cppm.modulemap
1500+ module a {
1501+ header "a.h"
1502+ }
1503+
1504+ Similarly, when we compile ``a.cc ``, if we add the flag ``-fmodule-map-file=a.cppm.modulemap ``, the compiler
1505+ will map the inclusion of ``a.h `` to the import of module ``a ``. And the module ``a `` is already imported.
1506+ So we avoid the redeclaration of class ``A `` in ``a.cc ``.
1507+
1508+ An imaginable problem with this approach maybe the hidden inclusion. e.g,
1509+
1510+ .. code-block :: c++
1511+
1512+ // b.h
1513+ #pragma once
1514+ struct B {};
1515+
1516+ // a.h
1517+ #pragma once
1518+ #include "b.h"
1519+ struct A { B b; };
1520+
1521+ // b.cppm
1522+ export module b;
1523+ export extern "C++" struct B { };
1524+
1525+ // a.cppm
1526+ export module a;
1527+ import b;
1528+ export extern "C++" struct A { B b; };
1529+
1530+ // test.cc
1531+ import a;
1532+ #include "a.h"
1533+ A a;
1534+ B b;
1535+
1536+ // a.cppm.modulemap
1537+ module a {
1538+ header "a.h"
1539+ }
1540+
1541+ // b.cppm.modulemap
1542+ module b {
1543+ header "b.h"
1544+ }
1545+
1546+ The example is valid if we don't use the module map:
1547+
1548+ .. code-block :: console
1549+
1550+ $ clang++ -std=c++20 b.cppm -c -fmodule-output=b.pcm -o b.o
1551+ $ clang++ -std=c++20 a.cppm -c -fmodule-output=a.pcm -fmodule-file=b=b.pcm -o a.o
1552+ $ clang++ -std=c++20 test.cc -fmodule-file=a=a.pcm -fmodule-file=b=b.pcm -fsyntax-only
1553+
1554+ But if we enable the module map, the example is invalid:
1555+
1556+ .. code-block :: console
1557+
1558+ $ clang++ -std=c++20 test.cc -fmodule-map-file=a.cppm.modulemap -fmodule-file=a=a.pcm -fmodule-map-file=b.cppm.modulemap -fmodule-file=b=b.pcm -fsyntax-only
1559+ test.cc:4:1: error: declaration of 'B' must be imported from module 'b' before it is required
1560+ 4 | B b;
1561+ | ^
1562+ b.cppm:2:28: note: declaration here is not visible
1563+ 2 | export extern "C++" struct B { };
1564+ | ^
1565+ 1 error generated.
1566+
1567+ A suggested convention for end users and build systems
1568+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1569+
1570+ As said, the build system is a vital role in this strategy.
1571+ However, for build systems, it is not easy to support clang explicit header modules or
1572+ support the module map with C++20 named modules generally. The complexity for build system
1573+ won't be less than supporting C++20 named modules.
1574+
1575+ So here we suggest a convention between end users and build systems to ease the implementation
1576+ burden of build systems and help end users to avoid the redeclaration problem from mixing #include
1577+ and import.
1578+
1579+ For end users who is the author of header based library offering named module wrappers, The header's interface
1580+ should be a subset of the module interface excluding user-facing macros.
1581+
1582+ * Extract all user facing headers into a single header file. Since C++20 named modules
1583+ * For each named module interface, provide a module map file to map the interface headers to the named module. The name of the module map should be the name of the module interface unit plus ``.modulemap ``.
1584+
1585+ The number of the module map may not be a lot sicne this is still a
1586+ header based library.
1587+
1588+ For build systems,
1589+
1590+ * For each Translation Units, if the unit doesn't import any named modules, stop. This is not what we want.
1591+ * If the TU imports named module, for all imported named module unit, look up for the module map file in the same path of the imported module unit with the name of the module unit plus ``.modulemap ``. e.g., if the name of the module unit is ``a.cppm ``, we should lookup for ``a.cppm.modulemap ``.
1592+ * For the found module map, pass ``-fmodule-map-file=<module_map_file_path> `` to the clang compiler.
1593+
1594+ The point of the approach is, the build system can reuse the result of C++20 named modules to manage depencies. So that
1595+ the implementation burden of build systems is largely reduced.
1596+
13231597Known Issues
13241598------------
13251599
0 commit comments