Skip to content

Conversation

@florianschmidt1994
Copy link

@florianschmidt1994 florianschmidt1994 commented Jan 8, 2025

This PR adds a config option called COB_LOAD_GLOBAL which allows the flags of dlopen used in call.c to be changed between RTLD_GLOBAL and RTLD_LOCAL (see https://linux.die.net/man/3/dlopen)

Motivation

This flag is motivated by an ambition to be able to write a multi-threaded c program, that then loads libcob with dlmopen in one namespace per thread, effectively allowing us to call the same libcob / Cobol module from multiple threads in a "thread-safe" way. The image below shows the a very simplified intended call sequence

CanvasAsImage

Because there is a "bug" in dlopen that stops us from calling dlopen("module", RTLD_GLOBAL), this PR introduces an option to be able to change this flag to RTLD_LOCAL at runtime.

As at glibc 2.24, specifying the RTLD_GLOBAL flag when calling dlmopen() generates an error. Furthermore, specifying RTLD_GLOBAL when calling dlopen() results in a program crash (SIGSEGV) if the call is made from any object loaded in a namespace other than the initial namespace.

Docs

# Environment name:  COB_LOAD_GLOBAL
#   Parameter name:  load_global
#          Purpose:  tell the system loader to provide symbols in CALLed
#                    programs globally, allowing symbols to be found for linked
#                    libraries later
#             Type:  boolean
#             Note:  COBOL CALLs will always find symbols in already CALLed or
#                    pre-modules; this setting is mostly an advise to the system,
#                    not all systems are capable of loading libraries global/local
#          Default:  false
#          Example:  load_global true

Implementation

  • Change lt_dlopen from a macro definition to an actual function, which internally calls dlopen with the flags based on the configuration
  • Add cob_load_global / COB_LOAD_GLOBAL as a configuration
  • Add documentation to runtime.cfg about COB_LOAD_GLOBAL

@GitMensch
Copy link
Collaborator

Apart from the other PR's review to that branch, the change here makes the current scenario even more confusing.
I'd like to fix this in this PR (or up-front upstream).

After the changes:

  • instead of calling lt_dlopen we call a new static function cob_dlopen
  • this has the current code of this PR in for the case of dlopen usage
  • for the WIN32 variant it uses what is currently used there
  • for the libtool variant it uses lt_dlopenadvise() with an advise matching the runtime setting

@florianschmidt1994 can you please inspect the libtool variant (to test this you can simply undefine USE_LIBDL in config.h)?
I think I'll work on the cob_dlopen stuff late-evening.

@florianschmidt1994
Copy link
Author

florianschmidt1994 commented Jan 8, 2025

Yup, I'll check it out. Basically something like this, right?

static void* cob_dlopen(const char* filename) {
	#ifdef _WIN32
	if (x == NULL) {
		return GetModuleHandle (NULL);
	}
	return LoadLibrary(x);	
	#elif	defined(USE_LIBDL)
	int flags = cobsetptr->cob_load_global 
		? RTLD_LAZY | RTLD_GLOBAL
		: RTLD_LAZY | RTLD_LOCAL;

	return dlopen(filename, flags);
	# else
	// TODO: initialise lt_dladvise
	lt_dlopenadvise(filename, /*advise here*/);
}

@florianschmidt1994
Copy link
Author

@GitMensch I've tried out your suggestion in the latest commit and temporarily defined lt_dlopen as cob_dlopen to try it out. I've also added some indentation to the macros, does that conform with your style or should I remove them again?

Copy link
Collaborator

@GitMensch GitMensch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice update, we may be able to push that soon

Note: we still need the testcase addition - does not need to be something that benefit from this as this is hard to make portable - just do a simple CALL and CANCEL. We should have an easy one for COB_LOAD_CASE

@florianschmidt1994 florianschmidt1994 force-pushed the feature/add-dlopen-config-flag branch from f6aa350 to ab26e16 Compare January 9, 2025 09:31
@florianschmidt1994
Copy link
Author

florianschmidt1994 commented Jan 13, 2025

When developing, I've noticed some test failures, where some of them might be related to our change. On the latest commit of this repositories "main branch" (gcos4gnucobol-3.x), there are some test failures when the build is configured with --without-dl. These failures persist also with our changes applied.

Commit Hash Loader COB_LOAD_GLOBAL Test Failures
731b81a --without-dl not applicable 47 49 50 784 failed
731b81a not applicable no test failure

In the first version of this PR (bd80e8c), there is an additional test failure (781) when using COB_LOAD_GLOBAL=false, i.e. the default, and preloading where symbols are overwritten in an unexpected manner.

Commit Hash Loader COB_LOAD_GLOBAL Test Failures
bd80e8c false 781 failed
bd80e8c true no test failure
bd80e8c --without-dl false 47 49 50 781 failed
bd80e8c --without-dl true 47 49 50 failed

In history up to the latest commit (62de625), there exists a fix (see 1f4d607), the same that is already applied in WIN32 environments, to fix this.

Commit Hash Loader COB_LOAD_GLOBAL Test Failures
62de625 false no test failure
62de625 true no test failure
62de625 --without-dl false 47 49 50 failed
62de625 --without-dl true 47 49 50 failed

@florianschmidt1994 florianschmidt1994 changed the title WIP: Add a COB_LOAD_GLOBAL config option to modify dlopen behavior Add a COB_LOAD_GLOBAL config option to modify dlopen behavior Jan 13, 2025
@florianschmidt1994 florianschmidt1994 marked this pull request as ready for review January 13, 2025 10:24
@codecov-commenter
Copy link

codecov-commenter commented Feb 11, 2025

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 70.58824% with 5 lines in your changes missing coverage. Please review.

Project coverage is 67.58%. Comparing base (16e84a5) to head (3234374).
Report is 44 commits behind head on gcos4gnucobol-3.x.

Files with missing lines Patch % Lines
libcob/call.c 70.58% 2 Missing and 3 partials ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@                  Coverage Diff                  @@
##           gcos4gnucobol-3.x     #209      +/-   ##
=====================================================
- Coverage              67.85%   67.58%   -0.28%     
=====================================================
  Files                     33       33              
  Lines                  60458    60827     +369     
  Branches               15821    15896      +75     
=====================================================
+ Hits                   41026    41109      +83     
- Misses                 13567    13847     +280     
- Partials                5865     5871       +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@florianschmidt1994 florianschmidt1994 force-pushed the feature/add-dlopen-config-flag branch from 8871e45 to 3234374 Compare March 18, 2025 09:36
Copy link
Collaborator

@GitMensch GitMensch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope to reinspect that PR soon, ideally after looking over CANCEL in general.

Can you please do the minor and cosmetic changes outlined in this review and rebase afterwards?

$b for executable basename, $d for date in YYYYMMDD format, $t for time
in HHMMSS format (before, only $$ was available for pid)

** introduction of the COB_LOAD_GLOBAL boolean flag, which determines whether
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a text block with sentences, which needs to start upper-case; but maybe go with a short entry as text, then a double-colon and newline, then more explanatory text (this could stay as sentences) afterwards, in any case the full-stops should be outside of the quotes :-)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll move the full stop. As for the other parts, I can capitalize the first word, but that would break with most other blocks in NEWS, which all start with lowercase letters. Are we sure this is how we want to do it here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only sentences should start with an upper-case and be ended with a period, but you can add a short summary of that change in all lower without a period to start it (kind of a headline)

@GitMensch
Copy link
Collaborator

Friendly ping @florianschmidt1994

@GitMensch
Copy link
Collaborator

@florianschmidt1994 thanks for the updates.

Please use the CI generated tarball, to check if your original setup that needed that change (a java starter running the same process more than once) still works. Deep-link to the CI generated source distribution: https://github.com/OCamlPro/gnucobol/actions/runs/16163870452/artifacts/3493326773

Also: can you please create a minimal runner in C that we can add to the testsuite that mimics your java starter?
Currently we only verified that the new setting works with both options, but we don't have a test case that only passes with one of those...

@florianschmidt1994 florianschmidt1994 force-pushed the feature/add-dlopen-config-flag branch from 6d37deb to 23415a2 Compare August 14, 2025 12:53
@GitMensch
Copy link
Collaborator

The "coverage" titled test fails for C89 compat reasons:


+caller.c:5:22: warning: extra tokens at end of #include directive
+    5 | #include <pthread.h> // For POSIX threads
+      |                      ^
+caller.c:6:22: warning: extra tokens at end of #include directive
+    6 | #include <dlfcn.h>   // For dynamic library loading
+      |                      ^
+caller.c:13:1: error: C++ style comments are not allowed in ISO C90
+   13 | // Function to call COBOL shared library functions
+      | ^
+caller.c:13:1: note: (this will be reported only once per input file)
+caller.c: In function 'callCobol':
+caller.c:34:30: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement]
+   34 |     RESOLVE_SYMBOL(cob_init, cob_init_func);
+      |                              ^~~~~~~~~~~~~
+caller.c:25:5: note: in definition of macro 'RESOLVE_SYMBOL'
+   25 |     type symbol = (type) dlsym(libCobHandle, #symbol); \
+      |     ^~~~
+caller.c:35:33: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement]
+   35 |     RESOLVE_SYMBOL(cob_resolve, cob_resolve_func);
+      |                                 ^~~~~~~~~~~~~~~~
+caller.c:25:5: note: in definition of macro 'RESOLVE_SYMBOL'
+   25 |     type symbol = (type) dlsym(libCobHandle, #symbol); \
+      |     ^~~~
+caller.c:36:30: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement]
+   36 |     RESOLVE_SYMBOL(cob_tidy, cob_tidy_func);
+      |                              ^~~~~~~~~~~~~
+caller.c:25:5: note: in definition of macro 'RESOLVE_SYMBOL'
+   25 |     type symbol = (type) dlsym(libCobHandle, #symbol); \
+      |     ^~~~
+caller.c:38:5: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement]
+   38 |     hello_world_func hello_world = cob_resolve("prog");
+      |     ^~~~~~~~~~~~~~~~
+caller.c: In function 'main':
+caller.c:60:5: error: 'for' loop initial declarations are only allowed in C99 or C11 mode
+   60 |     for (long i = 0; i < numThreads; i++) {
+      |     ^~~
+caller.c:60:5: note: use option '-std=c99', '-std=gnu99', '-std=c11' or '-std=gnu11' to compile your code
+caller.c:67:14: error: conflicting types for 'i'; have 'int'
+   67 |     for (int i = 0; i < numThreads; i++) {
+      |              ^
+caller.c:60:15: note: previous definition of 'i' with type 'long int'
+   60 |     for (long i = 0; i < numThreads; i++) {
+      |               ^
+caller.c:67:5: error: 'for' loop initial declarations are only allowed in C99 or C11 mode
+   67 |     for (int i = 0; i < numThreads; i++) {
+      |     ^~~
+cc1: some warnings being treated as errors

To reproduce that locally compile it with gcc -std=c89 -Werror=declaration-after-statement - in general you want variables to start at the begin of a block and use */ block comments */ .

@GitMensch
Copy link
Collaborator

and the MSYS1 entries error because of missing pthread support:

+caller.c:5:10: fatal error: pthread.h: No such file or directory
+    5 | #include <pthread.h> // For POSIX threads
+      |          ^~~~~~~~~~~

It is fine to skip this test on MSYS1. To do so add

# skip test for MSYS1 - we don't have pthread there
# FIXME: make that test more portable
AT_SKIP_IF([test "$OSTYPE" = "msys"])

at the start of the test.

@florianschmidt1994
Copy link
Author

Thanks for the hints @GitMensch. I've also disabled the tests on macOS, as there is no dlmopen there.

Test runs on Windows (e.g. https://github.com/OCamlPro/gnucobol/actions/runs/17068765301/job/48392123249?pr=209) say caller.c not found when trying to execute.

AT_CHECK([$COB_CC $COB_CFLAGS caller.c -ldl $COB_LIBS -o caller], [0], [], [])

Do you know if we might need to call the compiler arguments differently to make it more portable across platforms?

Also there's an issue with help2man in the Windows MSYS2 prepare workflow (https://github.com/OCamlPro/gnucobol/actions/runs/17069516851/job/48394605655?pr=209) that I don't quite understand, yet. Any ideas?

@florianschmidt1994 florianschmidt1994 force-pushed the feature/add-dlopen-config-flag branch 2 times, most recently from 22ff3e9 to eedf0d8 Compare August 19, 2025 13:49
@GitMensch
Copy link
Collaborator

GitMensch commented Aug 19, 2025

Thanks for the hints @GitMensch. I've also disabled the tests on macOS, as there is no dlmopen there.

Test runs on Windows (e.g. https://github.com/OCamlPro/gnucobol/actions/runs/17068765301/job/48392123249?pr=209) say caller.c not found when trying to execute.

AT_CHECK([$COB_CC $COB_CFLAGS caller.c -ldl $COB_LIBS -o caller], [0], [], [])

Use AT_CHECK([$COBC -x -ldl -lpthread -caller.c], [0], [], []) there (drop pthread if you think that's really not necessary).
The Windows tests likely haven't setup those vars correctly for some MSYS2 translation reason.

@florianschmidt1994 florianschmidt1994 force-pushed the feature/add-dlopen-config-flag branch from 4743ed6 to b93afd1 Compare August 27, 2025 14:49
@florianschmidt1994 florianschmidt1994 force-pushed the feature/add-dlopen-config-flag branch from 0881a6a to 7ecb2a9 Compare November 6, 2025 10:42
@GitMensch GitMensch changed the base branch from gcos4gnucobol-3.x to gitside-gnucobol-3.x November 27, 2025 09:17
@GitMensch
Copy link
Collaborator

@florianschmidt1994 Can you please rebase?

florianschmidt1994 and others added 27 commits December 8, 2025 09:51
using the more important part "default changed" first, more explanation
…pen test if dependencies do not exist

Signed-off-by: Oguzcan Kirmemis <[email protected]>
…BAL dlmopen test if dependencies do not exist"

This reverts commit 9a53df2.
@florianschmidt1994 florianschmidt1994 force-pushed the feature/add-dlopen-config-flag branch from 4f13276 to 49a99a8 Compare December 8, 2025 08:51
test cleanup
@oguzcankirmemis
Copy link

just adding myself to the thread to get notifications, @GitMensch you can ofc ping me if we need any further changes here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants