Skip to content

Conversation

@johroj
Copy link

@johroj johroj commented Jan 14, 2026

Purpose

The current situation is that Base.PCRE uses ccall((sym, "libpcre2-8"),...). On windows, this only works fine when using Julia.exe or on systems where libpcre2-8 is available on PATH (such as when having julia installed). Embedding and JuliaC dlls on windows however does not detect the bundled libpcre2-8. I have reported this issue w.r.t. JuliaC in #60496, #60406 and JuliaLang/JuliaC.jl#83 while understanding the root cause better. #52007, #52205, MilesCranmer/PySR#566 and MilesCranmer/PySR#636 are likely caused by the same core issue.

To solve this, the goal is to use BundledLazyLibraryPath in Base.PCRE. However, the latter is used to for parsing regular expressions, which is necessary for building the loading path used in Libdl. So currently Base.PCRE must be able to load libpcre2-8 before BundledLazyLibraryPath may be used. This catch 22 is discussed in #60496 and it was concluded that revising path.jl was the best way forward.

This pull request contains a rework of path.jl to avoid regular expressions, so that all path operations are available before libdl.jl is loaded. With this in place, some additional reorganization of the loading order in Base.jl are provided. I'm open for proposals on better ways to organize these changes, but current state is sufficient for solving the main problem: Windows dlls built with JuliaC now works on systems without the julia bin-folder on PATH.

Description of Changes

Loading order in Base.jl

* `libc.jl` loaded before `regex.jl`. * Add `path.jl` earlier, instead of being included in `filesystem.jl`. `module Filesystem` is now defined in `path.jl` and the rest is added in `filesystem.jl`. In the end, the content of `Filesystem` should be unaffected. * Similarly, extract `iswindows`, `isunix` etc. into a new file `osinfo.jl` which also defines `module Sys` and is loaded early. These operations are used frequently in `path.jl` and `libdl.jl`. The rest of `Sys` is added later. * Some minor changes to other files. Update: This PR will only cover the updated path code.

Path.jl modifications

All regular expressions in path.jl are replaced with corresponding string operations, with the goal of not altering any behavior whatsoever. The differences between windows/unix is now captured by the function isseparator (and different joinpath implementations, which I did not modify).

I would like to stress that I have (NB: on windows) tested all algorithms exhaustively using sufficiently long strings comprising applicable chars like ('a', '\alpha', ':', '.', '/', '\n') to ensure consistency with the regex implementations. It was indeed tempting to fix some obvious bugs, but I only added test cases to capture that this is the current behavior and commenting that the behavior is questionable.

Performance should not be degraded. With simple tests like @btime joinpath("foo", "bar"), performance is in fact improved by at least a factor of 2 everywhere. As extreme examples, splitdrive("foo/bar") goes from 250 to 12 ns and isabspath("foo/bar") from 100 to 8 ns (also just tested on windows).

To do

Trimming

I have verified that this PR so far solves the issue when using JuliaC with trim = "no". But I cannot find a way to make dlopen(PCRE_LIB) precompile when trimming; I constantly run into unresolved calls. If dlopen is not called in precompilation, there are runtime errors instead. I do not understand the mechanisms involved in making a type-instable LazyLibrary compile statically well enough to troubleshoot this further. Can I ask for some help here? #60496 (comment).

Other changes

As already mentioned, I am very open for proposals on how to improve this. The code has already served its purpose by proving that these changes were sufficient for fixing the JuliaC issue.

Should we use only some of the changed code in path.jl? Is there some better way to handle file structure & loading order changes in Base.jl?

@johroj johroj changed the title Libdl in pcre2 Load libpcre2-8 with Libdl to solve windows embedding issues Jan 14, 2026
@topolarity
Copy link
Member

topolarity commented Jan 14, 2026

Thanks again for taking up this initiative.

I would like to stress that I have (NB: on windows) tested all algorithms exhaustively using sufficiently long strings comprising applicable chars like ('a', '\alpha', ':', '.', '/', '\n') to ensure consistency with the regex implementations

Was testing exhaustive or random? Do you have a test script that I can run as well as part of my review?

Is it possible to do the libpcre2-8 + bootstrap changes (to use BundledLibraryPath) as a separate follow-up PR? I think it'd be best only to handle the path.jl changes in this one.

base/libc.jl Outdated
throw(ArgumentError("invalid arguments"))
end
@static if Sys.isapple()
function callmktime(s::String)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function doesn't call mktime, so it could probably benefit from a rename.

@johroj
Copy link
Author

johroj commented Jan 14, 2026

Was testing exhaustive or random? Do you have a test script that I can run as well as part of my review?

Exhaustive, so it did not feel sensible to add it to test/. If you can also test on unix, that would be a good addition. They are currently in a messy pluto notebook that I can clean up into a script. For example, splitpath is tested as

@testset begin
	nerrors = 0
	for N = 1:8
		for chrs in Iterators.product(fill(['a','/', '.', ':', '😺', '\n'], N)...)
			nerrors > 20 && break
			str = String(collect(chrs))
			
			if splitpath(str) == UUT.Filesystem.splitpath(str)
				@test true
			else
				nerrors += 1
				@show str
				@show splitpath(str)
				@show UUT.Filesystem.splitpath(str)
			end
		end
	end
end

Is it possible to do the libpcre2-8 + bootstrap changes (to use BundledLibraryPath) as a separate follow-up PR? I think it'd be best only to handle the path.jl changes in this one.

Makes sense.

@topolarity
Copy link
Member

Exhaustive, so it did not feel sensible to add it to test/.

Agreed - still valuable testing but not needed in CI

If you can also test on unix, that would be a good addition

Sure thing, happy to donate the machine time to test this on unix.

@johroj
Copy link
Author

johroj commented Jan 15, 2026

By the way, what would be the scope of versions to support? Since the embedding issues with PySR etc. exist today, backporting could make sense. But fixing it in LTS could be less straightforward since BundledLazyLibrary does not exist. Between 1.12 and 1.13, 88c0e10 happened, so 1.12 may require some attention. Backporting to 1.13 should be straightforward.

@johroj
Copy link
Author

johroj commented Jan 15, 2026

Here are the tests I used while porting pathjl_porting_tests.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants