Consider manual parsing and handling of regex #4
Labels
enhancement
New feature or request
nix incompatibility
Some function works differently or doesn't work in the Nix target
The
regex
module currently just forwards all regex strings to Nix's built-in regex matching and splitting functions. However, Nix uses POSIX ERE syntax for regex, which is more limited than the syntax available in other Gleam targets. For example,\s
for whitespace doesn't work, and must be replaced by[[:space:]]
. Similarly,\d
needs to be written as[0-9]
for example.We could solve this by manually parsing regex on
compile
and transforming into a format that Nix accepts. In particular, we'd look into tackling the following incompatibilities:\s
and\d
, which are widely used across Gleam packages;\n
;(?!x)
(negative lookahead);(?:x)
(ignored group).1 and 2 would require parsing and replacing depending on the context (inside/outside character classes - a simple global replacement isn't enough, since the
[ ]
in[0-9]
have to be dropped when already inside a character class).3 would basically consist of converting letters such as
a
orA
into[aA]
, and would require parsing in the same manner.4 could perhaps be done by splitting the string into lines first and joining matches on each line.
5 could perhaps be done by replacing
(?!x)
with(x)?
and storing the capture group number; later on, when using match functions, matches where the group is present would be ignored (additionally, the group would be removed from submatches).6 could be done by replacing
(?:x)
with(x)
and then removing the group from submatches later, by storing the group's number after parsing.Now, parsing with Nix could be inefficient, but we expect
compile
to be a "slower" operation anyway. This would allow proper compatibility with the ecosystem.We could also test if parsing would be needed anyway first , otherwise just pass the regex through.
The text was updated successfully, but these errors were encountered: