You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In Python's implementation of re the $ does not only match the end of the line, but also before any line break, even without multiline mode.
There's an article by the OpenSSF about this issue suggesting to use \Z instead, or prefer the fullmatch function.
In Python's implementation of re the $ does not only match the end of the line, but also before any line break, even without multiline mode.
This is incorrect. In non-multiline mode, $ matches either the end of the entire input or the position right before the trailing newline (only \n, not \r\n). It is similar to PCRE's \Z, whereas Python's \Z resembles PCRE's \z:
ANCHORS AND SIMPLE ASSERTIONS
\b word boundary
\B not a word boundary
^ start of subject
also after an internal newline in multiline mode
(after any newline if PCRE2_ALT_CIRCUMFLEX is set)
\A start of subject
$ end of subject
also before newline at end of subject
also before internal newline in multiline mode
\Z end of subject
also before newline at end of subject
\z end of subject
\G first matching position in subject
In Python's implementation of re the $ does not only match the end of the line, but also before any line break, even without multiline mode.
This is incorrect. In non-multiline mode, $ matches either the end of the entire input or the position right before the trailing newline (only \n, not \r\n). It is similar to PCRE's \Z, whereas Python's \Z resembles PCRE's \z:
Thank you for the clarification.
Based on your example here's a series of calls to highlight the specific behavior of $:
>>> re.search(r'$', 'foo\nbar')
<re.Match object; span=(7, 7), match=''>
>>> re.search(r'$', 'foo\nbar\n')
<re.Match object; span=(7, 7), match=''> # still matches at pos 7, before the final new line
>>> re.search(r'$', 'foo\nbar\n\n')
<re.Match object; span=(8, 8), match=''> # matches at pos 8, before the final new line
>>> re.search(r'\Z', 'foo\nbar')
<re.Match object; span=(7, 7), match=''>
>>> re.search(r'\Z', 'foo\nbar\n')
<re.Match object; span=(8, 8), match=''>
>>> re.search(r'\Z', 'foo\nbar\n\n')
<re.Match object; span=(9, 9), match=''>
In Python's implementation of
re
the$
does not only match the end of the line, but also before any line break, even without multiline mode.There's an article by the OpenSSF about this issue suggesting to use
\Z
instead, or prefer thefullmatch
function.An example of the issue:
Using
\Z
fixes this:Or using
fullmatch
:Should Ruff warn about using
$
withsearch
?The text was updated successfully, but these errors were encountered: