Skip to content

Conversation

jkeenan
Copy link
Contributor

@jkeenan jkeenan commented Aug 8, 2025

Adapt Tux's rt70652.pl in #9972 (comment).

Fixes GH #9972.


  • This set of changes does not require a perldelta entry.

# down where the failure is, and supply your new names as a patch.
# (Just-in-time test naming)
plan tests => 510 + 6 * 2;
plan tests => 512 + 6 * 2;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commit message is misleading.

The title is Proper handling of UTF-8 character in bitwise xor when using $1 but this commit doesn't fix anything.
It's adding tests for an issue that was fixed.

Ideally it would also link to the commit that fixed the issue but looking at #9972 that might not be easy to find because it was fixed in separate steps (#9972 (comment)) so might not be worth the effort.

At the very least adding something in the summary based on #9972 (comment) would be good so that there is a reference to when it was fixed (from a quick glance at the ticket: partially between 5.8 and 5.12 and fully fixed between 5.12 and 5.14)

Comment on lines +775 to +777
my $got = [@t];
my $exp = [1, 1, "", "", "", "", "", "", ""];
ok( eq_array($got, $exp), "GH 9972: no malformed UTF-8 character in bitwise xor");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I'm not a fan of this construct.
It think it would be much nicer to do replace (for example):

   # Now we take 8 Bytes of a normal string with m/(.{8})/
    push @t, utf8::is_utf8 ($normalstring);

with:

   # Now we take 8 Bytes of a normal string with m/(.{8})/
    is(utf8::is_utf8 ($normalstring) ,1, "\$normalstring has the UTF-8 flag set");

now it all ends up in one big eq_array at the end which makes it difficult to trace back to what is happening.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Tux, since this p.r. essentially takes some code you wrote years ago and adapts it into a regression test, would you care to comment?

Comment on lines +746 to +751
# $1 is assigned but not yet unicode: UTF8-Flag ($1)
push @t, utf8::is_utf8 ($1);
# After we copy $1 the Flag is on: UTF8-Flag ($1)
my $copy = $1;
push @t, utf8::is_utf8 ($1);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is confusing / contradictory..

These are the first two items in @t and should match the first two items in the $exp arrayref.
The $exp arrayref starts with: [1, 1, ...]
So both of the utf8::is_utf_8 call return 1?
Based on the comments about the block I would have expected these to be [0, 1, ...
The comments the first says: but not yet unicode while the second says: the Flag is on which to me implies a difference..

@jkeenan jkeenan changed the title Proper handling of UTF-8 character in bitwise xor when using $1 Add tests for proper handling of UTF-8 character in bitwise xor when using $1 Oct 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants