Add tests for proper handling of UTF-8 character in bitwise xor when using $1 #23552

jkeenan · 2025-08-08T20:00:27Z

Adapt Tux's rt70652.pl in #9972 (comment).

Fixes GH #9972.

This set of changes does not require a perldelta entry.

Adapt Tux's rt70652.pl in #9972 (comment). Fixes GH #9972.

bram-perl · 2025-08-09T11:07:04Z

t/op/bop.t

 # down where the failure is, and supply your new names as a patch.
 # (Just-in-time test naming)
-plan tests => 510 + 6 * 2;
+plan tests => 512 + 6 * 2;


The commit message is misleading.

The title is Proper handling of UTF-8 character in bitwise xor when using $1 but this commit doesn't fix anything.
It's adding tests for an issue that was fixed.

Ideally it would also link to the commit that fixed the issue but looking at #9972 that might not be easy to find because it was fixed in separate steps (#9972 (comment)) so might not be worth the effort.

At the very least adding something in the summary based on #9972 (comment) would be good so that there is a reference to when it was fixed (from a quick glance at the ticket: partially between 5.8 and 5.12 and fully fixed between 5.12 and 5.14)

bram-perl · 2025-08-09T11:12:54Z

t/op/bop.t

+    my $got = [@t];
+    my $exp = [1, 1, "", "", "", "", "", "", ""];
+    ok( eq_array($got, $exp), "GH 9972: no malformed UTF-8 character in bitwise xor");


Personally I'm not a fan of this construct.
It think it would be much nicer to do replace (for example):

# Now we take 8 Bytes of a normal string with m/(.{8})/ push @t, utf8::is_utf8 ($normalstring);

with:

# Now we take 8 Bytes of a normal string with m/(.{8})/ is(utf8::is_utf8 ($normalstring) ,1, "\$normalstring has the UTF-8 flag set");

now it all ends up in one big eq_array at the end which makes it difficult to trace back to what is happening.

@Tux, since this p.r. essentially takes some code you wrote years ago and adapts it into a regression test, would you care to comment?

bram-perl · 2025-08-09T11:17:30Z

t/op/bop.t

+    # $1 is assigned but not yet unicode: UTF8-Flag ($1)
+    push @t, utf8::is_utf8 ($1);
+
+    # After we copy $1 the Flag is on: UTF8-Flag ($1)
+    my $copy = $1;
+    push @t, utf8::is_utf8 ($1);


This is confusing / contradictory..

These are the first two items in @t and should match the first two items in the $exp arrayref.
The $exp arrayref starts with: [1, 1, ...]
So both of the utf8::is_utf_8 call return 1?
Based on the comments about the block I would have expected these to be [0, 1, ...
The comments the first says: but not yet unicode while the second says: the Flag is on which to me implies a difference..

Proper handling of UTF-8 character in bitwise xor when using $1

680d525

Adapt Tux's rt70652.pl in #9972 (comment). Fixes GH #9972.

jkeenan requested review from Tux and khwilliamson August 8, 2025 20:00

jkeenan mentioned this pull request Aug 8, 2025

Malformed UTF-8 character in bitwise xor when using $1 #9972

Open

bram-perl reviewed Aug 9, 2025

View reviewed changes

jkeenan changed the title ~~Proper handling of UTF-8 character in bitwise xor when using $1~~ Add tests for proper handling of UTF-8 character in bitwise xor when using $1 Oct 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add tests for proper handling of UTF-8 character in bitwise xor when using $1 #23552

Add tests for proper handling of UTF-8 character in bitwise xor when using $1 #23552

Uh oh!

jkeenan commented Aug 8, 2025

Uh oh!

bram-perl Aug 9, 2025

Uh oh!

bram-perl Aug 9, 2025

Uh oh!

jkeenan Oct 1, 2025

Uh oh!

bram-perl Aug 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add tests for proper handling of UTF-8 character in bitwise xor when using $1 #23552

Are you sure you want to change the base?

Add tests for proper handling of UTF-8 character in bitwise xor when using $1 #23552

Uh oh!

Conversation

jkeenan commented Aug 8, 2025

Uh oh!

bram-perl Aug 9, 2025

Choose a reason for hiding this comment

Uh oh!

bram-perl Aug 9, 2025

Choose a reason for hiding this comment

Uh oh!

jkeenan Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

bram-perl Aug 9, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants