You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I run the following code in the debugger, I see inconsistent behavior concerning the handling of utf-8 literals vs. variables:
Code test.pl
use strict;
use warnings;
use utf8;
my $var = 'Здравствуйте';
print $var;
Running this in a utf-8 terminal using the debugger:
perl -d test.pl
Loading DB routines from perl5db.pl version 1.81
Editor support available.
Enter h or 'h h' for help, or 'man perldebug' for more help.
main::(test.pl:7): my $var = 'Здравствуйте';
DB<1> n
main::(test.pl:9): print $var;
DB<1> p $var
Wide character in print at (eval 10)[/home/chris/perl5/perlbrew/perls/perl-5.41.4/lib/5.41.4/perl5db.pl:742] line 2.
at (eval 10)[/home/chris/perl5/perlbrew/perls/perl-5.41.4/lib/5.41.4/perl5db.pl:742] line 2.
eval 'no strict; ($@, $!, $^E, $,, $/, $\\, $^W) = @DB::saved;package main; $^D = $^D | $DB::db_stop;
print {$DB::OUT} $var;
' called at /home/chris/perl5/perlbrew/perls/perl-5.41.4/lib/5.41.4/perl5db.pl line 742
DB::eval called at /home/chris/perl5/perlbrew/perls/perl-5.41.4/lib/5.41.4/perl5db.pl line 3427
DB::DB called at test.pl line 9
Здравствуйте
DB<2> binmode $DB::OUT, ':utf8'
DB<3> p $var
Здравствуйте
DB<4> l 7
7: my $var = 'ÐдÑавÑÑвÑйÑе';
DB<5> q
In my utf8 terminal, without changing the output layer of $DB::OUT I get a readable line 7 without warning (which is surprising) and a wide character warning for the p $var debugger command (which is expected).
Changing the output layer with binmode $DB::OUT, ':utf8' leads to the p $var command printing correctly but the l 7 command prints garbage.
The reason for this behavior is, the utf8 pragma leads to the variable $var being correctly decoded but the literal (code-) line that is passed to the debugger is not.
I think the line array passed to the debugger needs to have lines decoded in the context of the use utf8 pragma.
If I apply the following patch to toke.c I get the correct behavior for the case where use utf8 is in effect. Unfortunately I don't have enough knowledge of the perl internals to check if utf8 is set for the current line and apply the SV_CATUTF8 flag conditionally, but maybe it gets anyone started.
perl-dev git:(blead) ✗ ./perl -I './lib' -d test.pl
Loading DB routines from perl5db.pl version 1.81
Editor support available.
Enter h or 'h h' for help, or 'man perldebug' for more help.
Wide character in print at lib/perl5db.pl line 6244.
at lib/perl5db.pl line 6244.
DB::print_lineinfo("main::(test.pl:7):\x{9}my \$var = '\x{417}\x{434}\x{440}\x{430}\x{432}\x{441}\x{442}\x{432}\x{443}\x{439}\x{442}\x{435}';\x{a}") called at lib/perl5db.pl line 4624
DB::depth_print_lineinfo(1, "main::(test.pl:7):\x{9}my \$var = '\x{417}\x{434}\x{440}\x{430}\x{432}\x{441}\x{442}\x{432}\x{443}\x{439}\x{442}\x{435}';\x{a}") called at lib/perl5db.pl line 3593
DB::Obj::_my_print_lineinfo(DB::Obj=HASH(0x5637dd8ee088), 7, "main::(test.pl:7):\x{9}my \$var = '\x{417}\x{434}\x{440}\x{430}\x{432}\x{441}\x{442}\x{432}\x{443}\x{439}\x{442}\x{435}';\x{a}") called at lib/perl5db.pl line 3680
DB::Obj::_DB__grab_control(DB::Obj=HASH(0x5637dd8ee088)) called at lib/perl5db.pl line 2981
DB::DB called at test.pl line 7
main::(test.pl:7): my $var = 'Здравствуйте';
DB<1> n
main::(test.pl:9): print $var;
DB<1> p $var
Wide character in print at (eval 9)[lib/perl5db.pl:742] line 2.
at (eval 9)[lib/perl5db.pl:742] line 2.
eval 'no strict; ($@, $!, $^E, $,, $/, $\\, $^W) = @DB::saved;package main; $^D = $^D | $DB::db_stop;
print {$DB::OUT} $var;
' called at lib/perl5db.pl line 742
DB::eval called at lib/perl5db.pl line 3427
DB::DB called at test.pl line 9
Здравствуйте
DB<2> binmode $DB::OUT, ':utf8'
DB<3> p $var
Здравствуйте
DB<4> l 7
7: my $var = 'Здравствуйте';
DB<5> q
You can see that now without changing the output layer, the p $var as well as the l 7 command print wide character warnings. Switching the output layer to :utf8 fixes both of them.
perl -v
This is perl 5, version 41, subversion 4 (v5.41.4) built for x86_64-linux
Copyright 1987-2024, Larry Wall
Perl may be copied only under the terms of either the Artistic License or the
GNU General Public License, which may be found in the Perl 5 source kit.
Complete documentation for Perl, including FAQ lists, should be found on
this system using "man perl" or "perldoc perl". If you have access to the
Internet, point your browser at https://www.perl.org/, the Perl Home Page.
The text was updated successfully, but these errors were encountered:
When I run the following code in the debugger, I see inconsistent behavior concerning the handling of utf-8 literals vs. variables:
Code test.pl
Running this in a utf-8 terminal using the debugger:
In my utf8 terminal, without changing the output layer of
$DB::OUT
I get a readable line 7 without warning (which is surprising) and a wide character warning for thep $var
debugger command (which is expected).Changing the output layer with
binmode $DB::OUT, ':utf8'
leads to thep $var
command printing correctly but thel 7
command prints garbage.The reason for this behavior is, the utf8 pragma leads to the variable $var being correctly decoded but the literal (code-) line that is passed to the debugger is not.
I think the line array passed to the debugger needs to have lines decoded in the context of the
use utf8
pragma.If I apply the following patch to toke.c I get the correct behavior for the case where
use utf8
is in effect. Unfortunately I don't have enough knowledge of the perl internals to check if utf8 is set for the current line and apply theSV_CATUTF8
flag conditionally, but maybe it gets anyone started.Running the example with above patch:
You can see that now without changing the output layer, the
p $var
as well as thel 7
command print wide character warnings. Switching the output layer to:utf8
fixes both of them.The text was updated successfully, but these errors were encountered: