Skip to content

Conversation

richardleach
Copy link
Contributor

Previously, only else {} branches would have the OPf_PARAMS flag set.

Perl_op_scope uses this flag to determine whether its optree argument (o)
should be wrapped in an ENTER/LEAVE pair or only get a SCOPE OP, which
is typically optimized away (nulled out) before runtime.

This has at least two consequences visible for Perl users:

  1. Differing lifetimes for things depending upon whether they occur in an
    if block or an else block. This could cause bugs that cannot be
    understood from Perl source code alone.

    For example, consider a Foo class that has a DESTROY sub. In the
    following code, $object2 goes out of scope at the completion of the
    else {} block and the DESTROY sub fires. In contrast, $object1
    does NOT go out of scope at the completion of the if {} block -
    because there is no scope - and the DESTROY sub won't fire until
    some later time.

    if ($_) {
        my $object1 = Foo->new();
    } else {
        my $object2 = Foo->new();
    }
  1. The NEXTSTATE OP immediately following a SCOPE OP is typically
    nulled out before runtime, but the first NEXTSTATE after an
    ENTER OP is not.

    NEXTSTATE OPs update the interpreter with the line number associated
    with the currently executing statement. (PL_curcop.) The interpreter
    outputs this in warnings or fatal error messages. Not having the first
    NEXTSTATE present in if blocks means that error messages triggered
    by the first line of code will typically report an incorrect line
    number.

This PR addresses the above two concerns, but with the downside
that if/elsif/unless blocks now have the same OP overhead as
else blocks. (The ENTER, first NEXTSTATE, and LEAVE OPs.)


  • This set of changes requires a perldelta entry, and I need help writing it.

Previously, only `else {}` branches would have the OPf_PARAMS flag set.

`Perl_op_scope` uses this flag to determine whether its optree argument (`o`)
should be wrapped in an `ENTER/LEAVE` pair or only get a `SCOPE` OP, which
is typically optimized away (nulled out) before runtime.

This has at least two consequences visible for Perl users:

1. Differing lifetimes for things depending upon whether they occur in an
  `if` block or an `else` block. This could cause bugs that cannot be
   understood from Perl source code alone.

   For example, consider a `Foo` class that has a `DESTROY` sub. In the
   following code, `$object2` goes out of scope at the completion of the
   `else {}` block and the `DESTROY` sub fires. In contrast, `$object1`
   does NOT go out of scope at the completion of the `if {}` block -
   because _there is no scope_ - and the `DESTROY` sub won't fire until
   some later time.

```
    if ($_) {
        my $object1 = Foo->new();
    } else {
        my $object2 = Foo->new();
    }
```

2. The `NEXTSTATE` OP immediately following a `SCOPE` OP is typically
   nulled out before runtime, but the first `NEXTSTATE` after an
   `ENTER` OP is not.

   `NEXTSTATE` OPs update the interpreter with the line number associated
   with the currently executing statement. (`PL_curcop`.) The interpreter
   outputs this in warnings or fatal error messages. Not having the first
   `NEXTSTATE` present in `if` blocks means that error messages triggered
   by the first line of code will typically report an incorrect line
   number.

This commit addresses the above two concerns, but with the downside
that `if`/`elsif`/`unless` blocks now have the same OP overhead as
`else` blocks. (The `ENTER`, first `NEXTSTATE`, and `LEAVE` OPs.)
@richardleach
Copy link
Contributor Author

Line number problems could also be fixed by not nulling out NEXTSTATE kids of SCOPE OPs, which would also fix additional "wrong line number" issues - such as ##8216 - but doing that alone and not this PR wouldn't fix the discrepancy in DESTROY behaviour, which I still think is a bug that should be fixed.

Something that we could do if this PR is merged and we fix the above is to teach Perl_op_scope to check whether the optree argument really needs an ENTER/LEAVE pair and to emit a SCOPE if not. Strategies might include:

  • Perl_op_scope scans a certain number of OPs in the optree (limited to prevent a noticeable slowdown in compilation) to see if there's anything that warrants ENTER/LEAVE.
  • Toggle an OP flag when adding an OP that needs an ENTER/LEAVE to a LINESEQ. Perl_op_scope might then just be able to read that flag and make its decision based on that. (This sounds like a nicer approach, but I have no idea how plausible it is!)

@richardleach
Copy link
Contributor Author

teach Perl_op_scope to check whether the optree argument really needs an ENTER/LEAVE pair and to emit a SCOPE if not

It might be easier than I suggested above. 😆 Will look into that sometime this month.

@richardleach richardleach marked this pull request as draft October 15, 2025 11:56
@richardleach
Copy link
Contributor Author

Hmmm, a better fix for the if/else scoping discrepancy might be:

diff --git a/perly.y b/perly.y
index 53d4279b98..64eb3a63b3 100644
--- a/perly.y
+++ b/perly.y
@@ -965,7 +965,6 @@ else
        :       empty
        |       KW_ELSE mblock
                        {
-                         ($mblock)->op_flags |= OPf_PARENS;
                          $$ = op_scope($mblock);
                        }
        |       KW_ELSIF PERLY_PAREN_OPEN mexpr PERLY_PAREN_CLOSE mblock else[else.recurse]
diff --git a/toke.c b/toke.c
index 3a3b1aa9e5..31c45a262b 100644
--- a/toke.c
+++ b/toke.c
@@ -8445,6 +8445,7 @@ yyl_word_or_keyword(pTHX_ char *s, STRLEN len, I32 key, I32 orig_keyword, struct
     case KEY_our:
     case KEY_my:
     case KEY_state:
+        PL_hints |= HINT_BLOCK_SCOPE;
         return yyl_my(aTHX_ s, key);
 
     case KEY_next:

That does nothing for the line number warnings though - actually means lines from the start of else blocks are more likely to be reported incorrectly! (That is arguably a distinct problem though.)

@tonycoz
Copy link
Contributor

tonycoz commented Oct 15, 2025

"Force OPf_PARAMS..."

in subject, commit message.

but you force OPf_PARENS (OPf_PARAMS isn't a thing)

@richardleach
Copy link
Contributor Author

"Force OPf_PARAMS..."

in subject, commit message.

but you force OPf_PARENS (OPf_PARAMS isn't a thing)

D'oh. Too much thinking about parameters recently. Thanks for spotting it. I'm going to see if the alternative route works without breaking B, before continuing with this PR.

@tonycoz
Copy link
Contributor

tonycoz commented Oct 15, 2025

I like fixes, but I am worried it will silently change lifetimes and break downstream code.

Though it should really only depend on destruction side effects, code that depends on the object really needs its own reference will be prevent any early destruction.

@richardleach
Copy link
Contributor Author

I like fixes, but I am worried it will silently change lifetimes and break downstream code.

Yeah, that's definitely a potential downside.

Any such code is already brittle; any minor refactoring that moves logic out of an if and into an else block, or vice-versa, will see a change to time-of-destruction. I wonder if any such code has already been made safe / had guardrails added as a result of the developer encountering this bug.

@richardleach richardleach changed the title Force OPf_PARAMS on "if/elsif/unless" optree branches Force OPf_PARENS on "if/elsif/unless" optree branches Oct 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants