Skip to content

Conversation

@stonechoe
Copy link
Member

@stonechoe stonechoe commented Jul 21, 2025

This pull request depends on #298.

It introduces the -extract:warn-action option, which resolves issue #289 by flagging any newly added “Yet” phrases (for steps and types) in the GitHub Actions workflow of tc39/ecma262.

The option accepts a JSON array of line numbers through standard input and emits a warning for each matching phrase on those lines. (We plan to write documentation explaining the consequences and impact of newly introduced “Yet” phrases on esmeta tycheck.)

Example — Check for lines 1, 2, and 3:

$ esmeta extract -extract:warn-action <<< "[1,2,3]"

To enable automatic warnings in a pull-request check, pass the lines added in the diff.
See the YAML snippet below for an example configuration. To see how the warning messages appear, check out this example PR.

name: ESMeta New 'Yet' Phrases Detection

on: [pull_request]

jobs:
  esmeta-new-phrases:
    runs-on: ubuntu-latest

    env:
      ESMETA_HOME: vendor/esmeta

    steps:
      - name: Checkout
        uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - name: Setup JDK
        uses: actions/setup-java@v4
        with:
          distribution: temurin
          java-version: 17
      - name: Setup SBT
        uses: sbt/setup-sbt@v1
      - name: download esmeta
        run: |
          mkdir -p "${ESMETA_HOME}"
          cd "${ESMETA_HOME}"
          git init
          git remote add origin https://github.com/es-meta/esmeta.git
          git fetch --depth 1 origin a9b543cc0ae8d467f9b67d03a48aeb2494cb57bf ;
          git checkout FETCH_HEAD
      - name: build esmeta
        run: |
          cd "${ESMETA_HOME}"
          sbt assembly
      - name: link
        run: |
          rmdir "${ESMETA_HOME}"/ecma262 \
            && ln -s "$(pwd)" "${ESMETA_HOME}"/ecma262

      - id: collect
        name: List added line numbers
        shell: bash
        env:
          FILE_PATH: spec.html
          BASE_SHA: ${{ github.event.pull_request.base.sha }}
          HEAD_SHA: ${{ github.sha }}
        run: |
          # zero-context diff, limited to the file we care about
          added_lines=$(git diff -U0 --no-color "${BASE_SHA}" "${HEAD_SHA}" -- "${FILE_PATH}" |
          # keep only the hunk headers (the @@ lines)
          awk '
            # Example header: @@ -158,0 +159,2 @@
            /^@@/ {
              # Extract “+<start>[,<count>]” part
              match($0, /\+([0-9]+)(,([0-9]+))?/, a)
              start = a[1]
              count = (a[3] == "" ? 1 : a[3])
              # Print every line number in the added range
              for (i = 0; i < count; i++) print start + i
            }
          ')

          # Join line numbers with comma or space (whichever format you need)
          added_joined=$(echo "$added_lines" | paste -sd "," -)
          added_lines_json="[$added_joined]"

          # Set it as an output value
          echo "added_lines=$added_lines_json" >> "$GITHUB_OUTPUT"

      - name: check newly introduces phrases
        run: |
          "${ESMETA_HOME}"/bin/esmeta extract \
            -status \
            -extract:warn-action <<< ${{ steps.collect.outputs.added_lines }}

@stonechoe stonechoe marked this pull request as draft July 21, 2025 10:50
@stonechoe stonechoe marked this pull request as ready for review July 21, 2025 11:20
@michaelficarra
Copy link

In the message about a newly-introduced unknown phrase, it says that type checking the body of the AO will not be performed after that line. But this may be misleading if there's another line preceding it that already would cause the type checking to stop. Should we only warn when the newly-introduced unknown phrase does not already follow an existing unknown phrase? I will ask the other editors what they think.

@stonechoe
Copy link
Member Author

In the message about a newly-introduced unknown phrase, it says that type checking the body of the AO will not be performed after that line. But this may be misleading if there's another line preceding it that already would cause the type checking to stop. Should we only warn when the newly-introduced unknown phrase does not already follow an existing unknown phrase? I will ask the other editors what they think.

@michaelficarra Thank you for bringing this up. Since there hasn’t been much feedback yet, I’d really appreciate hearing what the other editors think as well. From our side, we’d also be happy to adjust the wording to a more neutral tone if that would be preferable.

@michaelficarra
Copy link

@stonechoe After discussing with @syg at editor call, we'd like to make sure this won't be too noisy before integrating it. Can you show how many warnings would have been generated in the last 50-or-so commits? Something kind of like this:

#!/bin/bash

mkdir -p added_lines

commits=$(git rev-list --reverse -n 50 HEAD)

for SHA in $commits; do
  short_sha=$(git rev-parse --short=10 $SHA)
  msg=$(git log -1 --pretty=format:'%s' $SHA)
  author=$(git log -1 --pretty=format:'%an' $SHA)
  date=$(git log -1 --pretty=format:'%ad' $SHA)
  
  echo "- $short_sha $msg $author $date"
  
  added_lines_file="added_lines/${short_sha}.json"
  if [ ! -f "$added_lines_file" ]; then
    added_lines=$(git diff -U0 --no-color "${SHA}^" "${SHA}" -- spec.html |
      # keep only the hunk headers (the @@ lines)
      awk '
        # Example header: @@ -158,0 +159,2 @@
        /^@@/ {
          # Extract “+<start>[,<count>]” part
          match($0, /\+([0-9]+)(,([0-9]+))?/, a)
          start = a[1]
          count = (a[3] == "" ? 1 : a[3])
          # Print every line number in the added range
          for (i = 0; i < count; i++) print start + i
        }
      ')

    # Join line numbers with comma or space (whichever format you need)
    added_joined=$(echo "$added_lines" | paste -sd "," -)
    added_lines_json="[$added_joined]"

    # Set it as an output value
    echo "added_lines=$added_lines_json" > "$added_lines_file"
  fi

  git checkout $SHA
  esmeta ... "$added_lines_file"
  echo
done

@michaelficarra
Copy link

Oh, also, on the question of whether to do the "smart" thing and avoid "dead" steps, we decided that we would rather do the "dumb" thing and warn about any new or changed steps that are not understood, regardless of context.

@stonechoe
Copy link
Member Author

stonechoe commented Sep 15, 2025

@michaelficarra I ran an experiment on the latest 50 and 100 commits on the main branch of ecma262. The results are as follows:

  • Last 50 commits: an average of 1.18 warnings, with a maximum of 16 (the commit with the most warnings was tc39/ecma262@f764049fe0).
  • Last 100 commits: an average of 0.85 warnings, with a maximum of 16 (the commit with the most warnings was tc39/ecma262@f764049fe0).

It seems that commits associated with certain large editorial PRs tend to generate more warnings.
For example, all commits related to tc39/ecma262#2952 showed a noticeably high number of warnings.

These are the top 5 commits that generated the most warnings:

commit hash author date warnings commit message
tc39/ecma262@f764049fe0 tontonialberto 2025-07-24T10:23:22-05:00 16 Editorial: unify various forms of if-otherwise in single-line conditionals (#3653)
tc39/ecma262@411357131 Michael Dyck 2025-08-18T21:05:45-07:00 13 Editorial: Eliminate monkey-patching from "Block-Level Function Declarations..." (#2952)
tc39/ecma262@9815f3a5a2 Michael Dyck 2025-08-18T21:05:49-07:00 8 Editorial: Eliminate monkey-patching from "VariableStatements in Catch Blocks" (#2952)
tc39/ecma262@d430aceebe Kevin Gibbons 2025-02-26T22:42:52-08:00 7 Normative: add Float16Array and Math.f16round (#3532)
tc39/ecma262@c828627e97 Nicolò Ribaudo 2025-03-26T15:29:31-07:00 6 Editorial: Put module concrete methods in their own emu-clauses (#3541)

The attached CSV contains the detailed results, sorted from oldest to newest commit.

CSV File
short_sha,author,date_iso,warn_count,subject
ff129b14ca,Mathieu Hofman,2025-02-26T16:01:40-08:00,0,Normative: Close sync iterator when async wrapper yields rejection (#2600)
ed75310080,Shu-yu Guo,2025-02-26T16:13:48-08:00,0,Normative: Remove [[VarNames]] from the global (#3226)
e2da759885,Jordan Harband,2025-02-26T21:59:39-08:00,5,Normative: add `RegExp.escape` (#3382)
9552f29892,Kevin Gibbons,2025-02-26T22:27:34-08:00,0,Normative: iterator-producing helpers close receiver on argument validation failure (#3467)
f2bad0095a,Kevin Gibbons,2025-02-26T22:27:41-08:00,0,Normative: consuming helpers close receiver on argument validation failure (#3467)
d430aceebe,Kevin Gibbons,2025-02-26T22:42:52-08:00,7,Normative: add Float16Array and Math.f16round (#3532)
d005528c72,Richard Gibson,2025-02-27T09:43:03-08:00,0,Editorial: Document initial internal slot values at their point of creation (#3537)
d06f3a265a,Michael Dyck,2025-02-28T08:30:35-08:00,1,Editorial: Change remaining `_c_` to `_cp_` in EncodeForRegExpEscape (#3546)
8d92bad94b,Aki 🌹,2025-03-06T07:38:58-08:00,0,Markup: improvements for printing (#3352)
547772d617,Kevin Gibbons,2025-03-06T07:39:02-08:00,0,Meta: bump ecmarkup to v21.0.0 (#3352)
030dcd6c88,Nicolò Ribaudo,2025-03-06T07:47:03-08:00,1,Editorial: Explicitly track async evaluation order of pending modules (#3353)
acf67f9088,Michael Ficarra,2025-03-06T10:28:15-08:00,0,Editorial: rename `v` alias to `view` in DataView.prototype methods (#3547)
ddae0a9eb0,Michael Ficarra,2025-03-06T10:29:23-08:00,3,Editorial: unify NaN phrasing (#3547)
53a42e69a4,Nicolò Ribaudo,2025-03-12T15:30:17-07:00,1,Editorial: Extract JSON parsing into its own AO (#3540)
e42d11da77,Kevin Gibbons,2025-03-19T14:55:29-07:00,0,Markup: consistent spacing for method signatures in tables (#3549)
c828627e97,Nicolò Ribaudo,2025-03-26T15:29:31-07:00,6,Editorial: Put module concrete methods in their own emu-clauses (#3541)
4172303c29,Aki 🌹,2025-03-31T07:53:40-05:00,0,Markup: Add print styles for slashed corner cell (#3550)
0ebab55c31,Kevin Gibbons,2025-03-31T08:01:12-05:00,0,Meta: bump ecmarkup to v21.2.0 (#3553)
654c3f53fc,Kevin Gibbons,2025-03-31T08:08:24-05:00,0,Markup: more slashed corner cells (#3554)
5117d4f48a,Kevin Gibbons,2025-03-31T08:42:58-05:00,0,Editorial: Describe changes in ES2025 (#3552)
ab26103581,Jordan Harband,2025-03-31T22:44:17-05:00,0,Editorial: bump editor edition numbers from #3552
dd4300da71,Kevin Gibbons,2025-03-31T22:46:49-05:00,0,Editorial: main is now ES2026
6ccb804013,André Bargull,2025-04-02T14:46:46-05:00,0,Editorial: Remove unnecessary `undefined` type from property descriptor tests (#3402)
f0b28b6e1e,Michael Ficarra,2025-04-09T21:29:15-07:00,0,Editorial: address circular definition of Iterator.prototype (#3558)
de62e8dd5b,André Bargull,2025-04-09T21:38:47-07:00,2,Normative: Revert ArrayIterator and RegExpStringIterator to manual iterators (#3559)
b8e178d7b4,Shu-yu Guo,2025-04-09T21:48:10-07:00,0,Meta: remove Twitter link for Shu (#3560)
5854e1f9c4,Michael Ficarra,2025-04-09T21:57:41-07:00,0,Meta: replace Twitter link with Bluesky for me (#3561)
b6b9fbead2,Kevin Gibbons,2025-04-09T22:08:36-07:00,0,Meta: remove Twitter link for bakkot (#3562)
2733fc67c9,Michael Dyck,2025-04-23T14:45:34-07:00,0,"Editorial: 'Promote' two ""Properties of Instances"" sections (#3566)"
3c8eb86b89,Michael Dyck,2025-04-23T14:45:45-07:00,0,"Editorial: Add a ""Type"" column to two ""Internal Slots"" tables (#3566)"
d0e5ade6ed,Michael Dyck,2025-04-23T14:45:53-07:00,0,Editorial: Add underscores to alias (#3566)
6e63fb08af,Michael Dyck,2025-04-23T16:26:16-07:00,0,Editorial: Remove spurious '!' (#3070)
7177c073b5,Michael Dyck,2025-04-23T16:26:22-07:00,0,Editorial: Change some '?' to '!' (#3070)
6d71ca0e2d,Jihyeok Park,2025-04-25T13:23:24-07:00,0,Meta: Upgrade ESMeta to v0.6.0 (#3573)
620e926898,Richard Gibson,2025-05-02T23:08:05-07:00,0,Editorial: Explain the nature of TypedArray property handling (#3571)
e0bae9895b,Richard Gibson,2025-05-02T23:08:11-07:00,0,"Editorial: Replace ""{String Number} value"" with just ""{String Number}"" (#3571)"
df0552818a,Richard Gibson,2025-05-02T23:08:15-07:00,0,Editorial: Clarify that TypedArray properties privilege in-band integer index keys (#3571)
1a5472c517,Richard Gibson,2025-05-02T23:08:20-07:00,0,Editorial: Move the canonical numeric string observation to a TypedArray note (#3571)
5d3cb839ca,Kevin Gibbons,2025-05-03T10:37:42-07:00,0,Editorial: fix a couple typos in the sample module executions (#3579)
0fb1859688,Nicolò Ribaudo,2025-05-03T20:46:32-07:00,0,Normative: Mark sync module evaluation promise as handled (#3535)
a562082b03,Michael Ficarra,2025-05-03T21:54:33-07:00,0,Editorial: add a missing space in the colophon (#3577)
4a3b6b4496,Shu-yu Guo,2025-05-08T21:02:00-07:00,0,Editorial: Remove optional waitable parameter (#3591)
9d1085a538,Kevin Gibbons,2025-05-14T15:54:59-07:00,0,Meta: clarify CONTRIBUTING.md (#3575)
560fabf9d4,Jordan Harband,2025-05-14T16:05:14-07:00,0,Meta: IPR: add rbuckton’s commits
58f11725c1,Jordan Harband,2025-05-14T16:12:49-07:00,0,Meta: IPR: fix an exception commit sha
affcec0752,Michael Ficarra,2025-05-14T16:12:59-07:00,0,Editorial: mark generatorBody() as a completion record in GeneratorStart (#3597)
62121b11ef,Kevin Gibbons,2025-05-17T13:55:28-07:00,0,Meta: bump ecmarkup to v21.3.0 (#3576)
2529e75378,Michael Ficarra,2025-05-17T14:08:31-07:00,0,Editorial: mark asyncBody() as a completion record in AsyncBlockStart (#3600)
d146702242,Michael Ficarra,2025-05-18T20:51:44-07:00,0,"Editorial: fix spelling of ""requirements"" (#3610)"
b758ddcf5f,Jon Kuperman,2025-06-03T13:56:33-07:00,0,Meta: Add WebAssembly as a downstream dependency (#3616)
caa0482e4f,Jordan Harband,2025-06-04T14:34:21-07:00,0,Normative: add `Error.isError` (#3507)
bdfd596ffa,André Bargull,2025-06-15T21:34:57-07:00,0,Editorial: Mention AsyncGeneratorFunction constructor in note (#3624)
b5aa26bc57,LegionMammal978,2025-06-25T15:28:41-07:00,0,Editorial: List all TypedArray internal slots (#3429)
1c543976cd,Nicolò Ribaudo,2025-06-26T22:29:41-07:00,0,Editorial: Fix [[AsyncEvaluationOrder]] examples for evaluated modules (#3580)
c8fe69b10d,Kevin Gibbons,2025-06-26T23:00:16-07:00,0,Editorial: mark a callsite of GetBindingValue as not throwing (#3599)
17b5581732,Michael Ficarra,2025-06-26T23:08:54-07:00,0,Editorial: mark the conversion operation in NumericToRawBytes as infallible (#3603)
48cab84557,Kevin Gibbons,2025-06-26T23:18:45-07:00,0,Editorial: FunctionDeclarationInstantiation cannot return ReturnCompletion (#3607)
dd2c5c3b94,Minseok Choe,2025-06-26T23:51:04-07:00,0,Meta: Upgrade ESMeta to v0.6.1 (#3614)
905660db97,Nicolò Ribaudo,2025-06-27T08:26:48-07:00,0,Editorial: Replace [[DFSIndex]] with a local variable (#3625)
7af34ed6e2,Kevin Gibbons,2025-06-27T13:53:29-07:00,0,Editorial: fix missed reordering of TA slots in intro
ca045a01b6,Richard Gibson,2025-06-30T14:25:54-07:00,0,Editorial: Consolidate TypedArrayCreate* slot checks (#3632)
427ef8ae2b,Richard Gibson,2025-06-30T14:26:01-07:00,1,Editorial: Co-locate TypedArrayCreateSameType and TypedArraySpeciesCreate (#3632)
60c4df0b65,Richard Gibson,2025-06-30T14:26:05-07:00,1,Editorial: Define SetTypedArrayFromArrayLike before SetTypedArrayFromTypedArray (#3632)
fb272affa3,Jordan Harband,2025-07-14T13:23:20-07:00,0,Meta: ipr exceptions: add a deactivated user
4f92e3ed7e,Jordan Harband,2025-07-14T13:13:18-07:00,0,Meta: update github action workflows
9b6a4a43ca,Ross Kirsling,2025-07-14T16:01:19-07:00,2,"Normative: Add ""Late Errors for Function Call Assignment Targets"" to Annex B (#3568)"
ac44bdf6b6,Michael Ficarra,2025-07-14T16:11:32-07:00,0,Editorial: wrap AC return values in completion records where needed (#3601)
1cad710b99,André Bargull,2025-07-14T16:34:15-07:00,0,Normative: WeakRef.prototype.constructor should be writable (#3638)
3881d95ddf,André Bargull,2025-07-14T18:30:19-07:00,0,Editorial: Refer to [[Call]] as an internal method (#3641)
bc27c7d894,André Bargull,2025-07-14T19:51:23-07:00,0,"Editorial: Include ""object"" in dfn for WeakRef and FinalizationRegistry prototype object (#3642 / #3097)"
7f97b40067,André Bargull,2025-07-14T19:51:24-07:00,0,"Editorial: Add ""object"" to allow linking to the prototype object (#3642)"
b324a0dbcb,tontonialberto,2025-07-14T20:02:14-07:00,1,"Editorial: Remove unnecessary ""the value of"" when referencing properties (#3643)"
3d77f663db,Michael Ficarra,2025-07-14T20:18:58-07:00,0,Editorial: consistently order undefined before null in comparisons (#3647)
b3f6a0ae5f,Jihyeok Park,2025-07-16T20:42:04-07:00,0,Meta: Upgrade ESMeta to v0.6.2 (#3650)
b6f76eab39,Michael Ficarra,2025-07-17T15:17:50-07:00,0,Meta: make esmeta log files available for download after CI run (#3651)
f065126986,Jihyeok Park,2025-07-24T10:13:53-05:00,0,Meta: Upgrade ESMeta to v0.6.4 (#3656)
c0c7445db4,tontonialberto,2025-07-24T10:23:20-05:00,0,Editorial: Use semicolon and lowercase 'otherwise' in single-step conditionals (#3653)
f764049fe0,tontonialberto,2025-07-24T10:23:22-05:00,16,Editorial: unify various forms of if-otherwise in single-line conditionals (#3653)
6f61b741a7,Shu-yu Guo,2025-07-27T22:13:53-07:00,0,Meta: Update my email (#3661)
a839b2b63e,Michael Ficarra,2025-08-13T10:44:13-07:00,0,Editorial: allow Symbols and Private Names to have internals slots (#3644)
390280763c,tontonialberto,2025-08-13T10:44:25-07:00,0,Editorial: Replace genitive with default notation when referencing properties (#3644)
1439803692,tontonialberto,2025-08-13T12:49:09-07:00,5,Editorial: replace tables in AOs with algorithm steps (#3666)
a7c67f6a2d,YAMAMOTO Yuji,2025-08-13T12:58:38-07:00,0,Editorial: Enum value should be SANS-SERIF (#3671)
654e339887,Michael Dyck,2025-08-18T21:05:31-07:00,0,Editorial: Link to alt FunctionDeclaration semantics (#2952)
8693c9940f,Michael Dyck,2025-08-18T21:05:38-07:00,0,"Editorial: Tweak the defn of ""Normative Optional"" (#2952)"
766b80c263,Michael Dyck,2025-08-18T21:05:41-07:00,0,"Editorial: Eliminate monkey-patching from ""Labelled Function Declarations"" (#2952)"
411357131,Michael Dyck,2025-08-18T21:05:45-07:00,13,"Editorial: Eliminate monkey-patching from ""Block-Level Function Declarations..."" (#2952)"
9815f3a5a2,Michael Dyck,2025-08-18T21:05:49-07:00,8,"Editorial: Eliminate monkey-patching from ""VariableStatements in Catch Blocks"" (#2952)"
f6017b2590,Michael Dyck,2025-08-18T21:05:53-07:00,3,"Editorial: Eliminate monkey-patching from ""The [[IsHTMLDDA]] Internal Slot"" (#2952)"
f828db8539,Michael Dyck,2025-08-18T21:05:56-07:00,5,"Editorial: Change ""Let"" to ""Set"" in inlined steps in EvalDeclarationInstantiation (#2952)"
27172c843b,Kevin Gibbons,2025-08-18T21:35:43-07:00,0,Normative: add Math.sumPrecise (#3654)
b507d6e94e,Michael Ficarra,2025-08-18T21:59:33-07:00,0,Editorial: add oldids for IDs removed by #3666 (#3674)
097963e9cb,Michael Ficarra,2025-08-18T22:09:22-07:00,0,Editorial: add note to SetFunctionName clarifying optionality (#3675)
948f702479,Kevin Gibbons,2025-08-20T09:25:25-07:00,0,Meta: remove unused pagedjs dependencies (#3678)
52187c8e72,senocular,2025-08-21T10:33:07-07:00,0,Editorial: remove unnecessary pluralization (#3681)
858b1df4f4,Gus Caplan,2025-08-25T15:35:25-07:00,4,Normative: Set [[SourceText]] on classes before static blocks evaluate (#2840)
3864092cab,Aapo Alasuutari,2025-08-25T16:33:20-07:00,0,Editorial: TypedArrayCreateSameType take length as parameter (#3570)
11d9dc66a7,Aki 🌹,2025-08-25T22:17:48-07:00,0,Markup: rendering edits for 2025 edition & beyond (#3623)
1da362c7cf,Aapo Alasuutari,2025-08-25T22:26:20-07:00,0,Editorial: Get/Set 'lastIndex' on RegExp object cannot trigger user-code (#3636)
f92c1ae625,Michael Ficarra,2025-08-25T22:30:54-07:00,0,Editorial: make RequireObjectCoercible return `~unused~` (#3676)
Check Script
  • Place python script check.py at the $ESMETA_HOME/ecma262
$ python3 check.py spec.html --limit 104 --esmeta-home "$ESMETA_HOME" \
  --debug \
  --save-raw-dir out/esmeta
#!/usr/bin/env python3
import argparse, csv, json, os, re, subprocess, sys, time

HUNK_RE = re.compile(r"\@\@.*\+(\d+)(?:,(\d+))?\s\@\@")
DEFAULT_WARN_RE = re.compile(r"(?i)\bwarn(ing)?\b|\bnew(?:ly)?\b|\bphrase\b")  # adjust to your esmeta output

# --------------------------- git helpers ---------------------------

def sh_capture(*args, allow_fail=False):
    try:
        out = subprocess.check_output(args, stderr=subprocess.STDOUT)
        return out.decode("utf-8", "replace")
    except subprocess.CalledProcessError as e:
        if allow_fail:
            return e.output.decode("utf-8", "replace")
        raise

def list_commits(limit: int):
    out = sh_capture("git","rev-list","--first-parent","--reverse","-n",str(limit),"HEAD")
    return [l for l in out.splitlines() if l.strip()]

def commit_meta(sha, fmt):
    return sh_capture("git","log","-1",f"--pretty=format:{fmt}",sha).strip()

def first_parent(sha):
    parts = sh_capture("git","rev-list","--parents","-n","1",sha).split()
    return parts[1] if len(parts) > 1 else None

def added_lines_for_commit(parent, sha, target):
    if not parent:
        content = sh_capture("git","show",f"{sha}:{target}", allow_fail=True)
        if not content:
            return []
        n = content.count("\n") + (0 if content.endswith("\n") else 1)
        return list(range(1, n+1))
    diff = sh_capture("git","diff","-U0","--no-color","--find-renames",parent,sha,"--",target, allow_fail=True)
    if not diff:
        return []
    out = []
    for m in HUNK_RE.finditer(diff):
        start = int(m.group(1)); cnt = int(m.group(2) or "1")
        out.extend(range(start, start+cnt))
    return out

# --------------------------- esmeta via shell ---------------------------

def run_esmeta_via_shell(cmd_str, added_lines_json, sha,
                         shell_path="/bin/zsh", debug=False, extra_env=None):
    """
    Execute esmeta through a shell so scripts without shebang still work.
    sha: commit SHA, passed as -extract:target=<sha>
    """
    env = os.environ.copy()
    if extra_env:
        env.update(extra_env)

    full_cmd = f"{cmd_str} -extract:target={sha}"

    t0 = time.perf_counter()
    proc = subprocess.run(
        full_cmd,
        input=added_lines_json, text=True,
        stdout=subprocess.PIPE, stderr=subprocess.STDOUT,
        shell=True, executable=shell_path, env=env, check=False
    )
    dt = time.perf_counter() - t0
    if debug:
        print(f"[DEBUG] shell: {shell_path}", file=sys.stderr)
        print(f"[DEBUG] cmd: {full_cmd}", file=sys.stderr)
        print(f"[DEBUG] elapsed: {dt:.3f}s", file=sys.stderr)
    return proc.stdout or ""
# --------------------------- main ---------------------------

def main():
    ap = argparse.ArgumentParser()
    ap.add_argument("target", nargs="?", default="spec.html", help="file to inspect (default: spec.html)")
    ap.add_argument("--limit", type=int, default=50, help="number of commits to scan (default: 50)")
    ap.add_argument("--outdir", default="added_lines", help="dir to cache per-commit added line json (default: added_lines)")
    ap.add_argument("--csv", default="warnings_summary.csv", help="output CSV path (default: warnings_summary.csv)")
    ap.add_argument("--warn-re", default=None, help="Python regex to match warning lines in esmeta output")
    ap.add_argument("--esmeta-home", default=os.environ.get("ESMETA_HOME"),
                    help="path to esmeta (for $ESMETA_HOME in command env)")
    ap.add_argument("--esmeta-cmd", default=None,
                    help="full shell command line for esmeta; if omitted, uses '\"$ESMETA_HOME/bin/esmeta\" extract -status -extract:warn-action'")
    ap.add_argument("--esmeta-arg", action="append", default=[],
                    help="extra args appended (space-joined) to esmeta-cmd; repeatable")
    ap.add_argument("--shell", default="/bin/zsh", help="shell to execute command (default: /bin/zsh; use /bin/sh if preferred)")
    ap.add_argument("--no-esmeta", action="store_true", help="skip esmeta invocation (structure only)")
    ap.add_argument("--debug", action="store_true", help="print command, sizes, timing to stderr")
    ap.add_argument("--save-raw-dir", default=None,
                    help="directory to save per-commit raw esmeta stdout (.out) and stdin json (.in.json)")
    args = ap.parse_args()

    os.makedirs(args.outdir, exist_ok=True)
    if args.save_raw_dir:
        os.makedirs(args.save_raw_dir, exist_ok=True)

    warn_re = re.compile(args.warn_re) if args.warn_re else DEFAULT_WARN_RE

    # Default esmeta command if not provided
    if not args.no_esmeta:
        if not args.esmeta_cmd:
            if not args.esmeta_home:
                print("ERROR: provide --esmeta-home or --esmeta-cmd", file=sys.stderr)
                sys.exit(2)
            args.esmeta_cmd = f'"{os.path.join(args.esmeta_home, "bin", "esmeta")}" extract -status -extract:warn-action'
        # Append extra args if any
        if args.esmeta_arg:
            args.esmeta_cmd = args.esmeta_cmd + " " + " ".join(args.esmeta_arg)

    total_warns = 0
    with open(args.csv, "w", newline="", encoding="utf-8") as f:
        writer = csv.writer(f)
        writer.writerow(["short_sha","author","date_iso","warn_count","one_example","subject"])

        for sha in list_commits(args.limit):
            ssha    = sh_capture("git","rev-parse","--short=10",sha).strip()
            subject = commit_meta(sha, "%s").replace(",", " ")
            author  = commit_meta(sha, "%an").replace(",", " ")
            dateiso = commit_meta(sha, "%cI")

            # cache added lines json per commit
            cache_json = os.path.join(args.outdir, f"{ssha}.json")
            if not os.path.exists(cache_json):
                parent = first_parent(sha)
                nums = added_lines_for_commit(parent, sha, args.target)
                with open(cache_json, "w", encoding="utf-8") as jf:
                    json.dump(nums, jf, ensure_ascii=False)

            warn_count, example = 0, "-"
            if not args.no_esmeta:
                added_json = open(cache_json, "r", encoding="utf-8").read()
                output = run_esmeta_via_shell(
                    cmd_str=args.esmeta_cmd,
                    added_lines_json=added_json,
                    sha=sha,                                # <--- 여기서 sha 전달
                    shell_path=args.shell,
                    debug=args.debug,
                    extra_env={"ESMETA_HOME": args.esmeta_home} if args.esmeta_home else None,
                )

                # optional raw save
                if args.save_raw_dir:
                    with open(os.path.join(args.save_raw_dir, f"{ssha}.out"), "w", encoding="utf-8") as fo:
                        fo.write(output)
                    with open(os.path.join(args.save_raw_dir, f"{ssha}.in.json"), "w", encoding="utf-8") as fi:
                        fi.write(added_json)

                # summarize warnings
                lines = [l for l in output.splitlines() if l.startswith("::warning")]
                warncount = len(lines)
                total_warns += warn_count
                example = lines[0] if lines else "-"

            print(f"- {ssha} {author} {dateiso}  warnings={warn_count}")
            writer.writerow([ssha, author, dateiso, warn_count, example, subject])

    print(f"\nTOTAL warnings across last {args.limit} commits: {total_warns}")
    print(f"Summary written to: {args.csv}")

if __name__ == "__main__":
    main()

@stonechoe
Copy link
Member Author

@michaelficarra We’re considering merging this PR and then opening a corresponding PR in tc39/ecma262. If editors have already decided not to proceed with this feature, please let me know. Otherwise, I think opening one on ecma262 would help move things along more quickly.

@michaelficarra
Copy link

@stonechoe I'm sorry I forgot to get back to you earlier! We talked about the results you posted in our editor call. The other editors felt that it was a bit too noisy at the moment to integrate, but we believe there is a good amount of low-hanging fruit that pretty much just needs parser support. Take a look at some samples of not-yet-understood steps and types I took from an ECMA-262 CI run: https://gist.github.com/michaelficarra/8ed4c5d1243dc1465f14e90623e1c8c9. We think if there's parser support added for the steps which are effectively just aliases of other currently-understood steps, the signal-to-noise ratio would be acceptable enough to integrate into our CI.

So feel free to open the PR whenever you like, but we'll need to work to reduce the noise a little before we can merge it.

@jhnaldo
Copy link
Contributor

jhnaldo commented Oct 31, 2025

@michaelficarra Thanks for sharing your concerns. We recently refactored our metalanguage parser (#316). After this revision, the recent results are as follows for the recent commits in ecma262:

PR Author Name Added/Modified Lines Unknown Step Unknown Type
#3655 Kevin Gibbons test1 406 6 4
#3660 mobius29 test2 1 0 0
#3691 Kevin Gibbons test3 1 0 0
#3513 Richard Gibson test4 14 2 1
#3595 Aapo Alasuutari test5 3 0 0
#3608 Michael Ficarra test6 1 0 0
#3613 Nicolò Ribaudo test7 2 0 0
#3613 Jordan Harband test8 1 0 0
#3619 Richard Gibson test9 7 0 0
#3620 André Bargull test10 326 0 0
#3664 mobius29 test11 2 0 0
#3687 Nicolò Ribaudo test12 5 0 0
#3694 YAMAMOTO Yuji test13 1 0 0

You can see a preview of our warning messages by clicking each testXX for the corresponding PRs. Could you provide us with your opinion on the results? If they are still verbose, we will apply this feature after covering more steps.

@michaelficarra
Copy link

@jhnaldo We talked about this at the last editor call and we are willing to give it a try now with the improved parser. Thanks for doing that work!

@jhnaldo
Copy link
Contributor

jhnaldo commented Nov 11, 2025

@michaelficarra Thanks for the update! That's excellent news. I will create a PR on the ecma262 repository soon.

@jhnaldo jhnaldo merged commit 36089f3 into dev Nov 11, 2025
6 checks passed
@jhnaldo jhnaldo deleted the feat-warn-ci branch November 11, 2025 04:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants