New sca_match_kind and CallTransitiveReachable RPC v0 #342

aryx · 2025-02-05T15:12:25Z

test plan:
see related PR in semgrep

I ran make setup && make to update the generated code after editing a .atd file (TODO: have a CI check)
I made sure we're still backward compatible with old versions of the CLI.
For example, the Semgrep backend need to still be able to consume data
generated by Semgrep 1.50.0.
See https://atd.readthedocs.io/en/latest/atdgen-tutorial.html#smooth-protocol-upgrades
Note that the types related to the semgrep-core JSON output or the
semgrep-core RPC do not need to be backward compatible!

test plan: see related PR in semgrep

github-actions · 2025-02-05T15:13:31Z

Backwards compatibility summary:

Checking backward compatibility of semgrep_output_v1.atd against past version v1.100.0
Checking backward compatibility of semgrep_output_v1.atd against past version v1.101.0
Skipping v1.102.0 because commit 1c82453e89e0b569630e48ddde015e201df0e5f9 has already been checked
Checking backward compatibility of semgrep_output_v1.atd against past version v1.103.0
Checking backward compatibility of semgrep_output_v1.atd against past version v1.104.0
Skipping v1.106.0 because commit 5e0c767ec323f3f2356d3bf8dbdf7c7836497d8a has already been checked
Skipping v1.107.0 because commit 5e0c767ec323f3f2356d3bf8dbdf7c7836497d8a has already been checked
Checking backward compatibility of semgrep_output_v1.atd against past version v1.49.0
Skipping v1.50.0 because commit 857682f41eb09e0b330a247ff1adf3bfeaf9d9ca has already been checked
Checking backward compatibility of semgrep_output_v1.atd against past version v1.52.0
Checking backward compatibility of semgrep_output_v1.atd against past version v1.53.0
Skipping v1.54.0 because commit 3b72d494260258497e796d094b1a4916501a6df1 has already been checked
Skipping v1.54.1 because commit 3b72d494260258497e796d094b1a4916501a6df1 has already been checked
Checking backward compatibility of semgrep_output_v1.atd against past version v1.54.2
Skipping v1.54.3 because commit 9f1c50383a9a9969e2fe7a5f9bff9ca0a7c837bb has already been checked
Checking backward compatibility of semgrep_output_v1.atd against past version v1.55.0
Skipping v1.55.1 because commit 6dffeaa692153fd33b4f154fddaefde1f2f1ae27 has already been checked
Skipping v1.55.2 because commit 6dffeaa692153fd33b4f154fddaefde1f2f1ae27 has already been checked
Checking backward compatibility of semgrep_output_v1.atd against past version v1.56.0
Checking backward compatibility of semgrep_output_v1.atd against past version v1.57.0
Skipping v1.58.0 because commit 4cc11b00d411c02fc611aa8c78a336520438fb48 has already been checked
Checking backward compatibility of semgrep_output_v1.atd against past version v1.59.0
Checking backward compatibility of semgrep_output_v1.atd against past version v1.59.1
Checking backward compatibility of semgrep_output_v1.atd against past version v1.60.0
Skipping v1.60.1 because commit eed58a091fd7d19e402a6d4cf2d287e137215d03 has already been checked
Checking backward compatibility of semgrep_output_v1.atd against past version v1.61.0
Skipping v1.61.1 because commit bbfd1c5b91bd411bceffc3de73f5f0b37f04433d has already been checked
Skipping v1.62.0 because commit bbfd1c5b91bd411bceffc3de73f5f0b37f04433d has already been checked
Checking backward compatibility of semgrep_output_v1.atd against past version v1.63.0
Checking backward compatibility of semgrep_output_v1.atd against past version v1.64.0
Checking backward compatibility of semgrep_output_v1.atd against past version v1.65.0
Skipping v1.66.0 because commit 3e7bbafa2b7e722d893303a7fb90a83dab6737a7 has already been checked
Checking backward compatibility of semgrep_output_v1.atd against past version v1.66.1
Skipping v1.66.2 because commit 215a54782174de84f97188632b4a37e35ba0f827 has already been checked
Checking backward compatibility of semgrep_output_v1.atd against past version v1.67.0
Checking backward compatibility of semgrep_output_v1.atd against past version v1.68.0
Skipping v1.69.0 because commit d5b91fa4f6a03240db31e9bbbc5376a99bc8eeea has already been checked
Checking backward compatibility of semgrep_output_v1.atd against past version v1.70.0
Checking backward compatibility of semgrep_output_v1.atd against past version v1.71.0
Skipping v1.72.0 because commit 75abf193687b84ab341d8267d865ad68d81a89c9 has already been checked
Checking backward compatibility of semgrep_output_v1.atd against past version v1.73.0
Skipping v1.74.0 because commit 9f38254957c50c68ea402eebae0f7aa40dd01cbf has already been checked
Checking backward compatibility of semgrep_output_v1.atd against past version v1.75.0
Skipping v1.76.0 because commit 9102031608aa4154e1c37f557550ec4eabc8780c has already been checked
Checking backward compatibility of semgrep_output_v1.atd against past version v1.77.0
Skipping v1.78.0 because commit dcb5d77b420ddee61f58aadd3c2c7aef38778154 has already been checked
Checking backward compatibility of semgrep_output_v1.atd against past version v1.79.0
Checking backward compatibility of semgrep_output_v1.atd against past version v1.80.0
Checking backward compatibility of semgrep_output_v1.atd against past version v1.81.0
Skipping v1.82.0 because commit 9e0f3bec26b07b4fb6753a32cb75277f45f2572c has already been checked
Skipping v1.83.0 because commit 9e0f3bec26b07b4fb6753a32cb75277f45f2572c has already been checked
Checking backward compatibility of semgrep_output_v1.atd against past version v1.84.0
Skipping v1.84.1 because commit 3daef49297ada205359cc1d2996354c94b628b0d has already been checked
Checking backward compatibility of semgrep_output_v1.atd against past version v1.85.0
Checking backward compatibility of semgrep_output_v1.atd against past version v1.86.0
Checking backward compatibility of semgrep_output_v1.atd against past version v1.87.0
Skipping v1.88.0 because commit 512c0bd97db59c48a5705b2741662a338776e438 has already been checked
Skipping v1.89.0 because commit 512c0bd97db59c48a5705b2741662a338776e438 has already been checked
Checking backward compatibility of semgrep_output_v1.atd against past version v1.90.0
Checking backward compatibility of semgrep_output_v1.atd against past version v1.91.0
Skipping v1.92.0 because commit 2351c5e528cb7430422208dc66707894c066b508 has already been checked
Checking backward compatibility of semgrep_output_v1.atd against past version v1.93.0
Checking backward compatibility of semgrep_output_v1.atd against past version v1.94.0
Checking backward compatibility of semgrep_output_v1.atd against past version v1.95.0
Checking backward compatibility of semgrep_output_v1.atd against past version v1.96.0
Checking backward compatibility of semgrep_output_v1.atd against past version v1.97.0
Checking backward compatibility of semgrep_output_v1.atd against past version v1.98.0
Skipping v1.99.0 because commit 60809032a2e39742f42910d46b3e5dd305b8b8cf has already been checked

semgrep_output_v1.atd

bkettle

This spurred a bunch of questions in my head which I then talked to @mmcqd about. We sometimes have dependencies that are used both transitively and directly. Currently, we just treat these as direct dependencies and declare findings to be unreachable if we cannot find a match in first-party code, but we should stop doing so since we want to be more rigorous about our handling of transitive risk: even if a vulnerable is unreachable directly, it may still be reachable via some third-party dependency that also uses the vulnerable dependency.

I saw two ways of handling this:

We can try to encompass both direct and transitive cases in a single finding. This is nice in some cases (e.g. if it is neither directly nor transitively reachable, it is nice to have a single unreachable finding) but it is much harder to make the analysis that we performed clear (we would need to build complicated UX to allow customers to see that, for example, a finding was directly unreachable but transitively reachable). This is close to what we currently do, but some customers have complained that the behavior of Semgrep is not clear when a dependency is both direct and transitive.
We can create two findings when a dependency is both direct and transitive, one for the transitive usage and one for the direct usage. We would then apply only one type of analysis to each finding. The direct finding would be one of DirectReachable, DirectUnreachable, or LockfileOnlyMatch Direct, and the transitive finding would be one of TransitiveReachable, TransitiveUnreachable, TransitiveUnknown, or LockfileOnlyMatch Transitive`.

In either case, I think we will need to add another value to the Transitivity sum type in FoundDependency that allows a dependency to be TransitiveAndDirect.

Matthew and I thought that it made more sense to go with approach number 2, which splits findings in two when a dependency is both direct and transitive. This way, each sca_match_kind corresponds to one type of analysis. In option one, the sca_match_kind would need to encompass multiple types of analysis (direct and transitive) in a way that would likely get awkward, especially when there are multiple direct code matches that produce multiple findings. I would love to hear what you think of this, though, and am happy to chat about it in the morning.

semgrep_output_v1.atd

bkettle · 2025-02-05T23:30:08Z

semgrep_output_v1.atd

+   * (reachable as originally defined by Semgrep Inc.)
+   * the match location will be in some target code.
+   *)
+  | Reachable


We probably need an Unreachable option as well, right? Unreachable feels like fundamentally a different type of match because we ran the pattern, but didn't find any results. Like TransitiveUnreachable, direct Unreachable is also a positive finding

But if we didn't find any results in first party code, there will be no finding.
Or actually we might just generate at first a LockfileOnlyMatch Transitive, that hopefully we can then
transform in TransitiveUnreachable.
But let's keep this Unreachable, we'll see.

semgrep_output_v1.atd

bkettle · 2025-02-06T02:41:10Z

I committed some changes that might make sense if we go with option 2 above

aryx · 2025-02-06T08:24:57Z

I like option 2.
Let's merge this.

New sca_match_kind and CallTransitiveReachable RPC v0

263beaa

test plan: see related PR in semgrep

aryx requested review from a team, brandonspark and bkettle and removed request for a team and brandonspark February 5, 2025 15:12

aryx commented Feb 5, 2025

View reviewed changes

semgrep_output_v1.atd Outdated Show resolved Hide resolved

bkettle approved these changes Feb 6, 2025

View reviewed changes

bkettle and others added 2 commits February 5, 2025 18:56

proposed changes

ddba919

Merge branch 'main' into rpc_tr_step1

3727543

aryx merged commit 2ec9015 into main Feb 6, 2025
3 checks passed

aryx deleted the rpc_tr_step1 branch February 6, 2025 16:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

New sca_match_kind and CallTransitiveReachable RPC v0 #342

New sca_match_kind and CallTransitiveReachable RPC v0 #342

Uh oh!

aryx commented Feb 5, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Feb 5, 2025

Uh oh!

Uh oh!

bkettle left a comment

Uh oh!

Uh oh!

Uh oh!

bkettle Feb 5, 2025

Uh oh!

aryx Feb 6, 2025

Uh oh!

Uh oh!

bkettle commented Feb 6, 2025

Uh oh!

aryx commented Feb 6, 2025

Uh oh!

Uh oh!

Uh oh!

New sca_match_kind and CallTransitiveReachable RPC v0 #342

New sca_match_kind and CallTransitiveReachable RPC v0 #342

Uh oh!

Conversation

aryx commented Feb 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Feb 5, 2025

Uh oh!

Uh oh!

bkettle left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

bkettle Feb 5, 2025

Choose a reason for hiding this comment

Uh oh!

aryx Feb 6, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

bkettle commented Feb 6, 2025

Uh oh!

aryx commented Feb 6, 2025

Uh oh!

Uh oh!

Uh oh!

aryx commented Feb 5, 2025 •

edited

Loading