Skip to content

Conversation

grantseltzer
Copy link
Member

What does this PR do?

Creates the dyninst/exprlang package which lays the foundation for supporting the live debugger expression language. The package parses the dsl JSON message received from remote config, validates it is supported, and will provide utility for using it such as collecting all variables referenced in the ast.

Motivation

Supporting expression language.

Describe how you validated your changes

Added tests

Additional Notes

@grantseltzer grantseltzer requested a review from a team as a code owner October 10, 2025 15:54
@grantseltzer grantseltzer added changelog/no-changelog team/dynamic-instrumentation Dynamic Instrumentation qa/done QA done before merge and regressions are covered by tests labels Oct 10, 2025
@github-actions github-actions bot added the medium review PR review might take time label Oct 10, 2025
Copy link

cit-pr-commenter bot commented Oct 10, 2025

Regression Detector

Regression Detector Results

Metrics dashboard
Target profiles
Run ID: 9408d9b3-7969-486b-a15d-4bf7c63f470e

Baseline: 466f1b0
Comparison: 54cca2b
Diff

Optimization Goals: ❌ Regression(s) detected

perf experiment goal Δ mean % Δ mean % CI trials links
docker_containers_memory memory utilization +10.07 [+9.58, +10.55] 1 Logs

Experiments ignored for regressions

Regressions in experiments with settings containing erratic: true are ignored.

perf experiment goal Δ mean % Δ mean % CI trials links
docker_containers_cpu % cpu utilization +13.59 [+11.82, +15.36] 1 Logs

Fine details of change detection per experiment

perf experiment goal Δ mean % Δ mean % CI trials links
docker_containers_cpu % cpu utilization +13.59 [+11.82, +15.36] 1 Logs
docker_containers_memory memory utilization +10.07 [+9.58, +10.55] 1 Logs
quality_gate_logs % cpu utilization +3.49 [+0.65, +6.33] 1 Logs bounds checks dashboard
ddot_metrics memory utilization +0.39 [+0.22, +0.55] 1 Logs
ddot_metrics_sum_cumulative memory utilization +0.36 [+0.24, +0.48] 1 Logs
otlp_ingest_logs memory utilization +0.33 [+0.19, +0.46] 1 Logs
tcp_syslog_to_blackhole ingress throughput +0.21 [+0.14, +0.28] 1 Logs
quality_gate_idle_all_features memory utilization +0.12 [+0.08, +0.16] 1 Logs bounds checks dashboard
ddot_metrics_sum_delta memory utilization +0.11 [-0.04, +0.26] 1 Logs
uds_dogstatsd_20mb_12k_contexts_20_senders memory utilization +0.07 [+0.03, +0.11] 1 Logs
file_to_blackhole_1000ms_latency egress throughput +0.06 [-0.55, +0.66] 1 Logs
file_to_blackhole_0ms_latency egress throughput +0.04 [-0.56, +0.64] 1 Logs
uds_dogstatsd_to_api ingress throughput +0.02 [-0.21, +0.24] 1 Logs
tcp_dd_logs_filter_exclude ingress throughput +0.00 [-0.01, +0.01] 1 Logs
file_to_blackhole_500ms_latency egress throughput -0.03 [-0.63, +0.57] 1 Logs
ddot_metrics_sum_cumulativetodelta_exporter memory utilization -0.04 [-0.24, +0.15] 1 Logs
file_to_blackhole_100ms_latency egress throughput -0.06 [-0.66, +0.53] 1 Logs
otlp_ingest_metrics memory utilization -0.08 [-0.21, +0.05] 1 Logs
quality_gate_idle memory utilization -0.12 [-0.16, -0.08] 1 Logs bounds checks dashboard
file_tree memory utilization -0.26 [-0.31, -0.21] 1 Logs
ddot_logs memory utilization -0.33 [-0.39, -0.27] 1 Logs
quality_gate_metrics_logs memory utilization -0.69 [-0.90, -0.48] 1 Logs bounds checks dashboard

Bounds Checks: ❌ Failed

perf experiment bounds_check_name replicates_passed links
docker_containers_cpu simple_check_run 10/10
docker_containers_memory memory_usage 9/10
docker_containers_memory simple_check_run 10/10
file_to_blackhole_0ms_latency lost_bytes 10/10
file_to_blackhole_0ms_latency memory_usage 10/10
file_to_blackhole_1000ms_latency memory_usage 10/10
file_to_blackhole_100ms_latency lost_bytes 10/10
file_to_blackhole_100ms_latency memory_usage 10/10
file_to_blackhole_500ms_latency lost_bytes 10/10
file_to_blackhole_500ms_latency memory_usage 10/10
quality_gate_idle intake_connections 10/10 bounds checks dashboard
quality_gate_idle memory_usage 10/10 bounds checks dashboard
quality_gate_idle_all_features intake_connections 10/10 bounds checks dashboard
quality_gate_idle_all_features memory_usage 10/10 bounds checks dashboard
quality_gate_logs intake_connections 10/10 bounds checks dashboard
quality_gate_logs lost_bytes 10/10 bounds checks dashboard
quality_gate_logs memory_usage 10/10 bounds checks dashboard
quality_gate_metrics_logs cpu_usage 10/10 bounds checks dashboard
quality_gate_metrics_logs intake_connections 10/10 bounds checks dashboard
quality_gate_metrics_logs lost_bytes 10/10 bounds checks dashboard
quality_gate_metrics_logs memory_usage 10/10 bounds checks dashboard

Explanation

Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%

Performance changes are noted in the perf column of each table:

  • ✅ = significantly better comparison variant performance
  • ❌ = significantly worse comparison variant performance
  • ➖ = no significant change in performance

A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".

For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:

  1. Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.

  2. Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.

  3. Its configuration does not mark it "erratic".

CI Pass/Fail Decision

Passed. All Quality Gates passed.

  • quality_gate_idle, bounds check memory_usage: 10/10 replicas passed. Gate passed.
  • quality_gate_idle, bounds check intake_connections: 10/10 replicas passed. Gate passed.
  • quality_gate_idle_all_features, bounds check intake_connections: 10/10 replicas passed. Gate passed.
  • quality_gate_idle_all_features, bounds check memory_usage: 10/10 replicas passed. Gate passed.
  • quality_gate_metrics_logs, bounds check memory_usage: 10/10 replicas passed. Gate passed.
  • quality_gate_metrics_logs, bounds check lost_bytes: 10/10 replicas passed. Gate passed.
  • quality_gate_metrics_logs, bounds check cpu_usage: 10/10 replicas passed. Gate passed.
  • quality_gate_metrics_logs, bounds check intake_connections: 10/10 replicas passed. Gate passed.
  • quality_gate_logs, bounds check intake_connections: 10/10 replicas passed. Gate passed.
  • quality_gate_logs, bounds check memory_usage: 10/10 replicas passed. Gate passed.
  • quality_gate_logs, bounds check lost_bytes: 10/10 replicas passed. Gate passed.

@agent-platform-auto-pr
Copy link
Contributor

agent-platform-auto-pr bot commented Oct 10, 2025

Static quality checks

✅ Please find below the results from static quality gates
Comparison made with ancestor 466f1b0

Successful checks

Info

Quality gate Delta On disk size (MiB) Delta On wire size (MiB)
agent_deb_amd64 $${0}$$ $${676.04}$$ < $${702.15}$$ $${+0.01}$$ $${164.98}$$ < $${177.79}$$
agent_deb_amd64_fips $${0}$$ $${670.55}$$ < $${696.65}$$ $${+0.03}$$ $${164.23}$$ < $${176.53}$$
agent_heroku_amd64 $${0}$$ $${336.47}$$ < $${340.18}$$ $${+0}$$ $${89.82}$$ < $${91.08}$$
agent_msi $${-0}$$ $${1012.48}$$ < $${1015.38}$$ $${+0}$$ $${149.12}$$ < $${150.78}$$
agent_rpm_amd64 $${0}$$ $${676.03}$$ < $${702.14}$$ $${+0.01}$$ $${167.26}$$ < $${180.53}$$
agent_rpm_amd64_fips $${0}$$ $${670.54}$$ < $${696.64}$$ $${-0.01}$$ $${165.85}$$ < $${178.79}$$
agent_rpm_arm64 $${0}$$ $${665.47}$$ < $${686.31}$$ $${-0.01}$$ $${153.43}$$ < $${161.22}$$
agent_rpm_arm64_fips $${0}$$ $${660.99}$$ < $${681.84}$$ $${+0.03}$$ $${153.1}$$ < $${160.18}$$
agent_suse_amd64 $${0}$$ $${676.03}$$ < $${702.14}$$ $${+0.01}$$ $${167.26}$$ < $${180.53}$$
agent_suse_amd64_fips $${0}$$ $${670.54}$$ < $${696.64}$$ $${-0.01}$$ $${165.85}$$ < $${178.79}$$
agent_suse_arm64 $${0}$$ $${665.47}$$ < $${686.31}$$ $${-0.01}$$ $${153.43}$$ < $${161.22}$$
agent_suse_arm64_fips $${0}$$ $${660.99}$$ < $${681.84}$$ $${+0.03}$$ $${153.1}$$ < $${160.18}$$
docker_agent_amd64 $${-0}$$ $${747.47}$$ < $${773.59}$$ $${+0}$$ $${252.31}$$ < $${266.06}$$
docker_agent_arm64 $${-0}$$ $${760.86}$$ < $${781.7}$$ $${-0}$$ $${242.91}$$ < $${251.85}$$
docker_agent_jmx_amd64 $${-0}$$ $${938.33}$$ < $${964.45}$$ $${-0}$$ $${320.93}$$ < $${334.68}$$
docker_agent_jmx_arm64 $${+0}$$ $${940.33}$$ < $${961.17}$$ $${-0}$$ $${307.52}$$ < $${316.44}$$
docker_cluster_agent_amd64 $${+0}$$ $${212.95}$$ < $${213.74}$$ $${+0}$$ $${72.25}$$ < $${73.14}$$
docker_cluster_agent_arm64 $${+0}$$ $${228.86}$$ < $${229.68}$$ $${+0}$$ $${68.52}$$ < $${69.41}$$
docker_cws_instrumentation_amd64 $${0}$$ $${7.07}$$ < $${7.12}$$ $${-0}$$ $${2.95}$$ < $${3.29}$$
docker_cws_instrumentation_arm64 $${0}$$ $${6.69}$$ < $${6.92}$$ $${0}$$ $${2.7}$$ < $${3.07}$$
docker_dogstatsd_amd64 $${+0}$$ $${38.4}$$ < $${39.3}$$ $${+0}$$ $${14.83}$$ < $${15.76}$$
docker_dogstatsd_arm64 $${+0}$$ $${37.06}$$ < $${37.94}$$ $${-0}$$ $${14.27}$$ < $${14.83}$$
dogstatsd_deb_amd64 $${0}$$ $${29.63}$$ < $${30.53}$$ $${+0}$$ $${7.81}$$ < $${8.75}$$
dogstatsd_deb_arm64 $${0}$$ $${28.21}$$ < $${29.11}$$ $${+0}$$ $${6.77}$$ < $${7.71}$$
dogstatsd_rpm_amd64 $${0}$$ $${29.63}$$ < $${30.53}$$ $${-0}$$ $${7.82}$$ < $${8.76}$$
dogstatsd_suse_amd64 $${0}$$ $${29.63}$$ < $${30.53}$$ $${-0}$$ $${7.82}$$ < $${8.76}$$
iot_agent_deb_amd64 $${0}$$ $${41.81}$$ < $${54.97}$$ $${+0}$$ $${10.88}$$ < $${14.45}$$
iot_agent_deb_arm64 $${0}$$ $${39.64}$$ < $${51.9}$$ $${-0}$$ $${9.39}$$ < $${12.63}$$
iot_agent_deb_armhf $${0}$$ $${39.51}$$ < $${51.84}$$ $${+0}$$ $${9.48}$$ < $${12.74}$$
iot_agent_rpm_amd64 $${0}$$ $${41.81}$$ < $${54.97}$$ $${-0}$$ $${10.9}$$ < $${14.47}$$
iot_agent_suse_amd64 $${0}$$ $${41.81}$$ < $${54.97}$$ $${-0}$$ $${10.9}$$ < $${14.47}$$

// in the expression language of the dynamic instrumentation and live debugger products.
//
// This package focuses solely on DSL parsing and validation, without any dependencies
// on IR program structure. The parsed expressions can then be validated against specific
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uber-nit: you use the word "validate" twice. I think that the first use here is okay, but sort of redundant (it really is just parsing).

The second sentence is somewhat less enlightening. I also don't think that there needs to be justification of the what depends on what. This parsing is something of a leaf concept -- I don't think it's confusing to future folks that it is its own package.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think of this post: https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/

This really is parsing


// ParseError represents an error that occurred during DSL parsing.
type ParseError struct {
Message string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why export either of these fields? Also, is the structured error here motivated by something?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really, Im just going to get rid of it.

}

// decoderPool is a sync.Pool for jsontext.Decoder instances to optimize allocations.
var decoderPool = sync.Pool{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm all for this pool, but I feel like it begs for a benchmark. On some level, if you're going to pool the decoders but not the bytes.Reader, is it worth it? A benchmark will be instructive.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pushing benchmarks for with/without pool but it saves ~62% memory allocations.

pkg: github.com/DataDog/datadog-agent/pkg/dyninst/exprlang
BenchmarkParse
BenchmarkParse/simple_ref/WithPool
BenchmarkParse/simple_ref/WithPool-4         	 6630799	       175.8 ns/op	     259 B/op	       5 allocs/op
BenchmarkParse/simple_ref/WithoutPool
BenchmarkParse/simple_ref/WithoutPool-4      	 3770197	       328.1 ns/op	     680 B/op	      11 allocs/op
BenchmarkParse/long_ref/WithPool
BenchmarkParse/long_ref/WithPool-4           	 4677120	       265.7 ns/op	     323 B/op	       6 allocs/op
BenchmarkParse/long_ref/WithoutPool
BenchmarkParse/long_ref/WithoutPool-4        	 2973976	       457.0 ns/op	     744 B/op	      12 allocs/op
BenchmarkParse/unsupported_string/WithPool
BenchmarkParse/unsupported_string/WithPool-4 	 5854837	       204.6 ns/op	     307 B/op	       6 allocs/op
BenchmarkParse/unsupported_string/WithoutPool
BenchmarkParse/unsupported_string/WithoutPool-4         	 3486308	       353.0 ns/op	     728 B/op	      12 allocs/op
BenchmarkParse/unsupported_number/WithPool
BenchmarkParse/unsupported_number/WithPool-4            	 6117717	       196.7 ns/op	     304 B/op	       6 allocs/op
BenchmarkParse/unsupported_number/WithoutPool
BenchmarkParse/unsupported_number/WithoutPool-4         	 3653155	       384.1 ns/op	     720 B/op	      12 allocs/op
BenchmarkParseParallel
BenchmarkParseParallel/simple_ref/WithPool
BenchmarkParseParallel/simple_ref/WithPool-4            	10879426	       108.5 ns/op	     259 B/op	       5 allocs/op
BenchmarkParseParallel/simple_ref/WithoutPool
BenchmarkParseParallel/simple_ref/WithoutPool-4         	 5154585	       212.1 ns/op	     680 B/op	      11 allocs/op
BenchmarkParseParallel/unsupported_string/WithPool
BenchmarkParseParallel/unsupported_string/WithPool-4    	 9989060	       108.0 ns/op	     307 B/op	       6 allocs/op
BenchmarkParseParallel/unsupported_string/WithoutPool
BenchmarkParseParallel/unsupported_string/WithoutPool-4 	 5393654	       222.1 ns/op	     728 B/op	      12 allocs/op


// Get a decoder from the pool and reset it
dec := decoderPool.Get().(*jsontext.Decoder)
defer decoderPool.Put(dec)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should reset the decoder before putting it back, otherwise it'll hold the reader internally in the pool.

return nil, &ParseError{Message: fmt.Sprintf("malformed DSL: got token %v (%v), expected {", objStart, kind)}
}

// Read the instruction key
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: is Instruction a term we want to commit to? One of Op / Operator / Operation are a tad more evocative for me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a chance as we add more that we're going to want to have it more like:

type BinaryExpr /* or maybe DiadicExpr */ struct{
  Op Op // some enum of the ops
  Left Expr
  Right Expr
}

type UnaryExpr /* or maybe MonadicExpr */ struct {
    
}

type RefExpr struct { /* ... */ }

type UnsupportedExpr struct { /* ... */ }

type LiteralExpr struct { /* I don't know yet what goes here */ }

We don't need to think about this now. I just want to plant the seeds.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's so much overloaded terminology in this project/product :-). I'm like Operation and have changed it accordingly.

"github.com/stretchr/testify/require"
)

func TestParse(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this layer it'd be cool to have a bunch of examples from remote config or something of our actual expression language for a bunch of features. Alternatively / additionally, grab all the examples we use in system tests. It'll be instructive. To be explicit: I'm not trying to change the scope of what you support parsing, we should just test and understand what the parsing behavior looks like in those cases in terms of errors and the shape of UnsupportedExpr.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The system-tests don't have a lot of real-world examples, dd-trace-py has a decent amount of cases, but it would probably be best to curate a large number of examples and expectations that all the tracers could run (on top of their language-specific gotchas).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added more examples using the python tests.

Comment on lines 185 to 190
// For unsupported expressions, try to extract variable names if possible
if e.Instruction == "ref" {
if refStr, ok := e.Argument.(string); ok && refStr != "" {
return []string{refStr}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why? how does this even happen?

}
}

// CollectVariableReferences extracts all variable names referenced in an expression.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this function surprising. Let's not include it in this PR

Copy link
Member Author

@grantseltzer grantseltzer Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More relevant to follow ups, yea. Just for clarity though, do you find it surprising that it's included as part of this PR or feel that the functionality of it shouldn't go in this package at all?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for clarity though, do you find it surprising that it's included as part of this PR or feel that the functionality of it shouldn't go in this package at all?

I think I find it surprising in this package at all. The idea of VariableReferences isn't perfectly captured by the "ref" expressions because full dots will be here. I'd expect that a similar thing would be to iterate over all the *RefExpr would be the primitive you'd want to export from this package.

// UnsupportedExpr represents an expression type that is not yet supported.
type UnsupportedExpr struct {
Instruction string
Argument any
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This any doesn't seem quite right to me. Why are we even trying? It's highly likely to itself be an expression. The code as you wrote it is going to then make this into the string { or [.

For this pass, how about we just use jsontext.Value for argument and move along with our lives. I don't see the value in attempting to parse further from there.

In some future, I could also see us trying to recursively parse. I feel like the issue is that the DSL AST can have nodes that have literal values or they can have

type UnsupportedExpr struct {
    Instruction string
    Argument Expr
}

@github-actions github-actions bot added long review PR is complex, plan time to review it and removed medium review PR review might take time labels Oct 10, 2025
@grantseltzer grantseltzer force-pushed the grantseltzer/template-ast-validation branch 7 times, most recently from 3cd7aae to aff7f6f Compare October 14, 2025 19:28
Benchmark comparison:

goos: linux
goarch: arm64
pkg: github.com/DataDog/datadog-agent/pkg/dyninst/exprlang
                                   │   old.txt    │               new.txt                │
                                   │    sec/op    │    sec/op     vs base                │
Parse/simple_ref-4                    331.8n ± 6%   164.4n ±  8%  -50.44% (p=0.000 n=10)
Parse/long_ref-4                      419.2n ± 3%   243.5n ± 13%  -41.91% (p=0.000 n=10)
Parse/unsupported_string-4            347.5n ± 1%   193.7n ±  2%  -44.26% (p=0.000 n=10)
Parse/unsupported_number-4            335.3n ± 3%   181.8n ±  5%  -45.76% (p=0.000 n=10)
ParseParallel/simple_ref-4           210.15n ± 3%   83.94n ±  9%  -60.05% (p=0.000 n=10)
ParseParallel/unsupported_string-4   218.95n ± 6%   96.34n ± 38%  -56.00% (p=0.000 n=10)
geomean                               301.1n        150.0n        -50.18%

                                   │  old.txt   │              new.txt               │
                                   │    B/op    │    B/op     vs base                │
Parse/simple_ref-4                   680.0 ± 0%   211.0 ± 0%  -68.97% (p=0.000 n=10)
Parse/long_ref-4                     744.0 ± 0%   275.0 ± 0%  -63.04% (p=0.000 n=10)
Parse/unsupported_string-4           728.0 ± 0%   259.0 ± 0%  -64.42% (p=0.000 n=10)
Parse/unsupported_number-4           720.0 ± 0%   256.0 ± 0%  -64.44% (p=0.000 n=10)
ParseParallel/simple_ref-4           680.0 ± 0%   211.0 ± 0%  -68.97% (p=0.000 n=10)
ParseParallel/unsupported_string-4   728.0 ± 0%   259.0 ± 0%  -64.42% (p=0.000 n=10)
geomean                              712.9        243.9       -65.79%

                                   │   old.txt   │              new.txt               │
                                   │  allocs/op  │ allocs/op   vs base                │
Parse/simple_ref-4                   11.000 ± 0%   4.000 ± 0%  -63.64% (p=0.000 n=10)
Parse/long_ref-4                     12.000 ± 0%   5.000 ± 0%  -58.33% (p=0.000 n=10)
Parse/unsupported_string-4           12.000 ± 0%   5.000 ± 0%  -58.33% (p=0.000 n=10)
Parse/unsupported_number-4           12.000 ± 0%   5.000 ± 0%  -58.33% (p=0.000 n=10)
ParseParallel/simple_ref-4           11.000 ± 0%   4.000 ± 0%  -63.64% (p=0.000 n=10)
ParseParallel/unsupported_string-4   12.000 ± 0%   5.000 ± 0%  -58.33% (p=0.000 n=10)
geomean                               11.66        4.642       -60.18%


Summary:

2x faster execution on average
3x less memory usage
2.5x fewer allocations
@grantseltzer grantseltzer force-pushed the grantseltzer/template-ast-validation branch from aff7f6f to e70747f Compare October 14, 2025 19:28
Comment on lines 74 to 88
// Extract the argument value before reading more tokens
var argument jsontext.Value
switch arg.Kind() {
case '"':
argument = jsontext.Value(arg.String())
case 'n':
argument = jsontext.Value(nil)
case 't', 'f':
argument = jsontext.Value(arg.String())
case '0':
// For numbers in unsupported expressions, store as string
argument = jsontext.Value(arg.String())
default:
argument = jsontext.Value(arg.String())
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this switch. What's the point?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is none, I think left over from the earlier change with structured errors that wasn't needed either.

}
}

func parseRefExpr(dec *jsontext.Decoder) (*RefExpr, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like it's a little bit surprising that this thing expects a decoder in a state where it's expecting an object value and then an }. It's really very coupled to the logic inside Parse. It feels to me like you should instead fold this back into Parse underneath the relevant case of the switch, and then share the logic to expect a closing }.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, i'll inline this logic.

name: "ref with non-string value",
input: `{"ref": 123}`,
wantError: true,
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still waiting on a more complete catalog of tests

dec := jsontext.NewDecoder(bytes.NewReader(dslJSON))
pooled := getPooledDecoder(dslJSON)
defer pooled.put()
decoder := pooled.decoder
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: no need to rename this from dec to decoder in this commit.

return &RefExpr{Ref: refValue}, nil
default:
// Read the argument for unsupported operations
arg, err := dec.ReadToken()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not correct, you don't want to read a token, you want to read a value: https://pkg.go.dev/github.com/go-json-experiment/json/jsontext#Decoder.ReadValue.

This thing where you call a String() on token and then cast it to jsontext.Value is a pattern that is roughly never going to be correct (note that for string values this gives you the unescaped string).

The more complex example tests I asked for would have clarified this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing that I think will be cool is that we don't actually need to do a slices.Clone with the returned value because one thing we can do is use InputOffset before and after ReadValue and then subslice the input from there to avoid copying out any data.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify what you mean by getting rid of slices.Clone? Where is that used now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, I got ahead of myself. If you just use ReadValue it's only valid until the next call on the decoder, so to store it you'd need to copy out the bytes.

Comment on lines 132 to 139
// Read the closing brace
endObj, err := dec.ReadToken()
if err != nil {
return nil, fmt.Errorf("parse error: failed to read closing brace: %w", err)
}
if kind := endObj.Kind(); kind != '}' {
return nil, fmt.Errorf("parse error: malformed DSL: got token %v (%v), expected }", endObj, kind)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this logic can be shared between the two cases. Maybe extract a closure?

Comment on lines 237 to 273
// Nested/complex structures (will fail to parse with current simple parser)
// The current parser only handles single-level {"operation": value} structures
{name: "len of ref", input: `{"len": {"ref": "payload"}}`, wantError: true},
{name: "len of getmember", input: `{"len": {"getmember": [{"ref": "self"}, "collectionField"]}}`, wantError: true},
{name: "getmember", input: `{"getmember": [{"ref": "self"}, "name"]}`, wantError: true},
{name: "nested getmember", input: `{"getmember": [{"getmember": [{"ref": "self"}, "field1"]}, "name"]}`, wantError: true},
{name: "index array", input: `{"index": [{"ref": "arr"}, 1]}`, wantError: true},
{name: "index dict", input: `{"index": [{"ref": "dict"}, "world"]}`, wantError: true},
{name: "contains", input: `{"contains": [{"ref": "payload"}, "hello"]}`, wantError: true},
{name: "eq with bool", input: `{"eq": [{"ref": "hits"}, true]}`, wantError: true},
{name: "eq with null", input: `{"eq": [{"ref": "hits"}, null]}`, wantError: true},
{name: "substring", input: `{"substring": [{"ref": "payload"}, 4, 7]}`, wantError: true},
{name: "any with isEmpty", input: `{"any": [{"ref": "collection"}, {"isEmpty": {"ref": "@it"}}]}`, wantError: true},
{name: "any with @value", input: `{"any": [{"ref": "coll"}, {"isEmpty": {"ref": "@value"}}]}`, wantError: true},
{name: "any with @key", input: `{"any": [{"ref": "coll"}, {"isEmpty": {"ref": "@key"}}]}`, wantError: true},
{name: "startsWith", input: `{"startsWith": [{"ref": "local_string"}, "hello"]}`, wantError: true},
{name: "filter", input: `{"filter": [{"ref": "collection"}, {"not": {"isEmpty": {"ref": "@it"}}}]}`, wantError: true},
{name: "matches", input: `{"matches": [{"ref": "payload"}, "[0-9]+"]}`, wantError: true},
{name: "or", input: `{"or": [{"ref": "bar"}, {"ref": "foo"}]}`, wantError: true},
{name: "and", input: `{"and": [{"ref": "bar"}, {"ref": "foo"}]}`, wantError: true},
{name: "instanceof", input: `{"instanceof": [{"ref": "bar"}, "int"]}`, wantError: true},
{name: "isEmpty", input: `{"isEmpty": {"ref": "empty_str"}}`, wantError: true},
{name: "ne", input: `{"ne": [1, 2]}`, wantError: true},
{name: "gt", input: `{"gt": [2, 1]}`, wantError: true},
{name: "ge", input: `{"ge": [2, 1]}`, wantError: true},
{name: "lt", input: `{"lt": [1, 2]}`, wantError: true},
{name: "le", input: `{"le": [1, 2]}`, wantError: true},
{name: "all", input: `{"all": [{"ref": "collection"}, {"not": {"isEmpty": {"ref": "@it"}}}]}`, wantError: true},
{name: "endsWith", input: `{"endsWith": [{"ref": "local_string"}, "world!"]}`, wantError: true},
{name: "len of filter", input: `{"len": {"filter": [{"ref": "collection"}, {"gt": [{"ref": "@it"}, 1]}]}}`, wantError: true},
{name: "deeply nested getmember", input: `{"getmember": [{"getmember": [{"getmember": [{"ref": "self"}, "field1"]}, "field2"]}, "name"]}`, wantError: true},
{name: "any with nested ops", input: `{"any": [{"getmember": [{"ref": "self"}, "collectionField"]}, {"startsWith": [{"getmember": [{"ref": "@it"}, "name"]}, "foo"]}]}`, wantError: true},
{name: "and with eq and gt", input: `{"and": [{"eq": [{"ref": "hits"}, 42]}, {"gt": [{"len": {"ref": "payload"}}, 5]}]}`, wantError: true},
{name: "index of filter", input: `{"index": [{"filter": [{"ref": "collection"}, {"gt": [{"ref": "@it"}, 2]}]}, 0]}`, wantError: true},
{name: "count", input: `{"count": {"ref": "payload"}}`, wantError: true},
{name: "substring negative", input: `{"substring": [{"ref": "s"}, -5, -1]}`, wantError: true},
{name: "nested filter with any", input: `{"len": {"filter": [{"ref": "collection"}, {"any": [{"ref": "@it"}, {"eq": [{"ref": "@it"}, 1]}]}]}}`, wantError: true},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's trivial to make this not return an error. I really don't think these should be returning an error. Also, if we do return an error, let's assert something about the structure of the error.

@grantseltzer grantseltzer force-pushed the grantseltzer/template-ast-validation branch 2 times, most recently from 3ed52bd to f8e576a Compare October 15, 2025 14:37
@grantseltzer grantseltzer force-pushed the grantseltzer/template-ast-validation branch from f8e576a to 54cca2b Compare October 15, 2025 14:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/no-changelog long review PR is complex, plan time to review it qa/done QA done before merge and regressions are covered by tests team/dynamic-instrumentation Dynamic Instrumentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants