refactor(profiling): store memalloc samples as native objects#15372
refactor(profiling): store memalloc samples as native objects#15372
Conversation
|
|
Bootstrap import analysisComparison of import times between this PR and base. SummaryThe average import time from this PR is: 250 ± 4 ms. The average import time from base is: 254 ± 4 ms. The import time difference between this PR and base is: -4.1 ± 0.2 ms. Import time breakdownThe following import paths have shrunk:
|
Performance SLOsComparing candidate dsn/traceback-sample (e6fdc3a) with baseline main (bff4abd) 📈 Performance Regressions (3 suites)📈 iastaspects - 118/118✅ add_aspectTime: ✅ 0.387µs (SLO: <10.000µs 📉 -96.1%) vs baseline: +0.4% Memory: ✅ 38.433MB (SLO: <41.500MB -7.4%) vs baseline: +4.9% ✅ add_inplace_aspectTime: ✅ 0.384µs (SLO: <10.000µs 📉 -96.2%) vs baseline: -0.4% Memory: ✅ 38.559MB (SLO: <41.500MB -7.1%) vs baseline: +5.2% ✅ add_inplace_noaspectTime: ✅ 0.285µs (SLO: <10.000µs 📉 -97.1%) vs baseline: -1.3% Memory: ✅ 38.343MB (SLO: <41.500MB -7.6%) vs baseline: +4.4% ✅ add_noaspectTime: ✅ 0.357µs (SLO: <10.000µs 📉 -96.4%) vs baseline: -0.3% Memory: ✅ 38.515MB (SLO: <41.500MB -7.2%) vs baseline: +5.1% ✅ bytearray_aspectTime: ✅ 1.273µs (SLO: <10.000µs 📉 -87.3%) vs baseline: -2.8% Memory: ✅ 38.496MB (SLO: <41.500MB -7.2%) vs baseline: +4.7% ✅ bytearray_extend_aspectTime: ✅ 1.565µs (SLO: <10.000µs 📉 -84.4%) vs baseline: +6.4% Memory: ✅ 38.523MB (SLO: <41.500MB -7.2%) vs baseline: +4.6% ✅ bytearray_extend_noaspectTime: ✅ 0.615µs (SLO: <10.000µs 📉 -93.9%) vs baseline: -0.9% Memory: ✅ 38.529MB (SLO: <41.500MB -7.2%) vs baseline: +5.3% ✅ bytearray_noaspectTime: ✅ 0.489µs (SLO: <10.000µs 📉 -95.1%) vs baseline: +1.0% Memory: ✅ 38.628MB (SLO: <41.500MB -6.9%) vs baseline: +5.7% ✅ bytes_aspectTime: ✅ 1.263µs (SLO: <10.000µs 📉 -87.4%) vs baseline: -0.9% Memory: ✅ 38.494MB (SLO: <41.500MB -7.2%) vs baseline: +5.0% ✅ bytes_noaspectTime: ✅ 0.496µs (SLO: <10.000µs 📉 -95.0%) vs baseline: +0.4% Memory: ✅ 38.566MB (SLO: <41.500MB -7.1%) vs baseline: +4.8% ✅ bytesio_aspectTime: ✅ 1.319µs (SLO: <10.000µs 📉 -86.8%) vs baseline: +0.6% Memory: ✅ 38.629MB (SLO: <41.500MB -6.9%) vs baseline: +5.3% ✅ bytesio_noaspectTime: ✅ 0.503µs (SLO: <10.000µs 📉 -95.0%) vs baseline: ~same Memory: ✅ 38.401MB (SLO: <41.500MB -7.5%) vs baseline: +4.6% ✅ capitalize_aspectTime: ✅ 0.738µs (SLO: <10.000µs 📉 -92.6%) vs baseline: -0.2% Memory: ✅ 38.433MB (SLO: <41.500MB -7.4%) vs baseline: +4.7% ✅ capitalize_noaspectTime: ✅ 0.439µs (SLO: <10.000µs 📉 -95.6%) vs baseline: -0.2% Memory: ✅ 38.542MB (SLO: <41.500MB -7.1%) vs baseline: +5.2% ✅ casefold_aspectTime: ✅ 0.736µs (SLO: <10.000µs 📉 -92.6%) vs baseline: -0.4% Memory: ✅ 38.433MB (SLO: <41.500MB -7.4%) vs baseline: +4.2% ✅ casefold_noaspectTime: ✅ 0.371µs (SLO: <10.000µs 📉 -96.3%) vs baseline: +0.6% Memory: ✅ 38.393MB (SLO: <41.500MB -7.5%) vs baseline: +4.8% ✅ decode_aspectTime: ✅ 0.728µs (SLO: <10.000µs 📉 -92.7%) vs baseline: +0.2% Memory: ✅ 38.481MB (SLO: <41.500MB -7.3%) vs baseline: +5.0% ✅ decode_noaspectTime: ✅ 0.424µs (SLO: <10.000µs 📉 -95.8%) vs baseline: -0.5% Memory: ✅ 38.506MB (SLO: <41.500MB -7.2%) vs baseline: +4.7% ✅ encode_aspectTime: ✅ 0.716µs (SLO: <10.000µs 📉 -92.8%) vs baseline: +1.0% Memory: ✅ 38.728MB (SLO: <41.500MB -6.7%) vs baseline: +4.9% ✅ encode_noaspectTime: ✅ 0.404µs (SLO: <10.000µs 📉 -96.0%) vs baseline: ~same Memory: ✅ 38.661MB (SLO: <41.500MB -6.8%) vs baseline: +4.9% ✅ format_aspectTime: ✅ 3.418µs (SLO: <10.000µs 📉 -65.8%) vs baseline: +1.5% Memory: ✅ 38.535MB (SLO: <41.500MB -7.1%) vs baseline: +4.7% ✅ format_map_aspectTime: ✅ 3.677µs (SLO: <10.000µs 📉 -63.2%) vs baseline: +1.2% Memory: ✅ 38.501MB (SLO: <41.500MB -7.2%) vs baseline: +4.6% ✅ format_map_noaspectTime: ✅ 0.820µs (SLO: <10.000µs 📉 -91.8%) vs baseline: ~same Memory: ✅ 38.604MB (SLO: <41.500MB -7.0%) vs baseline: +5.0% ✅ format_noaspectTime: ✅ 0.599µs (SLO: <10.000µs 📉 -94.0%) vs baseline: +1.0% Memory: ✅ 38.453MB (SLO: <41.500MB -7.3%) vs baseline: +4.6% ✅ index_aspectTime: ✅ 0.345µs (SLO: <10.000µs 📉 -96.5%) vs baseline: +0.3% Memory: ✅ 38.534MB (SLO: <41.500MB -7.1%) vs baseline: +5.5% ✅ index_noaspectTime: ✅ 0.312µs (SLO: <10.000µs 📉 -96.9%) vs baseline: -2.2% Memory: ✅ 38.374MB (SLO: <41.500MB -7.5%) vs baseline: +4.2% ✅ join_aspectTime: ✅ 1.278µs (SLO: <10.000µs 📉 -87.2%) vs baseline: -2.4% Memory: ✅ 38.507MB (SLO: <41.500MB -7.2%) vs baseline: +5.2% ✅ join_noaspectTime: ✅ 0.534µs (SLO: <10.000µs 📉 -94.7%) vs baseline: +0.2% Memory: ✅ 38.497MB (SLO: <41.500MB -7.2%) vs baseline: +4.8% ✅ ljust_aspectTime: ✅ 2.584µs (SLO: <20.000µs 📉 -87.1%) vs baseline: +1.8% Memory: ✅ 38.742MB (SLO: <41.500MB -6.6%) vs baseline: +5.7% ✅ ljust_noaspectTime: ✅ 0.409µs (SLO: <10.000µs 📉 -95.9%) vs baseline: +0.5% Memory: ✅ 38.451MB (SLO: <41.500MB -7.3%) vs baseline: +4.3% ✅ lower_aspectTime: ✅ 2.237µs (SLO: <10.000µs 📉 -77.6%) vs baseline: +0.5% Memory: ✅ 38.584MB (SLO: <41.500MB -7.0%) vs baseline: +4.7% ✅ lower_noaspectTime: ✅ 0.371µs (SLO: <10.000µs 📉 -96.3%) vs baseline: -0.4% Memory: ✅ 38.557MB (SLO: <41.500MB -7.1%) vs baseline: +4.9% ✅ lstrip_aspectTime: ✅ 2.197µs (SLO: <20.000µs 📉 -89.0%) vs baseline: +1.2% Memory: ✅ 38.522MB (SLO: <41.500MB -7.2%) vs baseline: +4.6% ✅ lstrip_noaspectTime: ✅ 0.383µs (SLO: <10.000µs 📉 -96.2%) vs baseline: -1.2% Memory: ✅ 38.475MB (SLO: <41.500MB -7.3%) vs baseline: +4.6% ✅ modulo_aspectTime: ✅ 0.973µs (SLO: <10.000µs 📉 -90.3%) vs baseline: ~same Memory: ✅ 38.625MB (SLO: <41.500MB -6.9%) vs baseline: +4.3% ✅ modulo_aspect_for_bytearray_bytearrayTime: ✅ 1.485µs (SLO: <10.000µs 📉 -85.1%) vs baseline: +0.6% Memory: ✅ 38.495MB (SLO: <41.500MB -7.2%) vs baseline: +4.9% ✅ modulo_aspect_for_bytesTime: ✅ 0.955µs (SLO: <10.000µs 📉 -90.4%) vs baseline: +0.8% Memory: ✅ 38.421MB (SLO: <41.500MB -7.4%) vs baseline: +4.8% ✅ modulo_aspect_for_bytes_bytearrayTime: ✅ 1.169µs (SLO: <10.000µs 📉 -88.3%) vs baseline: -0.1% Memory: ✅ 38.500MB (SLO: <41.500MB -7.2%) vs baseline: +5.0% ✅ modulo_noaspectTime: ✅ 0.669µs (SLO: <10.000µs 📉 -93.3%) vs baseline: ~same Memory: ✅ 38.791MB (SLO: <41.500MB -6.5%) vs baseline: +5.6% ✅ replace_aspectTime: ✅ 5.019µs (SLO: <10.000µs 📉 -49.8%) vs baseline: +1.8% Memory: ✅ 38.536MB (SLO: <41.500MB -7.1%) vs baseline: +5.1% ✅ replace_noaspectTime: ✅ 0.464µs (SLO: <10.000µs 📉 -95.4%) vs baseline: +0.2% Memory: ✅ 38.504MB (SLO: <41.500MB -7.2%) vs baseline: +4.4% ✅ repr_aspectTime: ✅ 0.953µs (SLO: <10.000µs 📉 -90.5%) vs baseline: -0.1% Memory: ✅ 38.565MB (SLO: <41.500MB -7.1%) vs baseline: +5.7% ✅ repr_noaspectTime: ✅ 0.455µs (SLO: <10.000µs 📉 -95.4%) vs baseline: -0.6% Memory: ✅ 38.821MB (SLO: <41.500MB -6.5%) vs baseline: +5.8% ✅ rstrip_aspectTime: ✅ 1.847µs (SLO: <20.000µs 📉 -90.8%) vs baseline: +0.2% Memory: ✅ 38.543MB (SLO: <41.500MB -7.1%) vs baseline: +4.6% ✅ rstrip_noaspectTime: ✅ 0.386µs (SLO: <10.000µs 📉 -96.1%) vs baseline: +1.0% Memory: ✅ 38.554MB (SLO: <41.500MB -7.1%) vs baseline: +5.2% ✅ slice_aspectTime: ✅ 0.489µs (SLO: <10.000µs 📉 -95.1%) vs baseline: +0.7% Memory: ✅ 38.586MB (SLO: <41.500MB -7.0%) vs baseline: +5.4% ✅ slice_noaspectTime: ✅ 0.455µs (SLO: <10.000µs 📉 -95.4%) vs baseline: ~same Memory: ✅ 38.357MB (SLO: <41.500MB -7.6%) vs baseline: +4.2% ✅ stringio_aspectTime: ✅ 1.698µs (SLO: <10.000µs 📉 -83.0%) vs baseline: +1.2% Memory: ✅ 38.534MB (SLO: <41.500MB -7.1%) vs baseline: +4.7% ✅ stringio_noaspectTime: ✅ 0.921µs (SLO: <10.000µs 📉 -90.8%) vs baseline: +0.5% Memory: ✅ 38.426MB (SLO: <41.500MB -7.4%) vs baseline: +4.3% ✅ strip_aspectTime: ✅ 2.424µs (SLO: <20.000µs 📉 -87.9%) vs baseline: 📈 +12.8% Memory: ✅ 38.682MB (SLO: <41.500MB -6.8%) vs baseline: +5.2% ✅ strip_noaspectTime: ✅ 0.389µs (SLO: <10.000µs 📉 -96.1%) vs baseline: +0.4% Memory: ✅ 38.553MB (SLO: <41.500MB -7.1%) vs baseline: +4.8% ✅ swapcase_aspectTime: ✅ 2.429µs (SLO: <10.000µs 📉 -75.7%) vs baseline: -0.5% Memory: ✅ 38.398MB (SLO: <41.500MB -7.5%) vs baseline: +4.3% ✅ swapcase_noaspectTime: ✅ 0.541µs (SLO: <10.000µs 📉 -94.6%) vs baseline: -0.2% Memory: ✅ 38.424MB (SLO: <41.500MB -7.4%) vs baseline: +4.8% ✅ title_aspectTime: ✅ 2.356µs (SLO: <10.000µs 📉 -76.4%) vs baseline: -0.4% Memory: ✅ 38.632MB (SLO: <41.500MB -6.9%) vs baseline: +5.5% ✅ title_noaspectTime: ✅ 0.502µs (SLO: <10.000µs 📉 -95.0%) vs baseline: -0.9% Memory: ✅ 38.367MB (SLO: <41.500MB -7.5%) vs baseline: +4.7% ✅ translate_aspectTime: ✅ 3.260µs (SLO: <10.000µs 📉 -67.4%) vs baseline: ~same Memory: ✅ 38.575MB (SLO: <41.500MB -7.0%) vs baseline: +5.0% ✅ translate_noaspectTime: ✅ 1.048µs (SLO: <10.000µs 📉 -89.5%) vs baseline: +0.5% Memory: ✅ 38.462MB (SLO: <41.500MB -7.3%) vs baseline: +5.3% ✅ upper_aspectTime: ✅ 2.224µs (SLO: <10.000µs 📉 -77.8%) vs baseline: -1.5% Memory: ✅ 38.629MB (SLO: <41.500MB -6.9%) vs baseline: +5.0% ✅ upper_noaspectTime: ✅ 0.371µs (SLO: <10.000µs 📉 -96.3%) vs baseline: -0.9% Memory: ✅ 38.320MB (SLO: <41.500MB -7.7%) vs baseline: +4.6% 📈 iastaspectsospath - 24/24✅ ospathbasename_aspectTime: ✅ 5.169µs (SLO: <10.000µs 📉 -48.3%) vs baseline: 📈 +22.6% Memory: ✅ 38.614MB (SLO: <41.000MB -5.8%) vs baseline: +4.9% ✅ ospathbasename_noaspectTime: ✅ 1.085µs (SLO: <10.000µs 📉 -89.1%) vs baseline: -0.4% Memory: ✅ 38.516MB (SLO: <41.000MB -6.1%) vs baseline: +4.8% ✅ ospathjoin_aspectTime: ✅ 6.141µs (SLO: <10.000µs 📉 -38.6%) vs baseline: +1.8% Memory: ✅ 38.574MB (SLO: <41.000MB -5.9%) vs baseline: +5.1% ✅ ospathjoin_noaspectTime: ✅ 2.282µs (SLO: <10.000µs 📉 -77.2%) vs baseline: -0.4% Memory: ✅ 38.614MB (SLO: <41.000MB -5.8%) vs baseline: +4.8% ✅ ospathnormcase_aspectTime: ✅ 3.425µs (SLO: <10.000µs 📉 -65.8%) vs baseline: -2.5% Memory: ✅ 38.594MB (SLO: <41.000MB -5.9%) vs baseline: +4.6% ✅ ospathnormcase_noaspectTime: ✅ 0.573µs (SLO: <10.000µs 📉 -94.3%) vs baseline: -0.4% Memory: ✅ 38.594MB (SLO: <41.000MB -5.9%) vs baseline: +5.1% ✅ ospathsplit_aspectTime: ✅ 4.855µs (SLO: <10.000µs 📉 -51.4%) vs baseline: -0.1% Memory: ✅ 38.574MB (SLO: <41.000MB -5.9%) vs baseline: +4.7% ✅ ospathsplit_noaspectTime: ✅ 1.589µs (SLO: <10.000µs 📉 -84.1%) vs baseline: -0.3% Memory: ✅ 38.633MB (SLO: <41.000MB -5.8%) vs baseline: +5.0% ✅ ospathsplitdrive_aspectTime: ✅ 3.680µs (SLO: <10.000µs 📉 -63.2%) vs baseline: -0.2% Memory: ✅ 38.457MB (SLO: <41.000MB -6.2%) vs baseline: +4.9% ✅ ospathsplitdrive_noaspectTime: ✅ 0.704µs (SLO: <10.000µs 📉 -93.0%) vs baseline: +0.9% Memory: ✅ 38.398MB (SLO: <41.000MB -6.3%) vs baseline: +4.5% ✅ ospathsplitext_aspectTime: ✅ 4.555µs (SLO: <10.000µs 📉 -54.5%) vs baseline: -0.8% Memory: ✅ 38.594MB (SLO: <41.000MB -5.9%) vs baseline: +4.8% ✅ ospathsplitext_noaspectTime: ✅ 1.381µs (SLO: <10.000µs 📉 -86.2%) vs baseline: ~same Memory: ✅ 38.594MB (SLO: <41.000MB -5.9%) vs baseline: +5.0% 📈 telemetryaddmetric - 30/30✅ 1-count-metric-1-timesTime: ✅ 3.548µs (SLO: <20.000µs 📉 -82.3%) vs baseline: 📈 +15.7% Memory: ✅ 34.800MB (SLO: <35.500MB 🟡 -2.0%) vs baseline: +4.9% ✅ 1-count-metrics-100-timesTime: ✅ 208.331µs (SLO: <220.000µs -5.3%) vs baseline: -0.5% Memory: ✅ 34.800MB (SLO: <35.500MB 🟡 -2.0%) vs baseline: +4.6% ✅ 1-distribution-metric-1-timesTime: ✅ 3.434µs (SLO: <20.000µs 📉 -82.8%) vs baseline: +0.4% Memory: ✅ 34.996MB (SLO: <35.500MB 🟡 -1.4%) vs baseline: +5.2% ✅ 1-distribution-metrics-100-timesTime: ✅ 220.853µs (SLO: <230.000µs -4.0%) vs baseline: +0.2% Memory: ✅ 34.878MB (SLO: <35.500MB 🟡 -1.8%) vs baseline: +5.1% ✅ 1-gauge-metric-1-timesTime: ✅ 2.215µs (SLO: <20.000µs 📉 -88.9%) vs baseline: -0.8% Memory: ✅ 34.839MB (SLO: <35.500MB 🟡 -1.9%) vs baseline: +5.0% ✅ 1-gauge-metrics-100-timesTime: ✅ 138.303µs (SLO: <150.000µs -7.8%) vs baseline: +0.1% Memory: ✅ 34.898MB (SLO: <35.500MB 🟡 -1.7%) vs baseline: +4.7% ✅ 1-rate-metric-1-timesTime: ✅ 3.257µs (SLO: <20.000µs 📉 -83.7%) vs baseline: ~same Memory: ✅ 34.878MB (SLO: <35.500MB 🟡 -1.8%) vs baseline: +5.3% ✅ 1-rate-metrics-100-timesTime: ✅ 224.244µs (SLO: <250.000µs 📉 -10.3%) vs baseline: +0.7% Memory: ✅ 34.898MB (SLO: <35.500MB 🟡 -1.7%) vs baseline: +5.1% ✅ 100-count-metrics-100-timesTime: ✅ 20.886ms (SLO: <22.000ms -5.1%) vs baseline: -0.7% Memory: ✅ 34.898MB (SLO: <35.500MB 🟡 -1.7%) vs baseline: +5.2% ✅ 100-distribution-metrics-100-timesTime: ✅ 2.287ms (SLO: <2.550ms 📉 -10.3%) vs baseline: -1.5% Memory: ✅ 34.819MB (SLO: <35.500MB 🟡 -1.9%) vs baseline: +4.7% ✅ 100-gauge-metrics-100-timesTime: ✅ 1.423ms (SLO: <1.550ms -8.2%) vs baseline: +0.2% Memory: ✅ 34.800MB (SLO: <35.500MB 🟡 -2.0%) vs baseline: +4.6% ✅ 100-rate-metrics-100-timesTime: ✅ 2.287ms (SLO: <2.550ms 📉 -10.3%) vs baseline: +1.3% Memory: ✅ 34.819MB (SLO: <35.500MB 🟡 -1.9%) vs baseline: +4.7% ✅ flush-1-metricTime: ✅ 4.646µs (SLO: <20.000µs 📉 -76.8%) vs baseline: +0.1% Memory: ✅ 35.114MB (SLO: <35.500MB 🟡 -1.1%) vs baseline: +4.9% ✅ flush-100-metricsTime: ✅ 173.582µs (SLO: <250.000µs 📉 -30.6%) vs baseline: ~same Memory: ✅ 35.095MB (SLO: <35.500MB 🟡 -1.1%) vs baseline: +4.6% ✅ flush-1000-metricsTime: ✅ 2.173ms (SLO: <2.500ms 📉 -13.1%) vs baseline: -0.7% Memory: ✅ 35.960MB (SLO: <36.500MB 🟡 -1.5%) vs baseline: +4.8% 🟡 Near SLO Breach (17 suites)🟡 coreapiscenario - 10/10 (1 unstable)
|
taegyunkim
left a comment
There was a problem hiding this comment.
Do you mind fixing the failing tests first?
11ab17b to
e454ae5
Compare
- Comment out test subdirectory reference in CMakeLists.txt (directory was removed during merge) - Apply C++ formatting fixes
e454ae5 to
9708602
Compare
The destructor only needs to destroy the Sample member, which happens automatically. Using =default makes this explicit and cleaner.
KowalskiThomas
left a comment
There was a problem hiding this comment.
Overall this makes sense to me, as far as I can tell. I don't have much context on memory profiling so I'd recommend to wait for another review 😅
Description
Previously, we kept tracebacks in python objects, which meant that dealing with them required careful work with the GIL.
Replacing with native storage makes the code much cleaner.
Testing
Existing tests
Risks
This is a substantial change to how the profiler works, and should be looked at carefully.
Additional Notes