-
Notifications
You must be signed in to change notification settings - Fork 469
feat(openai): handle file and image prompt variables types #15359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
|
|
(If it's approved by tmrw, don't merge it yet; I want to test it in my sandbox, which I'll do tomorrow morning) |
Bootstrap import analysisComparison of import times between this PR and base. SummaryThe average import time from this PR is: 226 ± 3 ms. The average import time from base is: 228 ± 4 ms. The import time difference between this PR and base is: -2.2 ± 0.2 ms. Import time breakdownThe following import paths have shrunk:
|
Performance SLOsComparing candidate alex/MLOB-4411_handle_image_and_file_variables (f2517fb) with baseline main (38155b3) 📈 Performance Regressions (1 suite)📈 telemetryaddmetric - 30/30✅ 1-count-metric-1-timesTime: ✅ 3.257µs (SLO: <20.000µs 📉 -83.7%) vs baseline: 📈 +10.8% Memory: ✅ 34.603MB (SLO: <35.500MB -2.5%) vs baseline: +4.8% ✅ 1-count-metrics-100-timesTime: ✅ 201.016µs (SLO: <220.000µs -8.6%) vs baseline: +1.0% Memory: ✅ 34.583MB (SLO: <35.500MB -2.6%) vs baseline: +4.8% ✅ 1-distribution-metric-1-timesTime: ✅ 3.287µs (SLO: <20.000µs 📉 -83.6%) vs baseline: -1.0% Memory: ✅ 34.603MB (SLO: <35.500MB -2.5%) vs baseline: +4.9% ✅ 1-distribution-metrics-100-timesTime: ✅ 215.548µs (SLO: <230.000µs -6.3%) vs baseline: -1.3% Memory: ✅ 34.623MB (SLO: <35.500MB -2.5%) vs baseline: +5.0% ✅ 1-gauge-metric-1-timesTime: ✅ 2.601µs (SLO: <20.000µs 📉 -87.0%) vs baseline: 📈 +19.3% Memory: ✅ 34.583MB (SLO: <35.500MB -2.6%) vs baseline: +4.8% ✅ 1-gauge-metrics-100-timesTime: ✅ 136.107µs (SLO: <150.000µs -9.3%) vs baseline: ~same Memory: ✅ 34.642MB (SLO: <35.500MB -2.4%) vs baseline: +4.9% ✅ 1-rate-metric-1-timesTime: ✅ 3.150µs (SLO: <20.000µs 📉 -84.3%) vs baseline: +2.5% Memory: ✅ 34.564MB (SLO: <35.500MB -2.6%) vs baseline: +4.9% ✅ 1-rate-metrics-100-timesTime: ✅ 215.093µs (SLO: <250.000µs 📉 -14.0%) vs baseline: +0.6% Memory: ✅ 34.583MB (SLO: <35.500MB -2.6%) vs baseline: +4.9% ✅ 100-count-metrics-100-timesTime: ✅ 20.631ms (SLO: <22.000ms -6.2%) vs baseline: +3.3% Memory: ✅ 34.603MB (SLO: <35.500MB -2.5%) vs baseline: +4.9% ✅ 100-distribution-metrics-100-timesTime: ✅ 2.229ms (SLO: <2.300ms -3.1%) vs baseline: -1.2% Memory: ✅ 34.662MB (SLO: <35.500MB -2.4%) vs baseline: +5.1% ✅ 100-gauge-metrics-100-timesTime: ✅ 1.421ms (SLO: <1.550ms -8.3%) vs baseline: +0.9% Memory: ✅ 34.564MB (SLO: <35.500MB -2.6%) vs baseline: +4.8% ✅ 100-rate-metrics-100-timesTime: ✅ 2.232ms (SLO: <2.550ms 📉 -12.5%) vs baseline: +1.9% Memory: ✅ 34.524MB (SLO: <35.500MB -2.7%) vs baseline: +4.6% ✅ flush-1-metricTime: ✅ 4.534µs (SLO: <20.000µs 📉 -77.3%) vs baseline: -1.4% Memory: ✅ 34.544MB (SLO: <35.500MB -2.7%) vs baseline: +4.7% ✅ flush-100-metricsTime: ✅ 176.149µs (SLO: <250.000µs 📉 -29.5%) vs baseline: +1.2% Memory: ✅ 34.642MB (SLO: <35.500MB -2.4%) vs baseline: +4.9% ✅ flush-1000-metricsTime: ✅ 2.125ms (SLO: <2.500ms 📉 -15.0%) vs baseline: -0.2% Memory: ✅ 35.370MB (SLO: <36.500MB -3.1%) vs baseline: +4.8% 🟡 Near SLO Breach (2 suites)🟡 flasksimple - 18/18✅ appsec-getTime: ✅ 4.598ms (SLO: <4.750ms -3.2%) vs baseline: ~same Memory: ✅ 63.995MB (SLO: <66.500MB -3.8%) vs baseline: +4.8% ✅ appsec-postTime: ✅ 6.618ms (SLO: <6.750ms 🟡 -1.9%) vs baseline: ~same Memory: ✅ 64.348MB (SLO: <66.500MB -3.2%) vs baseline: +4.7% ✅ appsec-telemetryTime: ✅ 4.587ms (SLO: <4.750ms -3.4%) vs baseline: ~same Memory: ✅ 64.149MB (SLO: <66.500MB -3.5%) vs baseline: +5.0% ✅ debuggerTime: ✅ 1.855ms (SLO: <2.000ms -7.2%) vs baseline: ~same Memory: ✅ 47.974MB (SLO: <49.500MB -3.1%) vs baseline: +5.3% ✅ iast-getTime: ✅ 1.856ms (SLO: <2.000ms -7.2%) vs baseline: +0.2% Memory: ✅ 44.726MB (SLO: <49.000MB -8.7%) vs baseline: +4.6% ✅ profilerTime: ✅ 1.927ms (SLO: <2.100ms -8.2%) vs baseline: -0.4% Memory: ✅ 48.382MB (SLO: <50.000MB -3.2%) vs baseline: +4.6% ✅ resource-renamingTime: ✅ 3.375ms (SLO: <3.650ms -7.5%) vs baseline: +0.2% Memory: ✅ 54.683MB (SLO: <56.000MB -2.4%) vs baseline: +4.9% ✅ tracerTime: ✅ 3.368ms (SLO: <3.650ms -7.7%) vs baseline: +0.5% Memory: ✅ 54.679MB (SLO: <56.500MB -3.2%) vs baseline: +4.7% ✅ tracer-nativeTime: ✅ 3.367ms (SLO: <3.650ms -7.8%) vs baseline: +0.5% Memory: ✅ 54.695MB (SLO: <60.000MB -8.8%) vs baseline: +4.8% 🟡 otelspan - 22/22✅ add-eventTime: ✅ 38.500ms (SLO: <47.150ms 📉 -18.3%) vs baseline: -0.8% Memory: ✅ 39.125MB (SLO: <47.000MB 📉 -16.8%) vs baseline: +5.0% ✅ add-metricsTime: ✅ 258.207ms (SLO: <344.800ms 📉 -25.1%) vs baseline: +0.2% Memory: ✅ 43.414MB (SLO: <47.500MB -8.6%) vs baseline: +4.9% ✅ add-tagsTime: ✅ 315.449ms (SLO: <321.000ms 🟡 -1.7%) vs baseline: +0.3% Memory: ✅ 43.355MB (SLO: <47.500MB -8.7%) vs baseline: +4.8% ✅ get-contextTime: ✅ 78.311ms (SLO: <92.350ms 📉 -15.2%) vs baseline: -0.2% Memory: ✅ 39.481MB (SLO: <46.500MB 📉 -15.1%) vs baseline: +5.2% ✅ is-recordingTime: ✅ 36.137ms (SLO: <44.500ms 📉 -18.8%) vs baseline: +0.4% Memory: ✅ 39.014MB (SLO: <47.500MB 📉 -17.9%) vs baseline: +5.3% ✅ record-exceptionTime: ✅ 57.184ms (SLO: <67.650ms 📉 -15.5%) vs baseline: +0.4% Memory: ✅ 39.458MB (SLO: <47.000MB 📉 -16.0%) vs baseline: +4.4% ✅ set-statusTime: ✅ 42.638ms (SLO: <50.400ms 📉 -15.4%) vs baseline: +0.1% Memory: ✅ 38.912MB (SLO: <47.000MB 📉 -17.2%) vs baseline: +4.6% ✅ startTime: ✅ 35.160ms (SLO: <43.450ms 📉 -19.1%) vs baseline: -0.4% Memory: ✅ 38.995MB (SLO: <47.000MB 📉 -17.0%) vs baseline: +4.9% ✅ start-finishTime: ✅ 82.265ms (SLO: <88.000ms -6.5%) vs baseline: +0.3% Memory: ✅ 36.628MB (SLO: <46.500MB 📉 -21.2%) vs baseline: +4.9% ✅ start-finish-telemetryTime: ✅ 83.548ms (SLO: <89.000ms -6.1%) vs baseline: +0.2% Memory: ✅ 36.726MB (SLO: <46.500MB 📉 -21.0%) vs baseline: +5.0% ✅ update-nameTime: ✅ 36.999ms (SLO: <45.150ms 📉 -18.1%) vs baseline: ~same Memory: ✅ 39.086MB (SLO: <47.000MB 📉 -16.8%) vs baseline: +4.9%
|
- Updated `_openai_parse_input_response_messages` to support both `image_url` and `file_id` for images. - Modified `_extract_chat_template_from_instructions` to exclude fallback markers for missing data. - Adjusted tests to reflect changes in prompt structure and added new test cases for file inputs. - Updated cassettes to capture the new response format with multiple image references.
Description
Adds normalization (
_normalize_prompt_variables()) for OpenAI reusable prompt variables to extract meaningful values from all OpenAI SDK's response objects (ResponseInputText, ResponseInputImage, ResponseInputFile) before storing them in LLMObs span metadata.This was overlooked in the initial prompt tracking implementation, which only handled
ResponseInputTextobjects.And also, technically, we can have a reusable prompt... without variable. Hence why I removed
variablesfrom theif instructions and variables:conditional check.Testing
test_normalize_prompt_variables()to verify normalization handles all response object types correctlyRisks
🤷
Additional Notes
Changed the code a bit so it is easier to read and maintain + this way it's more similar to the upcoming nodejs implementation (DataDog/dd-trace-js#6941)