Skip to content

Add media constructors: baml.audio, baml.image, baml.pdf, baml.video#3170

Open
hellovai wants to merge 6 commits intocanaryfrom
hellovai/media-constructors
Open

Add media constructors: baml.audio, baml.image, baml.pdf, baml.video#3170
hellovai wants to merge 6 commits intocanaryfrom
hellovai/media-constructors

Conversation

@hellovai
Copy link
Contributor

@hellovai hellovai commented Feb 24, 2026

Summary

Adds BAML builtin media constructor modules baml.audio, baml.image, baml.pdf, and baml.video as thin wrappers around baml.Media.from_* with a fixed kind.

Changes

  • Builtins: New audio.baml, image.baml, pdf.baml, video.baml in baml_builtins/baml/, each with from_url, from_base64, and from_file (delegating to baml.Media.from_*).
  • Compiler: Builtins macros (parse, collect, codegen_native, util), HIR path_resolve, TIR builtins/normalize, and MIR lower updated to support the new types.
  • Runtime: New builtins registered in baml_builtins lib and bex_vm native.
  • Tests: New baml_std snapshots for audio/image/pdf/video (lexer, parser, formatter); existing HIR/MIR/TIR/codegen snapshots updated.

Clippy

  • baml_compiler_tir/src/builtins.rs: use .map(substitute_unknown) instead of .map(|p| substitute_unknown(p)).
  • bex_vm/src/native.rs: use mime_type.map(ToString::to_string) instead of mime_type.map(|s| s.to_string()).

Made with Cursor

Summary by CodeRabbit

  • New Features

    • Added constructors for audio, video, image, and PDF media types that can be created from URLs, base64-encoded data, or files with optional MIME type specification.
  • Improvements

    • Enhanced type system and resolution logic for improved media type handling and validation.

hellovai and others added 5 commits February 23, 2026 23:02
- Add builtin modules audio.baml, image.baml, pdf.baml, video.baml with
  from_url, from_base64, from_file (thin wrappers around baml.Media.from_*).
- Extend builtins macros (parse, collect, codegen_native, util) and
  path_resolve, TIR builtins/normalize, MIR lower for new types.
- Register new builtins in baml_builtins lib and bex_vm native.
- Include new baml_std test snapshots and update existing snapshots.

Co-authored-by: Cursor <[email protected]>
@vercel
Copy link

vercel bot commented Feb 24, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
beps Ready Ready Preview, Comment Feb 25, 2026 0:22am
promptfiddle Ready Ready Preview, Comment Feb 25, 2026 0:22am

Request Review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 24, 2026

📝 Walkthrough

Walkthrough

This PR introduces media type constructors for audio, image, PDF, and video as BAML builtin wrappers around existing Media.from_* functions. It extends the builtin type system with DSL-aware type patterns (StringLiteral, Union), updates the builtin macro infrastructure to support these new patterns, adds path resolution for nested BAML namespaces, and implements native VM constructors for media creation from URLs, base64, and files.

Changes

Cohort / File(s) Summary
New Media Builtin Modules
baml_language/crates/baml_builtins/baml/audio.baml, image.baml, pdf.baml, video.baml
Added three wrapper functions (from_url, from_base64, from_file) for each media type, delegating to baml.Media.from_* with kind preset to the respective media type.
Core Type Pattern System
baml_language/crates/baml_builtins/src/lib.rs, baml_language/crates/baml_compiler_tir/src/builtins.rs
Added TypePattern variants StringLiteral and Union; extended Media builtin methods with from_url, from_base64, from_file overloads; updated pattern matching and substitution logic to handle new variants.
Builtin Macro Infrastructure
baml_language/crates/baml_builtins_macros/src/parse.rs, collect.rs, util.rs, codegen_native.rs
Introduced DslType abstraction (Syn, StringLiteral, Union) for parameter and return types; added DSL-aware helper functions; refactored parameter/return type processing to use DSL variants; added Media result conversion in codegen.
Path Resolution & Type Inference
baml_language/crates/baml_compiler_hir/src/path_resolve.rs, baml_language/crates/baml_compiler_tir/src/lib.rs
Enhanced path resolution to support deeper BAML namespace traversal and fallback prefixing; extended builtin path lookups during expression inference; improved method/function resolution for multi-segment paths.
Runtime & Subtyping
baml_language/crates/bex_vm/src/native.rs, baml_language/crates/baml_compiler_tir/src/normalize.rs
Added native functions for media creation (baml_media_from_url, baml_media_from_base64, baml_media_from_file) with kind mapping; added Media-to-Media subtyping rule.
Test & Build Support
baml_language/crates/baml_tests/build.rs
Prefixed builtin BAML file paths with `/baml` to align test path resolution and prevent duplicate function errors.
Minor Changes
baml_language/crates/baml_compiler_mir/src/lower.rs, baml_language/crates/bridge_ctypes/types/baml/cffi/v1/baml_inbound.proto
Removed comment in ResolvedValue handler; reformatted enum comment in proto file.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested labels

rust, feature (small)

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: adding four new media constructor modules (audio, image, pdf, video) to the BAML builtins library.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch hellovai/media-constructors

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codspeed-hq
Copy link

codspeed-hq bot commented Feb 24, 2026

Merging this PR will degrade performance by 13.12%

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

❌ 2 regressed benchmarks
✅ 13 untouched benchmarks
⏩ 91 skipped benchmarks1

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
WallTime bench_parse_only_simple 70.1 µs 80.7 µs -13.12%
WallTime bench_lexer_only_simple 44.8 µs 50.2 µs -10.81%

Comparing hellovai/media-constructors (274f0b5) with canary (7c71c3e)

Open in CodSpeed

Footnotes

  1. 91 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@github-actions
Copy link

github-actions bot commented Feb 24, 2026

Binary size checks passed

7 passed

Artifact Platform Gzip Baseline Delta Status
bridge_cffi Linux 3.9 MB 4.0 MB -120.4 KB (-3.0%) OK
bridge_cffi-stripped Linux 2.3 MB 2.1 MB +199.4 KB (+9.3%) OK
bridge_cffi macOS 3.2 MB 3.3 MB -132.5 KB (-4.0%) OK
bridge_cffi-stripped macOS 1.9 MB 1.7 MB +160.0 KB (+9.3%) OK
bridge_cffi Windows 3.1 MB 3.3 MB -132.1 KB (-4.1%) OK
bridge_cffi-stripped Windows 1.9 MB 1.8 MB +172.3 KB (+9.8%) OK
bridge_wasm WASM 1.8 MB 1.8 MB +1.9 KB (+0.1%) OK

Generated by cargo size-gate · workflow run

baml.Media.from_url("pdf", url, mime_type)
}

function from_base64(base64: string, mime_type: string) -> pdf {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for mime type for pdf


// Infer receiver type (could be single var or nested field access)
let receiver_ty = if receiver_segments.len() == 1 {
// Simple receiver: `baz.method()`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment was neat

Base automatically changed from feat/cffi-handle-table-wire-serialization to canary February 25, 2026 00:39
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
baml_language/crates/baml_compiler_tir/src/lib.rs (1)

2104-2213: ⚠️ Potential issue | 🟠 Major

Short-path builtin lookup runs before local receiver resolution.

Trying baml.{full_path} before receiver-chain handling can misresolve local calls like image.from_url(...) as global builtins.

💡 Proposed fix
                 Expr::Path(segments) if segments.len() >= 2 => {
                     let full_path = segments
                         .iter()
                         .map(smol_str::SmolStr::as_str)
                         .collect::<Vec<_>>()
                         .join(".");
+                    let first_segment_is_local = ctx.lookup(&segments[0]).is_some();
+
-                    let prefixed_path = if !full_path.starts_with("baml.") {
+                    let prefixed_path = if !first_segment_is_local && !full_path.starts_with("baml.") {
                         Some(format!("baml.{full_path}"))
                     } else {
                         None
                     };
-                    if let Some(def) = builtins::lookup_builtin_by_path(&full_path).or_else(|| {
+                    if !first_segment_is_local && let Some(def) = builtins::lookup_builtin_by_path(&full_path).or_else(|| {
                         prefixed_path
                             .as_deref()
                             .and_then(builtins::lookup_builtin_by_path)
                     }) {
                         // Builtin function via Path ...

ℹ️ Review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7c71c3e and 274f0b5.

⛔ Files ignored due to path filters (18)
  • baml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____01_lexer__audio.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____01_lexer__image.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____01_lexer__pdf.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____01_lexer__video.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____02_parser__audio.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____02_parser__image.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____02_parser__pdf.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____02_parser__video.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____03_hir.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____04_5_mir.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____04_tir.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____06_codegen.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____10_formatter__audio.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____10_formatter__image.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____10_formatter__pdf.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____10_formatter__video.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/src/snapshots/baml_tests__codegen__tests__bytecode_display_expanded.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/src/snapshots/baml_tests__codegen__tests__bytecode_display_expanded_unoptimized.snap is excluded by !**/*.snap
📒 Files selected for processing (17)
  • baml_language/crates/baml_builtins/baml/audio.baml
  • baml_language/crates/baml_builtins/baml/image.baml
  • baml_language/crates/baml_builtins/baml/pdf.baml
  • baml_language/crates/baml_builtins/baml/video.baml
  • baml_language/crates/baml_builtins/src/lib.rs
  • baml_language/crates/baml_builtins_macros/src/codegen_native.rs
  • baml_language/crates/baml_builtins_macros/src/collect.rs
  • baml_language/crates/baml_builtins_macros/src/parse.rs
  • baml_language/crates/baml_builtins_macros/src/util.rs
  • baml_language/crates/baml_compiler_hir/src/path_resolve.rs
  • baml_language/crates/baml_compiler_mir/src/lower.rs
  • baml_language/crates/baml_compiler_tir/src/builtins.rs
  • baml_language/crates/baml_compiler_tir/src/lib.rs
  • baml_language/crates/baml_compiler_tir/src/normalize.rs
  • baml_language/crates/baml_tests/build.rs
  • baml_language/crates/bex_vm/src/native.rs
  • baml_language/crates/bridge_ctypes/types/baml/cffi/v1/baml_inbound.proto
💤 Files with no reviewable changes (1)
  • baml_language/crates/baml_compiler_mir/src/lower.rs

Comment on lines +14 to +22
#[derive(Clone)]
pub(crate) enum DslType {
/// A standard Rust type parsed by `syn`.
Syn(Type),
/// A string literal type (e.g., `"image"`).
StringLiteral(String),
/// A union of types (e.g., `"image" | "video" | "audio"`).
Union(Vec<DslType>),
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

DslType is missing a Debug derive.

The enum is used as a proc-macro intermediate type. Without Debug, any eprintln!("{:?}", dsl_ty) or syn Error::new_spanned(…) reflection that touches DslType will fail to compile, making debugging proc-macro panics harder.

✏️ Proposed fix
-#[derive(Clone)]
+#[derive(Clone, Debug)]
 pub(crate) enum DslType {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
#[derive(Clone)]
pub(crate) enum DslType {
/// A standard Rust type parsed by `syn`.
Syn(Type),
/// A string literal type (e.g., `"image"`).
StringLiteral(String),
/// A union of types (e.g., `"image" | "video" | "audio"`).
Union(Vec<DslType>),
}
#[derive(Clone, Debug)]
pub(crate) enum DslType {
/// A standard Rust type parsed by `syn`.
Syn(Type),
/// A string literal type (e.g., `"image"`).
StringLiteral(String),
/// A union of types (e.g., `"image" | "video" | "audio"`).
Union(Vec<DslType>),
}

Comment on lines +280 to +342
pub(crate) fn dsl_type_to_pattern(
dsl_ty: &DslType,
generic_params: &[String],
builtin_types: &HashMap<String, String>,
builtin_enums: &HashMap<String, String>,
) -> TokenStream2 {
match dsl_ty {
DslType::Syn(ty) => type_to_pattern(ty, generic_params, builtin_types, builtin_enums),
DslType::StringLiteral(s) => {
quote!(TypePattern::StringLiteral(#s))
}
DslType::Union(variants) => {
let patterns: Vec<TokenStream2> = variants
.iter()
.map(|v| dsl_type_to_pattern(v, generic_params, builtin_types, builtin_enums))
.collect();
quote!(TypePattern::Union(vec![#(#patterns),*]))
}
}
}

/// Get the simple type name from a `DslType` (for native fn generation).
pub(crate) fn dsl_type_to_simple_name(dsl_ty: &DslType) -> String {
match dsl_ty {
DslType::Syn(ty) => type_to_simple_name(ty),
DslType::StringLiteral(_) => "String".to_string(),
DslType::Union(variants) => {
if variants
.iter()
.all(|v| matches!(v, DslType::StringLiteral(_)))
{
"String".to_string()
} else {
dsl_type_to_simple_name(variants.first().expect("empty union"))
}
}
}
}

/// Check if a `DslType` is a generic type parameter.
pub(crate) fn dsl_is_generic_type(dsl_ty: &DslType, generic_params: &[String]) -> bool {
match dsl_ty {
DslType::Syn(ty) => is_generic_type(ty, generic_params),
DslType::StringLiteral(_) | DslType::Union(_) => false,
}
}

/// Check if a `DslType` is `Result<T>` and return the inner type if so.
///
/// String literal and union types are never `Result`.
/// Returns owned values since the inner type may need wrapping.
pub(crate) fn dsl_unwrap_result_type(dsl_ty: &DslType) -> (DslType, bool) {
match dsl_ty {
DslType::Syn(ty) => {
let (inner, is_result) = unwrap_result_type(ty);
if is_result {
(DslType::Syn(inner.clone()), true)
} else {
(dsl_ty.clone(), false)
}
}
DslType::StringLiteral(_) | DslType::Union(_) => (dsl_ty.clone(), false),
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

No unit tests for the four new DSL helpers.

dsl_type_to_pattern, dsl_type_to_simple_name, dsl_is_generic_type, and dsl_unwrap_result_type are non-trivial dispatch functions with branching logic for StringLiteral, Union, and Syn variants. None have unit tests.

As per coding guidelines, prefer Rust unit tests. Please add #[cfg(test)] coverage — at minimum for the StringLiteral and Union branches that are not exercised by the existing type_* tests.

Comment on lines +306 to +316
DslType::Union(variants) => {
if variants
.iter()
.all(|v| matches!(v, DslType::StringLiteral(_)))
{
"String".to_string()
} else {
dsl_type_to_simple_name(variants.first().expect("empty union"))
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

dsl_type_to_simple_name for mixed Union is order-dependent and fragile.

For a Union where not all variants are StringLiteral, the function silently returns the simple name of the first variant. This means "audio" | String and String | "audio" produce different names. Since this name drives native codegen (argument extraction/conversion), a reordering of union variants in a DSL declaration could silently change generated code.

Additionally, variants.first().expect("empty union") will panic inside the proc-macro with an unhelpful message if a DslType::Union(vec![]) is ever constructed.

Consider either: (a) asserting the invariant at construction time in parse.rs, or (b) making the mixed-Union case an explicit compile-time error here.

🛡️ Suggested hardening
 DslType::Union(variants) => {
     if variants
         .iter()
         .all(|v| matches!(v, DslType::StringLiteral(_)))
     {
         "String".to_string()
     } else {
-        dsl_type_to_simple_name(variants.first().expect("empty union"))
+        // Mixed unions (non-all-StringLiteral) should not reach here.
+        // Union(vec![]) is invalid; Syn variants in a union already have
+        // a fixed canonical name from their first non-literal element.
+        match variants.iter().find(|v| !matches!(v, DslType::StringLiteral(_))) {
+            Some(v) => dsl_type_to_simple_name(v),
+            None => panic!("dsl_type_to_simple_name: empty union reached"),
+        }
     }
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
DslType::Union(variants) => {
if variants
.iter()
.all(|v| matches!(v, DslType::StringLiteral(_)))
{
"String".to_string()
} else {
dsl_type_to_simple_name(variants.first().expect("empty union"))
}
}
}
DslType::Union(variants) => {
if variants
.iter()
.all(|v| matches!(v, DslType::StringLiteral(_)))
{
"String".to_string()
} else {
// Mixed unions (non-all-StringLiteral) should not reach here.
// Union(vec![]) is invalid; Syn variants in a union already have
// a fixed canonical name from their first non-literal element.
match variants.iter().find(|v| !matches!(v, DslType::StringLiteral(_))) {
Some(v) => dsl_type_to_simple_name(v),
None => panic!("dsl_type_to_simple_name: empty union reached"),
}
}
}

Comment on lines +224 to +230
// Fallback: try with "baml" prefix for short-name builtin access.
// e.g., image.from_url -> baml.image.from_url
if first.as_str() != "baml" {
let mut prefixed = vec![Name::new("baml")];
prefixed.extend(segments.iter().cloned());
return resolve_path(db, project, &prefixed);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid unconditional short-path baml prefix fallback in HIR resolver.

This resolver is scope-agnostic. Prefixing unresolved paths (e.g., image.from_url) can incorrectly steal local receiver chains by resolving them as baml.image.from_url.

💡 Proposed fix
-    // Fallback: try with "baml" prefix for short-name builtin access.
-    // e.g., image.from_url -> baml.image.from_url
-    if first.as_str() != "baml" {
-        let mut prefixed = vec![Name::new("baml")];
-        prefixed.extend(segments.iter().cloned());
-        return resolve_path(db, project, &prefixed);
-    }
-
     None
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Fallback: try with "baml" prefix for short-name builtin access.
// e.g., image.from_url -> baml.image.from_url
if first.as_str() != "baml" {
let mut prefixed = vec![Name::new("baml")];
prefixed.extend(segments.iter().cloned());
return resolve_path(db, project, &prefixed);
}
None

Comment on lines +138 to +140
(TypePattern::Union(patterns), ty) => patterns
.iter()
.any(|p| match_pattern_inner(p, ty, bindings)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Isolate union-branch bindings to avoid cross-branch contamination.

patterns.iter().any(...) reuses the same mutable bindings map for every arm. A failed arm can leave partial bindings that incorrectly constrain later arms.

💡 Proposed fix
-        (TypePattern::Union(patterns), ty) => patterns
-            .iter()
-            .any(|p| match_pattern_inner(p, ty, bindings)),
+        (TypePattern::Union(patterns), ty) => patterns.iter().any(|p| {
+            let mut branch_bindings = bindings.clone();
+            if match_pattern_inner(p, ty, &mut branch_bindings) {
+                *bindings = branch_bindings;
+                true
+            } else {
+                false
+            }
+        }),
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
(TypePattern::Union(patterns), ty) => patterns
.iter()
.any(|p| match_pattern_inner(p, ty, bindings)),
(TypePattern::Union(patterns), ty) => patterns.iter().any(|p| {
let mut branch_bindings = bindings.clone();
if match_pattern_inner(p, ty, &mut branch_bindings) {
*bindings = branch_bindings;
true
} else {
false
}
}),

Comment on lines +99 to +103
TypePattern::StringLiteral(val) => Ty::Literal(LiteralValue::String(val.to_string())),
TypePattern::Union(patterns) => substitute_with_fallback(
patterns.first().expect("empty union in TypePattern"),
bindings,
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Union substitution currently drops all but the first member.

Using patterns.first() narrows union patterns incorrectly and can produce wrong inferred parameter/return types.

💡 Proposed fix
-        TypePattern::Union(patterns) => substitute_with_fallback(
-            patterns.first().expect("empty union in TypePattern"),
-            bindings,
-        ),
+        TypePattern::Union(patterns) => {
+            if patterns.is_empty() {
+                Ty::Unknown
+            } else {
+                Ty::Union(
+                    patterns
+                        .iter()
+                        .map(|p| substitute_with_fallback(p, bindings))
+                        .collect(),
+                )
+            }
+        }

Comment on lines +168 to +171
// Media(Generic) is a subtype of any specific media kind.
// At runtime, the kind is just metadata; the wrapper functions
// (e.g., image.from_url) ensure the correct kind is set.
(StructuralTy::Media(_), StructuralTy::Media(_)) => true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

The subtyping rule comment and code disagree — and a unit test is missing.

The comment says "Media(Generic) is a subtype of any specific media kind" (one-directional), but the arm (StructuralTy::Media(_), StructuralTy::Media(_)) => true is fully symmetric: it also makes Media(Image) <: Media(Audio). A function declared to accept audio would therefore silently accept an image value at the type-check level.

If cross-kind assignability is intentional (all media kinds are interchangeable), the comment should say so explicitly. If only Media(Generic) <: Media(Specific) was intended, the arm should be tightened:

🛡️ Option A — clarify intent (all-media-compatible, current behaviour)
-// Media(Generic) is a subtype of any specific media kind.
-// At runtime, the kind is just metadata; the wrapper functions
-// (e.g., image.from_url) ensure the correct kind is set.
-(StructuralTy::Media(_), StructuralTy::Media(_)) => true,
+// All Media kinds are mutually assignable. The kind tag is metadata
+// set by constructors (e.g., baml.image.from_url), not a structural
+// restriction; the runtime does not enforce kind at call boundaries.
+(StructuralTy::Media(_), StructuralTy::Media(_)) => true,
🛡️ Option B — tighten to Generic-only (if cross-kind should be rejected)
-(StructuralTy::Media(_), StructuralTy::Media(_)) => true,
+// Media(Generic) <: Media(AnyKind): a generic media value may be used
+// where a specific media kind is expected.
+(StructuralTy::Media(baml_base::MediaKind::Generic), StructuralTy::Media(_)) => true,

Additionally, the test module has no coverage for this rule. Please add a test:

#[test]
fn test_media_subtyping() {
    let aliases = HashMap::new();
    // cross-kind (verify current intent)
    assert!(is_subtype_of(
        &Ty::Media(baml_base::MediaKind::Image),
        &Ty::Media(baml_base::MediaKind::Audio),
        &aliases,
    ));
}

As per coding guidelines, prefer Rust unit tests over integration tests.

Comment on lines +241 to +248
fn media_kind_from_str(s: &str) -> MediaKind {
match s {
"image" => MediaKind::Image,
"video" => MediaKind::Video,
"audio" => MediaKind::Audio,
"pdf" => MediaKind::Pdf,
_ => MediaKind::Generic,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fail fast on invalid media kind instead of silently coercing to Generic.

The _ => MediaKind::Generic branch masks invalid kinds and can silently produce incorrect media classification.

💡 Proposed fix
 fn media_kind_from_str(s: &str) -> MediaKind {
     match s {
         "image" => MediaKind::Image,
         "video" => MediaKind::Video,
         "audio" => MediaKind::Audio,
         "pdf" => MediaKind::Pdf,
-        _ => MediaKind::Generic,
+        "media" => MediaKind::Generic,
+        other => unreachable!("invalid media kind literal: {other}"),
     }
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
fn media_kind_from_str(s: &str) -> MediaKind {
match s {
"image" => MediaKind::Image,
"video" => MediaKind::Video,
"audio" => MediaKind::Audio,
"pdf" => MediaKind::Pdf,
_ => MediaKind::Generic,
}
fn media_kind_from_str(s: &str) -> MediaKind {
match s {
"image" => MediaKind::Image,
"video" => MediaKind::Video,
"audio" => MediaKind::Audio,
"pdf" => MediaKind::Pdf,
"media" => MediaKind::Generic,
other => unreachable!("invalid media kind literal: {other}"),
}
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants