Skip to content

Conversation

@ruolin59
Copy link

@ruolin59 ruolin59 commented Nov 6, 2025

What changes are proposed in this pull request, and why are they necessary?

This PR fixes a bug in ToLowercaseSchemaVisitor where field names within complex default values were not being lowercased, causing schema validation errors when creating fields with lowercased schemas.

Given an input schema with a record field that has a default value:

{
  "name": "Struct_Field",
  "type": {
    "type": "record",
    "name": "NestedRecord",
    "fields": [
      {"name": "firstName", "type": "string"},
      {"name": "Age", "type": "int"}
    ]
  },
  "default": {
    "firstName": "John",
    "Age": 30
  }
}

Before the fix:

{
  "name": "struct_field",
  "type": {
    "type": "record",
    "name": "nestedrecord",
    "fields": [
      {"name": "firstname", "type": "string"},
      {"name": "age", "type": "int"}
    ]
  },
  "default": {
    "firstName": "John",  // Mismatched casing
    "Age": 30             // Mismatched casing
  }
}

Result: AvroTypeException: Invalid default for field struct_field

After the fix:

{
  "name": "struct_field",
  "type": {
    "type": "record",
    "name": "nestedrecord",
    "fields": [
      {"name": "firstname", "type": "string"},
      {"name": "age", "type": "int"}
    ]
  },
  "default": {
    "firstname": "John",  // Correctly lowercased
    "age": 30             // Correctly lowercased
  }
}

The Solution:
Added lowercaseDefaultValue() method that recursively transforms default values based on the schema type:

  • RECORD types: Lowercases field names in record default values
  • MAP types: Lowercases all keys in map default values
  • ARRAY types: Recursively processes each array element
  • Primitives: Returns unchanged

The implementation handles both GenericData.Record and Map-based default value representations

How was this patch tested?

Added a new test testLowercaseSchemaWithComplexDefaultValues() in SchemaUtilitiesTests that verifies the fix handles:

  1. Simple primitive default values (no change needed)
  2. Nested record default values with mixed-case field names
  3. Map default values with mixed-case keys
  4. Array default values containing records with mixed-case field names

The test uses input/expected schema files:

  • testLowercaseSchemaWithDefaultValues-input.avsc - Contains fields with various default values using mixed case
  • testLowercaseSchemaWithDefaultValues-expected.avsc - Contains the fully lowercased expected output

All existing tests continue to pass, confirming backward compatibility.

@ruolin59 ruolin59 changed the title Ruolin/fix default value lowercasing [Coral-Schema] default value lowercasing Nov 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant