-
Notifications
You must be signed in to change notification settings - Fork 308
Description
Description
Registry Version: 3.1.6
Persistence type: postgres
When storing base64 encoded protobuf descriptors into schema registry, there are two things that don't work:
- Issue 1: If schema validation is enabled (i.e. Validity rule is FULL), then registering protobuf descriptors with references doesn't work. This is because the parsing logic fails at multiple places. 1 2 3 4 5
- Issue 2: Apicurio SerDes libraries don't work with base64 encoded protobuf schemas, again because of parsing issues. 1
Environment
Running apicurio schema registry locally in a docker container. Using the APIcurio provided UI (3.1.6) and serdes libraries (3.1.6) for interacting with schema registry.
Validity Rule : FULL
Compatibility Rule : BACKWARD_TRANSISTIVE
Integrity Rule : FULL
Steps to Reproduce : reproducer.txt
Issue 1: Schema syntax validation fails for schemas with references
dep.proto
syntax = "proto3";
package test;
message Dep {
string name = 1;
}root.proto
syntax = "proto3";
package test;
import "dep.proto";
message Root {
Dep d = 1;
}Execute the curl commands in the attached file in the sequence that they are present.
Issue 2:
Register the schemas in the registry using the curl commands in the uploaded file, generate the java stubs of the proto files and use them in a producer and consumer using Apicurio SerDes libraries.
Expected vs Actual Behaviour
Expected
Issue 1 : Schema validation must pass for schemas with references when a base64 encoded protobuf schema descriptor is uploaded
Issue 2 : SerDes libraries should work with base64 encoded protobuf schema descriptors
Actual
Issue1 : Schema validation fails for schema with references when a base64 encoded protobuf schema descriptor is uploaded
Issue 2 : SerDes libraries don't work with base64 encoded protobuf schema descriptors
Notes:
Background / Context
We are planning to use Apicurio Schema Registry for managing event schemas in our platform. Our ecosystem includes both Java and Go services, all of which interact with the registry via language-specific SerDes libraries.
I’m sharing some background on how we arrived at the issue described above.
Initial Approach
Initially, we stored text .proto files directly in the registry. Schemas were registered using the following flow:
GitHub repository (text .proto files)
→ middleware to fetch schemas and detect updates
→ sync to Apicurio Schema Registry
However, we encountered issues with the Apicurio SerDes libraries where compatibility checks for MESSAGE field types began to fail. (I can provide additional details if helpful.)
Descriptor-Based Approach
To work around this, we moved away from storing text .proto files and introduced an intermediate descriptor-based step:
GitHub repository (text .proto files)
→ generate protobuf descriptors
→ upload descriptors to S3
→ middleware fetches descriptors
→ convert descriptors back to text .proto using square-wireschema
→ register schemas in Apicurio
This resolved the compatibility issues and worked correctly for both Java and Go services initially.
Issue with map Types in Go
During further testing with a wider variety of schemas, we discovered that Go services failed for schemas containing map fields.
This turned out to be caused by:
- How square-wireschema converts descriptors back into text .proto
- Limitations in the Go-side parser we use (jhump/protoreflect), which was unable to parse the generated .proto files correctly. This same parser is used in confluent-kafka-go as well.
Current Direction
As a result, we decided to stop storing text .proto files altogether and instead store base64-encoded protobuf descriptors directly in the registry.
At this point, we started encountering the issues described above.
Additional Context
I’ve reviewed the following PR and discussion:
Suggestion by @EricWittmann to convert descriptors to text .proto (e.g., via square-wireschema) before storing them
While this approach appears to work well for Java services, based on our experience there is a risk of breaking other language implementations, such as Go, due to parser differences.
Next Steps
I’m happy to contribute a PR to address this, potentially following a similar approach to:
Before proceeding, I’d appreciate guidance on the preferred direction and whether this aligns with the project’s expectations.