Skip to content

Conversation

lzaeh
Copy link
Contributor

@lzaeh lzaeh commented Oct 13, 2025

Why?

Previously, my understanding of the type system was insufficient, which led to confusion between array and slice usage. As a result, the deserialization step of the slice serializer did not work properly in cross-language scenarios due to missing alignment rules (for example, when the types were structurally identical and should have taken the fast path, the previous version failed to correctly interpret the byte stream).

What does this PR do?

Serializer-related changes:

  • Adjusted the slice serializer to better align serialization and deserialization rules.
  • Standardized the type system to ensure that the slice serializer is consistently used where appropriate, eliminating mixed usage with array serializers.
  • Moved the serializer for primitive type arrays into primitive_array.go. If you think it’s not appropriate, I can revert the change.

Test-related changes:

  • Re-enabled previously commented-out test cases in commonSlice/Array.
  • Moved the [n]String test out of commonArray. Although it can be correctly deserialized into []String, the current test logic does not treat [n]String and []String as equivalent.
  • Updated several primitive array tests in TestXLangSerializer to use array types instead of slices.
  • Modified the check function in TestXLangSerializer to deserialize using the precise reflected type instead of interface{}.
    Rationale:
    • For example, []interface{}{"a", "b", "c"} and []string{"a", "b", "c"} are considered the same type and written via the fast path, resulting in identical byte sequences.
    • However, during deserialization in Go, the default behavior is to use []interface{} unless the target type is explicitly provided.(Just my opinion)
    • This means that if the test expects []string{}, it would fail. While we could force construction of a []T{} when the same type is detected, if the test’s starting point is []interface{}, it would still fail. Therefore, the deserialization target type is now explicitly matched via reflection.

Known Issues

  • After pulling the latest code (which includes modifications related to field order hashing in structs), the complex struct tests that passed 10 mins ago now fail due to Python-side deserialization errors.

    • A. Direct testing shows identical hashes, but deserialization in Python triggers an “invalid ID (id = 0)” error in the listSerializer—possibly related to changes in this PR.
    • B. Simplifying the struct on both the Python and Go sides (by removing fields) caused the hash values to differ.
      I will follow up on this issue to determine whether the cause lies in the list serializer or incomplete handling of struct field order consistency.
  • Cross-language list serialization tests (serializing in one language, deserializing in another—Python ↔ Go, Java ↔ Go) revealed that Go’s int and float64 types cannot be correctly deserialized in the other languages.

    • All other basic types work fine, and Go can correctly deserialize bytes from both Python and Java.
    • When encountering floating-point values, Go must use float32 to deserialize successfully.
  • The slice serializer currently mimics Python’s fast-path logic, but the matching rules for integer and floating-point types are somewhat ambiguous. My understanding here may be incomplete, and this part likely needs further refinement.

Related issues

Does this PR introduce any user-facing change?

  • Does this PR introduce any public API change?
  • Does this PR introduce any binary protocol compatibility change?

Benchmark

@lzaeh lzaeh requested a review from chaokunyang as a code owner October 13, 2025 10:58
@pandalee99 pandalee99 self-requested a review October 13, 2025 11:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants