Skip to content

Better Support for Naive Datetime Serialization #872

@dimahc

Description

@dimahc

Description

Summary

Currently, msgspec serializes naive datetime objects as strings, which can be challenging to deserialize in other languages (especially Go). This creates interoperability issues when building cross-language systems that need to exchange datetime data via MessagePack.

Problem Description

When using msgspec to encode naive datetime objects, they are serialized as ISO format strings:

import msgspec
from datetime import datetime

# Naive datetime (no timezone info)
dt = datetime(2025, 12, 1, 0, 0, 0)
encoded = msgspec.msgpack.encode(dt)
# Results in a string representation that's hard to parse in Go

This approach has several drawbacks:

  1. Cross-language compatibility issues: Languages like Go expect timestamps to be in a standardized binary format (MessagePack timestamp extension or Unix timestamp)
  2. Performance overhead: String parsing is slower than binary timestamp decoding
  3. Timezone ambiguity: Naive datetimes lose semantic meaning when serialized as strings
  4. Inconsistent behavior: Different libraries handle naive datetimes differently

Current Workaround

Other libraries like ormsgpack provide flags to handle this properly:

import ormsgpack
from datetime import datetime

dt = datetime(2025, 12, 1, 0, 0, 0)  # naive datetime
encoded = ormsgpack.packb(
    dt,
    option=ormsgpack.OPT_DATETIME_AS_TIMESTAMP_EXT | ormsgpack.OPT_NAIVE_UTC
)
# This produces a MessagePack timestamp extension that Go can easily decode

Proposed Solutions

Option 1: Add Configuration Flags (Preferred)

Add configuration options similar to ormsgpack:

import msgspec

# Configure encoder to treat naive datetimes as UTC timestamps
encoder = msgspec.msgpack.Encoder(
    naive_datetime_as_utc=True,
    datetime_as_timestamp_ext=True
)

dt = datetime(2025, 12, 1, 0, 0, 0)
encoded = encoder.encode(dt)
# Results in MessagePack timestamp extension format

Option 2: Enhanced Custom Hook Support

Allow custom hooks to intercept and modify datetime encoding behavior:

import msgspec
from datetime import datetime, timezone

def datetime_hook(obj):
    if isinstance(obj, datetime):
        if obj.tzinfo is None:
            # Treat naive datetime as UTC
            obj = obj.replace(tzinfo=timezone.utc)
        # Return timestamp that msgspec will encode as extension
        return msgspec.msgpack.TimestampExt(obj.timestamp())
    return obj

encoder = msgspec.msgpack.Encoder(enc_hook=datetime_hook)

Option 3: Global Configuration

Provide module-level configuration:

import msgspec

# Configure globally
msgspec.msgpack.configure(
    naive_datetime_treatment='utc',
    datetime_format='timestamp_ext'
)

Use Case

We're building a system where:

  • Python services use msgspec for performance
  • Go services need to consume the same MessagePack data
  • Datetime fields are common in our data structures
  • Cross-language compatibility is critical

Currently, we have to use different libraries (ormsgpack for encoding, msgspec for other operations) which creates inconsistency and complexity.

Expected Behavior

Naive datetimes should be encodeable as MessagePack timestamp extensions, treating them as UTC timestamps, similar to how ormsgpack handles this with OPT_NAIVE_UTC flag.

Additional Context

  • msgspec version: 0.19.0
  • Python version: 3.11+
  • Cross-language interoperability is increasingly important for modern distributed systems
  • MessagePack timestamp extensions are well-supported across languages

Benefits

  1. Better cross-language compatibility: Standard timestamp formats work across all MessagePack implementations
  2. Performance: Binary timestamp encoding/decoding is faster than string parsing
  3. Consistency: Unified approach to datetime handling
  4. Developer experience: Less cognitive overhead when dealing with mixed timezone/naive datetimes

Would the maintainers be open to implementing one of these approaches? I'm happy to contribute a PR if there's interest in this feature.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions