Skip to content

feat(bigquery/storage/managedwriter): allow overriding proto conversion mapping #12579

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

alvarowolfx
Copy link
Contributor

Currently the usage of bigquery.InferSchema and adapt. StorageSchemaToProto2Descriptor methods to generate proto descriptors with tables that have fields of type TIMESTAMP can have compatibility issues, since the original struct has a time.Time field and in the proto descriptor it becomes an INT64.

This PR adds an option to convert to Google's Timestamp Well Known Type (WKT), which is also accepted by the Storage Write API. I think we can't make it the default because some customer might be relying on unmarshalling JSON data with timestamps in the INT64 format ( unix timestamp ) instead of a RFC3339 formatted timestamp string already.

Also we discussed adding options to the StorageSchemaToProto2Descriptor method before, on the improvement issue related to CDC helpers: #10721

Naming is hard, but not sure what is the best name for the WithTimestampWellKnownType method. Open to suggestions.

Fixes #12569
Supersedes #12578


// WithTimestampWellKnownType defines that table fields of type Timestamp, are mapped
// as Google's WKT timestamppb.Timestamp.
func WithTimestampWellKnownType(useTimestampWellKnownType bool) Option {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want this to be option more configurable? for a given input schema type, should the user be able to specify the desired output proto type? Almost all of the schema types in https://cloud.google.com/bigquery/docs/supported-data-types list multiple possible proto representations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

excellent point, I'll see how to make it more configurable.

@alvarowolfx alvarowolfx changed the title feat(bigquery/storage/managedwriter): optionally use timestamp wkt feat(bigquery/storage/managedwriter): allow overriding proto conversion mapping Jul 21, 2025
@alvarowolfx alvarowolfx requested a review from shollyman July 22, 2025 14:54
@alvarowolfx alvarowolfx added owlbot:run Add this label to trigger the Owlbot post processor. and removed breaking change labels Jul 22, 2025
@gcf-owl-bot gcf-owl-bot bot removed the owlbot:run Add this label to trigger the Owlbot post processor. label Jul 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the BigQuery API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bigquery/storage/managedwriter/adapt: Schema to protobuf descriptors weird for TIMESTAMP columns
2 participants