-
Notifications
You must be signed in to change notification settings - Fork 1.5k
feat(bigquery/storage/managedwriter): allow overriding proto conversion mapping #12579
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
… proto description conversion
|
||
// WithTimestampWellKnownType defines that table fields of type Timestamp, are mapped | ||
// as Google's WKT timestamppb.Timestamp. | ||
func WithTimestampWellKnownType(useTimestampWellKnownType bool) Option { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want this to be option more configurable? for a given input schema type, should the user be able to specify the desired output proto type? Almost all of the schema types in https://cloud.google.com/bigquery/docs/supported-data-types list multiple possible proto representations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
excellent point, I'll see how to make it more configurable.
Currently the usage of
bigquery.InferSchema
andadapt. StorageSchemaToProto2Descriptor
methods to generate proto descriptors with tables that have fields of typeTIMESTAMP
can have compatibility issues, since the original struct has atime.Time
field and in the proto descriptor it becomes anINT64
.This PR adds an option to convert to Google's Timestamp Well Known Type (WKT), which is also accepted by the Storage Write API. I think we can't make it the default because some customer might be relying on unmarshalling JSON data with timestamps in the
INT64
format ( unix timestamp ) instead of a RFC3339 formatted timestamp string already.Also we discussed adding options to the
StorageSchemaToProto2Descriptor
method before, on the improvement issue related to CDC helpers: #10721Naming is hard, but not sure what is the best name for the
WithTimestampWellKnownType
method. Open to suggestions.Fixes #12569
Supersedes #12578