Skip to content

Releases: GoogleCloudPlatform/DataflowTemplates

Dataflow Templates 2022-10-25-00_RC00

26 Oct 21:17
Compare
Choose a tag to compare

Release Week of 2022-10-25

Note: This release is in the process of rolling out. It may not be in your region yet.

Improvements

[Project Structure] The structure of the project has changed to simplify contributing.

  • Move classic templates to their own subdirectory called v1/, this is consistent with flex templates being housed in the v2/ subdirectory.
  • Remove unified-templates.xml and rely solely on the root pom.xml for building the project.

[JdbcToBigQuery Template] Add optional data loading pipeline option to toggle whether data is truncated or appended into BigQuery.

Contributors

Pablo Estrada @pabloem
Suddhasatwa Bhaumik @suddhasatwabhaumik
Bruno Volpato @bvolpato

Dataflow Templates 2022-10-18-00_RC00

25 Oct 15:20
Compare
Choose a tag to compare

Release Week of 2022-10-18

Improvements

[BigTable Templates] Added Bigtable Resource Manager for integration testing
[BigQuery Templates] Added BigQuery Resource Manager for integration testing
[Splunk Template] Enable Splunk batching by default (10) on the Pub/Sub to Splunk template

Bug Fixes

[ElasticSearch Templates] Bug fix for the cases when maxBatchSizeBytes may be exceeded

Contributors

Bruno Volpato @bvolpato
Jeffrey Kinard @Polber
Mark Pevec @ggprod
olegsa @oleg-semenov

Dataflow Templates 2022-09-26-01_RC00

03 Oct 16:03
Compare
Choose a tag to compare

Release Week of 2022-09-26

Improvements

[Spanner Template] Removing CAST to string statement when reading Numeric columns from Spanner.
[All templates] Do not apply formatting/spotless on Apache Beam code

Contributors

@bvolpato
@darshan-sj

Dataflow Templates 2022-09-13-00_RC00

20 Sep 02:33
Compare
Choose a tag to compare

Release Week of 2022-09-12

New Templates

  • JDBC to BigQuery Flex template added. Same functionality as the existing classic template, but the new template also supports BigQuery Storage Write API.

Improvements

  • [BigQueryToParquet Template] Now supports row restrictions of the BigQuery Storage Read API.
  • [DataStreamToSpanner Template] Better handling of incoming change events based on the schema mappings in the session file.
  • Prevent WindowedFilenamePolicy from changing bucket names. WindowedFileNamePolicy replaces date patterns in output directory for dynamically changing the output location based on the window end time. This leads to errors when bucket name contains a date pattern. This change makes sure that the bucket name is always unchanged.

Bug Fixes

  • Temporary workarounds for SpannerIO: LocalSpannerIO and LocalReadSpannerSchema classes now cast JSONB records as VARCHAR.

Minor changes

  • Removed the explicit Bigtable client version from the pom.xml files to use the transitive version from Beam. This is to keep the client library up-to-date and match the version expected by Beam.
  • Updated maven-dependency-plugin version.

Contributors

@pranavbhandari24
@oleg-semenov
@Deep1998
@shubhamswe
@bvolpato

Full Changelog: 2022-09-05-00_RC00...2022-09-13-00_RC00

Dataflow Templates 2022-09-19-00_RC00

19 Sep 13:21
Compare
Choose a tag to compare

Release Week of 2022-09-19

Improvements

  • Updated to Beam 2.41
  • [Pub/Sub] Added framework for integration tests
  • WindowedFileNamePolicy bucket protections
  • BigQueryToParquet template supports row restrictions
  • Updated maven dependency plugin

Contributors

@oleg-semenov
@pranavbhandari24
@bvolpato

Dataflow Templates 2022-09-05-00_RC00

06 Sep 15:02
Compare
Choose a tag to compare

Release Week of 2022-09-05

Improvements

[Pub/Sub Proto to BigQuery] Improve documentation to give correct include_imports flag
[DataStream To Spanner] Add transformations supported by HarbourBridge
[Spanner Change Streams to BigQuery] Remove unnecessary logging statements

Contributors

@zhoufek
@bvolpato
Deep Chowdhury

Dataflow Templates 2022-08-29-00_RC00

30 Aug 18:20
Compare
Choose a tag to compare

Release Week of 2022-08-29

Improvements

[Spanner Change Stream] Allow setting autoscaling parameters
[Pub/Sub to Cloud Storage] Allow configuration of windowDuration parameter

Bug Fixes

[Spanner Change Stream to BigQuery] Fix parameter name from spannerRpcAuthority to rpcAuthority
[Spanner] Use Spanner version 6.23.3 to fix null pointer exceptions
[CDC] Better error handling when merge info can't be fetched from BigQuery
[BigQuery] Change - characters to _ in BigQuery dataset names

Contributors

@bvolpato
@marengaz
Internal contributors

Dataflow Templates 2022-08-15-00_RC02

16 Aug 14:15
Compare
Choose a tag to compare

Release Week of 2022-08-15

New Templates

N/A

Improvements

[ElasticSearchIO] ElasticsearchIO template improvements
[MongoDB] MongoDB batch template - Removed bucketing to avoid aggregation at read time
[Spanner] Created Spanner Resource Manager

Bug Fixes

[Spanner to BigQuery] Debug template options

Contributors

Mark Pevec
Venkatesh Shanbhag
ike-albert

Dataflow Templates 2022-08-08-00_RC01

10 Aug 18:50
Compare
Choose a tag to compare

Release Week of 2022-08-08

Improvements

[Dataplex Templates] Minor changes in the updateDataplexMetadata parameter description.

Bug Fixes

Fixed an issue with null error messages in BulkDecompressor (#433).

Contributors

@zhoufek
@an2x

Dataflow Templates 2022-08-01-00_RC04

05 Aug 14:16
Compare
Choose a tag to compare

Release Week of 2022-08-01 (second candidate)

Bug Fixes

[MongoDB] Fix classpath in spec files

Contributors

@an2x