-
Notifications
You must be signed in to change notification settings - Fork 20
Open
Labels
enhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomers
Description
Sometimes it's useful to run an EL pipeline with zero rows in order to create the target tables but not yet load data into them.
In the SDK for taps, we recently added --test=schema
to emit only the SCHEMA messages without emitting any RECORD messages.
For non-SDK taps, it would be helpful to have a similar option that excludes all rows or all rows past the first n
records.
Note:
- For safety, we should probably also not pass along any
STATE
messages when running in this mode. Since we'd be dropping records intentionally, we would not want to pass a bookmark that implied records had been written which were not actually passed. - To the extent that the tap is still having to process records, there's still a performance hit from having to read all the records from source. However, most pipelines are constrained on target performance, so by skipping the write process, there should still be significant gains for many/most use cases.
teej, edgarrmondragon and simonpai
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomers