-
Notifications
You must be signed in to change notification settings - Fork 31
Description
When the source of a record stream supports some natural key ordering, it'd be nice to optionally support retaining that order. Both recs-fromcsv and recs-fromsplit support the --header option which could preserve the field order as found in the first line of the input. With the natural order retained, various stream output operations can preferentially use it when no specific fields are specified, i.e. a bare ... | recs-tocsv or ... | recs-totable could use it but ... | recs-tocsv -k foo,bar wouldn't. This feature would make general filtering of data sets easier by removing the need to track external to the pipeline what fields you got at input to ensure they're output again.
Ideally it'd be general enough to be applied to any input operation and also be possible to add to records ad-hoc via recs-xform or similar operations for use later in the pipeline.
Since there's no stream-level metadata, we're limited to stashing this ordering information on each record, perhaps under a key like __field_order or __fields. Output operations can examine the first record for the stashed order. It's not the prettiest solution technically, so I'd love if someone had a better idea.
Does this seem reasonable?