List view
Miscellaneous backlog tickets for Rivulet workstream (multimodal dataset format)
No due date•1/3 issues closedCreate DeltaCAT V2 APIs, including (1) a native DeltaCAT Catalog implementation, (2) a native DeltaCAT CLI and corresponding Linux-FS-like APIs, (3) Ray/Daft Data source/sink adapters (to enable local/distributed reads/writes of DeltaCAT catalogs). The DeltaCAT Catalog implementation should also include all capabilities in`deltacat/storage/rivulet/dataset.py` (mostly on the table version level), including: 1. Manage (multiple) schemas on dataset 2. Import data (e.g. from_csv) 3. Export data (e.g. to webdataset) 4. Read and write methods (currently, deltacat catalog has somewhat different read/write methods from rivulet)
No due date•6/8 issues closedUse the DeltaCAT Metastore format in rivulet. Be able to express rivulet concepts (e.g. multiple schemas) in deltacat metastore. Clean up internal classes in rivulet that will no longer be needed.
No due date•1/7 issues closedImplement all required DeltaCAT storage APIs and make any changes required to integrate LSM-based CDC on Ray with Iceberg! Proposal Doc: https://docs.google.com/document/d/1kyyJp4masbd1FrIKUHF1ED_z1hTARL8bNoKCgb7fhSQ/edit.
No due date•9/14 issues closed