Replies: 3 comments 3 replies
-
|
HI @Jay-ju , thanks for writing this up. I think this approach makes sense. As a first step, could you share some sample workloads that would benefit from this? I'd like to set up some baseline benchmarks that we can work towards improving. |
Beta Was this translation helpful? Give feedback.
-
|
@kevinzwang I made a simple design here; please take a look at this content. https://docs.google.com/document/d/1dFTjB3DEkBFwEKuPJthuJ7QfIMLzIPfLGGcG1RkyfQ4/edit?tab=t.0 |
Beta Was this translation helpful? Give feedback.
-
|
A demo implementation has been done here, but it's not the final version and can't be reviewed yet. However, it roughly conveys the functionality that wants to be achieved. #5081 |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Technical Proposal for Daft-Lance Integration
Current Implementation
Daft effectively utilizes URL-based lazy processing to perform joins, aggregations, and complex operations (e.g., window functions) in multi-modal workflows. This paradigm delays downloading content until its actual use, which offers significant flexibility.
Identified Storage-Side Challenges
Proposed Integration Approach
To harness the complementary strengths of Daft and Lance, we suggest:
Replace URL-based download interfaces with row_id-based point queries via lance_take(),
Substitute upload interfaces with Lance's atomic update() operations.
Anticipated Advantages
@jaychia @kevinzwang @universalmind303 WDYT?
Beta Was this translation helpful? Give feedback.
All reactions