-
Notifications
You must be signed in to change notification settings - Fork 108
Description
Willingness to contribute
No. I cannot contribute at this time.
Feature Request Proposal
The Da Vinci Record Transformer is an API that allows users to hook into the lifecycle of the Da Vinci Client.
This feature was built by the open-source community to give more leverage to our users. To showcase the power of DVRT, we would like to build integrations with it that can used by others or be used an example of what can be done with it.
One example integration done with DVRT was with DuckDB. This allows users to run SQL OLAP queries against Venice datasets, which was not previously possible due to Venice's key-value access pattern.
Motivation
Since this is a brand new API that is directly exposed to users, we would like to showcase what is possible with it to encourage our users to onboard to it. Doing this exercise also helps identify any gaps in the DVRT abstract class.
Details
Integrations will be built on top the DaVinciRecordTransformer abstract class. Please use the DuckDB integration as a frame of reference when developing.
Since Venice is a key-value database and DuckDB is a SQL OLAP database, we would like to have integrations with different types of databases. Some examples would be: graph databases or search engines. Please keep in mind performance and community usage when selecting a database to integrate with.
What component(s) does this affect?
-
Controller: This is the control-plane for Venice. Used to create/update/query stores and their metadata. -
Router: This is the stateless query-routing layer for serving read requests. -
Server: This is the component that persists all the store data. -
VenicePushJob: This is the component that pushes derived data from Hadoop to Venice backend. -
VenicePulsarSink: This is a Sink connector for Apache Pulsar that pushes data from Pulsar into Venice. -
Thin Client: This is a stateless client users use to query Venice Router for reading store data. -
Fast Client: This is a stateful client users use to query Venice Server for reading store data. -
Da Vinci Client: This is an embedded, stateful client that materializes store data locally. -
Samza: This is the library users use to make nearline updates to store data. -
Admin Tool: This is the stand-alone client used for ad-hoc operations on Venice.