-
Notifications
You must be signed in to change notification settings - Fork 366
Add fenic-datasets integration #1936
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Just made the PR for the image too: https://huggingface.co/datasets/huggingface/documentation-images/discussions/548 |
I'm just discovering fenic, the API looks great :) btw is there a way to use Hugging Face Inference Providers for the semantic / generative operations ? It's a unified API for many providers serving models on HF, you can find more info at https://huggingface.co/docs/inference-providers/en/index |
Thank you so much for the kind words. It's great to hear from you that the API looks great! Regarding HF Inference Providers, we don't currently have support for it but we will definitely add it, same with also writing back to Datasets. Right now we only support reading from HF Datasets but the goal is to have full support. For us the functionality that HF Datasets offers is really important for the experience we want to offer and the functionality we are working on, e.g. hydrating MCP servers with precomputed data sets that are stored on HF. For an example of that, check this: https://huggingface.co/datasets/typedef-ai/fenic-0.4.0-codebase |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks very nice! Added a few small questions/suggestions.
I tried to address all the questions/suggestions, check also my latest commits. Let me know if there's anything else I can do to make this better. I really appreciate the time you spend on this! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for working on it! Will see if @lhoestq has any final comments but otherwise think we can merge (can update later with Inference Providers examples!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good ! The write to HF operation is not available because it's not yet implemented in DuckDB right ? (link to duckdb issue here)
Once this is merged/deployed you should share this online with the community :) Many people will like the API which is clean and convenient. Feel free to let us know here about your posts so we can amplify and re-share !
Also looking forward to see the integration with HF Inference providers, have you already started to look into it by any chance ?
Yes but we are most probably going to integrate directly the Datasets SDK on fenic. This will give us the maximum flexibility for the things we want to do. As soon as the work is at a stage it can be shared I'll let you know so you can take a look.
100% I'm also planning to write some content about this, I believe Datasets has tremendous potential and I'd like to share some the stuff we find really neat by combining a processing engine with the SDK and the infrastructure HF provides.
We haven't yet but we will soon. I'll keep you posted! Thank you so much for all the support guys! |
Hey everyone,
fenic now has native support for reading datasets directly from the Hugging Face Hub using the
hf://
protocol, documented at docs.fenic.ai. This PR adds the corresponding documentation to the Hugging Face docs.Changes
docs/hub/datasets-fenic.md
)_toctree.yml
Features documented
hf://
protocolImage
I'll add the image on a separate PR, is there any specific instructions I should follow for this?
Happy to answer any questions and of course accommodate any changes required.
Thank you for the amazing work you've been doing with HuggingFace Datasets.