Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async support? #279

Open
ion-elgreco opened this issue Apr 30, 2024 · 4 comments
Open

Async support? #279

ion-elgreco opened this issue Apr 30, 2024 · 4 comments
Labels
triage Requires initial review (is duplicate, reproduce bug, severity or priority)

Comments

@ion-elgreco
Copy link

What is the motivation and/or use case?

Is async supported in lakefs spec?

How can we implement this feature?

No response

@ion-elgreco ion-elgreco added the triage Requires initial review (is duplicate, reproduce bug, severity or priority) label Apr 30, 2024
@AdrianoKF
Copy link
Contributor

Thanks for raising this issue, @ion-elgreco!

Currently, there is no async support in lakeFS-spec, since the lakeFS client does not support it either (only for import operations, which are outside of the scope of lakeFS-spec).
We might consider this feature in the future depending on the overall demand.

Could you elaborate a bit on your use case?

@ion-elgreco
Copy link
Author

ion-elgreco commented May 2, 2024

@AdrianoKF I need to read a couple of hundred thousands text files and parse them with regex. Doing this asynchronous is much faster.

I have currently written it with s3fs and lakefs S3 gateway. Seeing 15-20x performance increase versus sequential reads with lakefs_spec

@leonpawelzik
Copy link
Contributor

Hey @ion-elgreco,

would you be willing to test out our WIP solution: #283?

@ion-elgreco
Copy link
Author

@leonpawelzik currently on holidays, and afterwards I won't be using LakeFS anytime soon.

Might be an idea to create small benchmark that compares s3fs vs lakefs-spec async perf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage Requires initial review (is duplicate, reproduce bug, severity or priority)
Projects
None yet
Development

No branches or pull requests

3 participants