Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up Parquet utf8 validation #6667

Open
Dandandan opened this issue Oct 31, 2024 · 1 comment · May be fixed by #6668
Open

Speed up Parquet utf8 validation #6667

Dandandan opened this issue Oct 31, 2024 · 1 comment · May be fixed by #6668
Labels
enhancement Any new improvement worthy of a entry in the changelog

Comments

@Dandandan
Copy link
Contributor

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Utf8 validation comes up in profiles when reading Parquet.

Describe the solution you'd like

We could use https://docs.rs/simdutf8/latest/simdutf8/ to speed up validation of utf8.

Describe alternatives you've considered

Additional context

@Dandandan Dandandan added the enhancement Any new improvement worthy of a entry in the changelog label Oct 31, 2024
@Dandandan Dandandan linked a pull request Oct 31, 2024 that will close this issue
@alamb
Copy link
Contributor

alamb commented Nov 7, 2024

Here is another idea: #6701

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Any new improvement worthy of a entry in the changelog
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants