Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Format: Include small sample of the data to the HATS metadata #481

Open
3 tasks done
hombit opened this issue Mar 28, 2025 · 0 comments
Open
3 tasks done

Format: Include small sample of the data to the HATS metadata #481

hombit opened this issue Mar 28, 2025 · 0 comments
Labels
enhancement New feature or request

Comments

@hombit
Copy link
Contributor

hombit commented Mar 28, 2025

Feature request

It would be really helpful to have a small portion of the data as a catalog metadata. Main reasons for that are:

  1. Good for data exploration, so users could have a better idea of the data schema and units, especially in the case of lack of column descriptions
  2. It would really help with some technical tasks, for example with meta derivation for user-defined functions.

This data sample could come in different ways, for example as a catalog "head" or a single row from each partition. This data may go to a new metadata file, like sample.parquet, or replace _common_metadata file.

Before submitting
Please check the following:

  • I have described the purpose of the suggested change, specifying what I need the enhancement to accomplish, i.e. what problem it solves.
  • I have included any relevant links, screenshots, environment information, and data relevant to implementing the requested feature, as well as pseudocode for how I want to access the new functionality.
  • If I have ideas for how the new feature could be implemented, I have provided explanations and/or pseudocode and/or task lists for the steps.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Suggested Todo
Development

No branches or pull requests

2 participants