-
Notifications
You must be signed in to change notification settings - Fork 0
Add pivot function to handle modeledyears files #27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #27 +/- ##
==========================================
- Coverage 97.94% 97.76% -0.19%
==========================================
Files 19 19
Lines 1024 1029 +5
==========================================
+ Hits 1003 1006 +3
- Misses 21 23 +2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request adds a pivot functionality to handle modeledyears files by introducing a new transformation step in the tabular data processing pipeline.
- Added
pl_pivot_on
function to perform DataFrame pivoting based on configuration - Integrated the pivot function into the existing transformation pipeline
- The function unpivots all columns and selects only the specified pivot column
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
src/r2x_core/processors.py
Outdated
if not data_file.pivot_on: | ||
return df | ||
|
||
all_columns = df.collect_schema().names() |
Copilot
AI
Oct 13, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling collect_schema()
forces schema collection which can be expensive for large lazy frames. Consider using df.schema.names()
or df.columns
if available in the Polars version being used.
all_columns = df.collect_schema().names() | |
all_columns = df.columns |
Copilot uses AI. Check for mistakes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mcllerena , can you add notes using numpy style on why we do this? Also, i think we have on the docstring on the pipeline the process, can you update the docstring as well?
No description provided.