-
Notifications
You must be signed in to change notification settings - Fork 50
Open
Labels
Description
Description
We are going to have a number of functions that work on self.data
and require specific columns or row entries of the data
dataframe.
As long as we only have functions that operate on Technologies
this is not a problem.
I'm thinking that we could have a decorator to check whether the required columns (or row entries) exist and warn/fail if they don't.
This would solve to challenges that I anticipate:
- If we have other types of objects that expose
self.data
with transformations, we test to make sure that thedata
passed to the function is compatible with the function - If someone wants to use our methods on their own data or starts modifying
self.data
, then our functions may fail if the required columns are not present - It helps us to document and make it clearer, which function requires which columns/fields.
Example
E.g.
- the decorator could be called
@requires_columns
- we wrap
adjust_scale(..)
as
@requires_columns(on_dataframe=self.data, columns=["value","unit","scale","scale_unit"])
def adjust_scale(...)
- The decorator then checks before
adjust_scale(...)
is called whether thecolumns
exist in the dataframeon_dataframe
- Fail if they don't exist
- Pass to the function and execute normally if they exist
Notes, ideas, potential issues
- I mentioned row entries because I think this could also be implemented with a similar decorator for rows. E.g. every pair of [source, technology, year, region] would need on entry with a certain [parameter] for the method to work properly. Technologies / row groups that do not fulfill this requirement would be excluded from the function automatically. But let's keep it simple for now and not implement this.
- This could already be useful for e.g. the currency conversion functions and the archiving functions?
Happy to hear your thoughts @finozzifa