Description
Feature type
- Add new functionality
Story
As a pycytominer developer, I would like to have descriptive dataclasses to use when working on pycytominer's main functions. Instead of repetitively passing necessary attributes such as "feature_cols" and "metadata_cols" from function to function, I could pass a singluar descriptive dataclass with all the information about the dafarame needed to operate on it. This would allow me to write more modular, more easily tested code.
General description of the proposed functionality
As a first step to reducing the quantity of redundant code in Pycytominer, it would be good to create a ProfileData Class. This Class could contain methods that provide shared functionality used by all or most of the core pycytominer functions such how the data should be read from a file and determining what feature/metadata columns are.
Example pseudo-code
Class ProfilesData:
___init__(profiles, feature_cols, meta_cols,):
self.profiles_df = pd.read_csv(input_csv)
self.features_df = self.profiles_df[feature_cols]
def aggregate_data(self, aggregate_on):
self.profiles_df.group_by(aggregate_on)
Additional information
This class should be initially provided as separate functionality, but gradually could integrated into the core functions (aggregate
, normalize
, annotate
, etc. Ideally, this should be able to be accomplished without changing the functionality that users expect from those functions.