Brainstorming: API for meta-data/side-information on the features

This follows a discussion on Slack and @ablaom's suggestion to open an issue to brainstorm ideas.

**The problem**
Consider a machine learning model with p-dimensional features X. Now assume that for each feature j, the analyst has access to external information Z(j) and furthermore that this information can potentially be used by machine learning models to improve predictive performance. A concrete example of such external information would be the case of categorical Z(j), which induces a partition of features into groups of related features.

 How could the MLJ API account for that?


**Example use cases for grouping structure**

* Such information is quite frequently used in the context of supervised methods with linear predictors, with the most common method being the (sparse) Group Lasso. This is also the use case I am interested in.
* It comes up less in the supervised non-linear setting, but  one exception is  https://link.springer.com/article/10.1186/s12859-017-1993-1, where they study Random Forests: the probability that a feature is selected within the candidate set for a split is modulated by the group it belongs to (so features of some groups are prioritized).
* Grouping structure comes up frequently in the unsupervised setting, e.g., the following is a method popular in the computational biology community https://www.embopress.org/doi/full/10.15252/msb.20178124. In the Machine Learning community such problems sometimes come up under the name multi-view learning (e.g., https://arxiv.org/pdf/1604.04939.pdf).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Brainstorming: API for meta-data/side-information on the features #480

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Brainstorming: API for meta-data/side-information on the features #480

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions