Brand new version of skweak
, including both a number of bug fixes and some new functionalities:
skweak
is now using the latest version ofhmmlearn
, thereby fixing a number errors due to a mismatch between method names- We now have a clearer split between aggregation models for sequence labelling and for text classification. Possible aggregators for sequence labelling are
SequentialMajorityVoter
andHMM
(preferred), while the aggregators for non-sequential text classification areMajorityVoter
andNaiveBayes
.
We also introduce a brand new functionality: multi-label classification! Instead of assuming that all labels are mutually exclusive, you can now aggregate the results of labelling functions without assuming that only one label is correct. This multi-label scheme is available for both sequence labelling (see MultilabelSequentialMajorityVoter
and MultilabelHMM
) and text classification (see MultilabelMajorityVoter
and MultilabelNaiveBayes
).
By default, all labels can be simultaneously true for a given data point, but you can enforce exclusivity relations between labels through the method set_exclusive_labels
. If all labels are set to be mutually exclusive, the aggregation is equivalent to a standard multi-class setup. Internally, this functionality is implemented by constructing and fitting separate aggregation models for each label.
The code for the aggregation models has also been heavily refactored, making it hopefully easier to create new aggregation models.