Skip to content

skweak v0.3.1

Latest
Compare
Choose a tag to compare
@plison plison released this 25 Mar 13:29
· 37 commits to main since this release

Brand new version of skweak, including both a number of bug fixes and some new functionalities:

  1. skweak is now using the latest version of hmmlearn, thereby fixing a number errors due to a mismatch between method names
  2. We now have a clearer split between aggregation models for sequence labelling and for text classification. Possible aggregators for sequence labelling are SequentialMajorityVoter and HMM (preferred), while the aggregators for non-sequential text classification are MajorityVoter and NaiveBayes.

We also introduce a brand new functionality: multi-label classification! Instead of assuming that all labels are mutually exclusive, you can now aggregate the results of labelling functions without assuming that only one label is correct. This multi-label scheme is available for both sequence labelling (see MultilabelSequentialMajorityVoter and MultilabelHMM) and text classification (see MultilabelMajorityVoter and MultilabelNaiveBayes).

By default, all labels can be simultaneously true for a given data point, but you can enforce exclusivity relations between labels through the method set_exclusive_labels. If all labels are set to be mutually exclusive, the aggregation is equivalent to a standard multi-class setup. Internally, this functionality is implemented by constructing and fitting separate aggregation models for each label.

The code for the aggregation models has also been heavily refactored, making it hopefully easier to create new aggregation models.