SparseSpacyFeaturizer

If you have a look at [all the attributes](https://spacy.io/api/token#attributes) that spaCy generates for their tokens then you can imagine that some of these features can be useful for machine learning pipelines. To name a few: 

- `is_oov`: is the token part of the vocabulary/does it have a vector? 
- `is_stop`: is the token a stopword? 
- `lemma_`: what is the lemma of the token
- `pos`/`tag` coarse/fine-grained part of speech information 
- [morphological features](https://spacy.io/usage/linguistic-features#morphology)
- [grammatical dependency](https://spacy.io/usage/linguistic-features#navigating)

These can all have a discrete representation and could be added in general to a Rasa pipeline. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SparseSpacyFeaturizer #29

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

SparseSpacyFeaturizer #29

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions