-
Notifications
You must be signed in to change notification settings - Fork 77
Open
Description
If you have a look at all the attributes that spaCy generates for their tokens then you can imagine that some of these features can be useful for machine learning pipelines. To name a few:
is_oov: is the token part of the vocabulary/does it have a vector?is_stop: is the token a stopword?lemma_: what is the lemma of the tokenpos/tagcoarse/fine-grained part of speech information- morphological features
- grammatical dependency
These can all have a discrete representation and could be added in general to a Rasa pipeline.
Metadata
Metadata
Assignees
Labels
No labels