Skip to content

Create an overview of cleaning taggers #207

@KennethEnevoldsen

Description

@KennethEnevoldsen

Agreed with @peterbjorgensen that it would be a great idea to create over overview of what taggers might be relevant for cleaning.

Outlining

  • Create a .md table with relevant taggers + a short description
  • Check what filters were used for existing cleaning strategies and at least try to match them (see here)
  • potentially some estimate on speed (time to process danish gigaword Wikipedia section ~55M tokens)

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions