-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Labels
good first issueGood for newcomersGood for newcomers
Description
The current implementation for SimilarTermIndexGenerator
is rather naive. It merges all equivalence classes in a connected component based on similarity between pairs of equivalence classes. This approach has the strong disadvantage of potentially merging dis-similar equivalence classes because similarity is not transitive.
One improvement could be to pick equivalence classes as strong seeds and then merge them with all other equivalence classes that are similar to the seed. While this could still merge dis-similar equivalence classes there is the guarantee that they all at least satisfy the similarity threshold with the seed equivalence class.
Metadata
Metadata
Assignees
Labels
good first issueGood for newcomersGood for newcomers