Skip to content

Suggestions for new chunkers #4

@do-me

Description

@do-me

Hey, super useful tool!

There's been some development in the chunking community. If you'd like to keep your app up to date here are a few suggestions. Also, considerung that all of the options struggle with correctly identifying sentence boundaries (quickly tested with some texts) and tend to chop off parts, it would be nice to have more choice.

Python

JS

Maybe another idea would be to include the option to allow for any regex like we did in SemanticFinder. I tried to come up with a good regex for sentence boundaries but it's incredibly hard.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions