Skip to content

[Feature Request]: Metadata Cleanup #19162

Open
@strawgate

Description

@strawgate

Feature Description

When using docling I'm ending up with 25kb documents as docling pushes a lot of junk into metadata and the relatednodeinfo object brings related node metadata into the node meaning there are 3-4 copies of this metadata

I have a metadata cleanup transformer which I'm using to clean this up

Is something worth contributing?

Reason

No response

Value of Feature

Much faster search and ingest as document sizes are much smaller

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesttriageIssue needs to be triaged/prioritized

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions