Open
Description
Question
I have simple script that used docling to parse this webpage:
https://ramzinex.com/help/register-in-ramzinex
The issue is when I parse the document it contains ol
tag and footer details as well. How can I exclude them?
The ol
tag info:
صرافی رمزینکس
راهنما
ثبت نام و احراز هویت
The footer
info:

This whole section will be included in final document.
Also, when I use hybrid chunker to chunk them, these are still included.
Is there any config to exclude redundant stuff? from links, PDFs or anything else?