Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exporting to markdown when table cell contains "|" #92

Open
isakcodes opened this issue Dec 3, 2024 · 2 comments
Open

Exporting to markdown when table cell contains "|" #92

isakcodes opened this issue Dec 3, 2024 · 2 comments
Labels
duplicate This issue or pull request already exists

Comments

@isakcodes
Copy link

I'm using docling to load documents and then export_to_markdown. My data has many tables and I've discovered that some contain the "|" character, breaking the table structure. So I thought, docling should surely escape those when exporting. Or at least offer an option to do so (if not by default).

I see there is another open issue mentioning this.

I considered contributing, but am unsure of where a change like that could fit in. Perhaps adding a boolean escape_breaking parameter here and then replace any occurences of "|" with regex. Perhaps that is too crude. Any docling-core developer that could point me in the right direction?

Kind regards,

@ceberam
Copy link
Collaborator

ceberam commented Dec 4, 2024

@isakcodes Thanks for reporting this issue and your willingness to contribute!
Indeed, the problem you reported is being addressed in #61 .
Also note that we are leveraging the tabulate library, which also has an issue in this direction: astanin/python-tabulate#241

@ceberam ceberam added the duplicate This issue or pull request already exists label Dec 4, 2024
@ceberam
Copy link
Collaborator

ceberam commented Dec 4, 2024

Duplicate of #61

@ceberam ceberam marked this as a duplicate of #61 Dec 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists
Projects
None yet
Development

No branches or pull requests

2 participants