Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to export markdown with references to images instead of embedding them as base64 #211

Open
uninstall-your-browser opened this issue Nov 3, 2024 · 4 comments

Comments

@uninstall-your-browser
Copy link

It would be nice to have an option to export markdown files with images as references to files instead of embedding them in the document as base 64. This might be similar to how "figure export" example extracts and stores files

@Tendo33
Copy link

Tendo33 commented Nov 4, 2024

I also noticed that after using result.document.export_to_markdown(), the images in the Markdown are completely removed.
Hope extract the images and save them in a folder called image

@cenit
Copy link

cenit commented Nov 4, 2024

this is a feature I am really interested too.
I had similar experiences to @Tendo33, after pdf-to-md conversions images are lost and substituted with an html comment of <!-- image -->. Where are you seeing base64 images? I can easily add another step after it to remove them from the md and put them in a separate file...

@cau-git
Copy link
Contributor

cau-git commented Nov 4, 2024

@Tendo33 @cenit @uninstall-your-browser this feature is already supported, you can configure the pipeline to extract pictures, and change arguments to the export_to_markdown() method. Please refer to this post.

@uninstall-your-browser
Copy link
Author

I was interested in them being references to files rather than embedded base64 blobs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants