Skip to content

fix: only save applicable page images#226

Merged
vagenas merged 1 commit intodocling-project:mainfrom
ClemDoum:fix(document)/picture-refs
Jul 15, 2025
Merged

fix: only save applicable page images#226
vagenas merged 1 commit intodocling-project:mainfrom
ClemDoum:fix(document)/picture-refs

Conversation

@ClemDoum
Copy link
Contributor

@ClemDoum ClemDoum commented Apr 1, 2025

Bug description

fixes: #179

When saving a markdown page by page (potentially to keep track of page size in characters for instance) images of the whole documents were save. This had 2 majors drawbacks:

  • saving too many images when we want to dump a single page
  • saving all images, n times for a n pages doc when saving it page by page

Fix

Changed

  • updated DoclingDocument._with_pictures_refs to take the page_no and forward it to iter_items()

@mergify
Copy link

mergify bot commented Apr 1, 2025

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🔴 Require two reviewer for test updates

This rule is failing.

When test data is updated, we require two reviewers

  • #approved-reviews-by >= 2

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\(.+\))?(!)?:

@vagenas
Copy link
Member

vagenas commented Jul 14, 2025

@ClemDoum this looks good, but the CI build appears frozen.
Can you please best update your branch with the latest main, so a fresh CI gets triggered?

Signed-off-by: Clément Doumouro <clement.doumouro@gmail.com>
@vagenas vagenas force-pushed the fix(document)/picture-refs branch from 4d82723 to 8ef0bd4 Compare July 15, 2025 07:45
@github-actions
Copy link
Contributor

DCO Check Passed

Thanks @ClemDoum, all your commits are properly signed off. 🎉

@mergify
Copy link

mergify bot commented Jul 15, 2025

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\(.+\))?(!)?:

🟢 Require two reviewer for test updates

Wonderful, this rule succeeded.

When test data is updated, we require two reviewers

  • #approved-reviews-by >= 2

@codecov
Copy link

codecov bot commented Jul 15, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

📢 Thoughts on this report? Let us know!

@vagenas vagenas changed the title fix(document): save document images page by page fix: only save applicable page images Jul 15, 2025
@vagenas vagenas merged commit ebd9147 into docling-project:main Jul 15, 2025
11 checks passed
@vagenas
Copy link
Member

vagenas commented Jul 15, 2025

@ClemDoum you can ignore my previous message, the latest changes have been incorporated & this is now merged.
Thank you for your contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

document.save_as_markdown(page_no=page_number) creates an artefacts folder with all the images, instead the images of the page

3 participants