Skip to content

Conversation

@lesyk
Copy link
Contributor

@lesyk lesyk commented Jan 8, 2026

Fix #68

This pull request enhances the handling of MasterFormat-style partial numbering (e.g., .1, .2) in PDF document conversion to Markdown, ensuring that such numbering is merged with its associated text rather than split into separate lines or table columns. It also introduces comprehensive tests to verify correct behavior and bumps the package version.

Testing and validation:

  • Added test_pdf_masterformat.py with extensive tests to verify correct regex matching, merging logic, and content preservation for MasterFormat-style partial numbering in converted documents.

Version update:

  • Bumped the package version from 0.1.4 to 0.1.5 in __about__.py to reflect the new functionality and fixes.

@lesyk lesyk marked this pull request as ready for review January 8, 2026 18:47
@afourney afourney merged commit 7fdaefb into microsoft:main Jan 8, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PDF parsing doesn't support partially numbered lists

2 participants