Skip to content

page_range not working with .pptx documents #1946

Open
@JOSHMT0744

Description

@JOSHMT0744

Bug

when setting page_range parameter of convert function, the document converter still parses the whole document regardless.

Steps to reproduce

python```
doc = "example.pptx"
output_dir = Path("out")

doc_converter = create_doc_converter()
conv_result = doc_converter.convert(doc, page_range=(1,2))

Export Markdown format without image referencing

with (output_dir / f"{conv_result.input.file.stem}.md").open("w", encoding="utf-8") as fp:
fp.write(conv_result.document.export_to_markdown())


Given example.pptx is more than 2 pages, output will still be whole document

### Docling version
Docling version: 2.41.0
Docling Core version: 2.42.0
Docling IBM Models version: 3.8.1
Docling Parse version: 4.1.0

### Python version
Python: cpython-39 (3.9.21)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions