Skip to content

Commit 8a9a2de

Browse files
authored
fix: use inference config (#456)
<!-- CURSOR_SUMMARY --> > [!NOTE] > Aligns PDF rendering DPI with centralized config and bumps version. > > - Add `PDF_RENDER_DPI` to `InferenceConfig` (default `350`) > - `convert_pdf_to_image` now uses `inference_config.PDF_RENDER_DPI` when `dpi` is unset (replacing env var lookup) > - Bump version to `1.1.6` and update `CHANGELOG.md` > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 8bf6d27. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY -->
1 parent 72ba3d8 commit 8a9a2de

File tree

4 files changed

+12
-2
lines changed

4 files changed

+12
-2
lines changed

CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,7 @@
1+
## 1.1.6
2+
3+
- Use inference_config to set default rendering DPI
4+
15
## 1.1.5
26

37
- Render PDF to image using PyPDFium instead of pdf2image, due to much improved performance for certain docs
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
__version__ = "1.1.5" # pragma: no cover
1+
__version__ = "1.1.6" # pragma: no cover

unstructured_inference/config.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -116,5 +116,10 @@ def IMG_PROCESSOR_SHORTEST_EDGE(self) -> int:
116116
"""configuration for DetrImageProcessor to scale images"""
117117
return self._get_int("IMG_PROCESSOR_SHORTEST_EDGE", 800)
118118

119+
@property
120+
def PDF_RENDER_DPI(self) -> int:
121+
"""DPI to render PDF pages to images"""
122+
return self._get_int("PDF_RENDER_DPI", 350)
123+
119124

120125
inference_config = InferenceConfig()

unstructured_inference/inference/layout.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010
import pypdfium2 as pdfium
1111
from PIL import Image, ImageSequence
1212

13+
from unstructured_inference.config import inference_config
1314
from unstructured_inference.inference.elements import (
1415
TextRegion,
1516
)
@@ -422,7 +423,7 @@ def convert_pdf_to_image(
422423
try:
423424
images: dict[int, Image.Image] = {}
424425
if dpi is None:
425-
dpi = int(os.environ.get("PDF_RENDER_DPI", 400))
426+
dpi = inference_config.PDF_RENDER_DPI
426427
scale = dpi / 72.0
427428
for i, page in enumerate(pdf, start=1):
428429
try:

0 commit comments

Comments
 (0)