Skip to content

Commit 8de7679

Browse files
committedFeb 12, 2025
Update docling versions
Signed-off-by: Aakanksha Duggal <aduggal@redhat.com>
1 parent 6790918 commit 8de7679

File tree

2 files changed

+4
-4
lines changed

2 files changed

+4
-4
lines changed
 

‎requirements.txt

+3-3
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
# SPDX-License-Identifier: Apache-2.0
22
click>=8.1.7,<9.0.0
33
datasets>=2.18.0,<3.0.0
4-
docling[tesserocr]>=2.4.2,<=2.8.3; sys_platform != 'darwin'
5-
docling>=2.4.2,<=2.8.3; sys_platform == 'darwin'
6-
docling-parse>=2.0.0,<3.0.0
4+
docling[tesserocr]>=2.9.0; sys_platform != 'darwin'
5+
docling>=2.9.0; sys_platform == 'darwin'
6+
docling-parse>=3.3.0
77
GitPython>=3.1.42,<4.0.0
88
gguf>=0.6.0
99
httpx>=0.25.0,<1.0.0

‎src/instructlab/sdg/utils/taxonomy.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -151,7 +151,7 @@ def extract_text_from_pdf(file_path: str) -> str:
151151
)
152152
page_text = "\n".join(text_lines)
153153
pdf_text += page_text + "\n"
154-
except Exception as e:
154+
except Exception as e: # pylint: disable=broad-exception-caught
155155
logger.warning(
156156
f"Error extracting text from page {page_no} of '{file_path}': {e}"
157157
)

0 commit comments

Comments
 (0)