Extracting Table titles/descriptions #97
-
Hello, I'm trying to extract the text/title that comes just before the extracted table. How can I do that? it seems the text that comes right before a table is just labled as text (it doesn't have any correlating info as to which table it refers to.). For example. Description: Table showing last year's rain data. TAble: rain-data (heading (h1/h2/h3 etc. then the data columns repeat ()) So when the table is extracted by docling it doesn't have the Description that comes right before the table. I need to extract that information somehow. Any pointers will be helpful. Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
At the moment, the table object will contain a I think you are referring more to the other paragraphs around the table, right? |
Beta Was this translation helpful? Give feedback.
-
@dolfim-ibm Thank you for the answer and for the PR that made the export of the tables possible! I was indeed looking for the text for the table, not so much the surrounding text. I will try to export it out. Should've looked at the schema a little closer 😅 |
Beta Was this translation helpful? Give feedback.
At the moment, the table object will contain a
text
field which maps to the caption of the table (if any).I think you are referring more to the other paragraphs around the table, right?
In this case, one could iterate through the
main_text
elements with the index +/- a given "window size".