You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
P.S.: Thanks for your understanding, currently I'm not knowledgable enough on python to fix this issue with confidence, but I tried to add all the context.
The text was updated successfully, but these errors were encountered:
I want to use nlm-ingestor from C#, so I was
layout_reader.py
to create the code. And I found a really bad bug regarding level handling.It will hard to explain for me, but I'll do my best and provide the test data I'm using.
Given the following structure:
When the reader goes to read the last
para
, in theif block['tag'] == 'para':
it doesn't take into account the level change to change its parent.Since my knowledge of python is limited, I'll paste some visual debugging information to make it easier to understand:
In this image, you can see a Level 0 header with another Level 0 header as children:
And in this image, you can see a Level 0 paragraph inside a Level 2 header:
This is the translated code with the bug:
Here's the fixed method, handling the level change within the rest of the cases. I know it's not pretty, I didn't refactor it yet:
Here's the first case fixed. You can notice that there's no Level 0 header and that there's more Level 1 items in there:
The second case, now there's no childrens as it should:
Here's the used json for testing:
https://gist.github.com/solvingproblemswithtechnology/a1d79d1892284375855a7205d5eb4ea5
P.S.: Thanks for your understanding, currently I'm not knowledgable enough on python to fix this issue with confidence, but I tried to add all the context.
The text was updated successfully, but these errors were encountered: