Skip to content

new-figure-table-extraction - Extract figures from SVG #1297

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 27 commits into
base: master
Choose a base branch
from

Conversation

lfoppiano
Copy link
Collaborator

@lfoppiano lfoppiano commented May 30, 2025

@kermitt2 I've started working on the new figure table extraction, but since I have rebased your initial branch on master I created a new branch, to keep the original, in case I break stuff.

From now on, I'm going to do incremental branches, this first PR implemented the SVG parsing and extraction.

There might be some problems with images that are going beyond their actual zone, which I did not find a way to exclude (checking for transparency, etc... did not help in these edge cases - any idea is welcome 👍 ):

image

(For reference this is Figure 1 SVG):

image

I'll try to post a few benchmarks for each new implementation so that we can track the progress.

Here are other images (of correctly identified figures) 😄 :

image image image

@lfoppiano lfoppiano changed the base branch from master to new-figure-table-models May 30, 2025 14:52
@lfoppiano lfoppiano changed the base branch from new-figure-table-models to master May 30, 2025 14:52
@lfoppiano lfoppiano marked this pull request as draft May 30, 2025 14:53
@lfoppiano lfoppiano force-pushed the new-figure-table-models2 branch from 45da2f2 to d23c92f Compare May 31, 2025 05:25
@POST
public Response getFiguresAndTables(
@FormDataParam(INPUT) InputStream inputStream) throws Exception {
return restProcessFiles.getFigures(inputStream);

Check warning

Code scanning / CodeQL

Information exposure through an error message Medium

Error information
can be exposed to an external user.
Error information
can be exposed to an external user.
Error information
can be exposed to an external user.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants