-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
ami
now carries out most of the required task and its my intention to prototype and test the full functionality in the next few days.
The result of running normami
will be a large CTree
and a set of html
and csv
files that can be re-used. The missing functionality includes
- develop
TableExtractor
to identify table structure TableExtractor
should unifyhocr
andgocr
output to a canonical table format.TableExtractor
will attempt to unify the cell content, according to a schema.TableExtractor
will apply simply heuristics to detect errors and add @class-based annotationTableExtractor
will emit CSV files or Html for the various components of a plot (i.e. possibly several files)- Develop
GraphExtractor
to extractSVGLine
s frombody.graph
s - Develop
ScaleExtractor
to extract numeric scales - apply the results of
GraphExtractor
andScaleExtractor
to convert to a CSV with user coordinates. - synchronise tables and graphs to determine consistency of horizontal content lines
- provide an aggregate view of
gocr
,hocr
andgraph
values. - extract and parse summary data in tables (e.g. Overall P values).
- allow parameterisation of
hocr
andgocr
as far as I understand it. (e.g. to prepare argument lists with whitelists. However both programs are very poorly documented, fragile and I shall not research this. I may open Issues showing the possible tasks.
This data should then be sufficient for repurposing for clients.
PMR output will be CSV and HTML that try to replicate what is visit on the screens, with some indications of reliability.
== What PMR will not currently do ==
- domain-specific analysis of results.
- customisation of use
- client-facing documentations
- refinement of image analysis parameters
- creation of corpora
- develop JS, containers, servers for this project
- implement software on client site.
- respond to alternative corpora.
- write a
clean
facility fornormami
(there is a lot of potential output from a run, especially when different parameters are being used.)
== What PMR will do ==
- attempt to fix runtime bugs
- mentor CG and MD on how to run programs
Metadata
Metadata
Assignees
Labels
No labels