Skip to content

finalising ami output for forest plots #12

@petermr

Description

@petermr

ami now carries out most of the required task and its my intention to prototype and test the full functionality in the next few days.
The result of running normami will be a large CTree and a set of html and csv files that can be re-used. The missing functionality includes

  • develop TableExtractor to identify table structure
  • TableExtractor should unify hocr and gocr output to a canonical table format.
  • TableExtractor will attempt to unify the cell content, according to a schema.
  • TableExtractor will apply simply heuristics to detect errors and add @class-based annotation
  • TableExtractor will emit CSV files or Html for the various components of a plot (i.e. possibly several files)
  • Develop GraphExtractor to extract SVGLines from body.graphs
  • Develop ScaleExtractor to extract numeric scales
  • apply the results of GraphExtractor and ScaleExtractor to convert to a CSV with user coordinates.
  • synchronise tables and graphs to determine consistency of horizontal content lines
  • provide an aggregate view of gocr, hocr and graph values.
  • extract and parse summary data in tables (e.g. Overall P values).
  • allow parameterisation of hocr and gocr as far as I understand it. (e.g. to prepare argument lists with whitelists. However both programs are very poorly documented, fragile and I shall not research this. I may open Issues showing the possible tasks.

This data should then be sufficient for repurposing for clients.

PMR output will be CSV and HTML that try to replicate what is visit on the screens, with some indications of reliability.

== What PMR will not currently do ==

  • domain-specific analysis of results.
  • customisation of use
  • client-facing documentations
  • refinement of image analysis parameters
  • creation of corpora
  • develop JS, containers, servers for this project
  • implement software on client site.
  • respond to alternative corpora.
  • write a clean facility for normami (there is a lot of potential output from a run, especially when different parameters are being used.)

== What PMR will do ==

  • attempt to fix runtime bugs
  • mentor CG and MD on how to run programs

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions