-
Notifications
You must be signed in to change notification settings - Fork 3
Mine For Specimens
####Discover and Dynamically Create Reference Links Between Publications and Catalogued Specimens
##Background
- Currently there is no easy way to link AMNH science publications to specimens in our research collections. We would like a solution to dynamically create bibliography that would match references to publications to specimen numbers in our collection database (KE EMu - http://kesoftware.com/).
- We have scanned specimen catalog cards but the jpegs are not named but specimen or catalog number so there is no way to find a card that pertains to a specific specimen. We would like to be able to search the cards for a specimen name or number and ultimately, we would like to attach the jpeg of the card to the corresponding specimen record in the collection database (KE EMu - http://kesoftware.com/).
Scientific staff, researchers, interested public
-
Extract specimen numbers/names from digitized Scientific Publications (http://digitallibrary.amnh.org/handle/2246/5) and match them to corresponding numbers/names in collections databases and output a bibliography in XML format that could be imported into KE Emu.
-
Extract jpegs labeled with specimen numbers/names from digitized Scientific Publications (http://digitallibrary.amnh.org/handle/2246/5)
-
Extract the scientific name and the catalog number (if available) from jpegs of specimen catalog cards and insert this data for each card / catalog page into a spreadsheet with the name of the file in which it appears (ie: p0005_74101.jpg)
Challenge 2 - typewritten cards, use OCR to get number off card, create a spreadsheet that has the image name and the relevant number
Challenge 3 - OCR machine readable line numbers in handwritten cards to create
Comprehensive association and linking of different data to become the ultimate record of life on Earth
Identify Latin words in publications
OCR the cards without rescanning them
###AMNH Vertebrate Zoology EMu site
EMu is a comprehensive and flexible collections management system, and is used for multiple divisions of vertebrate zoology at AMNH including Herpetology, Ichthyology, Mammalogy, and Ornithology. It's the best place to begin a search as it crosses formats and location. Downside: for collections-based material, the descriptions may be too general and relevant content may not come up in a search.
-
Deep Dive, Stanford
Get dump of all specimen names that exist in catalogs now (dictionary) Extract the specimen numbers Formatting is different across departments
- Do we have the formats? Spreadsheet?
- Getting lists of all specimen numbers?
Possibly use the Sapling Detector codebase
-
Text Mining Museum Specimen IDs - http://rossmounce.co.uk/2015/05/19/text-mining-for-museum-specimen-identifiers/
Challenges --|-- Online Resources And Data Sets --|-- Code of Conduct --|-- Home
