Nlmatics extracts data from large documents sets using retrieval augemented generation (RAG). It can also be used for RAG search on knowledge bases. It comes with an extensive UI for search, data extraction and PDF viewing. It ingests documents using the llmsherpa/nlm-ingestor backend and indexes the document in elastic search which are retrieved using a hybrid search approach.
Nlmatics was founded by Ambika Sukla and Bulent Yener.
Nlmatics developed an early RAG like question answering, semantic search and data extraction pipeline using layout aware chunking, vector + bm25 indexing and language models. The open source codebase was developed from 2020-2023 by Yi Zhang, Ambika Sukla, Kiran Panicker, Niranjan Borawake, Suhail Kandanur, Wonjun Kang, Reshav Abraham, Nima Sheikholeslami, Lora Johns, Jasmin Omanovic, Karen Reeves, Sonia Joseph, Evan Li, Batya Stein, Cheyenne Zhang, Ashlan Ahmed, Nicholas Greenspan, Connie Xu, Shivangi Jha and others with product management support from Pooja Reddy, Ambika Sukla and Jan Choy.
Nlmatics is thankful to have worked with prominent early adopters in financial services, legal services and life sciences who recognized and leveraged our technology way before the current wave of generative AI.
Nlmatics raised seed funding from Felix Anthony, Silvertech Ventures, World Trade Ventures and ERS Ventures.