Named entity recognition (NER) is a very important and long-standing goal of the NLP community. In it, we attempt to identify and categorize "entities" from text so that we can use them for downstream processing such as argument attachment or event extraction.
Text: The pilot, John Doe, flew over the United States in his airplane.
Named entities: John Doe (PERSON), the United States (GPE)
- Text will be no longer than 500 words.
- Required named entity types are PERSON, GPE (Geopolitical Entity), LOC (Location), ORG (Organization) - you can add more if you choose
- If no named entities are found, return a "No entities found" message.
Because we don't aim to test you on project setup, we have provided templates that you may choose to use if you wish. For the frontend, we've given you templates in Vue.js
, React.js
, and Angular
. For the backend, we have provided a template in Flask
.
- Make sure you have
Node.js
andnpm
installed.- In case you don't have Node.js or npm installed, refer to the NodeSource blog posts Installing Node.js Tutorial: Using nvm (macOS and Ubuntu) or Installing Node.js Tutorial: Windows (Windows) for instructions.
- We also recommend setting up a virtual environment for the Python dependencies. A good one is Miniconda, which you can then use in a manner similar to the following code snippet:
conda create --name web-ner python=3.7
- Run
make install FRONTEND=react-frontend BACKEND=flask-backend
- Run
make start FRONTEND=react-frontend BACKEND=flask-backend
- Code quality - We want to know that you are capable of writing production-level code involving machine learning material.
- Usability - The interface should be intuitive to use for the reviewer.
- Accuracy of model - The model you choose to use should be able to cover the very basics, like recognizing
the United States
as a GPE. We just want to know that whatever model you choose or implement works. - Creativity - This is a catch-all category for whatever else you want to incorporate to show off your skills. Some examples could be implementing more entity types or other linguistic features, creating a more visually appealing interface, creating an option to use a different language, or adding better error handling. This is your time to shine.
If you have any questions/comments while working on this, please reach out to your contact at ISI.