Search Engine
Overview This is implementation of the search engine for a collection of 418 documents on environmental news from Kaggle. Link to dataset: https://www.kaggle.com/amritvirsinghx/environmental-news-nlp-dataset
The retrieved results are then compared using a standard realtime open source search engine ‘ElasticSearch’.
Directory Structure:
Code Directory:
-
Download and extract dataset from kaggle into as following directory structure in your current working directory ./content/content/TelevisionNews/
-
Run all the cells in the AIR_Assignment_Team39.ipynb file after changing the path location.
-
In the last cell of the AIR_Assignment_Team39.ipynb file, the user can pass the query as the parameter to the query_search function. This function displays the top 10 results retrieved by the search engine.
Snapshots Directory:
Contains the output screenshot for the queries executed.