GitHub - DanielT504/Movie-rec-data-miner: A movie recommender using database mining and cosine similarities.

This movie recommender uses data-mining to suggest movies to users based on the contents of a database populated by aggregating the dataset links below.

The recommender establishes a connection to the database managed by connect_to_database(), which uses SQLAlchemy to interact with it. An SQL query then fetches data from the MojoBudgetUpdate, transforming the results into a pandas DataFrame to be processed.

To preprocess the data, our algorithm handles missing values and combines multiple text features from the dataset (genre_1, mpaa, main_actor_1, and trivia) into a single string. This creates a representation of each movie, making it easier to assess their similarities

The text data is then put into a numerical form to calculate similarity, using CountVectorizer() for feature extraction. The numerical form is a sparse matrix of token counts, which represents the frequency of words across all the movie descriptions. Using this matrix, the pairwise cosine_similarity() is computed for all movies in our dataset, to measure how similar the documents are (regardless of size).

Get_recommendations() takes a movie title as input, identifies its index, and finds the movies that are most similar based on these cosine similarities. They are sorted into descending order and the top ten are returned.

To run data-mining application, first execute:

make install

make run

Once finished using data-mining application execute:

make clean

Dataset Links:

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
clientApp_schema_and_SQL		clientApp_schema_and_SQL
Makefile		Makefile
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

DanielT504/Movie-rec-data-miner

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages