Skip to content

Latest commit

 

History

History
47 lines (34 loc) · 2.09 KB

File metadata and controls

47 lines (34 loc) · 2.09 KB

CSM (Conventional and Social Media Movies)

Dataset

CSM (Conventional and Social Media Movies) Dataset 2014 and 2015 Data Set

Data Set Information

Year:2014 and 2015 Source: Twitter,YouTube,IMDB

Abstract

- -
Data Set Characteristics Multivariate
Attribute Characteristics Integer
Number of Attributes 12
Number of Instances 217
Associated Tasks Classification, Regression
Missing Values? Yes

Source

Mehreen Ahmed Department of Computer Software Engineering National University of Sciences and Technology (NUST), Islamabad, Pakistan mahreenmcs '@' gmail.com

Result

Paper - Using Crowd-Source Based Features from Social Media and Conventional Features to Predict the Movies Popularity

Drop the row with missing values Measure the accuracy of the test subset (20% of instances) => Same as paper Takes "Genre, Gross, Budget, Screens, Sequel, Ratings" as Conventional Features and Drop Gross Income as it is not available before release and Ratings is the one to be predicted.

Accuracy criteria: recording to Paper criteria Accuracy 2

Model Accuracy R2 MAE MSE RMSE Remark
Linear Regression Scikit Learn 0.7955 0.0446 0.6158 0.6334 0.7958 -
Linear Regression From Scratch 0.7955 0.9554 0.6158 0.6334 0.7958 Calculate weight only by "inverse"
Linear Regression PyTorch NN 0.7727 - 0.6299 0.6738 0.8208 Calculate weight by SGD