CSM (Conventional and Social Media Movies) Dataset 2014 and 2015 Data Set
Year:2014 and 2015 Source: Twitter,YouTube,IMDB
- | - |
---|---|
Data Set Characteristics | Multivariate |
Attribute Characteristics | Integer |
Number of Attributes | 12 |
Number of Instances | 217 |
Associated Tasks | Classification, Regression |
Missing Values? | Yes |
Mehreen Ahmed Department of Computer Software Engineering National University of Sciences and Technology (NUST), Islamabad, Pakistan mahreenmcs '@' gmail.com
Drop the row with missing values Measure the accuracy of the test subset (20% of instances) => Same as paper Takes "Genre, Gross, Budget, Screens, Sequel, Ratings" as Conventional Features and Drop Gross Income as it is not available before release and Ratings is the one to be predicted.
Accuracy criteria: recording to Paper criteria Accuracy 2
Model | Accuracy | R2 | MAE | MSE | RMSE | Remark |
---|---|---|---|---|---|---|
Linear Regression Scikit Learn | 0.7955 | 0.0446 | 0.6158 | 0.6334 | 0.7958 | - |
Linear Regression From Scratch | 0.7955 | 0.9554 | 0.6158 | 0.6334 | 0.7958 | Calculate weight only by "inverse" |
Linear Regression PyTorch NN | 0.7727 | - | 0.6299 | 0.6738 | 0.8208 | Calculate weight by SGD |