Alphabet Soup offers donations to charitable organizations. They want to ensure that their funds donated will be targeted towards a successful project.
In order to select which applicants receive a donation, an analysis has been done using charity_data.csv, which consists of over 34,000 organizations that have been funded by Alphabet Soup in the past. With this data a neural network model has been created to predict which future applicants might be successful in their project when funded by Alphabet Soup.
A deep learning model created initially produced an accuracy of 72.79% with a model loss of 56%.
The original model used the 'Relu' activation function at epochs of 100. Below is the model structure:
For better optimization to achieve an accuracy score of 75% or more, additional 3 models have been trained.
Data Processing - preprocessing of data was the same as original model
- Columns "EIN" and "NAME" were dropped as they were not beneficial for a target or feature variables.
- "IS_SUCCESSFUL" column was selected as the target outcome.
- All remaining columns were featuers; "APPLICATION_TYPE", "AFFILIATION", "CLASSIFICATION", "USE_CASE", "ORGANIZATION", "STATUS", "INCOME_AMT", "SPECIAL_CONSIDERATIONS" and "ASK_AMT".
- Since "APPLICATION_TYPE" and "CLASSIFICATION" columns had more than 10 unique values, both columns required binning after identifying the density of each.
Compiling, Training and Evaluating the Model
-
This model was defined with 3 layers; first layer with 80 neurons, second with 30 and last layer with 10. The activation functions were all "Relu" except the output layer with the "Sigmoid" activation function. This model returned 6,271 in total trainable params.
-
The model accuracy of 72.27% is very much similar to the original model and the model loss still at 56% using the same number of epochs of 100.
-
An additional 3rd layer was introduced with 10 neurons and activation functions of "Relu" for all hidden layers. Original layer performed closer to 73%, therefore the same model structure as original was tested but with an additional layer increasing the parameters to see if that increases the model accuracy. Unfortunately no significant difference was acheived implying that additional layers are not always necessary.
Data Processing
- Columns "EIN", "NAME" and "APPLICATION_TYPE were dropped as they were not beneficial for a target or feature variables.
- "IS_SUCCESSFUL" column was selected as the target outcome.
- All remaining columns were featuers; "AFFILIATION", "CLASSIFICATION", "USE_CASE", "ORGANIZATION", "STATUS", "INCOME_AMT", "SPECIAL_CONSIDERATIONS" and "ASK_AMT".
- The "CLASSIFICATION" column had more than 10 unique values, both columns required binning after identifying the density.
Compiling, Training and Evaluating the Model
-
Model was defined with 2 layers; first layer with 70 neurons and second with 20. The activation functions were changed to "Tanh" for both layers and the output layer remained with "Sigmoid". A total of 3,891 params - half the amount compared to Attempt 1.
-
The model accuracy of 70.72% is lower than the original and attempted 1 model and the test loss is 2 points higher at 58.20%. Model was trained with 100 epochs as well.
-
In order to optimize better, an additional column was categorized as noisy variable since I couldn't find significance to the output and therefore removed, which then left 35 features. Since the original model's accuracy was similar to Attempt 1, I reverted back to keeping only 2 hidden layers. However the number of neurons were reduced following the rule of thumb; "2-3 times more than features". Changing the activation fuction was to try and capture any negative values as well.
-
As per the accuracy score, it is safe to assume that dropping the additional column for this model negatively impacted to a lower accuracy, returning less trainable data. Proving the "APPLICATION_TYPE" data points are significant and relevant for the prediction.
Data Processing
- For this model - the data was preprocessed as the original. Only dropping the "EIN" and "NAME" columns. Also binning the "APPLICATION_TYPE" and "CLASSIFICATION" columns.
Compiling, Training and Evaluating the Model
-
Model structure is the same as Attempt 1 with the number of layers and neurons per layer. The activation functions were changed to "Sigmoid" for all hidden and output layers.
-
The model accuracy of 72.39% is slightly lower than the original and attempted 1 model, however the test loss is lower as well at 55.81%.
-
When reviewing the validation set for the original model, the model acheived between 74-75% accuracy at 84 epochs.
-
Using the above validation set, the epoch was reduced to 84 for the 3rd model to achieve a higher accuracy than the previous attempts. Unfortunately, the accuracy was still very similar in the 72% range.
With 3 attempts of changing the number of hidden layers, neurons and activation function, I was still not able to achieve an accuracy score of 75%. Therefore I defined a function to create a sequential model with hyperparameters option. The epoch was set at 80, first layer was set between 1 and 80. Any additional layers for better optimization were limited between 1 to 6 layers and a max neuron of 40.
The script is saved as AlphabetSoupCharity_Optimization_trial. This was created using Google Colaboratory due to time efficiency. The trial ran 136 times on the given hyperparameters received the best accuracy of 73.39%.
The best model consists of 4 layers with 71 neurons for the first layer with an epoch of 27 and using "Relu" activation function.
Below is the model evaluation:
Another machine learning model that can be used is the RanomForest Classifier. Using multiple decision trees, each branch is trained on the weak learners of the previous branch, can optimize and predict probably similar or better results. This would also reduce the time required to run the code.