Neural_Network_Charity_Analysis

Overview

The purpose of this analysis is to build a binary classification model capable of predicting whether a charity will effectively use the money raised if funded. By leveraging machine learning and neural networks, a deep learning model is generated to analyze metadata of charities previously funded and evaluate the potential success of charities seeking new funding based on those same criteria. After generating the initial neural network model and measuring its efficacy, various adjustments to its inputs and structure are implemented in order to optimize performance.

Results

Data Preprocessing

The IS_SUCCESSFUL variable is the target of the model
The following variables are the features of the model:
- APPLICATION_TYPE
- AFFILIATION
- CLASSIFICATION
- USE_CASE
- ORGANIZATION
- STATUS
- INCOME_AMT
- SPECIAL_CONSIDERATIONS
- ASK_AMT
The following identification variables are neither targets nor features and are dropped from the input data:
- EIN
- NAME

Compiling, Training, and Evaluating the Model

The initial model is structured as follows:
- Layers:
  - First hidden layer: 80 neurons with ReLU as the activation function
  - Second hidden layer: 30 neurons with ReLU as the activation function
  - Output layer: 1 neurons with Sigmoid as the activation function
- 110 neurons were used in the hidden layer, as it's roughly 2.5x the number of input features (43)
- The ReLU activation function was selected for both hidden layers as all feature values are positive
- The Sigmoid activation was selected for the output as the model is a binary classifier
- Model summary:
The original model had an accuracy of .7298, falling short of the .75 target performance
3 attempts were made to optimize performance and achieve an accuracy > .75
- The different approaches taken to improve performance are outlined below:
  - Attempt 1: dropping the potentially noisy variables, STATUS and SPECIAL_CONSIDERATIONS, from features
  - Attempt 2: using the preprocessed and scaled data from the original attempt, auto-optimize the model by leveraging keras-tuner to get the best hyperparameters for the model
  - Attempt 3: adjusting the original input data by using different bin values and binning an additional variable
    - The APPLICATION_TYPE count threshold was raised to 1,000 from 500
    - The CLASSIFICATION count threshold was lowered to 500 from 1,500
    - The INCOME_AMT variable was binned using a count value threshold of 3,000

Summary

Each of the 3 attempts to optimize the model's performance did not result in a meaningful change to accuracy, and all 3 versions failed to achieve the target performance accuracy of .75. The results summary of each attempt are listed below:

Attempt 1: dropping potentially noisy variables caused performance to fall slightly, resulting in an accuracy of .7279
Attempt 2: auto-optimizing the hyperparameters led to the best performance with an accuracy of .7326
Attempt 3: adjusting and adding bins resulted in the worst performance with an accuracy .7275

Given the inability to achieve the target performance using the methods summarized above on the existing set of organizational metadata, I would recommend examining the dataset before further attempts to optimize the deep learning model are made. A deeper understanding of the meaning and impact of each variable currently contained in the metadata might allow for better decisions regarding variable inclusion and binning during the data preprocessing phase. If feasible, collecting more financial data -wether in continuous or categorical form- might strengthen the ability to predict if future funding will be used effectively.

In addition to evaluating the quality of the dataset currently used to classify outcomes, a Support Vector Machine should be applied to this analysis. SVMs' ability to handle non-linear data in binary classification problems might make an SVM model the most appropriate choice for predicting the IS_SUCCESSFUL target. While SVMs can often outperform deep learning models when dealing with straightforward binary classification, applying an SVM model to the same dataset used for the original model resulted in an accuracy of .723 - a result in line with all the deep learning models evaluated.

The similar performance achieved by the SVM suggests that a deep learning model is not necessarily required to solve this classification problem. However, its inability to improve accuracy further reinforces the conclusion that the classification dataset should be reevaluated if achieving a higher accuracy is required.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
checkpoints		checkpoints
imgs		imgs
opitimization-checkpoints_V1.2		opitimization-checkpoints_V1.2
opitimization-checkpoints_V3		opitimization-checkpoints_V3
untitled_project		untitled_project
.gitignore		.gitignore
AlphabetSoupCharity-SVM.ipynb		AlphabetSoupCharity-SVM.ipynb
AlphabetSoupCharity.h5		AlphabetSoupCharity.h5
AlphabetSoupCharity.ipynb		AlphabetSoupCharity.ipynb
AlphabetSoupCharity_Optimization_V1.2.h5		AlphabetSoupCharity_Optimization_V1.2.h5
AlphabetSoupCharity_Optimization_V2.h5		AlphabetSoupCharity_Optimization_V2.h5
AlphabetSoupCharity_Optimization_V3.h5		AlphabetSoupCharity_Optimization_V3.h5
AlphabetSoupCharity_Optimzation.ipynb		AlphabetSoupCharity_Optimzation.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Neural_Network_Charity_Analysis

Overview

Results

Data Preprocessing

Compiling, Training, and Evaluating the Model

Summary

About

Uh oh!

Releases

Packages

Languages

jbenasuli/Neural_Network_Charity_Analysis

Folders and files

Latest commit

History

Repository files navigation

Neural_Network_Charity_Analysis

Overview

Results

Data Preprocessing

Compiling, Training, and Evaluating the Model

Summary

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages