Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Data Preprocessing, Outlier Detection, and Visualization #3

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

54J4N
Copy link

@54J4N 54J4N commented Apr 28, 2024

Data Preprocessing:_ Handling missing values and outliers:

Handling Missing Values:
The script applies pd.to_numeric with errors='coerce' to convert non-numeric values to NaN, and then drops columns with missing values using dropna().

Handling Outliers: _Outliers are detected using the Isolation Forest algorithm from scikit-learn.

Model Selection:_Testing various algorithms to identify the best-performing model:
The script only uses the Isolation Forest algorithm for outlier detection. It doesn't involve testing multiple algorithms for model selection.

Data Visualization: Creating insightful visualizations for better understanding:
The script includes various visualizations such as correlation heatmaps, pair plots, histograms, and boxplots, which help in understanding the data and identifying patterns.
2
3
4
5
correlation1
output

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant