Skip to content

Rkarande1/K-mean-Clustering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

K-mean-Clustering

This code conducts K-Means clustering on a customer dataset retrieved from a CSV file. The initial steps involve loading the data into a Pandas DataFrame, selecting relevant columns representing customer features, and converting them into a NumPy array called X. Subsequently, it employs a loop to execute K-Means clustering with varying numbers of clusters, ranging from 2 to 10, while simultaneously calculating the Within-Cluster Sum of Squares (WCSS) for each cluster count. The aim is to determine the optimal number of clusters, which is visualized by plotting the WCSS values against the number of clusters. By identifying the "elbow point" in the plot, this code establishes the most suitable number of clusters, typically based on a trade-off between cluster tightness and not over-segmenting. Finally, K-Means clustering is performed once more with the chosen optimal number of clusters (in this case, 5), and the resulting cluster assignments are stored in y_kmeans. The code concludes by generating a scatter plot that visually represents these clusters, each denoted by a distinct color, offering insights into how customers group based on their characteristics. This clustering analysis is invaluable for customer segmentation and tailored marketing strategies.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published