Fuzzy clustering C implementation for MATLAB (FCM, Gustafson-Kessel, clustering validity, extrapolation with presumed cluster centers)
The only planned changes at this time are to improve the documentation, i.e. this README file.
This repository provides efficient C implementation of two types of Fuzzy c-Means Clustering algorithms and several related tools that can be used in clustering evaluation and validation. Once installed, the provided functions are directly callable within MATLAB. To use them outside of MATLAB, one would need to modify the existing C code. Additional implementation details that may be useful are included in the Implementation section below.
The following functionality is supported:
-
Fuzzy c-Means Clustering
(based on the work of
$\textcolor{#00c000}{\text{James C. Bezdek}}$ and$\textcolor{#00c000}{\text{Enrique Ruspini}}$ )- Euclidean, diagonal, and Mahalanobis distance metric
-
$\textcolor{#00c0c0}{\text{Gustafson-Kessel}}$ fuzzy c-means clustering -
$\textcolor{#00c0c0}{\text{Initial fuzzy partition matrix}}$ generation -
$\textcolor{#00c0c0}{\text{12 Clustering validity functionals:}}$ - partition coefficient, partition entropy, nonfuzzy index, minimum and mean hard tendencies, minimum and maximum relative fuzziness, the minimum nearest maximum membership cardinality, compactness and separation index, fuzzy hypervolume, average partition density, and the partition density of a resulting fuzzy partition
-
$\textcolor{#00c0c0}{\text{Fuzzy partition matrix extrapolation}}$ that uses the presumed cluster centers -
$\textcolor{#00c0c0}{\text{Fuzzy scatter and covariance matrices}}$ calculation for fuzzy clusters -
$\textcolor{#00c0c0}{\text{Test code}}$ (in MATLAB) to verify the proper installation of software -
$\textcolor{#00c0c0}{\text{Detailed help text}}$ describing how to use the provided functions
The primary motivation for this work was to estimate membership functions of the Temporal Fuzzy Sets which span the Fuzzy Information Space. Nevertheless, the provided software package may be used for solving any problem that requires fuzzy c-means clustering.
To use this software one must have MATLAB installed and must be able to run the mex command to generate the accelerated versions of the three main algorithms from their C source code.
Once you verified you have MATLAB installed and you can successfully run mex command to build MATLAB functions implemented in C language, you may proceed with the following steps:
- Clone or download this repository to the system where you have MATLAB installed
- Copy the contents of
./srcdirectory to a folder where you want MATLAB to access it- As an example, let us assume you copied all the files from
./srctoD:\MyMatlab\fcmclton a Windows system.
- As an example, let us assume you copied all the files from
- Start MATLAB and within MATLAB change directory to
D:\MyMatlab\fcmclt>> cd D:\MyMatlab\fcmclt
- Run the following three
mexcommands:mex extfpm.c mex fcmc.c mex gkfcmc.c
- You should see three new files created:
extfpm.<mexext>,fcmc.<mexext>,gkfcmc.<mexext>, where<mexext>reflects the mex file extension on your system. For 64-bit Windows system, it should bemexw64. - You can now test your installation by running test routines that are provided, e.g. try
running
>> testfcmcfrom the same folder. - Try all other test routines. They have prefix
testortst.- NOTE: Some may display a warning indicating the dataset may be too small for reliable clustering. That is normal and expected behavior for those examples.
If you want to make the fcmclt package available to you no matter which folder you are in,
you may
add the D:\MyMatlab\fcmclt to your MATLAB path. You may consult MATLAB documentation
if you do not know how to do it.
The best way to learn how to use the functions provided in this package is to read their
documentation. You may run >> help <function-name> to get the function description.
In addition to the help text for the three main algorithms (fcmc, gkfcmc, and extfpm),
you should consult the MATLAB source code for the test routines provided in .m files. That
will provide you with examples of how to integrate these routines into your programs
and how to initialize
or evaluate the results.
The source code may also provide references to additional reading material. The notation used in the source code is based primarily on this book:
-
$\textcolor{#00c000}{\text{J.C. Bezdek,}}$ "Pattern Recognition with Fuzzy Objective Function Algorithms," Plenum Press, New York, 1981.
The support is not provided for this package.
The initial implementation of this package was done between 1992-1995. There were only a handful of minor changes done since that time. Those were limited to accomodate the MEX API changes introduced by MathWorks, Inc. over a long period of time (25+ years).
The code in ./src folder contains C and MATLAB sources. The C files are:
extfpm.c: Extrapolates the fuzzy partition matrix using the presumed cluster centersfcmc.c: Implements the Fuzzy c-Means Clustering (FCMC) algorithmgkfcmc.c: Implements Gustafson-Kessel (GK) variant of the FCMC algorithm
The relevant MATLAB files are:
Contents.m: Contains help text for thefcmcltpackage.fcmcinit.m: Very important routine that is used to generate the initial fuzzy partition matrix U0.cltvalid.m: Calculates clustering validity functionalsfscat.m: Calculates fuzzy scatter and covariance matrices for a fuzzytestextfpm.m: Example of how to extrapolate fuzzy partition matric from the presumed cluster centerstestfcmc.m: Example of how to use FCMC algorithmtestgk.m: Example of GK variant for two Gaussian classestestgk2.m: Example of GK for Gustafson's crosststvalid.m: Example of how to use validity functionals for the FCMC algorithm
The C functions include only two header files. The math.h and mex.h. The only
two core MATLAB algorithms that are invoked within the C code are the matrix inverse and
determinant.
For brevity, only the fcmc.c code structure is outlined here:
- The main function called by MATLAB is
mexFunction()- Checks the input arguments and applies the defaults for optional ones as needed
- Fetches pointers to vector data
- Allocates some memory
- Prepares for the FCMC routine
- Selects the distance metric (may allocate additional memory and may invoke MATLAB inverse function in case of Mahalanobis metric)
- Executes the FCMC algorithm (
do_fcm()) - Checks and creates output variables
- All other functions are written in plain C code
In essense, the mexFunction() is a MATLAB wrapper for the algorithms that are to be
accelerated.
NOTE: If you would like to use the C code outside of MATLAB, you may need to do a few things:
- Make it re-entrant. That is, you would need to instantiate the variables that are currently global or static. That was not a problem for the MATLAB implementation, but may create problems if you try to integrate this code somewhere else.
- You would need to provide compatible implementations for matrix inverse and determinant MATLAB functions in case you have to use the parts of code that depend on these.
- Finally, you would need to replace the
mexFunction()implementation with your own wrapper that would handle memory management, optional input arguments, etc. The memory management you may need to implement may substantially differ from the way MATLAB manages memory. Especially when it comes to allocating and deallocating memory.
The code in gkfcmc.c and extfpm.c files is structured in a similar way.
There are no plans to expand this package with additional features.
Pull requests will only be considered for the following contributions:
- Bug fixes (if you can find any)
- Interesting examples that show how to use the provided functions
- Please try to provide a single self-contained MATLAB file that does not require any data files. If additional MATLAB toolboxes are required, please list them all.
-
$\textcolor{red}{\text{Document your code!}}$ If your code cannot be easily reviewed it will be swiftly rejected no matter how great you think it is.
The software provided here has been developed by Bogdan Kosanovic in the early 1990s during his Ph.D. work at the University of Pittsburgh.
The author would like to acknowledge fcmclt package work for
MATLAB 6.5 R13. If we are to trust the LinkedIn, Dr. Hamadicharef is now (2022) with
the