Skip to content

Different DBCV value from Matlab's implementation when calculating it on a dataset having all 0s cluster labels #9

@davidechicco

Description

@davidechicco

Hi Felipe
Thanks again for your availability to fix the issues in your software package during December.
I am writing to you today because I found another case where your dbcv() function generates an outcome which is different from Matlab's implementation one.

I have applied your dbcv() function to this DB1_with_307_clusters.csv
dataset file, where the last column on the right contains the cluster labels, which are all zeros.

I used this piece of code:

import pandas as pd
import dbcv

data_file_name = 'DB1_with_307_clusters.csv'
df = pd.read_csv(data_file_name)
print("data file name read: ", data_file_name)

df = df.drop(df[(df.cluster >= 2)].index)

these_data = df.iloc[:,:-2]
these_clusters = df.iloc[:,-1]
this_dbcv = dbcv.dbcv(these_data, these_clusters, check_duplicates = False)
print("FelSiq/DBCV = ", this_dbcv)

The result is 0.999 and is clearly wrong. A collaborator of mine applied the function of the original DBCV Matlab implementation and obtained 0 as result.

Can you please investigate this problem?

Thanks

-- Davide

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions