Skip to content

kcisgroup/SAMM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Recognizing Underlying Patterns in Categorical Data via Symbolization and Masking Mechanisms

- Depending on your transformer toolkit versions, the transformer import code may need to be adjusted, like as follows:
+ from transformers.modeling_bert import BertPreTrainedModel, BertPooler
+ --> from transformers.models.bert.modeling_bert import BertPreTrainedModel, BertPooler
- (Please check your transformer toolikt, and update the import code accordingly.)

How to run the code?

After downloading the code, you can run

python3 run.py

directly for categorical clustering. We suggest adjusting the hyperparameters multiple times to achieve better results.

What are the scripts used for?

(1)LM/BertForMaskedLM: Contains the model structure and configuration of the BERT.

(2)make_dataset: Data processing. Help us prepare the training set.

(3)models: Define the network structure of SAMM.

(4) utils: Contains functions for data processing and model evaluation.

Several toolkits may be needed to run the code

(1) pytorch (https://anaconda.org/pytorch/pytorch)

(2) sklearn (https://anaconda.org/anaconda/scikit-learn)

(3) transformers (https://anaconda.org/conda-forge/transformers)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages