Skip to content

Commit 209d643

Browse files
committed
First commit
0 parents  commit 209d643

File tree

430 files changed

+1138809
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

430 files changed

+1138809
-0
lines changed

.gitignore

+5
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
__pycache__/
2+
ouput_bfs*
3+
*.zip
4+
.idea/
5+
backup_stuff/

README.md

+28
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
# SigDirect
2+
This is a python implementation of SigDirect [1] classifier.
3+
The classifier does not have any specific hyper-parameter to tune and
4+
the only one is the p-value of the statistical significance,
5+
which is set to 0.05 (you can change it in the config file, though this is the value used in most scientific works)
6+
7+
## Running
8+
You first need to use the requirements.txt file to install the dependencies:
9+
```
10+
pip3 install -r requirements.txt
11+
```
12+
13+
You can test the dataset using one of the provided UCI datasets:
14+
```
15+
python3 sigdirect_test.py iris
16+
```
17+
In order to use the classifier, you can call the code similar to scikit-learn classifiers:
18+
You should instantiate it first, and then call the ```fit``` method to train a model.
19+
To get predictions, you can either call ```predict``` or ```predict_proba``` methods where
20+
the former will provide classes and the latter will provide the probability distributions.
21+
22+
The input to fit method should be a 2-d numpy array where each feature can be either 0 or 1.
23+
For labels, a 1-d array should be used where each element should also be integers (starting from 0 to n-1 for n classes.)
24+
25+
Note: You can provide ```get_logs=std.out``` or ```get_logs=std.err``` as an argument to the constructor
26+
of the classifier so it will print some logs about creating the model.
27+
28+
[1]: Li, Jundong, and Osmar R. Zaiane. "Exploiting statistically significant dependent rules for associative classification." Intelligent Data Analysis 21.5 (2017): 1155-1172.

__init__.py

Whitespace-only changes.

config.py

+10
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
random_seed = 1
2+
3+
ALPHA = 0.05
4+
ALPHA_LOG = -3.0000
5+
6+
DEFAULT_LAYER_SIZE = 1
7+
8+
BASE_VALUE_THRESHOLD = 1000000
9+
10+
BATCH_SIZE = 1000

requirements.txt

+5
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
numpy==1.17.4
2+
pandas==0.25.3
3+
psutil==5.6.7
4+
scikit-learn==0.22
5+
scipy==1.3.2

rule.py

+49
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
#!/usr/bin/env python3
2+
3+
import math
4+
5+
6+
class Rule:
7+
""" Represents a final Rule that is extracted from the tree. """
8+
9+
def __init__(self, items, label, confidence, ss, support):
10+
self._items = items
11+
self._label = label
12+
self._confidence = confidence
13+
self._ss = ss
14+
self._support = support
15+
16+
def get_items(self):
17+
return self._items
18+
19+
def get_label(self):
20+
return self._label
21+
22+
def get_confidence(self):
23+
return self._confidence
24+
25+
def get_ss(self):
26+
return self._ss
27+
28+
def get_support(self):
29+
return self._support
30+
31+
def set_items(self, new_items):
32+
self._items = new_items
33+
34+
def __str__(self):
35+
try:
36+
return "{} {};{:.4f},{:.3f},{:.3f}".format(' '.join(map(str, self._items)),
37+
self._label,
38+
self._support,
39+
self._confidence,
40+
math.log(self._ss),
41+
)
42+
except Exception as e:
43+
print(repr(e), self._items, self._label, self._ss)
44+
return "{} {};{:.4f},{:.3f},{:.3f}".format(' '.join(map(str, self._items)),
45+
self._label,
46+
float(self._support),
47+
float(self._confidence),
48+
0.0,
49+
)

run_all.sh

+15
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
#!/bin/bash
2+
3+
INDEX=80
4+
INTERPRETER=python3
5+
declare -a NAMES=( "breast" "flare" "glass" "heart" "iris"
6+
"led7" "anneal" "pageBlocks" "pima" "wine"
7+
"zoo" "hepati" "horse" "adult" "mushroom"
8+
"penDigits" "letRecog" "soybean" "ionosphere" "cylBands")
9+
10+
for NAME in ${NAMES[@]};
11+
do
12+
echo $NAME
13+
INTERPRETER sigdirect_test.py $NAME >> output_bfs_$INDEX;
14+
echo "done"
15+
done

0 commit comments

Comments
 (0)