Skip to content

Commit

Permalink
fix indicator generation bug
Browse files Browse the repository at this point in the history
change indicator default count
re-run examples
  • Loading branch information
JohnMount committed Jul 28, 2019
1 parent 560dfcc commit 3d23657
Show file tree
Hide file tree
Showing 36 changed files with 6,238 additions and 3,976 deletions.
318 changes: 155 additions & 163 deletions Examples/KDD2009Example/KDD2009Example.ipynb

Large diffs are not rendered by default.

225 changes: 109 additions & 116 deletions Examples/KDD2009Example/KDD2009Example.md

Large diffs are not rendered by default.

Binary file modified Examples/KDD2009Example/output_40_0.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified Examples/KDD2009Example/output_54_0.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
299 changes: 162 additions & 137 deletions Examples/Multinomial/MultinomialExample.ipynb

Large diffs are not rendered by default.

1,150 changes: 1,150 additions & 0 deletions Examples/Multinomial/MultinomialExample.md

Large diffs are not rendered by default.

3,091 changes: 1,548 additions & 1,543 deletions Examples/NoiseColumns/NoiseColumns.ipynb

Large diffs are not rendered by default.

1,918 changes: 961 additions & 957 deletions Examples/NoiseColumns/NoiseColumns.md

Large diffs are not rendered by default.

Binary file modified Examples/NoiseColumns/output_18_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified Examples/NoiseColumns/output_19_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified Examples/NoiseColumns/output_20_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified Examples/NoiseColumns/output_33_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified Examples/NoiseColumns/output_34_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified Examples/NoiseColumns/output_35_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified Examples/NoiseColumns/output_36_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified Examples/NoiseColumns/output_37_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified Examples/NoiseColumns/output_38_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified Examples/NoiseColumns/output_39_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified Examples/NoiseColumns/output_40_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
394 changes: 197 additions & 197 deletions Examples/Regression/Example_regression_1.ipynb

Large diffs are not rendered by default.

292 changes: 146 additions & 146 deletions Examples/Regression/Example_regression_1.md

Large diffs are not rendered by default.

89 changes: 83 additions & 6 deletions Examples/Unsupervised/Unsupervised.ipynb
Original file line number Diff line number Diff line change
@@ -1,9 +1,52 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Processing /Users/johnmount/Documents/work/pyvtreat/pkg/dist/vtreat-0.1.tar.gz\n",
"Requirement already satisfied: numpy in /Users/johnmount/anaconda3/envs/aiAcademy/lib/python3.7/site-packages (from vtreat==0.1) (1.16.4)\n",
"Requirement already satisfied: pandas in /Users/johnmount/anaconda3/envs/aiAcademy/lib/python3.7/site-packages (from vtreat==0.1) (0.24.2)\n",
"Requirement already satisfied: statistics in /Users/johnmount/anaconda3/envs/aiAcademy/lib/python3.7/site-packages (from vtreat==0.1) (1.0.3.5)\n",
"Requirement already satisfied: scipy in /Users/johnmount/anaconda3/envs/aiAcademy/lib/python3.7/site-packages (from vtreat==0.1) (1.2.1)\n",
"Requirement already satisfied: python-dateutil>=2.5.0 in /Users/johnmount/anaconda3/envs/aiAcademy/lib/python3.7/site-packages (from pandas->vtreat==0.1) (2.8.0)\n",
"Requirement already satisfied: pytz>=2011k in /Users/johnmount/anaconda3/envs/aiAcademy/lib/python3.7/site-packages (from pandas->vtreat==0.1) (2019.1)\n",
"Requirement already satisfied: docutils>=0.3 in /Users/johnmount/anaconda3/envs/aiAcademy/lib/python3.7/site-packages (from statistics->vtreat==0.1) (0.14)\n",
"Requirement already satisfied: six>=1.5 in /Users/johnmount/anaconda3/envs/aiAcademy/lib/python3.7/site-packages (from python-dateutil>=2.5.0->pandas->vtreat==0.1) (1.12.0)\n",
"Building wheels for collected packages: vtreat\n",
" Building wheel for vtreat (setup.py) ... \u001b[?25ldone\n",
"\u001b[?25h Stored in directory: /Users/johnmount/Library/Caches/pip/wheels/cf/06/fc/6b2552717486fb6401f19308eec24381555e456e3bd9cfb103\n",
"Successfully built vtreat\n",
"Installing collected packages: vtreat\n",
" Found existing installation: vtreat 0.1\n",
" Uninstalling vtreat-0.1:\n",
" Successfully uninstalled vtreat-0.1\n",
"Successfully installed vtreat-0.1\n"
]
}
],
"source": [
"# To install:\n",
"!pip install /Users/johnmount/Documents/work/pyvtreat/pkg/dist/vtreat-0.1.tar.gz\n",
"#!pip install https://github.com/WinVector/pyvtreat/raw/master/pkg/dist/vtreat-0.1.tar.gz"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
Expand Down Expand Up @@ -75,7 +118,7 @@
"4 z00013 1 b"
]
},
"execution_count": 1,
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
Expand All @@ -97,7 +140,42 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'use_hierarchical_estimate': True,\n",
" 'coders': {'clean_copy',\n",
" 'deviance_code',\n",
" 'impact_code',\n",
" 'indicator_code',\n",
" 'logit_code',\n",
" 'missing_indicator',\n",
" 'prevalence_code'},\n",
" 'filter_to_recommended': True,\n",
" 'indicator_min_fracton': 0.01,\n",
" 'cross_validation_plan': <vtreat.KWayCrossPlan at 0x104b2a2b0>,\n",
" 'cross_validation_k': 5}"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"transform = vtreat.UnsupervisedTreatment(\n",
" params=vtreat.vtreat_parameters({\n",
" 'indicator_min_fracton': 0.01,\n",
" }))\n",
"transform.params_"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
Expand Down Expand Up @@ -269,20 +347,19 @@
"4 0 0 0 0 "
]
},
"execution_count": 2,
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"transform = vtreat.UnsupervisedTreatment()\n",
"d_treated = transform.fit_transform(d)\n",
"d_treated.head()"
]
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 5,
"metadata": {},
"outputs": [
{
Expand Down Expand Up @@ -580,7 +657,7 @@
"17 NaN NaN False 16.0 "
]
},
"execution_count": 3,
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
Expand Down
82 changes: 70 additions & 12 deletions Examples/Unsupervised/Unsupervised.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,37 @@


```python

```


```python
# To install:
!pip install /Users/johnmount/Documents/work/pyvtreat/pkg/dist/vtreat-0.1.tar.gz
#!pip install https://github.com/WinVector/pyvtreat/raw/master/pkg/dist/vtreat-0.1.tar.gz
```

Processing /Users/johnmount/Documents/work/pyvtreat/pkg/dist/vtreat-0.1.tar.gz
Requirement already satisfied: numpy in /Users/johnmount/anaconda3/envs/aiAcademy/lib/python3.7/site-packages (from vtreat==0.1) (1.16.4)
Requirement already satisfied: pandas in /Users/johnmount/anaconda3/envs/aiAcademy/lib/python3.7/site-packages (from vtreat==0.1) (0.24.2)
Requirement already satisfied: statistics in /Users/johnmount/anaconda3/envs/aiAcademy/lib/python3.7/site-packages (from vtreat==0.1) (1.0.3.5)
Requirement already satisfied: scipy in /Users/johnmount/anaconda3/envs/aiAcademy/lib/python3.7/site-packages (from vtreat==0.1) (1.2.1)
Requirement already satisfied: python-dateutil>=2.5.0 in /Users/johnmount/anaconda3/envs/aiAcademy/lib/python3.7/site-packages (from pandas->vtreat==0.1) (2.8.0)
Requirement already satisfied: pytz>=2011k in /Users/johnmount/anaconda3/envs/aiAcademy/lib/python3.7/site-packages (from pandas->vtreat==0.1) (2019.1)
Requirement already satisfied: docutils>=0.3 in /Users/johnmount/anaconda3/envs/aiAcademy/lib/python3.7/site-packages (from statistics->vtreat==0.1) (0.14)
Requirement already satisfied: six>=1.5 in /Users/johnmount/anaconda3/envs/aiAcademy/lib/python3.7/site-packages (from python-dateutil>=2.5.0->pandas->vtreat==0.1) (1.12.0)
Building wheels for collected packages: vtreat
Building wheel for vtreat (setup.py) ... [?25ldone
[?25h Stored in directory: /Users/johnmount/Library/Caches/pip/wheels/cf/06/fc/6b2552717486fb6401f19308eec24381555e456e3bd9cfb103
Successfully built vtreat
Installing collected packages: vtreat
Found existing installation: vtreat 0.1
Uninstalling vtreat-0.1:
Successfully uninstalled vtreat-0.1
Successfully installed vtreat-0.1



```python
import numpy.random
import pandas
Expand Down Expand Up @@ -80,7 +112,33 @@ d.head()


```python
transform = vtreat.UnsupervisedTreatment()
transform = vtreat.UnsupervisedTreatment(
params=vtreat.vtreat_parameters({
'indicator_min_fracton': 0.01,
}))
transform.params_
```




{'use_hierarchical_estimate': True,
'coders': {'clean_copy',
'deviance_code',
'impact_code',
'indicator_code',
'logit_code',
'missing_indicator',
'prevalence_code'},
'filter_to_recommended': True,
'indicator_min_fracton': 0.01,
'cross_validation_plan': <vtreat.KWayCrossPlan at 0x104b2a2b0>,
'cross_validation_k': 5}




```python
d_treated = transform.fit_transform(d)
d_treated.head()
```
Expand Down Expand Up @@ -113,13 +171,13 @@ d_treated.head()
<th>zip_lev_z00013</th>
<th>zip_lev_z00003</th>
<th>zip_lev_z00008</th>
<th>zip_lev_z00004</th>
<th>zip_lev_z00015</th>
<th>zip_lev_z00004</th>
<th>zip_lev_z00005</th>
<th>zip_lev_z00014</th>
<th>zip_lev_z00001</th>
<th>zip_lev_z00006</th>
<th>zip_lev_z00014</th>
<th>zip_lev_z00012</th>
<th>zip_lev_z00006</th>
<th>zip_lev_z00002</th>
<th>zip_lev_z00010</th>
</tr>
Expand Down Expand Up @@ -153,7 +211,6 @@ d_treated.head()
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
Expand All @@ -162,6 +219,7 @@ d_treated.head()
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<th>2</th>
Expand Down Expand Up @@ -196,8 +254,8 @@ d_treated.head()
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
</tr>
Expand Down Expand Up @@ -350,7 +408,7 @@ transform.score_frame_
</tr>
<tr>
<th>7</th>
<td>zip_lev_z00004</td>
<td>zip_lev_z00015</td>
<td>zip</td>
<td>indicator_code</td>
<td>False</td>
Expand All @@ -362,7 +420,7 @@ transform.score_frame_
</tr>
<tr>
<th>8</th>
<td>zip_lev_z00015</td>
<td>zip_lev_z00004</td>
<td>zip</td>
<td>indicator_code</td>
<td>False</td>
Expand All @@ -386,7 +444,7 @@ transform.score_frame_
</tr>
<tr>
<th>10</th>
<td>zip_lev_z00014</td>
<td>zip_lev_z00001</td>
<td>zip</td>
<td>indicator_code</td>
<td>False</td>
Expand All @@ -398,7 +456,7 @@ transform.score_frame_
</tr>
<tr>
<th>11</th>
<td>zip_lev_z00001</td>
<td>zip_lev_z00014</td>
<td>zip</td>
<td>indicator_code</td>
<td>False</td>
Expand All @@ -410,7 +468,7 @@ transform.score_frame_
</tr>
<tr>
<th>12</th>
<td>zip_lev_z00006</td>
<td>zip_lev_z00012</td>
<td>zip</td>
<td>indicator_code</td>
<td>False</td>
Expand All @@ -422,7 +480,7 @@ transform.score_frame_
</tr>
<tr>
<th>13</th>
<td>zip_lev_z00012</td>
<td>zip_lev_z00006</td>
<td>zip</td>
<td>indicator_code</td>
<td>False</td>
Expand Down
Loading

0 comments on commit 3d23657

Please sign in to comment.