Skip to content

Commit 48bae74

Browse files
committed
rebuild and retest
1 parent 6d78489 commit 48bae74

File tree

5 files changed

+54
-52
lines changed

5 files changed

+54
-52
lines changed

coverage.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -102,4 +102,4 @@ pkg/vtreat/vtreat_impl.py 711 61 91%
102102
-------------------------------------------------------------
103103
TOTAL 1593 126 92%
104104

105-
================= 45 passed, 15 warnings in 137.81s (0:02:17) ==================
105+
================== 45 passed, 15 warnings in 81.34s (0:01:21) ==================

docs/vtreat.html

Lines changed: 51 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -114,57 +114,58 @@ <h1 class="modulename">
114114
</span><span id="L-8"><a href="#L-8"><span class="linenos"> 8</span></a><span class="c1"># noinspection PyUnresolvedReferences</span>
115115
</span><span id="L-9"><a href="#L-9"><span class="linenos"> 9</span></a><span class="kn">import</span> <span class="nn">numpy</span>
116116
</span><span id="L-10"><a href="#L-10"><span class="linenos">10</span></a>
117-
</span><span id="L-11"><a href="#L-11"><span class="linenos">11</span></a><span class="kn">from</span> <span class="nn">vtreat.vtreat_api</span> <span class="kn">import</span> <span class="o">*</span>
117+
</span><span id="L-11"><a href="#L-11"><span class="linenos">11</span></a><span class="kn">from</span> <span class="nn">vtreat.vtreat_api</span> <span class="kn">import</span> <span class="n">unsupervised_parameters</span><span class="p">,</span> <span class="n">vtreat_parameters</span><span class="p">,</span> <span class="n">BinomialOutcomeTreatment</span><span class="p">,</span> <span class="n">MultinomialOutcomeTreatment</span><span class="p">,</span> <span class="n">NumericOutcomeTreatment</span><span class="p">,</span> <span class="n">UnsupervisedTreatment</span>
118118
</span><span id="L-12"><a href="#L-12"><span class="linenos">12</span></a>
119-
</span><span id="L-13"><a href="#L-13"><span class="linenos">13</span></a><span class="n">__docformat__</span> <span class="o">=</span> <span class="s2">&quot;restructuredtext&quot;</span>
120-
</span><span id="L-14"><a href="#L-14"><span class="linenos">14</span></a><span class="n">__version__</span> <span class="o">=</span> <span class="s2">&quot;1.3.0&quot;</span>
121-
</span><span id="L-15"><a href="#L-15"><span class="linenos">15</span></a>
122-
</span><span id="L-16"><a href="#L-16"><span class="linenos">16</span></a><span class="vm">__doc__</span> <span class="o">=</span> <span class="s2">&quot;&quot;&quot;</span>
123-
</span><span id="L-17"><a href="#L-17"><span class="linenos">17</span></a><span class="s2">This&lt;https://github.com/WinVector/pyvtreat&gt; is the Python version of the vtreat data preparation system</span>
124-
</span><span id="L-18"><a href="#L-18"><span class="linenos">18</span></a><span class="s2">(also available as an R package&lt;https://winvector.github.io/vtreat/&gt;.</span>
125-
</span><span id="L-19"><a href="#L-19"><span class="linenos">19</span></a>
126-
</span><span id="L-20"><a href="#L-20"><span class="linenos">20</span></a><span class="s2">vtreat is a DataFrame processor/conditioner that prepares</span>
127-
</span><span id="L-21"><a href="#L-21"><span class="linenos">21</span></a><span class="s2">real-world data for supervised machine learning or predictive modeling</span>
128-
</span><span id="L-22"><a href="#L-22"><span class="linenos">22</span></a><span class="s2">in a statistically sound manner.</span>
129-
</span><span id="L-23"><a href="#L-23"><span class="linenos">23</span></a>
130-
</span><span id="L-24"><a href="#L-24"><span class="linenos">24</span></a><span class="s2">vtreat takes an input DataFrame</span>
131-
</span><span id="L-25"><a href="#L-25"><span class="linenos">25</span></a><span class="s2">that has a specified column called &quot;the outcome variable&quot; (or &quot;y&quot;)</span>
132-
</span><span id="L-26"><a href="#L-26"><span class="linenos">26</span></a><span class="s2">that is the quantity to be predicted (and must not have missing</span>
133-
</span><span id="L-27"><a href="#L-27"><span class="linenos">27</span></a><span class="s2">values). Other input columns are possible explanatory variables</span>
134-
</span><span id="L-28"><a href="#L-28"><span class="linenos">28</span></a><span class="s2">(typically numeric or categorical/string-valued, these columns may</span>
135-
</span><span id="L-29"><a href="#L-29"><span class="linenos">29</span></a><span class="s2">have missing values) that the user later wants to use to predict &quot;y&quot;.</span>
136-
</span><span id="L-30"><a href="#L-30"><span class="linenos">30</span></a><span class="s2">In practice such an input DataFrame may not be immediately suitable</span>
137-
</span><span id="L-31"><a href="#L-31"><span class="linenos">31</span></a><span class="s2">for machine learning procedures that often expect only numeric</span>
138-
</span><span id="L-32"><a href="#L-32"><span class="linenos">32</span></a><span class="s2">explanatory variables, and may not tolerate missing values.</span>
139-
</span><span id="L-33"><a href="#L-33"><span class="linenos">33</span></a>
140-
</span><span id="L-34"><a href="#L-34"><span class="linenos">34</span></a><span class="s2">To solve this, vtreat builds a transformed DataFrame where all</span>
141-
</span><span id="L-35"><a href="#L-35"><span class="linenos">35</span></a><span class="s2">explanatory variable columns have been transformed into a number of</span>
142-
</span><span id="L-36"><a href="#L-36"><span class="linenos">36</span></a><span class="s2">numeric explanatory variable columns, without missing values. The</span>
143-
</span><span id="L-37"><a href="#L-37"><span class="linenos">37</span></a><span class="s2">vtreat implementation produces derived numeric columns that capture</span>
144-
</span><span id="L-38"><a href="#L-38"><span class="linenos">38</span></a><span class="s2">most of the information relating the explanatory columns to the</span>
145-
</span><span id="L-39"><a href="#L-39"><span class="linenos">39</span></a><span class="s2">specified &quot;y&quot; or dependent/outcome column through a number of numeric</span>
146-
</span><span id="L-40"><a href="#L-40"><span class="linenos">40</span></a><span class="s2">transforms (indicator variables, impact codes, prevalence codes, and</span>
147-
</span><span id="L-41"><a href="#L-41"><span class="linenos">41</span></a><span class="s2">more). This transformed DataFrame is suitable for a wide range of</span>
148-
</span><span id="L-42"><a href="#L-42"><span class="linenos">42</span></a><span class="s2">supervised learning methods from linear regression, through gradient</span>
149-
</span><span id="L-43"><a href="#L-43"><span class="linenos">43</span></a><span class="s2">boosted machines.</span>
150-
</span><span id="L-44"><a href="#L-44"><span class="linenos">44</span></a>
151-
</span><span id="L-45"><a href="#L-45"><span class="linenos">45</span></a><span class="s2">The idea is: you can take a DataFrame of messy real world data and</span>
152-
</span><span id="L-46"><a href="#L-46"><span class="linenos">46</span></a><span class="s2">easily, faithfully, reliably, and repeatably prepare it for machine</span>
153-
</span><span id="L-47"><a href="#L-47"><span class="linenos">47</span></a><span class="s2">learning using documented methods using vtreat. Incorporating</span>
154-
</span><span id="L-48"><a href="#L-48"><span class="linenos">48</span></a><span class="s2">vtreat into your machine learning workflow lets you quickly work</span>
155-
</span><span id="L-49"><a href="#L-49"><span class="linenos">49</span></a><span class="s2">with very diverse structured data.</span>
156-
</span><span id="L-50"><a href="#L-50"><span class="linenos">50</span></a>
157-
</span><span id="L-51"><a href="#L-51"><span class="linenos">51</span></a><span class="s2">Worked examples can be found `here`&lt;https://github.com/WinVector/pyvtreat/tree/master/Examples&gt;.</span>
158-
</span><span id="L-52"><a href="#L-52"><span class="linenos">52</span></a>
159-
</span><span id="L-53"><a href="#L-53"><span class="linenos">53</span></a><span class="s2">For more detail please see here: `arXiv:1611.09477</span>
160-
</span><span id="L-54"><a href="#L-54"><span class="linenos">54</span></a><span class="s2">stat.AP`&lt;https://arxiv.org/abs/1611.09477&gt; (the documentation describes the R version,</span>
161-
</span><span id="L-55"><a href="#L-55"><span class="linenos">55</span></a><span class="s2">however all of the examples can be found worked in Python </span>
162-
</span><span id="L-56"><a href="#L-56"><span class="linenos">56</span></a><span class="s2">`here`&lt;https://github.com/WinVector/pyvtreat/tree/master/Examples/vtreat_paper1&gt;).</span>
163-
</span><span id="L-57"><a href="#L-57"><span class="linenos">57</span></a>
164-
</span><span id="L-58"><a href="#L-58"><span class="linenos">58</span></a><span class="s2">vtreat is available</span>
165-
</span><span id="L-59"><a href="#L-59"><span class="linenos">59</span></a><span class="s2">as a `Python/Pandas package`&lt;https://github.com/WinVector/vtreat&gt;,</span>
166-
</span><span id="L-60"><a href="#L-60"><span class="linenos">60</span></a><span class="s2">and also as an `R package`&lt;https://github.com/WinVector/vtreat&gt;.</span>
167-
</span><span id="L-61"><a href="#L-61"><span class="linenos">61</span></a><span class="s2">&quot;&quot;&quot;</span>
119+
</span><span id="L-13"><a href="#L-13"><span class="linenos">13</span></a>
120+
</span><span id="L-14"><a href="#L-14"><span class="linenos">14</span></a><span class="n">__docformat__</span> <span class="o">=</span> <span class="s2">&quot;restructuredtext&quot;</span>
121+
</span><span id="L-15"><a href="#L-15"><span class="linenos">15</span></a><span class="n">__version__</span> <span class="o">=</span> <span class="s2">&quot;1.3.0&quot;</span>
122+
</span><span id="L-16"><a href="#L-16"><span class="linenos">16</span></a>
123+
</span><span id="L-17"><a href="#L-17"><span class="linenos">17</span></a><span class="vm">__doc__</span> <span class="o">=</span> <span class="s2">&quot;&quot;&quot;</span>
124+
</span><span id="L-18"><a href="#L-18"><span class="linenos">18</span></a><span class="s2">This&lt;https://github.com/WinVector/pyvtreat&gt; is the Python version of the vtreat data preparation system</span>
125+
</span><span id="L-19"><a href="#L-19"><span class="linenos">19</span></a><span class="s2">(also available as an R package&lt;https://winvector.github.io/vtreat/&gt;.</span>
126+
</span><span id="L-20"><a href="#L-20"><span class="linenos">20</span></a>
127+
</span><span id="L-21"><a href="#L-21"><span class="linenos">21</span></a><span class="s2">vtreat is a DataFrame processor/conditioner that prepares</span>
128+
</span><span id="L-22"><a href="#L-22"><span class="linenos">22</span></a><span class="s2">real-world data for supervised machine learning or predictive modeling</span>
129+
</span><span id="L-23"><a href="#L-23"><span class="linenos">23</span></a><span class="s2">in a statistically sound manner.</span>
130+
</span><span id="L-24"><a href="#L-24"><span class="linenos">24</span></a>
131+
</span><span id="L-25"><a href="#L-25"><span class="linenos">25</span></a><span class="s2">vtreat takes an input DataFrame</span>
132+
</span><span id="L-26"><a href="#L-26"><span class="linenos">26</span></a><span class="s2">that has a specified column called &quot;the outcome variable&quot; (or &quot;y&quot;)</span>
133+
</span><span id="L-27"><a href="#L-27"><span class="linenos">27</span></a><span class="s2">that is the quantity to be predicted (and must not have missing</span>
134+
</span><span id="L-28"><a href="#L-28"><span class="linenos">28</span></a><span class="s2">values). Other input columns are possible explanatory variables</span>
135+
</span><span id="L-29"><a href="#L-29"><span class="linenos">29</span></a><span class="s2">(typically numeric or categorical/string-valued, these columns may</span>
136+
</span><span id="L-30"><a href="#L-30"><span class="linenos">30</span></a><span class="s2">have missing values) that the user later wants to use to predict &quot;y&quot;.</span>
137+
</span><span id="L-31"><a href="#L-31"><span class="linenos">31</span></a><span class="s2">In practice such an input DataFrame may not be immediately suitable</span>
138+
</span><span id="L-32"><a href="#L-32"><span class="linenos">32</span></a><span class="s2">for machine learning procedures that often expect only numeric</span>
139+
</span><span id="L-33"><a href="#L-33"><span class="linenos">33</span></a><span class="s2">explanatory variables, and may not tolerate missing values.</span>
140+
</span><span id="L-34"><a href="#L-34"><span class="linenos">34</span></a>
141+
</span><span id="L-35"><a href="#L-35"><span class="linenos">35</span></a><span class="s2">To solve this, vtreat builds a transformed DataFrame where all</span>
142+
</span><span id="L-36"><a href="#L-36"><span class="linenos">36</span></a><span class="s2">explanatory variable columns have been transformed into a number of</span>
143+
</span><span id="L-37"><a href="#L-37"><span class="linenos">37</span></a><span class="s2">numeric explanatory variable columns, without missing values. The</span>
144+
</span><span id="L-38"><a href="#L-38"><span class="linenos">38</span></a><span class="s2">vtreat implementation produces derived numeric columns that capture</span>
145+
</span><span id="L-39"><a href="#L-39"><span class="linenos">39</span></a><span class="s2">most of the information relating the explanatory columns to the</span>
146+
</span><span id="L-40"><a href="#L-40"><span class="linenos">40</span></a><span class="s2">specified &quot;y&quot; or dependent/outcome column through a number of numeric</span>
147+
</span><span id="L-41"><a href="#L-41"><span class="linenos">41</span></a><span class="s2">transforms (indicator variables, impact codes, prevalence codes, and</span>
148+
</span><span id="L-42"><a href="#L-42"><span class="linenos">42</span></a><span class="s2">more). This transformed DataFrame is suitable for a wide range of</span>
149+
</span><span id="L-43"><a href="#L-43"><span class="linenos">43</span></a><span class="s2">supervised learning methods from linear regression, through gradient</span>
150+
</span><span id="L-44"><a href="#L-44"><span class="linenos">44</span></a><span class="s2">boosted machines.</span>
151+
</span><span id="L-45"><a href="#L-45"><span class="linenos">45</span></a>
152+
</span><span id="L-46"><a href="#L-46"><span class="linenos">46</span></a><span class="s2">The idea is: you can take a DataFrame of messy real world data and</span>
153+
</span><span id="L-47"><a href="#L-47"><span class="linenos">47</span></a><span class="s2">easily, faithfully, reliably, and repeatably prepare it for machine</span>
154+
</span><span id="L-48"><a href="#L-48"><span class="linenos">48</span></a><span class="s2">learning using documented methods using vtreat. Incorporating</span>
155+
</span><span id="L-49"><a href="#L-49"><span class="linenos">49</span></a><span class="s2">vtreat into your machine learning workflow lets you quickly work</span>
156+
</span><span id="L-50"><a href="#L-50"><span class="linenos">50</span></a><span class="s2">with very diverse structured data.</span>
157+
</span><span id="L-51"><a href="#L-51"><span class="linenos">51</span></a>
158+
</span><span id="L-52"><a href="#L-52"><span class="linenos">52</span></a><span class="s2">Worked examples can be found `here`&lt;https://github.com/WinVector/pyvtreat/tree/master/Examples&gt;.</span>
159+
</span><span id="L-53"><a href="#L-53"><span class="linenos">53</span></a>
160+
</span><span id="L-54"><a href="#L-54"><span class="linenos">54</span></a><span class="s2">For more detail please see here: `arXiv:1611.09477</span>
161+
</span><span id="L-55"><a href="#L-55"><span class="linenos">55</span></a><span class="s2">stat.AP`&lt;https://arxiv.org/abs/1611.09477&gt; (the documentation describes the R version,</span>
162+
</span><span id="L-56"><a href="#L-56"><span class="linenos">56</span></a><span class="s2">however all of the examples can be found worked in Python </span>
163+
</span><span id="L-57"><a href="#L-57"><span class="linenos">57</span></a><span class="s2">`here`&lt;https://github.com/WinVector/pyvtreat/tree/master/Examples/vtreat_paper1&gt;).</span>
164+
</span><span id="L-58"><a href="#L-58"><span class="linenos">58</span></a>
165+
</span><span id="L-59"><a href="#L-59"><span class="linenos">59</span></a><span class="s2">vtreat is available</span>
166+
</span><span id="L-60"><a href="#L-60"><span class="linenos">60</span></a><span class="s2">as a `Python/Pandas package`&lt;https://github.com/WinVector/vtreat&gt;,</span>
167+
</span><span id="L-61"><a href="#L-61"><span class="linenos">61</span></a><span class="s2">and also as an `R package`&lt;https://github.com/WinVector/vtreat&gt;.</span>
168+
</span><span id="L-62"><a href="#L-62"><span class="linenos">62</span></a><span class="s2">&quot;&quot;&quot;</span>
168169
</span></pre></div>
169170

170171

pkg/build/lib/vtreat/__init__.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,8 @@
88
# noinspection PyUnresolvedReferences
99
import numpy
1010

11-
from vtreat.vtreat_api import *
11+
from vtreat.vtreat_api import unsupervised_parameters, vtreat_parameters, BinomialOutcomeTreatment, MultinomialOutcomeTreatment, NumericOutcomeTreatment, UnsupervisedTreatment
12+
1213

1314
__docformat__ = "restructuredtext"
1415
__version__ = "1.3.0"
45 Bytes
Binary file not shown.

pkg/dist/vtreat-1.3.0.tar.gz

33 Bytes
Binary file not shown.

0 commit comments

Comments
 (0)