You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Varying first cutoff time for each target group (#258)
* implement and test varying minimum data per group
* update release notes
* pin scikit-learn for doc builds
* update docstring
* add guide for controlling cutoff times
* fix dfs test
* add guide to index
* Revert "fix dfs test"
This reverts commit 584a5cb.
* pin version of featuretools
* update docstring
* update test case
* update docstring
* lint fix
* parametrize test
* lint fix
num_examples_per_instance (int): Number of examples per unique instance of target entity.
71
-
minimum_data (str): Minimum data before starting the search. Default value is first time of index.
78
+
minimum_data (int or str or Series): The amount of data needed before starting the search. Defaults to the first value in the time index.
79
+
The value can be a datetime string to directly set the first cutoff time or a timedelta string to denote the amount of data needed before
80
+
the first cutoff time. The value can also be an integer to denote the number of rows needed before the first cutoff time.
81
+
If a Series, minimum_data should be datetime string, timedelta string, or integer values with a unique set of target groups as the corresponding index.
72
82
maximum_data (str): Maximum data before stopping the search. Default value is last time of index.
73
83
gap (str or int): Time between examples. Default value is window size.
74
84
If an integer, search will start on the first event after the minimum data.
"""Checks whether example count corresponds to data slices."""
178
130
ifself.window_sizeisNoneandgapisNone:
@@ -195,8 +147,11 @@ def search(self,
195
147
df (DataFrame): Data frame to search and extract labels.
196
148
num_examples_per_instance (int or dict): The expected number of examples to return from each entity group.
197
149
A dictionary can be used to further specify the expected number of examples to return from each label.
198
-
minimum_data (str): Minimum data before starting the search. Default value is first time of index.
199
-
maximum_data (str): Maximum data before stopping the search. Default value is last time of index.
150
+
minimum_data (int or str or Series): The amount of data needed before starting the search. Defaults to the first value in the time index.
151
+
The value can be a datetime string to directly set the first cutoff time or a timedelta string to denote the amount of data needed before
152
+
the first cutoff time. The value can also be an integer to denote the number of rows needed before the first cutoff time.
153
+
If a Series, minimum_data should be datetime string, timedelta string, or integer values with a unique set of target groups as the corresponding index.
154
+
maximum_data (str): Maximum data before stopping the search. Defaults to the last value in the time index.
200
155
gap (str or int): Time between examples. Default value is window size.
201
156
If an integer, search will start on the first event after the minimum data.
202
157
drop_empty (bool): Whether to drop empty slices. Default value is True.
0 commit comments