|
132 | 132 | "When Woodwork's type inference does not return any LogicalTypes for a column, Woodwork will set the column's logical type as the default LogicalType, `Unknown`. A logical type being inferred as `Unknown` may be a good indicator that a more specific logical type can be chosen and set by the user.\n", |
133 | 133 | "\n", |
134 | 134 | "- **physical type**: `string`\n", |
135 | | - "- **nullable**: yes\n", |
136 | 135 | "\n", |
137 | 136 | "Below is an example of a column for which no logical type is inferred, resulting in a Series with `Unknown` logical type. Looking at the contents of the Series, though, we can see that it contains country codes, so we set the logical type to `CountryCode`." |
138 | 137 | ] |
|
179 | 178 | "\n", |
180 | 179 | "- **physical type**: `int64`\n", |
181 | 180 | "- **standard tags**: `{'numeric'}`\n", |
182 | | - "- **nullable**: no\n", |
183 | 181 | "\n", |
184 | 182 | "##### AgeFractional\n", |
185 | 183 | "\n", |
186 | 184 | "Represents Logical Types that contain non-negative floating point numbers indicating a person’s age. May contain null values.\n", |
187 | 185 | "\n", |
188 | 186 | "- **physical type**: `float64`\n", |
189 | 187 | "- **standard tags**: `{'numeric'}`\n", |
190 | | - "- **nullable**: yes\n", |
191 | 188 | "\n", |
192 | 189 | "\n", |
193 | 190 | "##### AgeNullable\n", |
|
196 | 193 | "\n", |
197 | 194 | "- **physical type**: `Int64`\n", |
198 | 195 | "- **standard tags**: `{'numeric'}`\n", |
199 | | - "- **nullable**: yes\n", |
200 | 196 | "\n", |
201 | 197 | "##### Double\n", |
202 | 198 | "\n", |
203 | 199 | "Represents Logical Types that contain positive and negative numbers, some of which include a fractional component.\n", |
204 | 200 | "\n", |
205 | 201 | "- **physical type**: `float64`\n", |
206 | 202 | "- **standard tags**: `{'numeric'}`\n", |
207 | | - "- **nullable**: yes\n", |
208 | 203 | "\n", |
209 | 204 | "##### Integer\n", |
210 | 205 | "\n", |
211 | 206 | "Represents Logical Types that contain positive and negative numbers without a fractional component, including zero (0).\n", |
212 | 207 | "\n", |
213 | 208 | "- **physical type**: `int64`\n", |
214 | 209 | "- **standard tags**: `{'numeric'}`\n", |
215 | | - "- **nullable**: no\n", |
216 | 210 | "\n", |
217 | 211 | "##### IntegerNullable \n", |
218 | 212 | "Represents Logical Types that contain positive and negative numbers without a fractional component, including zero (0). May contain null values. \n", |
219 | 213 | "\n", |
220 | 214 | "- **physical type**: `Int64`\n", |
221 | 215 | "- **standard tags**: `{'numeric'}`\n", |
222 | | - "- **nullable**: yes\n", |
223 | 216 | "\n", |
224 | 217 | "\n", |
225 | 218 | "\n", |
|
259 | 252 | "Represents a Logical Type with few unique values relative to the size of the data.\n", |
260 | 253 | "\n", |
261 | 254 | "- **physical type**: `category`\n", |
262 | | - "- **nullable**: yes\n", |
263 | 255 | "- **inference**: Woodwork defines a threshold for percentage unique values relative to the size of the series below which a series will be considered categorical. See [setting config options guide](setting_config_options.ipynb#Categorical-Threshold) for more information on how to control this threshold.\n", |
264 | 256 | "- **koalas note**: Koalas does not support the `category` dtype, so for Koalas DataFrames and Series, the `string` dtype will be used.\n", |
265 | 257 | "\n", |
|
277 | 269 | "Represents Logical Types that use the ISO-3166 standard country code to represent countries. ISO 3166-1 (countries) are supported. These codes should be in the Alpha-2 format.\n", |
278 | 270 | "\n", |
279 | 271 | "- **physical type**: `category`\n", |
280 | | - "- **nullable**: yes\n", |
281 | 272 | "- **standard tags**: `{'category'}`\n", |
282 | 273 | "- **koalas note**: Koalas does not support the `category` dtype, so for Koalas DataFrames and Series, the `string` dtype will be used.\n", |
283 | 274 | "\n", |
|
288 | 279 | "A Ordinal variable type can take ordered discrete values. Similar to Categorical, it is usually a limited, and fixed number of possible values. However, these discrete values have a certain order, and the ordering is important to understanding the values. Ordinal variable types can be represented as strings, or integers. \n", |
289 | 280 | "\n", |
290 | 281 | "- **physical type**: `category`\n", |
291 | | - "- **nullable**: yes\n", |
292 | 282 | "- **standard tags**: `{'category'}`\n", |
293 | 283 | "- **parameters**:\n", |
294 | 284 | " - `order` - the order of the ordinal values in the column from low to high\n", |
|
308 | 298 | "Represents Logical Types that contain a series of postal codes for representing a group of addresses.\n", |
309 | 299 | "\n", |
310 | 300 | "- **physical type**: `category`\n", |
311 | | - "- **nullable**: yes\n", |
312 | 301 | "- **standard tags**: `{'category'}`\n", |
313 | 302 | "- **koalas note**: Koalas does not support the `category` dtype, so for Koalas DataFrames and Series, the `string` dtype will be used.\n", |
314 | 303 | "\n", |
|
317 | 306 | "Represents Logical Types that use the ISO-3166 standard sub-region code to represent a portion of a larger geographic region. ISO 3166-2 (sub-regions) codes are supported. These codes should be in the Alpha-2 format.\n", |
318 | 307 | "\n", |
319 | 308 | "- **physical type**: `category`\n", |
320 | | - "- **nullable**: yes\n", |
321 | 309 | "- **standard tags**: `{'category'}`\n", |
322 | 310 | "- **koalas note**: Koalas does not support the `category` dtype, so for Koalas DataFrames and Series, the `string` dtype will be used.\n", |
323 | 311 | "\n", |
|
360 | 348 | "Represents Logical Types that contain binary values indicating true/false.\n", |
361 | 349 | "\n", |
362 | 350 | "- **physical type**: `bool`\n", |
363 | | - "- **nullable**: no\n", |
364 | 351 | "\n", |
365 | 352 | "##### BooleanNullable\n", |
366 | 353 | "Represents Logical Types that contain binary values indicating true/false. May also contain null values.\n", |
367 | 354 | "\n", |
368 | 355 | "- **physical type**: `boolean`\n", |
369 | | - "- **nullable**: yes\n", |
370 | 356 | "\n", |
371 | 357 | "##### Datetime\n", |
372 | 358 | "A Datetime is a representation of a date and/or time. Datetime variable types can be represented as strings, or integers.\n", |
373 | 359 | "\n", |
374 | 360 | "- **physical type**: `datetime64[ns]`\n", |
375 | | - "- **nullable**: yes\n", |
376 | 361 | "- **transformation**: Will convert valid strings or numbers to pandas datetimes, and will parse more datetime formats with the use of the `datetime_format` parameter.\n", |
377 | 362 | "- **parameters**:\n", |
378 | 363 | " - `datetime_format` - the format of the datetimes in the column, ex: `'%Y-%m-%d'` vs `'%m-%d-%Y'`\n", |
|
388 | 373 | "Represents Logical Types that contain email address values.\n", |
389 | 374 | "\n", |
390 | 375 | "- **physical type**: `string`\n", |
391 | | - "- **nullable**: yes\n", |
392 | 376 | "- **inference**: Uses an email address regex that, if the data matches, means that the column contains email addresses. To learn more about controling the regex used, see the [setting config options guide](setting_config_options.ipynb#Email-Inference-Regex).\n", |
393 | 377 | "\n", |
394 | 378 | "##### LatLong\n", |
395 | 379 | "A LatLong represents an ordered pair (Latitude, Longitude) that tells the location on Earth. The order of the tuple is important. LatLongs can be represented as tuple of floating point numbers. \n", |
396 | 380 | "\n", |
397 | 381 | "- **physical type**: `object`\n", |
398 | | - "- **nullable**: yes\n", |
399 | 382 | "- **transformation**: Will convert inputs into a tuple of floats. Any null values will be stored as `np.nan`\n", |
400 | 383 | "- **koalas note**: Koalas does not support tuples, so latlongs will be stored as a list of floats\n", |
401 | 384 | "\n", |
|
404 | 387 | "Represents Logical Types that contain values specifying a duration of time.\n", |
405 | 388 | "\n", |
406 | 389 | "- **physical type**: `timedelta64[ns]`\n", |
407 | | - "- **nullable**: yes\n", |
408 | 390 | "\n", |
409 | 391 | "\n", |
410 | 392 | "Examples could inclue:\n", |
|
475 | 457 | "Represents Logical Types that contain long-form text or characters representing natural human language\n", |
476 | 458 | "\n", |
477 | 459 | "- **physical type**: `string`\n", |
478 | | - "- **nullable**: yes\n", |
479 | 460 | "\n", |
480 | 461 | "Examples of natural language data:\n", |
481 | 462 | "\n", |
|
488 | 469 | "Represents Logical Types that contain address values.\n", |
489 | 470 | "\n", |
490 | 471 | "- **physical type**: `string`\n", |
491 | | - "- **nullable**: yes\n", |
492 | 472 | "\n", |
493 | 473 | "\n", |
494 | 474 | "##### Filepath\n", |
495 | 475 | "\n", |
496 | 476 | "Represents Logical Types that specify locations of directories and files in a file system.\n", |
497 | 477 | "\n", |
498 | 478 | "- **physical type**: `string`\n", |
499 | | - "- **nullable**: yes\n", |
500 | 479 | "\n", |
501 | 480 | "\n", |
502 | 481 | "##### PersonFullName\n", |
503 | 482 | "\n", |
504 | 483 | "Represents Logical Types that may contain first, middle and last names, including honorifics and suffixes.\n", |
505 | 484 | "\n", |
506 | 485 | "- **physical type**: `string`\n", |
507 | | - "- **nullable**: yes\n", |
508 | 486 | "\n", |
509 | 487 | "##### PhoneNumber\n", |
510 | 488 | "\n", |
511 | 489 | "Represents Logical Types that contain numeric digits and characters representing a phone number.\n", |
512 | 490 | "\n", |
513 | 491 | "- **physical type**: `string`\n", |
514 | | - "- **nullable**: yes\n", |
515 | 492 | "\n", |
516 | 493 | "\n", |
517 | 494 | "##### URL\n", |
518 | 495 | "\n", |
519 | 496 | "Represents Logical Types that contain URLs, which may include protocol, hostname and file name.\n", |
520 | 497 | "\n", |
521 | 498 | "- **physical type**: `string`\n", |
522 | | - "- **nullable**: yes\n", |
523 | 499 | "\n", |
524 | 500 | "##### IPAddress\n", |
525 | 501 | "\n", |
526 | 502 | "Represents Logical Types that contain IP addresses, including both IPv4 and IPv6 addresses.\n", |
527 | 503 | "\n", |
528 | | - "- **physical type**: `string`\n", |
529 | | - "- **nullable**: yes\n" |
| 504 | + "- **physical type**: `string`\n" |
530 | 505 | ] |
531 | 506 | }, |
532 | 507 | { |
|
0 commit comments