Skip to content

Commit c808f22

Browse files
Validation and docs for datetime
1 parent ed95f78 commit c808f22

File tree

7 files changed

+137
-2
lines changed

7 files changed

+137
-2
lines changed

README.md

+6
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,7 @@ These directives are currently available:
8282
| [Parse as Simple Date](wrangler-docs/directives/parse-as-simple-date.md) | Parses date strings |
8383
| [Parse XML To JSON](wrangler-docs/directives/parse-xml-to-json.md) | Parses an XML document into a JSON structure |
8484
| [Parse as Currency](wrangler-docs/directives/parse-as-currency.md) | Parses a string representation of currency into a number. |
85+
| [Parse as Datetime](wrangler-docs/directives/parse-as-datetime.md) | Parses strings with datetime values to CDAP datetime type |
8586
| **Output Formatters** | |
8687
| [Write as CSV](wrangler-docs/directives/write-as-csv.md) | Converts a record into CSV format |
8788
| [Write as JSON](wrangler-docs/directives/write-as-json-map.md) | Converts the record into a JSON map |
@@ -115,6 +116,11 @@ These directives are currently available:
115116
| [Diff Date](wrangler-docs/directives/diff-date.md) | Calculates the difference between two dates |
116117
| [Format Date](wrangler-docs/directives/format-date.md) | Custom patterns for date-time formatting |
117118
| [Format Unix Timestamp](wrangler-docs/directives/format-unix-timestamp.md) | Formats a UNIX timestamp as a date |
119+
| **DateTime Transformations** | |
120+
| [Current DateTime](wrangler-docs/directives/current-datetime.md) | Generates the current datetime using the given zone or UTC by default|
121+
| [Datetime To Timestamp](wrangler-docs/directives/datetime-to-timestamp.md) | Converts a datetime value to timestamp with the given zone |
122+
| [Format Datetime](wrangler-docs/directives/format-datetime.md) | Formats a datetime value to custom date time pattern strings |
123+
| [Timestamp To Datetime](wrangler-docs/directives/timestamp-to-datetime.md) | Converts a timestamp value to datetime |
118124
| **Lookups** | |
119125
| [Catalog Lookup](wrangler-docs/directives/catalog-lookup.md) | Static catalog lookup of ICD-9, ICD-10-2016, ICD-10-2017 codes |
120126
| [Table Lookup](wrangler-docs/directives/table-lookup.md) | Performs lookups into Table datasets |

wrangler-core/src/main/java/io/cdap/wrangler/utils/RecordConvertor.java

+20-2
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@
3535
import java.time.LocalDateTime;
3636
import java.time.LocalTime;
3737
import java.time.ZonedDateTime;
38+
import java.time.format.DateTimeParseException;
3839
import java.util.ArrayList;
3940
import java.util.List;
4041
import java.util.Map;
@@ -112,17 +113,34 @@ public StructuredRecord decodeRecord(Row row, Schema schema) throws RecordConver
112113
private Object decode(String name, Object object, Schema schema) throws RecordConvertorException {
113114
// Extract the type of the field.
114115
Schema.Type type = schema.getType();
115-
Schema.LogicalType logicalType = schema.getLogicalType();
116+
Schema.LogicalType logicalType =
117+
schema.isNullable() ? schema.getNonNullable().getLogicalType() :
118+
schema.getLogicalType();
116119

117120
if (logicalType != null) {
118121
switch (logicalType) {
122+
case DATETIME:
123+
if (schema.isNullable() && object == null || object instanceof LocalDateTime) {
124+
return object;
125+
}
126+
if (object == null) {
127+
throw new UnexpectedFormatException(
128+
String.format("Datetime field %s should have a non null value", name));
129+
}
130+
try {
131+
LocalDateTime.parse((String) object);
132+
} catch (DateTimeParseException exception) {
133+
throw new UnexpectedFormatException(
134+
String.format("Datetime field '%s' with value '%s' is not in ISO-8601 format.",
135+
name, object), exception);
136+
}
137+
return object;
119138
case DATE:
120139
case TIME_MILLIS:
121140
case TIME_MICROS:
122141
case TIMESTAMP_MILLIS:
123142
case TIMESTAMP_MICROS:
124143
case DECIMAL:
125-
case DATETIME:
126144
return object;
127145
default:
128146
throw new UnexpectedFormatException("field type " + logicalType + " is not supported.");
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
# Current Datetime
2+
3+
The CURRENT-DATETIME directive generates the current datetime using the given zone or UTC by default.
4+
5+
6+
## Syntax
7+
```
8+
current-datetime <colname> 'timezone'
9+
```
10+
11+
12+
## Usage Notes
13+
14+
The CURRENT-DATETIME directive generates the current datetime using the given zone or UTC by default.
15+
Zone can be region based string like America/Los_Angeles, Europe/Paris ,
16+
simple offsets like +08:00 , or prefix and offset like UTC+01:00, GMT+08:00, UT+04:00 etc
17+
18+
## Examples
19+
current-datetime :col1 'UTC-08:00'
20+
21+
current-datetime :col2 'America/Los_Angeles'
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
# Datetime To Timestamp
2+
3+
The DATETIME-TO-TIMESTAMP directive converts a datetime value to timestamp with the given zone
4+
5+
6+
## Syntax
7+
```
8+
datetime-to-timestamp <colname> 'timezone'
9+
```
10+
11+
12+
## Usage Notes
13+
14+
The DATETIME-TO-TIMESTAMP directive converts datetime values to
15+
timestamp values using the given time zone (UTC by default).
16+
Zone can be region based string like America/Los_Angeles, Europe/Paris ,
17+
simple offsets like +08:00 , or prefix and offset like UTC+01:00, GMT+08:00, UT+04:00 etc
18+
19+
If the column is `null` applying this directive is a no-op.
20+
21+
## Examples
22+
datetime-to-timestamp :col1 'UTC-08:00'
23+
24+
datetime-to-timestamp :col2 'America/Los_Angeles'
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
# Format Datetime
2+
3+
The FORMAT-DATETIME directive formats CDAP datetime values to custom pattern strings.
4+
5+
6+
## Syntax
7+
```
8+
format-datetime <datetime_column> "<pattern>"
9+
```
10+
11+
12+
## Usage Notes
13+
14+
The FORMAT-DATETIME directive will format CDAP datetime values to custom pattern strings.
15+
Pattern is the format for the output string.
16+
17+
18+
If the column is `null` applying this directive is a no-op.
19+
The column to be formatted should be of type datetime.
20+
21+
22+
## Examples
23+
See [FORMAT-DATE](format-date.md) for an explanation and examples the pattern strings.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
# Parse as Datetime
2+
3+
The PARSE-AS-DATETIME directive parses strings with datetime values to CDAP datetime type.
4+
5+
6+
## Syntax
7+
```
8+
parse-as-datetime <column> "<pattern>"
9+
```
10+
11+
12+
## Usage Notes
13+
14+
The PARSE-AS-DATETIME directive will parse strings with datetime values to CDAP
15+
datetime type. Pattern is the format of the `input` strings.
16+
The input values and pattern should have a date and time component.
17+
18+
If the column is `null` or is already a datetime field, applying this directive
19+
is ano-op. The column to be parsed as a datetime should be of type string.
20+
21+
22+
## Examples
23+
See [FORMAT-DATE](format-date.md) for an explanation and examples of these pattern strings.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
# Timestamp To Datetime
2+
3+
The TIMESTAMP-TO-DATETIME directive converts a timestamp value to datetime
4+
5+
6+
## Syntax
7+
```
8+
timestamp-to-datetime <colname>
9+
```
10+
11+
12+
## Usage Notes
13+
14+
The TIMESTAMP-TO-DATETIME directive converts timestamp values to
15+
datetime values .
16+
17+
If the column is `null` applying this directive is a no-op.
18+
19+
## Examples
20+
timestamp-to-datetime :col1

0 commit comments

Comments
 (0)