Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update SQL reference guide for Log Workspaces #28591

Merged
merged 9 commits into from
Apr 10, 2025
92 changes: 82 additions & 10 deletions content/en/logs/workspaces/sql_reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,13 +90,13 @@
| `cast(value AS type)` | type | Converts the given value to the specified data type. |
| `length(string s)` | integer | Returns the number of characters in the string. |
| `trim(string s)` | string | Removes leading and trailing whitespace from the string. |
| `replace(string s, from_string s1, to_string s2)`| string | Replaces occurrences of a substring within a string with another substring. |
| `substring(string s, start_position_int i, length_int l)` | string | Extracts a substring from a string, starting at a given position and for a specified length. |
| `extract(field from timestamp/interval)` | numeric | Extracts a part of a date or time field (such as year or month) from a timestamp or interval. |
| `to_timestamp(numeric n)` | timestamp with time zone | Converts a numeric value to a timestamp with time zone. |
| `to_char(timestamp t / interval i / numeric n, format f)` | string | Converts a timestamp, interval, or numeric value to a string using a format.|
| `date_trunc(field f, source [, time_zone])` | timestamp [with time zone] / interval | Truncates a timestamp or interval to a specified precision. |
| `regexp_like(string s, pattern p [flags])` | boolean | Evaluates if a string matches a regular expression pattern. |
| `replace(string s, string from, string to)` | string | Replaces occurrences of a substring within a string with another substring. |
| `substring(string s, int start, int length)` | string | Extracts a substring from a string, starting at a given position and for a specified length. |
| `extract(unit from timestamp/interval)` | numeric | Extracts a part of a date or time field (such as year or month) from a timestamp or interval. |
| `to_timestamp(string timestamp, string format)` | timestamp | Converts a string to a timestamp according to the given format. |
| `to_char(timestamp t, string format)` | string | Converts a timestamp to a string according to the given format. |
| `date_trunc(string unit, timestamp t)` | timestamp | Truncates a timestamp to a specified precision based on the provided unit. |
| `regexp_like(string s, pattern p)` | boolean | Evaluates if a string matches a regular expression pattern. |

Check warning on line 99 in content/en/logs/workspaces/sql_reference.md

View workflow job for this annotation

GitHub Actions / vale

Datadog.words

Use 'Boolean' instead of 'boolean'.


{{% collapse-content title="Examples" level="h3" %}}
Expand Down Expand Up @@ -177,6 +177,13 @@
{{< /code-block >}}

### `CAST`

Supported cast target types:
- `VARCHAR`
- `BIGINT`
- `DECIMAL`
- `TIMESTAMP`

{{< code-block lang="sql" >}}
SELECT
CAST(order_id AS VARCHAR) AS order_id_string,
Expand Down Expand Up @@ -225,6 +232,23 @@
{{< /code-block >}}

### `EXTRACT`

Supported extraction units:
| Literal | Input Type | Description |
| ------------------| ------------------------ | -------------------------------------------- |
| `day` | `timestamp` / `interval` | day of the month |
| `dow` | `timestamp` | day of the week `1` (Monday) to `7` (Sunday) |
| `doy` | `timestamp` | day of the year (`1` - `366`) |
| `hour` | `timestamp` / `interval` | hour of the day (`0` - `23`) |
| `minute` | `timestamp` / `interval` | minute of the hour (`0` - `59`) |
| `second` | `timestamp` / `interval` | second of the minute (`0` - `59`) |
| `week` | `timestamp` | week of the year (`1` - `53`) |
| `month` | `timestamp` | month of the year (`1` - `12`) |
| `quarter` | `timestamp` | quarter of the year (`1` - `4`) |
| `year` | `timestamp` | year |
| `timezone_hour` | `timestamp` | hour of the time-zone offfset |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prob use "timezone" instead of "time-zone" for consistency?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think I'd rather move it to "time zone" as that's how I'm seeing it being used more often in documentations (incl postgres) regardless of the consistency with the literal

| `timezone_minute` | `timestamp` | minute of the time-zone offset |

{{< code-block lang="sql" >}}
SELECT
extract(year FROM purchase_date) AS purchase_year
Expand All @@ -233,14 +257,50 @@
{{< /code-block >}}

### `TO_TIMESTAMP`

Supported patterns for date/time formatting:
| Pattern | Description |
| ----------- | ------------------------------------ |
| `YYYY` | year (4 digits) |
| `YY` | year (2 digits) |
| `MM` | month number (01 - 12) |
| `DD` | day of month (01 - 31) |
| `HH24` | hour of day (00 - 23) |
| `HH12` | hour of day (01 - 12) |
| `HH` | hour of day (01 - 12) |
| `MI` | minute (00 - 59) |
| `SS` | second (00 - 59) |
| `MS` | millisecond (000 - 999) |
| `TZ` | time-zone abbreviation |
| `OF` | time-zone offset from UTC |
| `AM` / `am` | meridiem indicator (without periods) |
| `PM` / `pm` | meridiem indicator (without periods) |

{{< code-block lang="sql" >}}
SELECT
to_timestamp(epoch_time) AS formatted_time
FROM
event_logs
to_timestamp('25/12/2025 04:23 pm', 'DD/MM/YYYY HH:MI am') AS ts

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something I noticed now that technically is unrelated to this PR: We seem to be inconsistent in how we case function names. E.g., CAST is upper case, this is lower case.

I think it would be good to do this consistently. And I would in general prefer upper case, but that's a weak preference. I think consistency is the more important aspect.

I'm curious on your thoughts here, but regardless I think it would make sense to tackle in a separate PR (so it can be done holistically and make that easier to review).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, seems like this entire section has them in lower case. I'll make a follow up PR to edit them altogether. Looking at Postgres, I'm seeing them use lower case when listing functions in tables, but upper case in SQL statements. I do prefer that, it's easier on the eyes when scanning through a list but works inside SQL with capitalizing other keywords.

{{< /code-block >}}

### `TO_CHAR`

Supported patterns for date/time formatting:
| Pattern | Description |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only have this list specified twice, right?

Seems ok to keep it duplicated for now, but might be nice to put into a shared section at least at some point. The more functions that we have that need this, the more useful a shared section would be. I'll defer to you whether that's worth doing now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep just twice. I had thought about a common section, but decided against it to save the user a click. Agree that it should be moved if it's specified in more often

| ----------- | ------------------------------------ |
| `YYYY` | year (4 digits) |
| `YY` | year (2 digits) |
| `MM` | month number (01 - 12) |
| `DD` | day of month (01 - 31) |
| `HH24` | hour of day (00 - 23) |
| `HH12` | hour of day (01 - 12) |
| `HH` | hour of day (01 - 12) |
| `MI` | minute (00 - 59) |
| `SS` | second (00 - 59) |
| `MS` | millisecond (000 - 999) |
| `TZ` | time-zone abbreviation |
| `OF` | time-zone offset from UTC |
| `AM` / `am` | meridiem indicator (without periods) |
| `PM` / `pm` | meridiem indicator (without periods) |

{{< code-block lang="sql" >}}
SELECT
to_char(order_date, 'MM-DD-YYYY') AS formatted_date
Expand All @@ -249,6 +309,18 @@
{{< /code-block >}}

### `DATE_TRUNC`

Supported truncations:
- `milliseconds`
- `seconds` / `second`
- `minutes` / `minute`
- `hours` / `hour`
- `days` / `day`
- `weeks` / `week `
- `months` / `month`
- `quarter` / `quarters`
- `year` / `years`

{{< code-block lang="sql" >}}
SELECT
date_trunc('month', event_time) AS month_start
Expand Down
Loading