Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/additional-functionality/advanced_configs.md
Original file line number Diff line number Diff line change
Expand Up @@ -399,6 +399,7 @@ Name | SQL Function(s) | Description | Default Value | Notes
<a name="sql.expression.CollectSet"></a>spark.rapids.sql.expression.CollectSet|`collect_set`|Collect a set of unique elements, not supported in reduction|true|None|
<a name="sql.expression.Count"></a>spark.rapids.sql.expression.Count|`count`|Count aggregate operator|true|None|
<a name="sql.expression.First"></a>spark.rapids.sql.expression.First|`first_value`, `first`|first aggregate operator|true|None|
<a name="sql.expression.HyperLogLogPlusPlus"></a>spark.rapids.sql.expression.HyperLogLogPlusPlus|`approx_count_distinct`|Aggregation approximate count distinct|true|None|
<a name="sql.expression.Last"></a>spark.rapids.sql.expression.Last|`last_value`, `last`|last aggregate operator|true|None|
<a name="sql.expression.Max"></a>spark.rapids.sql.expression.Max|`max`|Max aggregate operator|true|None|
<a name="sql.expression.MaxBy"></a>spark.rapids.sql.expression.MaxBy|`max_by`|MaxBy aggregate operator. It may produce different results than CPU when multiple rows in a group have same minimum value in the ordering column and different associated values in the value column.|true|None|
Expand Down
8 changes: 8 additions & 0 deletions docs/compatibility.md
Original file line number Diff line number Diff line change
Expand Up @@ -865,3 +865,11 @@ Seq(0L, Long.MaxValue).toDF("val")

But this is not something that can be done generically and requires inner knowledge about
what can trigger a side effect.

## HyperLogLogPlusPlus(approx_count_distinct)
Spark supports a precision range [4, Infinity). GPU supports a precision range: [4, 14].
The precision formula from rsd parameter is:
```scala
Math.ceil(2.0d * Math.log(1.106d / rsd) / Math.log(2.0d)).toInt
```
The `rsd` is abbreviation of relative standard deviation.
Loading