Daft Functions Roadmap #4824
Replies: 2 comments 17 replies
-
|
Hi @kevinzwang, can it be understood that there will be no concept of Expression in the future, and all will be uniformly called Function, but distinguished between system built-in functions (referred to as BIF) and User-defined functions (referred to as UDF)? Then what's the difference between BIF and UDF? What I can think of is:
@daft.func
def my_udf(x: int) -> str: # return dtype is inferred from type hint
return f"{input}"
df.with_column("y", my_udf(col("x")))
Take the built-in df = daft.from_pydict({
"json": [
'{"a": 1, "b": 2}',
'{"a": 3, "b": 4}',
],
})
df = df.with_column("a", df["json"].json.query(".a"))If we abandon the concept of Expression and switch to BIF, will it evolve into the following usage style? (It is assumed that df = df.with_column("a", json_query(df["json"], ".a"))This example is mainly to show that for users, |
Beta Was this translation helpful? Give feedback.
-
|
Do we think calling like normal functions would be confusing for customers because the DataFrame API supports passing ColumnInputType in many places? df.select("a").with_column("b", do_work("a")) # !!! ERROR !!! do_work will be evaluated on string literalHere we have "a" as a column reference in one context, but a string literal in another context. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Recently, we created a new
daft.functionsmodule which we hope to expand upon. The work here will serve two purposes:col("a").str.capitalize()) which users have found confusingFor each relevant method on
daft.Expressionor its namespaces, we will do the following:Expression. Keep the original but add a deprecation warning once the move is complete. We'll remove it in v0.6daft.functionswith the same nameexplodeis only applicable to expressions)if_else, there may not need to be an expression method variant.Exprenum or as aFunctionExprbut should really be implemented as aScalarFunctionTasks
strnamespacedtnamespaceembeddingnamespacefloatnamespaceurlnamespacelistnamespacestructnamespacemapnamespaceimagenamespacepartitioningnamespacejsonnamespacebinarynamespaceBeta Was this translation helpful? Give feedback.
All reactions