Releases · snowflakedb/snowpark-python · GitHub

12 Jan 20:01

sfc-gh-yixie

v0.3.0 Pre-release

Pre-release

0.3.0 (2022-01-09)

New Features

Added Column.isin(), with an alias Column.in_().
Added Column.try_cast(), which is a special version of cast(). It tries to cast a string expression to other types and returns null if the cast is not possible.
Added Column.startswith() and Column.substr() to process string columns.
Column.cast() now also accepts a str value to indicate the cast type in addition to a DataType instance.
Added DataFrame.describe() to summarize stats of a DataFrame.
Added DataFrame.explain() to print the query plan of a DataFrame.
DataFrame.filter() and DataFrame.select_expr() now accepts a sql expression.
Added a new bool parameter create_temp_table to methods DataFrame.saveAsTable() and Session.write_pandas() to optionally create a temp table.
Added DataFrame.minus() and DataFrame.subtract() as aliases to DataFrame.except_().
Added regexp_replace(), concat(), concat_ws(), to_char(), current_timestamp(), current_date(), current_time(), months_between(), cast(), try_cast(), greatest(), least(), and hash() to module snowflake.snowpark.functions.

Bug Fixes

Fixed an issue where Session.createDataFrame(pandas_df) and Session.write_pandas(pandas_df) raise an exception when the Pandas DataFrame has spaces in the column name.
DataFrame.copy_into_table() sometimes prints an error level log entry while it actually works. It's fixed now.
Fixed an API docs issue where some DataFrame APIs are missing from the docs.

Dependency updates

Update snowflake-connector-python to 2.7.2, which upgrades pyarrow dependency to 6.0.x. Refer to the python connector 2.7.2 release notes for more details.

Assets 2

10 Jan 23:44

sfc-gh-yixie

v0.2.0 Pre-release

Pre-release

0.2.0 (2021-12-02)

New Features

Updated the Session.createDataFrame() method for creating a DataFrame from a Pandas DataFrame.
Added the Session.write_pandas() method for writing a Pandas DataFrame to a table in Snowflake and getting a Snowpark DataFrame object back.
Added new classes and methods for calling window functions.
Added the new functions cume_dist(), to find the cumulative distribution of a value with regard to other values within a window partition,
and row_number(), which returns a unique row number for each row within a window partition.
Added functions for computing statistics for DataFrames in the DataFrameStatFunctions class.
Added functions for handling missing values in a DataFrame in the DataFrameNaFunctions class.
Added new methods rollup(), cube(), and pivot() to the DataFrame class.
Added the GroupingSets class, which you can use with the DataFrame groupByGroupingSets method to perform a SQL GROUP BY GROUPING SETS.
Added the new FileOperation(session)
class that you can use to upload and download files to and from a stage.
Added the DataFrame.copy_into_table()
method for loading data from files in a stage into a table.
In CASE expressions, the functions when() and otherwise()
now accept Python types in addition to Column objects.
When you register a UDF you can now optionally set the replace parameter to True to overwrite an existing UDF with the same name.

Improvements

UDFs are now compressed before they are uploaded to the server. This makes them about 10 times smaller, which can help
when you are using large ML model files.
When the size of a UDF is less than 8196 bytes, it will be uploaded as in-line code instead of uploaded to a stage.

Bug Fixes

Fixed an issue where the statement df.select(when(col("a") == 1, 4).otherwise(col("a"))), [Row(4), Row(2), Row(3)] raised an exception.
Fixed an issue where df.toPandas() raised an exception when a DataFrame was created from large local data.

Assets 2

29 Oct 22:37

sfc-gh-abhatnagar

Private Preview Release Pre-release

Pre-release

Initial private preview release of snowflake-snowpark-python

Assets 2