Releases: snowflakedb/snowpark-python
Releases · snowflakedb/snowpark-python
v0.3.0
0.3.0 (2022-01-09)
New Features
- Added
Column.isin(), with an aliasColumn.in_(). - Added
Column.try_cast(), which is a special version ofcast(). It tries to cast a string expression to other types and returnsnullif the cast is not possible. - Added
Column.startswith()andColumn.substr()to process string columns. Column.cast()now also accepts astrvalue to indicate the cast type in addition to aDataTypeinstance.- Added
DataFrame.describe()to summarize stats of aDataFrame. - Added
DataFrame.explain()to print the query plan of aDataFrame. DataFrame.filter()andDataFrame.select_expr()now accepts a sql expression.- Added a new
boolparametercreate_temp_tableto methodsDataFrame.saveAsTable()andSession.write_pandas()to optionally create a temp table. - Added
DataFrame.minus()andDataFrame.subtract()as aliases toDataFrame.except_(). - Added
regexp_replace(),concat(),concat_ws(),to_char(),current_timestamp(),current_date(),current_time(),months_between(),cast(),try_cast(),greatest(),least(), andhash()to modulesnowflake.snowpark.functions.
Bug Fixes
- Fixed an issue where
Session.createDataFrame(pandas_df)andSession.write_pandas(pandas_df)raise an exception when thePandas DataFramehas spaces in the column name. DataFrame.copy_into_table()sometimes prints anerrorlevel log entry while it actually works. It's fixed now.- Fixed an API docs issue where some
DataFrameAPIs are missing from the docs.
Dependency updates
- Update
snowflake-connector-pythonto 2.7.2, which upgradespyarrowdependency to 6.0.x. Refer to the python connector 2.7.2 release notes for more details.
v0.2.0
0.2.0 (2021-12-02)
New Features
- Updated the
Session.createDataFrame()method for creating aDataFramefrom a Pandas DataFrame. - Added the
Session.write_pandas()method for writing aPandas DataFrameto a table in Snowflake and getting aSnowpark DataFrameobject back. - Added new classes and methods for calling window functions.
- Added the new functions
cume_dist(), to find the cumulative distribution of a value with regard to other values within a window partition,
androw_number(), which returns a unique row number for each row within a window partition. - Added functions for computing statistics for DataFrames in the
DataFrameStatFunctionsclass. - Added functions for handling missing values in a DataFrame in the
DataFrameNaFunctionsclass. - Added new methods
rollup(),cube(), andpivot()to theDataFrameclass. - Added the
GroupingSetsclass, which you can use with the DataFrame groupByGroupingSets method to perform a SQL GROUP BY GROUPING SETS. - Added the new
FileOperation(session)
class that you can use to upload and download files to and from a stage. - Added the
DataFrame.copy_into_table()
method for loading data from files in a stage into a table. - In CASE expressions, the functions
when()andotherwise()
now accept Python types in addition toColumnobjects. - When you register a UDF you can now optionally set the
replaceparameter toTrueto overwrite an existing UDF with the same name.
Improvements
- UDFs are now compressed before they are uploaded to the server. This makes them about 10 times smaller, which can help
when you are using large ML model files. - When the size of a UDF is less than 8196 bytes, it will be uploaded as in-line code instead of uploaded to a stage.
Bug Fixes
- Fixed an issue where the statement
df.select(when(col("a") == 1, 4).otherwise(col("a"))), [Row(4), Row(2), Row(3)]raised an exception. - Fixed an issue where
df.toPandas()raised an exception when a DataFrame was created from large local data.
Private Preview Release
Initial private preview release of snowflake-snowpark-python