Releases: snowflakedb/snowpark-python
Releases · snowflakedb/snowpark-python
Release
1.33.0 (2025-06-19)
Snowpark Python API Updates
New Features
- Added support for MySQL in
DataFrameWriter.dbapi(PrPr) for both Parquet and UDTF-based ingestion. - Added support for PostgreSQL in
DataFrameReader.dbapi(PrPr) for both Parquet and UDTF-based ingestion. - Added support for Databricks in
DataFrameWriter.dbapi(PrPr) for UDTF-based ingestion. - Added support to
DataFrameReaderto enable use ofPATTERNwhen reading files withINFER_SCHEMAenabled. - Added support for the following AI-powered functions in
functions.py:ai_completeai_similarityai_summarize_agg(originallysummarize_agg)- different config options for
ai_classify
- Added support for more options when reading XML files with a row tag using
rowTagoption:- Added support for removing namespace prefixes from col names using
ignoreNamespaceoption. - Added support for specifying the prefix for the attribute column in the result table using
attributePrefixoption. - Added support for excluding attributes from the XML element using
excludeAttributesoption. - Added support for specifying the column name for the value when there are attributes in an element that has no child elements using
valueTagoption. - Added support for specifying the value to treat as a
nullvalue usingnullValueoption. - Added support for specifying the character encoding of the XML file using
charsetoption. - Added support for ignoring surrounding whitespace in the XML element using
ignoreSurroundingWhitespaceoption.
- Added support for removing namespace prefixes from col names using
- Added support for parameter
return_dataframeinSession.call, which can be used to set the return type of the functions to aDataFrameobject. - Added a new argument to
Dataframe.describecalledstrings_include_math_statsthat triggersstddevandmeanto be calculated for String columns. - Added support for retrieving
Edge.propertieswhen retrieving lineage fromDGQLinDataFrame.lineage.trace. - Added a parameter
table_existstoDataFrameWriter.save_as_tablethat allows specifying if a table already exists. This allows skipping a table lookup that can be expensive.
Bug Fixes
- Fixed a bug in
DataFrameReader.dbapi(PrPr) where thecreate_connectiondefined as local function was incompatible with multiprocessing. - Fixed a bug in
DataFrameReader.dbapi(PrPr) where databricksTIMESTAMPtype was converted to SnowflakeTIMESTAMP_NTZtype which should beTIMESTAMP_LTZtype. - Fixed a bug in
DataFrameReader.jsonwhere repeated reads with the same reader object would create incorrectly quoted columns. - Fixed a bug in
DataFrame.to_pandas()that would drop column names when converting a dataframe that did not originate from a select statement. - Fixed a bug that
DataFrame.create_or_replace_dynamic_tableraises error when the dataframe contains a UDTF andSELECT *in UDTF not being parsed correctly. - Fixed a bug where casted columns could not be used in the values-clause of in functions.
Improvements
- Improved the error message for
Session.write_pandas()andSession.create_dataframe()when the input pandas DataFrame does not have a column. - Improved
DataFrame.selectwhen the arguments contain a table function with output columns that collide with columns of current dataframe. With the improvement, if user provides non-colliding columns indf.select("col1", "col2", table_func(...))as string arguments, then the query generated by snowpark client will not raise ambiguous column error. - Improved
DataFrameReader.dbapi(PrPr) to use in-memory Parquet-based ingestion for better performance and security. - Improved
DataFrameReader.dbapi(PrPr) to useMATCH_BY_COLUMN_NAME=CASE_SENSITIVEin copy into table operation.
Snowpark Local Testing Updates
New Features
- Added support for snow urls (snow://) in local file testing.
Bug Fixes
- Fixed a bug in
Column.isinthat would cause incorrect filtering on joined or previously filtered data. - Fixed a bug in
snowflake.snowpark.functions.concat_wsthat would cause results to have an incorrect index.
Snowpark pandas API Updates
Dependency Updates
- Updated
modindependency constraint from 0.32.0 to >=0.32.0, <0.34.0. The latest version tested with Snowpark pandas ismodin0.33.1.
New Features
- Added support for Hybrid Execution (PrPr). By running
from modin.config import AutoSwitchBackend; AutoSwitchBackend.enable(), Snowpark pandas will automatically choose whether to run certain pandas operations locally or on Snowflake. This feature is disabled by default.
Improvements
- Set the default value of the
indexparameter toFalseforDataFrame.to_view,Series.to_view,DataFrame.to_dynamic_table, andSeries.to_dynamic_table. - Added
iceberg_versionoption to table creation functions. - Reduced query count for many operations, including
insert,repr, andgroupby, that previously issued a query to retrieve the input data's size.
Bug Fixes
- Fixed a bug in
Series.wherewhen theotherparameter is an unnamedSeries.
Release
1.32.0 (2025-05-15)
Snowpark Python API Updates
Improvements
- Invoking snowflake system procedures does not invoke an additional
describe procedurecall to check the return type of the procedure. - Added support for
Session.create_dataframe()with the stage URL and FILE data type. - Added support for different modes for dealing with corrupt XML records when reading an XML file using
session.read.option('mode', <mode>), option('rowTag', <tag_name>).xml(<stage_file_path>). CurrentlyPERMISSIVE,DROPMALFORMEDandFAILFASTare supported. - Improved the error message of the XML reader when the specified row tag is not found in the file.
- Improved query generation for
Dataframe.dropto useSELECT * EXCLUDE ()to exclude the dropped columns. To enable this feature, setsession.conf.set("use_simplified_query_generation", True). - Added support for
VariantTypetoStructType.from_json
Bug Fixes
- Fixed a bug in
DataFrameWriter.dbapi(PrPr) that unicode or double-quoted column name in external database causes error because not quoted correctly. - Fixed a bug where named fields in nested OBJECT data could cause errors when containing spaces.
Snowpark Local Testing Updates
Bug Fixes
- Fixed a bug in
snowflake.snowpark.functions.rankthat would cause sort direction to not be respected. - Fixed a bug in
snowflake.snowpark.functions.to_timestamp_*that would cause incorrect results on filtered data.
Snowpark pandas API Updates
New Features
- Added support for dict values in
Series.str.get,Series.str.slice, andSeries.str.__getitem__(Series.str[...]). - Added support for
DataFrame.to_html. - Added support for
DataFrame.to_stringandSeries.to_string. - Added support for reading files from S3 buckets using
pd.read_csv.
Improvements
- Make
iceberg_configa required parameter forDataFrame.to_icebergandSeries.to_iceberg.
Release
1.31.1 (2025-05-05)
Snowpark Python API Updates
Bug Fixes
- Updated conda build configuration to deprecate Python 3.8 support, preventing installation in incompatible environments.
Release
1.31.0 (2025-04-24)
Snowpark Python API Updates
New Features
- Added support for
restricted callerpermission ofexecute_asargument inStoredProcedure.register(). - Added support for non-select statement in
DataFrame.to_pandas(). - Added support for
artifact_repositoryparameter toSession.add_packages,Session.add_requirements,Session.get_packages,Session.remove_package, andSession.clear_packages. - Added support for reading an XML file using a row tag by
session.read.option('rowTag', <tag_name>).xml(<stage_file_path>)(experimental).- Each XML record is extracted as a separate row.
- Each field within that record becomes a separate column of type VARIANT, which can be further queried using dot notation, e.g.,
col(a.b.c).
- Added updates to
DataFrameReader.dbapi(PrPr):- Added
fetch_merge_countparameter for optimizing performance by merging multiple fetched data into a single Parquet file. - Added support for Databricks.
- Added support for ingestion with Snowflake UDTF.
- Added
- Added support for the following AI-powered functions in
functions.py(Private Preview):promptai_filter(added support forprompt()function and image files, and changed the second argument name fromexprtofile)ai_classify
Improvements
- Renamed the
relaxed_orderingparam intoenforce_orderingforDataFrame.to_snowpark_pandas. Also the new default values isenforce_ordering=Falsewhich has the opposite effect of the previous default value,relaxed_ordering=False. - Improved
DataFrameReader.dbapi(PrPr) reading performance by setting the defaultfetch_sizeparameter value to 1000. - Improve the error message for invalid identifier SQL error by suggesting the potentially matching identifiers.
- Reduced the number of describe queries issued when creating a DataFrame from a Snowflake table using
session.table. - Improved performance and accuracy of
DataFrameAnalyticsFunctions.time_series_agg().
Bug Fixes
- Fixed a bug in
DataFrame.group_by().pivot().aggwhen the pivot column and aggregate column are the same. - Fixed a bug in
DataFrameReader.dbapi(PrPr) where aTypeErrorwas raised whencreate_connectionreturned a connection object of an unsupported driver type. - Fixed a bug where
df.limit(0)call would not properly apply. - Fixed a bug in
DataFrameWriter.save_as_tablethat caused reserved names to throw errors when using append mode.
Deprecations
- Deprecated support for Python3.8.
- Deprecated argument
sliding_intervalinDataFrameAnalyticsFunctions.time_series_agg().
Snowpark Local Testing Updates
New Features
- Added support for Interval expression to
Window.range_between. - Added support for
array_constructfunction.
Bug Fixes
- Fixed a bug in local testing where transient
__pycache__directory was unintentionally copied during stored procedure execution via import. - Fixed a bug in local testing that created incorrect result for
Column.likecalls. - Fixed a bug in local testing that caused
Column.getItemandsnowpark.snowflake.functions.getto raiseIndexErrorrather than return null. - Fixed a bug in local testing where
df.limit(0)call would not properly apply. - Fixed a bug in local testing where a
Table.mergeinto an empty table would cause an exception.
Snowpark pandas API Updates
Dependency Updates
- Updated
modinfrom 0.30.1 to 0.32.0. - Added support for
numpy2.0 and above.
New Features
- Added support for
DataFrame.create_or_replace_viewandSeries.create_or_replace_view. - Added support for
DataFrame.create_or_replace_dynamic_tableandSeries.create_or_replace_dynamic_table. - Added support for
DataFrame.to_viewandSeries.to_view. - Added support for
DataFrame.to_dynamic_tableandSeries.to_dynamic_table. - Added support for
DataFrame.groupby.resamplefor aggregationsmax,mean,median,min, andsum. - Added support for reading stage files using:
pd.read_excelpd.read_htmlpd.read_picklepd.read_saspd.read_xml
- Added support for
DataFrame.to_icebergandSeries.to_iceberg. - Added support for dict values in
Series.str.len.
Improvements
- Improve performance of
DataFrame.groupby.applyandSeries.groupby.applyby avoiding expensive pivot step. - Added estimate for row count upper bound to
OrderedDataFrameto enable better engine switching. This could potentially result in increased query counts. - Renamed the
relaxed_orderingparam intoenforce_orderingforpd.read_snowflake. Also the new default value isenforce_ordering=Falsewhich has the opposite effect of the previous default value,relaxed_ordering=False.
Bug Fixes
- Fixed a bug for
pd.read_snowflakewhen reading iceberg tables andenforce_ordering=True.
Release
1.30.0 (2025-03-27)
Snowpark Python API Updates
New Features
- Added Support for relaxed consistency and ordering guarantees in
Dataframe.to_snowpark_pandasby introducing the new parameterrelaxed_ordering. DataFrameReader.dbapi(PrPr) now accepts a list of strings for the session_init_statement parameter, allowing multiple SQL statements to be executed during session initialization.
Improvements
- Improved query generation for
Dataframe.stat.sample_byto generate a single flat query that scales well with largefractionsdictionary compared to older method of creating a UNION ALL subquery for each key infractions. To enable this feature, setsession.conf.set("use_simplified_query_generation", True). - Improved performance of
DataFrameReader.dbapiby enable vectorized option when copy parquet file into table. - Improved query generation for
DataFrame.random_splitin the following ways. They can be enabled by settingsession.conf.set("use_simplified_query_generation", True):- Removed the need to
cache_resultin the internal implementation of the input dataframe resulting in a pure lazy dataframe operation. - The
seedargument now behaves as expected with repeatable results across multiple calls and sessions.
- Removed the need to
DataFrame.fillnaandDataFrame.replacenow both support fittingintandfloatintoDecimalcolumns ifinclude_decimalis set to True.- Added documentation for the following UDF and stored procedure functions in
files.pyas a result of their General Availability.SnowflakeFile.writeSnowflakeFile.writelinesSnowflakeFile.writeable
- Minor documentation changes for
SnowflakeFileandSnowflakeFile.open()
Bug Fixes
- Fixed a bug for the following functions that raised errors
.cast()is applied to their outputfrom_jsonsize
Snowpark Local Testing Updates
Bug Fixes
- Fixed a bug in aggregation that caused empty groups to still produce rows.
- Fixed a bug in
Dataframe.except_that would cause rows to be incorrectly dropped. - Fixed a bug that caused
to_timestampto fail when casting filtered columns.
Snowpark pandas API Updates
New Features
- Added support for list values in
Series.str.__getitem__(Series.str[...]). - Added support for
pd.Grouperobjects in group by operations. Whenfreqis specified, the default values of thesort,closed,label, andconventionarguments are supported;originis supported when it isstartorstart_day. - Added support for relaxed consistency and ordering guarantees in
pd.read_snowflakefor both named data sources (e.g., tables and views) and query data sources by introducing the new parameterrelaxed_ordering.
Improvements
- Raise a warning whenever
QUOTED_IDENTIFIERS_IGNORE_CASEis found to be set, ask user to unset it. - Improved how a missing
index_labelinDataFrame.to_snowflakeandSeries.to_snowflakeis handled whenindex=True. Instead of raising aValueError, system-defined labels are used for the index columns. - Improved error message for
groupby or DataFrame or Series.aggwhen the function name is not supported.
Release
1.29.1 (2025-03-12)
Snowpark Python API Updates
Bug Fixes
- Fixed a bug in
DataFrameReader.dbapi(PrPr) that prevents usage in stored procedure and snowbooks.
Release
1.29.0 (2025-03-05)
Snowpark Python API Updates
New Features
- Added support for the following AI-powered functions in
functions.py(Private Preview):ai_filterai_aggsummarize_agg
- Added support for the new FILE SQL type support, with the following related functions in
functions.py(Private Preview):fl_get_content_typefl_get_etagfl_get_file_typefl_get_last_modifiedfl_get_relative_pathfl_get_scoped_file_urlfl_get_sizefl_get_stagefl_get_stage_file_urlfl_is_audiofl_is_compressedfl_is_documentfl_is_imagefl_is_video
- Added support for importing third-party packages from PyPi using Artifact Repository (Private Preview):
- Use keyword arguments
artifact_repositoryandartifact_repository_packagesto specify your artifact repository and packages respectively when registering stored procedures or user defined functions. - Supported APIs are:
Session.sproc.registerSession.udf.registerSession.udaf.registerSession.udtf.registerfunctions.sprocfunctions.udffunctions.udaffunctions.udtffunctions.pandas_udffunctions.pandas_udtf
- Use keyword arguments
Bug Fixes
- Fixed a bug where creating a Dataframe with large number of values raised
Unsupported feature 'SCOPED_TEMPORARY'.error if thread-safe session was disabled. - Fixed a bug where
df.describeraised internal SQL execution error when the dataframe is created from reading a stage file and CTE optimization is enabled. - Fixed a bug where
df.order_by(A).select(B).distinct()would generate invalid SQL when simplified query generation was enabled usingsession.conf.set("use_simplified_query_generation", True).- Disabled simplified query generation by default.
Improvements
- Improved version validation warnings for
snowflake-snowpark-pythonpackage compatibility when registering stored procedures. Now, warnings are only triggered if the major or minor version does not match, while bugfix version differences no longer generate warnings. - Bumped cloudpickle dependency to also support
cloudpickle==3.0.0in addition to previous versions.
Snowpark Local Testing Updates
New Features
- Added support for literal values to
range_betweenwindow function.
Snowpark pandas API Updates
New Features
- Added support for applying Snowflake Cortex functions
ClassifyText,Translate, andExtractAnswer.
Improvements
- Improve error message for
pd.to_snowflake,DataFrame.to_snowflake, andSeries.to_snowflakewhen the table does not exist. - Improve readability of docstring for the
if_existsparameter inpd.to_snowflake,DataFrame.to_snowflake, andSeries.to_snowflake. - Improve error message for all pandas functions that use UDFs with Snowpark objects.
Bug Fixes
- Fixed a bug in
Series.rename_axiswhere anAttributeErrorwas being raised. - Fixed a bug where
pd.get_dummiesdidn't ignore NULL/NaN values by default. - Fixed a bug where repeated calls to
pd.get_dummiesresults in 'Duplicated column name error'. - Fixed a bug in
pd.get_dummieswhere passing list of columns generated incorrect column labels in output DataFrame. - Update
pd.get_dummiesto return bool values instead of int.
Release
1.28.0 (2025-02-20)
Snowpark Python API Updates
New Features
- Added support for the following functions in
functions.pynormalrandn
- Added support for
allow_missing_columnsparameter toDataframe.union_by_nameandDataframe.union_all_by_name.
Improvements
- Improved the random object name generation to avoid collisions.
- Improved query generation for
Dataframe.distinctto generateSELECT DISTINCTinstead ofSELECTwithGROUP BYall columns. To disable this feature, setsession.conf.set("use_simplified_query_generation", False).
Deprecations
- Deprecated Snowpark Python function
snowflake_cortex_summarize. Users can install snowflake-ml-python and use the snowflake.cortex.summarize function instead. - Deprecated Snowpark Python function
snowflake_cortex_sentiment. Users can install snowflake-ml-python and use the snowflake.cortex.sentiment function instead.
Bug Fixes
- Fixed a bug where session-level query tag was overwritten by a stacktrace for dataframes that generate multiple queries. Now, the query tag will only be set to the stacktrace if
session.conf.set("collect_stacktrace_in_query_tag", True). - Fixed a bug in
Session._write_pandaswhere it was erroneously passinguse_logical_typeparameter toSession._write_modin_pandas_helperwhen writing a Snowpark pandas object. - Fixed a bug in options sql generation that could cause multiple values to be formatted incorrectly.
- Fixed a bug in
Session.catalogwhere empty strings for database or schema were not handled correctly and were generating erroneous sql statements.
Experimental Features
- Added support for writing pyarrow Tables to Snowflake tables.
Snowpark pandas API Updates
New Features
- Added support for applying Snowflake Cortex functions
SummarizeandSentiment. - Added support for list values in
Series.str.get.
Bug Fixes
- Fixed a bug in
applywhere kwargs were not being correctly passed into the applied function.
Snowpark Local Testing Updates
New Features
- Added support for the following functions
hourminute
- Added support for NULL_IF parameter to csv reader.
- Added support for
date_format,datetime_format, andtimestamp_formatoptions when loading csvs.
Bug Fixes
- Fixed a bug in Dataframe.join that caused columns to have incorrect typing.
- Fixed a bug in when statements that caused incorrect results in the otherwise clause.
Release
1.27.0 (2025-02-03)
Snowpark Python API Updates
New Features
- Added support for the following functions in
functions.pyarray_reversedivnullmap_catmap_contains_keymap_keysnullifzerosnowflake_cortex_sentimentacoshasinhatanhbit_lengthbitmap_bit_positionbitmap_bucket_numberbitmap_construct_aggcbrtequal_nullfrom_jsonifnulllocaltimestampmax_bymin_bynth_valuenvloctet_lengthpositionregr_avgxregr_avgyregr_countregr_interceptregr_r2regr_sloperegr_sxxregr_sxyregr_syytry_to_binarybase64base64_decode_stringbase64_encodeeditdistancehexhex_encodeinstrlog1plog2log10percentile_approxunbase64
- Added support for specifying a schema string (including implicit struct syntax) when calling
DataFrame.create_dataframe. - Added support for
DataFrameWriter.insert_into/insertInto. This method also supports local testing mode. - Added support for
DataFrame.create_temp_viewto create a temporary view. It will fail if the view already exists. - Added support for multiple columns in the functions
map_catandmap_concat. - Added an option
keep_column_orderfor keeping original column order inDataFrame.with_columnandDataFrame.with_columns. - Added options to column casts that allow renaming or adding fields in StructType columns.
- Added support for
contains_nullparameter to ArrayType. - Added support for creating a temporary view via
DataFrame.create_or_replace_temp_viewfrom a DataFrame created by reading a file from a stage. - Added support for
value_contains_nullparameter to MapType. - Added
interactiveto telemetry that indicates whether the current environment is an interactive one. - Allow
session.file.getin a Native App to read file paths starting with/from the current version - Added support for multiple aggregation functions after
DataFrame.pivot.
Experimental Features
- Added
Catalogclass to manage snowflake objects. It can be accessed viaSession.catalog.snowflake.coreis a dependency required for this feature.
- Allow user input schema when reading JSON file on stage.
- Added support for specifying a schema string (including implicit struct syntax) when calling
DataFrame.create_dataframe.
Improvements
- Updated README.md to include instructions on how to verify package signatures using
cosign.
Bug Fixes
- Fixed a bug in local testing mode that caused a column to contain None when it should contain 0.
- Fixed a bug in
StructField.from_jsonthat prevented TimestampTypes withtzinfofrom being parsed correctly. - Fixed a bug in function
date_formatthat caused an error when the input column was date type or timestamp type. - Fixed a bug in dataframe that null value can be inserted in a non-nullable column.
- Fixed a bug in
replaceandlitwhich raised type hint assertion error when passingColumnexpression objects. - Fixed a bug in
pandas_udfandpandas_udtfwheresessionparameter was erroneously ignored. - Fixed a bug that raised incorrect type conversion error for system function called through
session.call.
Snowpark pandas API Updates
New Features
- Added support for
Series.str.ljustandSeries.str.rjust. - Added support for
Series.str.center. - Added support for
Series.str.pad. - Added support for applying Snowpark Python function
snowflake_cortex_sentiment. - Added support for
DataFrame.map. - Added support for
DataFrame.from_dictandDataFrame.from_records. - Added support for mixed case field names in struct type columns.
- Added support for
SeriesGroupBy.unique - Added support for
Series.dt.strftimewith the following directives:- %d: Day of the month as a zero-padded decimal number.
- %m: Month as a zero-padded decimal number.
- %Y: Year with century as a decimal number.
- %H: Hour (24-hour clock) as a zero-padded decimal number.
- %M: Minute as a zero-padded decimal number.
- %S: Second as a zero-padded decimal number.
- %f: Microsecond as a decimal number, zero-padded to 6 digits.
- %j: Day of the year as a zero-padded decimal number.
- %X: Locale’s appropriate time representation.
- %%: A literal '%' character.
- Added support for
Series.between. - Added support for
include_groups=FalseinDataFrameGroupBy.apply. - Added support for
expand=TrueinSeries.str.split. - Added support for
DataFrame.popandSeries.pop. - Added support for
firstandlastinDataFrameGroupBy.aggandSeriesGroupBy.agg. - Added support for
Index.drop_duplicates. - Added support for aggregations
"count","median",np.median,
"skew","std",np.std"var", andnp.varin
pd.pivot_table(),DataFrame.pivot_table(), andpd.crosstab().
Improvements
- Improve performance of
DataFrame.map,Series.applyandSeries.mapmethods by mapping numpy functions to snowpark functions if possible. - Added documentation for
DataFrame.map. - Improve performance of
DataFrame.applyby mapping numpy functions to snowpark functions if possible. - Added documentation on the extent of Snowpark pandas interoperability with scikit-learn.
- Infer return type of functions in
Series.map,Series.applyandDataFrame.mapif type-hint is not provided. - Added
call_countto telemetry that counts method calls including interchange protocol calls.
Release
1.26.0 (2024-12-05)
Snowpark Python API Updates
New Features
- Added support for property
versionand class methodget_active_sessionforSessionclass. - Added new methods and variables to enhance data type handling and JSON serialization/deserialization:
- To
DataType, its derived classes, andStructField:type_name: Returns the type name of the data.simple_string: Provides a simple string representation of the data.json_value: Returns the data as a JSON-compatible value.json: Converts the data to a JSON string.
- To
ArrayType,MapType,StructField,PandasSeriesType,PandasDataFrameTypeandStructType:from_json: Enables these types to be created from JSON data.
- To
MapType:keyType: keys of the mapvalueType: values of the map
- To
- Added support for method
appNameinSessionBuilder. - Added support for
include_nullsargument inDataFrame.unpivot. - Added support for following functions in
functions.py:sizeto get size of array, object, or map columns.collect_listan alias ofarray_agg.substringmakeslenargument optional.
- Added parameter
ast_enabledto session for internal usage (default:False).
Improvements
- Added support for specifying the following to
DataFrame.create_or_replace_dynamic_table:iceberg_configA dictionary that can hold the following iceberg configuration options:external_volumecatalogbase_locationcatalog_syncstorage_serialization_policy
- Added support for nested data types to
DataFrame.print_schema - Added support for
levelparameter toDataFrame.print_schema - Improved flexibility of
DataFrameReaderandDataFrameWriterAPI by adding support for the following:- Added
formatmethod toDataFrameReaderandDataFrameWriterto specify file format when loading or unloading results. - Added
loadmethod toDataFrameReaderto work in conjunction withformat. - Added
savemethod toDataFrameWriterto work in conjunction withformat. - Added support to read keyword arguments to
optionsmethod forDataFrameReaderandDataFrameWriter.
- Added
- Relaxed the cloudpickle dependency for Python 3.11 to simplify build requirements. However, for Python 3.11,
cloudpickle==2.2.1remains the only supported version.
Bug Fixes
- Removed warnings that dynamic pivot features were in private preview, because
dynamic pivot is now generally available. - Fixed a bug in
session.read.optionswhereFalseBoolean values were incorrectly parsed asTruein the generated file format.
Dependency Updates
- Added a runtime dependency on
python-dateutil.
Snowpark pandas API Updates
New Features
- Added partial support for
Series.mapwhenargis a pandasSeriesor a
collections.abc.Mapping. No support for instances ofdictthat implement
__missing__but are not instances ofcollections.defaultdict. - Added support for
DataFrame.alignandSeries.alignforaxis=1andaxis=None. - Added support for
pd.json_normalize. - Added support for
GroupBy.pct_changewithaxis=0,freq=None, andlimit=None. - Added support for
DataFrameGroupBy.__iter__andSeriesGroupBy.__iter__. - Added support for
np.sqrt,np.trunc,np.floor, numpy trig functions,np.exp,np.abs,np.positiveandnp.negative. - Added partial support for the dataframe interchange protocol method
DataFrame.__dataframe__().
Bug Fixes
- Fixed a bug in
df.locwhere setting a single column from a series results in unexpectedNonevalues.
Improvements
- Use UNPIVOT INCLUDE NULLS for unpivot operations in pandas instead of sentinel values.
- Improved documentation for pd.read_excel.