Skip to content

Releases: snowflakedb/snowpark-python

Release

19 Jun 16:29
40957d7

Choose a tag to compare

1.33.0 (2025-06-19)

Snowpark Python API Updates

New Features

  • Added support for MySQL in DataFrameWriter.dbapi (PrPr) for both Parquet and UDTF-based ingestion.
  • Added support for PostgreSQL in DataFrameReader.dbapi (PrPr) for both Parquet and UDTF-based ingestion.
  • Added support for Databricks in DataFrameWriter.dbapi (PrPr) for UDTF-based ingestion.
  • Added support to DataFrameReader to enable use of PATTERN when reading files with INFER_SCHEMA enabled.
  • Added support for the following AI-powered functions in functions.py:
    • ai_complete
    • ai_similarity
    • ai_summarize_agg (originally summarize_agg)
    • different config options for ai_classify
  • Added support for more options when reading XML files with a row tag using rowTag option:
    • Added support for removing namespace prefixes from col names using ignoreNamespace option.
    • Added support for specifying the prefix for the attribute column in the result table using attributePrefix option.
    • Added support for excluding attributes from the XML element using excludeAttributes option.
    • Added support for specifying the column name for the value when there are attributes in an element that has no child elements using valueTag option.
    • Added support for specifying the value to treat as a null value using nullValue option.
    • Added support for specifying the character encoding of the XML file using charset option.
    • Added support for ignoring surrounding whitespace in the XML element using ignoreSurroundingWhitespace option.
  • Added support for parameter return_dataframe in Session.call, which can be used to set the return type of the functions to a DataFrame object.
  • Added a new argument to Dataframe.describe called strings_include_math_stats that triggers stddev and mean to be calculated for String columns.
  • Added support for retrieving Edge.properties when retrieving lineage from DGQL in DataFrame.lineage.trace.
  • Added a parameter table_exists to DataFrameWriter.save_as_table that allows specifying if a table already exists. This allows skipping a table lookup that can be expensive.

Bug Fixes

  • Fixed a bug in DataFrameReader.dbapi (PrPr) where the create_connection defined as local function was incompatible with multiprocessing.
  • Fixed a bug in DataFrameReader.dbapi (PrPr) where databricks TIMESTAMP type was converted to Snowflake TIMESTAMP_NTZ type which should be TIMESTAMP_LTZ type.
  • Fixed a bug in DataFrameReader.json where repeated reads with the same reader object would create incorrectly quoted columns.
  • Fixed a bug in DataFrame.to_pandas() that would drop column names when converting a dataframe that did not originate from a select statement.
  • Fixed a bug that DataFrame.create_or_replace_dynamic_table raises error when the dataframe contains a UDTF and SELECT * in UDTF not being parsed correctly.
  • Fixed a bug where casted columns could not be used in the values-clause of in functions.

Improvements

  • Improved the error message for Session.write_pandas() and Session.create_dataframe() when the input pandas DataFrame does not have a column.
  • Improved DataFrame.select when the arguments contain a table function with output columns that collide with columns of current dataframe. With the improvement, if user provides non-colliding columns in df.select("col1", "col2", table_func(...)) as string arguments, then the query generated by snowpark client will not raise ambiguous column error.
  • Improved DataFrameReader.dbapi (PrPr) to use in-memory Parquet-based ingestion for better performance and security.
  • Improved DataFrameReader.dbapi (PrPr) to use MATCH_BY_COLUMN_NAME=CASE_SENSITIVE in copy into table operation.

Snowpark Local Testing Updates

New Features

  • Added support for snow urls (snow://) in local file testing.

Bug Fixes

  • Fixed a bug in Column.isin that would cause incorrect filtering on joined or previously filtered data.
  • Fixed a bug in snowflake.snowpark.functions.concat_ws that would cause results to have an incorrect index.

Snowpark pandas API Updates

Dependency Updates

  • Updated modin dependency constraint from 0.32.0 to >=0.32.0, <0.34.0. The latest version tested with Snowpark pandas is modin 0.33.1.

New Features

  • Added support for Hybrid Execution (PrPr). By running from modin.config import AutoSwitchBackend; AutoSwitchBackend.enable(), Snowpark pandas will automatically choose whether to run certain pandas operations locally or on Snowflake. This feature is disabled by default.

Improvements

  • Set the default value of the index parameter to False for DataFrame.to_view, Series.to_view, DataFrame.to_dynamic_table, and Series.to_dynamic_table.
  • Added iceberg_version option to table creation functions.
  • Reduced query count for many operations, including insert, repr, and groupby, that previously issued a query to retrieve the input data's size.

Bug Fixes

  • Fixed a bug in Series.where when the other parameter is an unnamed Series.

Release

15 May 18:21
b23cfde

Choose a tag to compare

1.32.0 (2025-05-15)

Snowpark Python API Updates

Improvements

  • Invoking snowflake system procedures does not invoke an additional describe procedure call to check the return type of the procedure.
  • Added support for Session.create_dataframe() with the stage URL and FILE data type.
  • Added support for different modes for dealing with corrupt XML records when reading an XML file using session.read.option('mode', <mode>), option('rowTag', <tag_name>).xml(<stage_file_path>). Currently PERMISSIVE, DROPMALFORMED and FAILFAST are supported.
  • Improved the error message of the XML reader when the specified row tag is not found in the file.
  • Improved query generation for Dataframe.drop to use SELECT * EXCLUDE () to exclude the dropped columns. To enable this feature, set session.conf.set("use_simplified_query_generation", True).
  • Added support for VariantType to StructType.from_json

Bug Fixes

  • Fixed a bug in DataFrameWriter.dbapi (PrPr) that unicode or double-quoted column name in external database causes error because not quoted correctly.
  • Fixed a bug where named fields in nested OBJECT data could cause errors when containing spaces.

Snowpark Local Testing Updates

Bug Fixes

  • Fixed a bug in snowflake.snowpark.functions.rank that would cause sort direction to not be respected.
  • Fixed a bug in snowflake.snowpark.functions.to_timestamp_* that would cause incorrect results on filtered data.

Snowpark pandas API Updates

New Features

  • Added support for dict values in Series.str.get, Series.str.slice, and Series.str.__getitem__ (Series.str[...]).
  • Added support for DataFrame.to_html.
  • Added support for DataFrame.to_string and Series.to_string.
  • Added support for reading files from S3 buckets using pd.read_csv.

Improvements

  • Make iceberg_config a required parameter for DataFrame.to_iceberg and Series.to_iceberg.

Release

05 May 19:44
e4e191f

Choose a tag to compare

1.31.1 (2025-05-05)

Snowpark Python API Updates

Bug Fixes

  • Updated conda build configuration to deprecate Python 3.8 support, preventing installation in incompatible environments.

Release

24 Apr 18:36

Choose a tag to compare

1.31.0 (2025-04-24)

Snowpark Python API Updates

New Features

  • Added support for restricted caller permission of execute_as argument in StoredProcedure.register().
  • Added support for non-select statement in DataFrame.to_pandas().
  • Added support for artifact_repository parameter to Session.add_packages, Session.add_requirements, Session.get_packages, Session.remove_package, and Session.clear_packages.
  • Added support for reading an XML file using a row tag by session.read.option('rowTag', <tag_name>).xml(<stage_file_path>) (experimental).
    • Each XML record is extracted as a separate row.
    • Each field within that record becomes a separate column of type VARIANT, which can be further queried using dot notation, e.g., col(a.b.c).
  • Added updates to DataFrameReader.dbapi (PrPr):
    • Added fetch_merge_count parameter for optimizing performance by merging multiple fetched data into a single Parquet file.
    • Added support for Databricks.
    • Added support for ingestion with Snowflake UDTF.
  • Added support for the following AI-powered functions in functions.py (Private Preview):
    • prompt
    • ai_filter (added support for prompt() function and image files, and changed the second argument name from expr to file)
    • ai_classify

Improvements

  • Renamed the relaxed_ordering param into enforce_ordering for DataFrame.to_snowpark_pandas. Also the new default values is enforce_ordering=False which has the opposite effect of the previous default value, relaxed_ordering=False.
  • Improved DataFrameReader.dbapi (PrPr) reading performance by setting the default fetch_size parameter value to 1000.
  • Improve the error message for invalid identifier SQL error by suggesting the potentially matching identifiers.
  • Reduced the number of describe queries issued when creating a DataFrame from a Snowflake table using session.table.
  • Improved performance and accuracy of DataFrameAnalyticsFunctions.time_series_agg().

Bug Fixes

  • Fixed a bug in DataFrame.group_by().pivot().agg when the pivot column and aggregate column are the same.
  • Fixed a bug in DataFrameReader.dbapi (PrPr) where a TypeError was raised when create_connection returned a connection object of an unsupported driver type.
  • Fixed a bug where df.limit(0) call would not properly apply.
  • Fixed a bug in DataFrameWriter.save_as_table that caused reserved names to throw errors when using append mode.

Deprecations

  • Deprecated support for Python3.8.
  • Deprecated argument sliding_interval in DataFrameAnalyticsFunctions.time_series_agg().

Snowpark Local Testing Updates

New Features

  • Added support for Interval expression to Window.range_between.
  • Added support for array_construct function.

Bug Fixes

  • Fixed a bug in local testing where transient __pycache__ directory was unintentionally copied during stored procedure execution via import.
  • Fixed a bug in local testing that created incorrect result for Column.like calls.
  • Fixed a bug in local testing that caused Column.getItem and snowpark.snowflake.functions.get to raise IndexError rather than return null.
  • Fixed a bug in local testing where df.limit(0) call would not properly apply.
  • Fixed a bug in local testing where a Table.merge into an empty table would cause an exception.

Snowpark pandas API Updates

Dependency Updates

  • Updated modin from 0.30.1 to 0.32.0.
  • Added support for numpy 2.0 and above.

New Features

  • Added support for DataFrame.create_or_replace_view and Series.create_or_replace_view.
  • Added support for DataFrame.create_or_replace_dynamic_table and Series.create_or_replace_dynamic_table.
  • Added support for DataFrame.to_view and Series.to_view.
  • Added support for DataFrame.to_dynamic_table and Series.to_dynamic_table.
  • Added support for DataFrame.groupby.resample for aggregations max, mean, median, min, and sum.
  • Added support for reading stage files using:
    • pd.read_excel
    • pd.read_html
    • pd.read_pickle
    • pd.read_sas
    • pd.read_xml
  • Added support for DataFrame.to_iceberg and Series.to_iceberg.
  • Added support for dict values in Series.str.len.

Improvements

  • Improve performance of DataFrame.groupby.apply and Series.groupby.apply by avoiding expensive pivot step.
  • Added estimate for row count upper bound to OrderedDataFrame to enable better engine switching. This could potentially result in increased query counts.
  • Renamed the relaxed_ordering param into enforce_ordering for pd.read_snowflake. Also the new default value is enforce_ordering=False which has the opposite effect of the previous default value, relaxed_ordering=False.

Bug Fixes

  • Fixed a bug for pd.read_snowflake when reading iceberg tables and enforce_ordering=True.

Release

27 Mar 18:01

Choose a tag to compare

1.30.0 (2025-03-27)

Snowpark Python API Updates

New Features

  • Added Support for relaxed consistency and ordering guarantees in Dataframe.to_snowpark_pandas by introducing the new parameter relaxed_ordering.
  • DataFrameReader.dbapi (PrPr) now accepts a list of strings for the session_init_statement parameter, allowing multiple SQL statements to be executed during session initialization.

Improvements

  • Improved query generation for Dataframe.stat.sample_by to generate a single flat query that scales well with large fractions dictionary compared to older method of creating a UNION ALL subquery for each key in fractions. To enable this feature, set session.conf.set("use_simplified_query_generation", True).
  • Improved performance of DataFrameReader.dbapi by enable vectorized option when copy parquet file into table.
  • Improved query generation for DataFrame.random_split in the following ways. They can be enabled by setting session.conf.set("use_simplified_query_generation", True):
    • Removed the need to cache_result in the internal implementation of the input dataframe resulting in a pure lazy dataframe operation.
    • The seed argument now behaves as expected with repeatable results across multiple calls and sessions.
  • DataFrame.fillna and DataFrame.replace now both support fitting int and float into Decimal columns if include_decimal is set to True.
  • Added documentation for the following UDF and stored procedure functions in files.py as a result of their General Availability.
    • SnowflakeFile.write
    • SnowflakeFile.writelines
    • SnowflakeFile.writeable
  • Minor documentation changes for SnowflakeFile and SnowflakeFile.open()

Bug Fixes

  • Fixed a bug for the following functions that raised errors .cast() is applied to their output
    • from_json
    • size

Snowpark Local Testing Updates

Bug Fixes

  • Fixed a bug in aggregation that caused empty groups to still produce rows.
  • Fixed a bug in Dataframe.except_ that would cause rows to be incorrectly dropped.
  • Fixed a bug that caused to_timestamp to fail when casting filtered columns.

Snowpark pandas API Updates

New Features

  • Added support for list values in Series.str.__getitem__ (Series.str[...]).
  • Added support for pd.Grouper objects in group by operations. When freq is specified, the default values of the sort, closed, label, and convention arguments are supported; origin is supported when it is start or start_day.
  • Added support for relaxed consistency and ordering guarantees in pd.read_snowflake for both named data sources (e.g., tables and views) and query data sources by introducing the new parameter relaxed_ordering.

Improvements

  • Raise a warning whenever QUOTED_IDENTIFIERS_IGNORE_CASE is found to be set, ask user to unset it.
  • Improved how a missing index_label in DataFrame.to_snowflake and Series.to_snowflake is handled when index=True. Instead of raising a ValueError, system-defined labels are used for the index columns.
  • Improved error message for groupby or DataFrame or Series.agg when the function name is not supported.

Release

12 Mar 23:39
a1ac12c

Choose a tag to compare

1.29.1 (2025-03-12)

Snowpark Python API Updates

Bug Fixes

  • Fixed a bug in DataFrameReader.dbapi (PrPr) that prevents usage in stored procedure and snowbooks.

Release

06 Mar 01:40

Choose a tag to compare

1.29.0 (2025-03-05)

Snowpark Python API Updates

New Features

  • Added support for the following AI-powered functions in functions.py (Private Preview):
    • ai_filter
    • ai_agg
    • summarize_agg
  • Added support for the new FILE SQL type support, with the following related functions in functions.py (Private Preview):
    • fl_get_content_type
    • fl_get_etag
    • fl_get_file_type
    • fl_get_last_modified
    • fl_get_relative_path
    • fl_get_scoped_file_url
    • fl_get_size
    • fl_get_stage
    • fl_get_stage_file_url
    • fl_is_audio
    • fl_is_compressed
    • fl_is_document
    • fl_is_image
    • fl_is_video
  • Added support for importing third-party packages from PyPi using Artifact Repository (Private Preview):
    • Use keyword arguments artifact_repository and artifact_repository_packages to specify your artifact repository and packages respectively when registering stored procedures or user defined functions.
    • Supported APIs are:
      • Session.sproc.register
      • Session.udf.register
      • Session.udaf.register
      • Session.udtf.register
      • functions.sproc
      • functions.udf
      • functions.udaf
      • functions.udtf
      • functions.pandas_udf
      • functions.pandas_udtf

Bug Fixes

  • Fixed a bug where creating a Dataframe with large number of values raised Unsupported feature 'SCOPED_TEMPORARY'. error if thread-safe session was disabled.
  • Fixed a bug where df.describe raised internal SQL execution error when the dataframe is created from reading a stage file and CTE optimization is enabled.
  • Fixed a bug where df.order_by(A).select(B).distinct() would generate invalid SQL when simplified query generation was enabled using session.conf.set("use_simplified_query_generation", True).
    • Disabled simplified query generation by default.

Improvements

  • Improved version validation warnings for snowflake-snowpark-python package compatibility when registering stored procedures. Now, warnings are only triggered if the major or minor version does not match, while bugfix version differences no longer generate warnings.
  • Bumped cloudpickle dependency to also support cloudpickle==3.0.0 in addition to previous versions.

Snowpark Local Testing Updates

New Features

  • Added support for literal values to range_between window function.

Snowpark pandas API Updates

New Features

  • Added support for applying Snowflake Cortex functions ClassifyText, Translate, and ExtractAnswer.

Improvements

  • Improve error message for pd.to_snowflake, DataFrame.to_snowflake, and Series.to_snowflake when the table does not exist.
  • Improve readability of docstring for the if_exists parameter in pd.to_snowflake, DataFrame.to_snowflake, and Series.to_snowflake.
  • Improve error message for all pandas functions that use UDFs with Snowpark objects.

Bug Fixes

  • Fixed a bug in Series.rename_axis where an AttributeError was being raised.
  • Fixed a bug where pd.get_dummies didn't ignore NULL/NaN values by default.
  • Fixed a bug where repeated calls to pd.get_dummies results in 'Duplicated column name error'.
  • Fixed a bug in pd.get_dummies where passing list of columns generated incorrect column labels in output DataFrame.
  • Update pd.get_dummies to return bool values instead of int.

Release

20 Feb 18:01
2b1e444

Choose a tag to compare

1.28.0 (2025-02-20)

Snowpark Python API Updates

New Features

  • Added support for the following functions in functions.py
    • normal
    • randn
  • Added support for allow_missing_columns parameter to Dataframe.union_by_name and Dataframe.union_all_by_name.

Improvements

  • Improved the random object name generation to avoid collisions.
  • Improved query generation for Dataframe.distinct to generate SELECT DISTINCT instead of SELECT with GROUP BY all columns. To disable this feature, set session.conf.set("use_simplified_query_generation", False).

Deprecations

  • Deprecated Snowpark Python function snowflake_cortex_summarize. Users can install snowflake-ml-python and use the snowflake.cortex.summarize function instead.
  • Deprecated Snowpark Python function snowflake_cortex_sentiment. Users can install snowflake-ml-python and use the snowflake.cortex.sentiment function instead.

Bug Fixes

  • Fixed a bug where session-level query tag was overwritten by a stacktrace for dataframes that generate multiple queries. Now, the query tag will only be set to the stacktrace if session.conf.set("collect_stacktrace_in_query_tag", True).
  • Fixed a bug in Session._write_pandas where it was erroneously passing use_logical_type parameter to Session._write_modin_pandas_helper when writing a Snowpark pandas object.
  • Fixed a bug in options sql generation that could cause multiple values to be formatted incorrectly.
  • Fixed a bug in Session.catalog where empty strings for database or schema were not handled correctly and were generating erroneous sql statements.

Experimental Features

  • Added support for writing pyarrow Tables to Snowflake tables.

Snowpark pandas API Updates

New Features

  • Added support for applying Snowflake Cortex functions Summarize and Sentiment.
  • Added support for list values in Series.str.get.

Bug Fixes

  • Fixed a bug in apply where kwargs were not being correctly passed into the applied function.

Snowpark Local Testing Updates

New Features

  • Added support for the following functions
    • hour
    • minute
  • Added support for NULL_IF parameter to csv reader.
  • Added support for date_format, datetime_format, and timestamp_format options when loading csvs.

Bug Fixes

  • Fixed a bug in Dataframe.join that caused columns to have incorrect typing.
  • Fixed a bug in when statements that caused incorrect results in the otherwise clause.

Release

04 Feb 02:44

Choose a tag to compare

1.27.0 (2025-02-03)

Snowpark Python API Updates

New Features

  • Added support for the following functions in functions.py
    • array_reverse
    • divnull
    • map_cat
    • map_contains_key
    • map_keys
    • nullifzero
    • snowflake_cortex_sentiment
    • acosh
    • asinh
    • atanh
    • bit_length
    • bitmap_bit_position
    • bitmap_bucket_number
    • bitmap_construct_agg
    • cbrt
    • equal_null
    • from_json
    • ifnull
    • localtimestamp
    • max_by
    • min_by
    • nth_value
    • nvl
    • octet_length
    • position
    • regr_avgx
    • regr_avgy
    • regr_count
    • regr_intercept
    • regr_r2
    • regr_slope
    • regr_sxx
    • regr_sxy
    • regr_syy
    • try_to_binary
    • base64
    • base64_decode_string
    • base64_encode
    • editdistance
    • hex
    • hex_encode
    • instr
    • log1p
    • log2
    • log10
    • percentile_approx
    • unbase64
  • Added support for specifying a schema string (including implicit struct syntax) when calling DataFrame.create_dataframe.
  • Added support for DataFrameWriter.insert_into/insertInto. This method also supports local testing mode.
  • Added support for DataFrame.create_temp_view to create a temporary view. It will fail if the view already exists.
  • Added support for multiple columns in the functions map_cat and map_concat.
  • Added an option keep_column_order for keeping original column order in DataFrame.with_column and DataFrame.with_columns.
  • Added options to column casts that allow renaming or adding fields in StructType columns.
  • Added support for contains_null parameter to ArrayType.
  • Added support for creating a temporary view via DataFrame.create_or_replace_temp_view from a DataFrame created by reading a file from a stage.
  • Added support for value_contains_null parameter to MapType.
  • Added interactive to telemetry that indicates whether the current environment is an interactive one.
  • Allow session.file.get in a Native App to read file paths starting with / from the current version
  • Added support for multiple aggregation functions after DataFrame.pivot.

Experimental Features

  • Added Catalog class to manage snowflake objects. It can be accessed via Session.catalog.
    • snowflake.core is a dependency required for this feature.
  • Allow user input schema when reading JSON file on stage.
  • Added support for specifying a schema string (including implicit struct syntax) when calling DataFrame.create_dataframe.

Improvements

  • Updated README.md to include instructions on how to verify package signatures using cosign.

Bug Fixes

  • Fixed a bug in local testing mode that caused a column to contain None when it should contain 0.
  • Fixed a bug in StructField.from_json that prevented TimestampTypes with tzinfo from being parsed correctly.
  • Fixed a bug in function date_format that caused an error when the input column was date type or timestamp type.
  • Fixed a bug in dataframe that null value can be inserted in a non-nullable column.
  • Fixed a bug in replace and lit which raised type hint assertion error when passing Column expression objects.
  • Fixed a bug in pandas_udf and pandas_udtf where session parameter was erroneously ignored.
  • Fixed a bug that raised incorrect type conversion error for system function called through session.call.

Snowpark pandas API Updates

New Features

  • Added support for Series.str.ljust and Series.str.rjust.
  • Added support for Series.str.center.
  • Added support for Series.str.pad.
  • Added support for applying Snowpark Python function snowflake_cortex_sentiment.
  • Added support for DataFrame.map.
  • Added support for DataFrame.from_dict and DataFrame.from_records.
  • Added support for mixed case field names in struct type columns.
  • Added support for SeriesGroupBy.unique
  • Added support for Series.dt.strftime with the following directives:
    • %d: Day of the month as a zero-padded decimal number.
    • %m: Month as a zero-padded decimal number.
    • %Y: Year with century as a decimal number.
    • %H: Hour (24-hour clock) as a zero-padded decimal number.
    • %M: Minute as a zero-padded decimal number.
    • %S: Second as a zero-padded decimal number.
    • %f: Microsecond as a decimal number, zero-padded to 6 digits.
    • %j: Day of the year as a zero-padded decimal number.
    • %X: Locale’s appropriate time representation.
    • %%: A literal '%' character.
  • Added support for Series.between.
  • Added support for include_groups=False in DataFrameGroupBy.apply.
  • Added support for expand=True in Series.str.split.
  • Added support for DataFrame.pop and Series.pop.
  • Added support for first and last in DataFrameGroupBy.agg and SeriesGroupBy.agg.
  • Added support for Index.drop_duplicates.
  • Added support for aggregations "count", "median", np.median,
    "skew", "std", np.std "var", and np.var in
    pd.pivot_table(), DataFrame.pivot_table(), and pd.crosstab().

Improvements

  • Improve performance of DataFrame.map, Series.apply and Series.map methods by mapping numpy functions to snowpark functions if possible.
  • Added documentation for DataFrame.map.
  • Improve performance of DataFrame.apply by mapping numpy functions to snowpark functions if possible.
  • Added documentation on the extent of Snowpark pandas interoperability with scikit-learn.
  • Infer return type of functions in Series.map, Series.apply and DataFrame.map if type-hint is not provided.
  • Added call_count to telemetry that counts method calls including interchange protocol calls.

Release

05 Dec 22:11

Choose a tag to compare

1.26.0 (2024-12-05)

Snowpark Python API Updates

New Features

  • Added support for property version and class method get_active_session for Session class.
  • Added new methods and variables to enhance data type handling and JSON serialization/deserialization:
    • To DataType, its derived classes, and StructField:
      • type_name: Returns the type name of the data.
      • simple_string: Provides a simple string representation of the data.
      • json_value: Returns the data as a JSON-compatible value.
      • json: Converts the data to a JSON string.
    • To ArrayType, MapType, StructField, PandasSeriesType, PandasDataFrameType and StructType:
      • from_json: Enables these types to be created from JSON data.
    • To MapType:
      • keyType: keys of the map
      • valueType: values of the map
  • Added support for method appName in SessionBuilder.
  • Added support for include_nulls argument in DataFrame.unpivot.
  • Added support for following functions in functions.py:
    • size to get size of array, object, or map columns.
    • collect_list an alias of array_agg.
    • substring makes len argument optional.
  • Added parameter ast_enabled to session for internal usage (default: False).

Improvements

  • Added support for specifying the following to DataFrame.create_or_replace_dynamic_table:
    • iceberg_config A dictionary that can hold the following iceberg configuration options:
      • external_volume
      • catalog
      • base_location
      • catalog_sync
      • storage_serialization_policy
  • Added support for nested data types to DataFrame.print_schema
  • Added support for level parameter to DataFrame.print_schema
  • Improved flexibility of DataFrameReader and DataFrameWriter API by adding support for the following:
    • Added format method to DataFrameReader and DataFrameWriter to specify file format when loading or unloading results.
    • Added load method to DataFrameReader to work in conjunction with format.
    • Added save method to DataFrameWriter to work in conjunction with format.
    • Added support to read keyword arguments to options method for DataFrameReader and DataFrameWriter.
  • Relaxed the cloudpickle dependency for Python 3.11 to simplify build requirements. However, for Python 3.11, cloudpickle==2.2.1 remains the only supported version.

Bug Fixes

  • Removed warnings that dynamic pivot features were in private preview, because
    dynamic pivot is now generally available.
  • Fixed a bug in session.read.options where False Boolean values were incorrectly parsed as True in the generated file format.

Dependency Updates

  • Added a runtime dependency on python-dateutil.

Snowpark pandas API Updates

New Features

  • Added partial support for Series.map when arg is a pandas Series or a
    collections.abc.Mapping. No support for instances of dict that implement
    __missing__ but are not instances of collections.defaultdict.
  • Added support for DataFrame.align and Series.align for axis=1 and axis=None.
  • Added support for pd.json_normalize.
  • Added support for GroupBy.pct_change with axis=0, freq=None, and limit=None.
  • Added support for DataFrameGroupBy.__iter__ and SeriesGroupBy.__iter__.
  • Added support for np.sqrt, np.trunc, np.floor, numpy trig functions, np.exp, np.abs, np.positive and np.negative.
  • Added partial support for the dataframe interchange protocol method
    DataFrame.__dataframe__().

Bug Fixes

  • Fixed a bug in df.loc where setting a single column from a series results in unexpected None values.

Improvements

  • Use UNPIVOT INCLUDE NULLS for unpivot operations in pandas instead of sentinel values.
  • Improved documentation for pd.read_excel.