Releases: snowflakedb/snowpark-python
Releases · snowflakedb/snowpark-python
Release
1.43.0 (2025-12-03)
Snowpark Python API Updates
New Features
- Added support for
DataFrame.lateral_join - Added support for PrPr feature
Session.client_telemetry. - Added support for
Session.udf_profiler. - Added support for
functions.ai_translate. - Added support for the following
iceberg_configoptions inDataFrameWriter.save_as_tableandDataFrame.copy_into_table:target_file_sizepartition_by
- Added support for the following functions in
functions.py:-
String and Binary functions:
base64_decode_binarybucketcompressdaydecompress_binarydecompress_stringmd5_binarymd5_number_lower64md5_number_upper64sha1_binarysha2_binarysoundex_p123strtoktruncatetry_base64_decode_binarytry_base64_decode_stringtry_hex_decode_binarytry_hex_decode_stringunicodeuuid_string
-
Conditional expressions:
booland_aggboolxor_aggregr_valyzeroifnull
-
Numeric expressions:
cotmodpisquarewidth_bucket
-
Bug Fixes
- Fixed a bug where automatically-generated temporary objects were not properly cleaned up.
- Fixed with a bug when sql generation when joining two
DataFrames created usingDataFrame.aliasand CTE optimization is enabled. - Fixed a bug in
XMLReaderwhere finding the start position of a row tag could return an incorrect file position.
Improvements
- Enhanced
DataFrame.sort()to supportORDER BY ALLwhen no columns are specified. - Removed experimental warning from
Session.cte_optimization_enabled.
Snowpark pandas API Updates
New Features
- Added support for
Dataframe.groupby.rolling(). - Added support for mapping
np.percentilewith DataFrame and Series inputs toSeries.quantile. - Added support for setting the
random_stateparameter to an integer when callingDataFrame.sampleorSeries.sample. - Added support for the following
iceberg_configoptions into_iceberg:target_file_sizepartition_by
Improvements
- Enhanced autoswitching functionality from Snowflake to native pandas for methods with unsupported argument combinations:
shift()withsuffixor non-integerperiodsparameterssort_index()withaxis=1orkeyparameterssort_values()withaxis=1melt()withcol_levelparameterapply()withresult_typeparameter for DataFramepivot_table()withsort=True, non-stringindexlist, non-stringcolumnslist, non-stringvalueslist, oraggfuncdict with non-string valuesfillna()withdowncastparameter or usinglimittogether withvaluedropna()withaxis=1asfreq()withhowparameter,fill_valueparameter,normalize=True, orfreqparameter being week, month, quarter, or yeargroupby()withaxis=1,by!=None and level!=None, or by containing any non-pandas hashable labels.groupby_fillna()withdowncastparametergroupby_first()withmin_count>1groupby_last()withmin_count>1groupby_shift()withfreqparameter
- Slightly improved the performance of
agg,nunique,describe, and related methods on 1-column DataFrame and Series objects.
Bug Fixes
- Fixed a bug in
DataFrameGroupBy.aggwhere func is a list of tuples used to set the names of the output columns. - Fixed a bug where converting a modin datetime index with a timezone to a numpy array with
np.asarraywould cause aTypeError. - Fixed a bug where
Series.isinwith a Series argument matched index labels instead of the row position.
Improvements
- Add support for the following in faster pandas:
groupby.applygroupby.nuniquegroupby.sizeconcatcopystr.isdigitstr.islowerstr.isupperstr.istitlestr.lowerstr.upperstr.titlestr.matchstr.capitalizestr.__getitem__str.centerstr.countstr.getstr.padstr.lenstr.ljuststr.rjuststr.splitstr.replacestr.stripstr.lstripstr.rstripstr.translatedt.tz_localizedt.tz_convertdt.ceildt.rounddt.floordt.normalizedt.month_namedt.day_namedt.strftimedt.dayofweekdt.weekdaydt.dayofyeardt.isocalendarrolling.minrolling.maxrolling.countrolling.sumrolling.meanrolling.stdrolling.varrolling.semrolling.correxpanding.minexpanding.maxexpanding.countexpanding.sumexpanding.meanexpanding.stdexpanding.varexpanding.semcumsumcummincummaxgroupby.groupsgroupby.indicesgroupby.firstgroupby.lastgroupby.rankgroupby.shiftgroupby.cumcountgroupby.cumsumgroupby.cummingroupby.cummaxgroupby.anygroupby.allgroupby.uniquegroupby.get_groupgroupby.rollinggroupby.resampleto_snowflaketo_snowparkresample.minresample.maxresample.countresample.sumresample.meanresample.medianresample.stdresample.varresample.sizeresample.firstresample.lastresample.quantileresample.nunique
- Make faster pandas disabled by default (opt-in instead of opt-out).
- Improve performance of
drop_duplicatesby avoiding joins whenkeep!=Falsein faster pandas.
Release
1.42.0 (2025-10-28)
Snowpark Python API Updates
New Features
- Snowpark python DB-api is now generally available. Access this feature with
DataFrameReader.dbapi()to read data from a database table or query into a DataFrame using a DBAPI connection.
Release
1.41.0 (2025-10-23)
Snowpark Python API Updates
New Features
- Added a new function
serviceinsnowflake.snowpark.functionsthat allows users to create a callable representing a Snowpark Container Services (SPCS) service. - Added
connection_parametersparameter toDataFrameReader.dbapi()(PuPr) method to allow passing keyword arguments to thecreate_connectioncallable. - Added support for
Session.begin_transaction,Session.commitandSession.rollback. - Added support for the following functions in
functions.py:- Geospatial functions:
st_interpolatest_intersectionst_intersection_aggst_intersectsst_isvalidst_lengthst_makegeompointst_makelinest_makepolygonst_makepolygonorientedst_disjointst_distancest_dwithinst_endpointst_envelopest_geohashst_geomfromgeohashst_geompointfromgeohashst_hausdorffdistancest_makepointst_npointsst_perimeterst_pointnst_setsridst_simplifyst_sridst_startpointst_symdifferencest_transformst_unionst_union_aggst_withinst_xst_xmaxst_xminst_yst_ymaxst_yminst_geogfromgeohashst_geogpointfromgeohashst_geographyfromwkbst_geographyfromwktst_geometryfromwkbst_geometryfromwkttry_to_geographytry_to_geometry
- Geospatial functions:
- Added a parameter to enable and disable automatic column name aliasing for
interval_day_time_from_partsandinterval_year_month_from_partsfunctions.
Bug Fixes
- Fixed a bug that
DataFrameReader.xmlfails to parse XML files with undeclared namespaces whenignoreNamespaceisTrue. - Added a fix for floating point precision discrepancies in
interval_day_time_from_parts. - Fixed a bug where writing Snowpark pandas dataframes on the pandas backend with a column multiindex to Snowflake with
to_snowflakewould raiseKeyError. - Fixed a bug that
DataFrameReader.dbapi(PuPr) is not compatible with oracledb 3.4.0. - Fixed a bug where
modinwould unintentionally be imported during session initialization in some scenarios. - Fixed a bug where
session.udf|udtf|udaf|sproc.registerfailed when an extra session argument was passed. These methods do not expect a session argument; please remove it if provided.
Improvements
- The default maximum length for inferred StringType columns during schema inference in
DataFrameReader.dbapiis now increased from 16MB to 128MB in parquet file based ingestion.
Dependency Updates
- Updated dependency of
snowflake-connector-python>=3.17,<5.0.0.
Snowpark pandas API Updates
New Features
- Added support for the
dtypesparameter ofpd.get_dummies - Added support for
nuniqueindf.pivot_table,df.aggand other places where aggregate functions can be used. - Added support for
DataFrame.interpolateandSeries.interpolatewith the "linear", "ffill"/"pad", and "backfill"/bfill" methods. These use the SQLINTERPOLATE_LINEAR,INTERPOLATE_FFILL, andINTERPOLATE_BFILLfunctions (PuPr).
Improvements
- Improved performance of
Series.to_snowflakeandpd.to_snowflake(series)for large data by uploading data via a parquet file. You can control the dataset size at which Snowpark pandas switches to parquet with the variablemodin.config.PandasToSnowflakeParquetThresholdBytes. - Enhanced autoswitching functionality from Snowflake to native Pandas for methods with unsupported argument combinations:
get_dummies()withdummy_na=True,drop_first=True, or customdtypeparameterscumsum(),cummin(),cummax()withaxis=1(column-wise operations)skew()withaxis=1ornumeric_only=Falseparametersround()withdecimalsparameter as a Seriescorr()withmethod!=pearsonparameter
- Set
cte_optimization_enabledto True for all Snowpark pandas sessions. - Add support for the following in faster pandas:
isinisnaisnullnotnanotnullstr.containsstr.startswithstr.endswithstr.slicedt.datedt.timedt.hourdt.minutedt.seconddt.microseconddt.nanoseconddt.yeardt.monthdt.daydt.quarterdt.is_month_startdt.is_month_enddt.is_quarter_startdt.is_quarter_enddt.is_year_startdt.is_year_enddt.is_leap_yeardt.days_in_monthdt.daysinmonthsort_valuesloc(setting columns)to_datetimerenamedropinvertduplicatedilocheadcolumns(e.g., df.columns = ["A", "B"])aggminmaxcountsummeanmedianstdvargroupby.agggroupby.mingroupby.maxgroupby.countgroupby.sumgroupby.meangroupby.mediangroupby.stdgroupby.vardrop_duplicates
- Reuse row count from the relaxed query compiler in
get_axis_len.
Bug Fixes
- Fixed a bug where the row count was not getting cached in the ordered dataframe each time count_rows() is called.
Release
1.40.0 (2025-10-02)
Snowpark Python API Updates
New Features
-
Added a new module
snowflake.snowpark.secretsthat provides Python wrappers for accessing Snowflake Secrets within Python UDFs and stored procedures that execute inside Snowflake.get_generic_secret_stringget_oauth_access_tokenget_secret_typeget_username_passwordget_cloud_provider_token
-
Added support for the following scalar functions in
functions.py:-
Conditional expression functions:
boolandboolnotboolorboolxorboolor_aggdecodegreatest_ignore_nullsleast_ignore_nullsnullifnvl2regr_valx
-
Semi-structured and structured date functions:
array_remove_atas_booleanmap_deletemap_insertmap_pickmap_size
-
String & binary functions:
chrhex_decode_binary
-
Numeric functions:
div0null
-
Differential privacy functions:
dp_interval_highdp_interval_low
-
Context functions:
last_query_idlast_transaction
-
Geospatial functions:
h3_cell_to_boundaryh3_cell_to_childrenh3_cell_to_children_stringh3_cell_to_parenth3_cell_to_pointh3_compact_cellsh3_compact_cells_stringsh3_coverageh3_coverage_stringsh3_get_resolutionh3_grid_diskh3_grid_distanceh3_int_to_stringh3_polygon_to_cellsh3_polygon_to_cells_stringsh3_string_to_inth3_try_grid_pathh3_try_polygon_to_cellsh3_try_polygon_to_cells_stringsh3_uncompact_cellsh3_uncompact_cells_stringshaversineh3_grid_pathh3_is_pentagonh3_is_valid_cellh3_latlng_to_cellh3_latlng_to_cell_stringh3_point_to_cellh3_point_to_cell_stringh3_try_coverageh3_try_coverage_stringsh3_try_grid_distancest_areast_asewkbst_asewktst_asgeojsonst_aswkbst_aswktst_azimuthst_bufferst_centroidst_collectst_containsst_coveredbyst_coversst_differencest_dimension
-
Bug Fixes
- Fixed a bug that
DataFrame.limit()fail if there is parameter binding in the executed SQL when used in non-stored-procedure/udxf environment. - Added an experimental fix for a bug in schema query generation that could cause invalid sql to be generated when using nested structured types.
- Fixed multiple bugs in
DataFrameReader.dbapi(PuPr):- Fixed UDTF ingestion failure with
pyodbcdriver caused by unprocessed row data. - Fixed SQL Server query input failure due to incorrect select query generation.
- Fixed UDTF ingestion not preserving column nullability in the output schema.
- Fixed an issue that caused the program to hang during multithreaded Parquet based ingestion when a data fetching error occurred.
- Fixed a bug in schema parsing when custom schema strings used upper-cased data type names (NUMERIC, NUMBER, DECIMAL, VARCHAR, STRING, TEXT).
- Fixed UDTF ingestion failure with
- Fixed a bug in
Session.create_dataframewhere schema string parsing failed when using upper-cased data type names (e.g., NUMERIC, NUMBER, DECIMAL, VARCHAR, STRING, TEXT).
Improvements
- Improved
DataFrameReader.dbapi(PuPr) that dbapi will not retry on non-retryable error such as SQL syntax error on external data source query. - Removed unnecessary warnings about local package version mismatch when using
session.read.option('rowTag', <tag_name>).xml(<stage_file_path>)orxpathfunctions. - Improved
DataFrameReader.dbapi(PuPr) reading performance by setting the defaultfetch_sizeparameter value to 100000. - Improved error message for XSD validation failure when reading XML files using
session.read.option('rowValidationXSDPath', <xsd_path>).xml(<stage_file_path>).
Snowpark pandas API Updates
Dependency Updates
- Updated the supported
modinversions to >=0.36.0 and <0.38.0 (was previously >= 0.35.0 and <0.37.0).
New Features
- Added support for
DataFrame.queryfor dataframes with single-level indexes. - Added support for
DataFrameGroupby.__len__andSeriesGroupBy.__len__.
Improvements
- Hybrid execution mode is now enabled by default. Certain operations on smaller data will now automatically execute in native pandas in-memory. Use
from modin.config import AutoSwitchBackend; AutoSwitchBackend.disable()to turn this off and force all execution to occur in Snowflake. - Added a session parameter
pandas_hybrid_execution_enabledto enable/disable hybrid execution as an alternative to usingAutoSwitchBackend. - Removed an unnecessary
SHOW OBJECTSquery issued fromread_snowflakeunder certain conditions. - When hybrid execution is enabled,
pd.merge,pd.concat,DataFrame.merge, andDataFrame.joinmay now move arguments to backends other than those among the function arguments. - Improved performance of
DataFrame.to_snowflakeandpd.to_snowflake(dataframe)for large data by uploading data via a parquet file. You can control the dataset size at which Snowpark pandas switches to parquet with the variablemodin.config.PandasToSnowflakeParquetThresholdBytes.
Release
1.39.1 (2024-09-25)
Snowpark Python API Updates
Bug Fixes
- Added an experimental fix for a bug in schema query generation that could cause invalid sql to be genrated when using nested structured types.
Release
1.39.0 (2025-09-17)
Snowpark Python API Updates
New Features
- Added support for unstructured data engineering in Snowpark, powered by Snowflake AISQL and Cortex functions:
DataFrame.ai.complete: Generate per-row LLM completions from prompts built over columns and files.DataFrame.ai.filter: Keep rows where an AI classifier returns TRUE for the given predicate.DataFrame.ai.agg: Reduce a text column into one result using a natural-language task description.RelationalGroupedDataFrame.ai_agg: Perform the same natural-language aggregation per group.DataFrame.ai.classify: Assign single or multiple labels from given categories to text or images.DataFrame.ai.similarity: Compute cosine-based similarity scores between two columns via embeddings.DataFrame.ai.sentiment: Extract overall and aspect-level sentiment from text into JSON.DataFrame.ai.embed: Generate VECTOR embeddings for text or images using configurable models.DataFrame.ai.summarize_agg: Aggregate and produce a single comprehensive summary over many rows.DataFrame.ai.transcribe: Transcribe audio files to text with optional timestamps and speaker labels.DataFrame.ai.parse_document: OCR/layout-parse documents or images into structured JSON.DataFrame.ai.extract: Pull structured fields from text or files using a response schema.DataFrame.ai.count_tokens: Estimate token usage for a given model and input text per row.DataFrame.ai.split_text_markdown_header: Split Markdown into hierarchical header-aware chunks.DataFrame.ai.split_text_recursive_character: Split text into size-bounded chunks using recursive separators.DataFrameReader.file: Create a DataFrame containing all files from a stage as FILE data type for downstream unstructured data processing.
- Added a new datatype
YearMonthIntervalTypethat allows users to create intervals for datetime operations. - Added a new function
interval_year_month_from_partsthat allows users to easily createYearMonthIntervalTypewithout using SQL. - Added a new datatype
DayTimeIntervalTypethat allows users to create intervals for datetime operations. - Added a new function
interval_day_time_from_partsthat allows users to easily createDayTimeIntervalTypewithout using SQL. - Added support for
FileOperation.listto list files in a stage with metadata. - Added support for
FileOperation.removeto remove files in a stage. - Added an option to specify
copy_grantsfor the followingDataFrameAPIs:create_or_replace_viewcreate_or_replace_temp_viewcreate_or_replace_dynamic_table
- Added a new function
snowflake.snowpark.functions.vectorizedthat allows users to mark a function as vectorized UDF. - Added support for parameter
use_vectorized_scannerin functionSession.write_pandas(). - Added support for the following scalar functions in
functions.py:getdategetvariableinvoker_roleinvoker_shareis_application_role_in_sessionis_database_role_in_sessionis_granted_to_invoker_roleis_role_in_sessionlocaltimesystimestamp
Bug Fixes
Deprecations
- Deprecated warnings will be triggered when using snowpark-python with Python 3.9. For more details, please refer to https://docs.snowflake.com/en/developer-guide/python-runtime-support-policy.
Dependency Updates
Improvements
- Unsupported types in
DataFrameReader.dbapi(PuPr) are ingested asStringTypenow. - Improved error message to list available columns when dataframe cannot resolve given column name.
- Added a new option
cacheResulttoDataFrameReader.xmlthat allows users to cache the result of the XML reader to a temporary table after callingxml. It helps improve performance when subsequent operations are performed on the same DataFrame.
Snowpark pandas API Updates
New Features
Improvements
- Downgraded to level
logging.DEBUG - 1the log message saying that the
SnowparkDataFramereference of an internalDataFrameReferenceobject
has changed. - Eliminate duplicate parameter check queries for casing status when retrieving the session.
- Retrieve dataframe row counts through object metadata to avoid a COUNT(*) query (performance)
- Added support for applying Snowflake Cortex function
Complete. - Introduce faster pandas: Improved performance by deferring row position computation.
- The following operations are currently supported and can benefit from the optimization:
read_snowflake,repr,loc,reset_index,merge, and binary operations. - If a lazy object (e.g., DataFrame or Series) depends on a mix of supported and unsupported operations, the optimization will not be used.
- The following operations are currently supported and can benefit from the optimization:
- Updated the error message for when Snowpark pandas is referenced within apply.
- Added a session parameter
dummy_row_pos_optimization_enabledto enable/disable dummy row position optimization in faster pandas.
Dependency Updates
- Updated the supported
modinversions to >=0.35.0 and <0.37.0 (was previously >= 0.34.0 and <0.36.0).
Bug Fixes
- Fixed an issue with drop_duplicates where the same data source could be read multiple times in the same query but in a different order each time, resulting in missing rows in the final result. The fix ensures that the data source is read only once.
- Fixed a bug with hybrid execution mode where an
AssertionErrorwas unexpectedly raised by certain indexing operations.
Snowpark Local Testing Updates
New Features
- Added support to allow patching
functions.ai_complete.
Release
1.38.0 (2025-09-04)
Snowpark Python API Updates
New Features
- Added support for the following AI-powered functions in
functions.py:ai_extractai_parse_documentai_transcribe
- Added time travel support for querying historical data:
Session.table()now supports time travel parameters:time_travel_mode,statement,offset,timestamp,timestamp_type, andstream.DataFrameReader.table()supports the same time travel parameters as direct arguments.DataFrameReadersupports time travel via option chaining (e.g.,session.read.option("time_travel_mode", "at").option("offset", -60).table("my_table")).
- Added support for specifying the following parameters to
DataFrameWriter.copy_into_locationfor validation and writing data to external locations:validation_modestorage_integrationcredentialsencryption
- Added support for
Session.directoryandSession.read.directoryto retrieve the list of all files on a stage with metadata. - Added support for
DataFrameReader.jdbc(PrPr) that allows ingesting external data source with jdbc driver. - Added support for
FileOperation.copy_filesto copy files from a source location to an output stage. - Added support for the following scalar functions in
functions.py:all_user_namesbitandbitand_aggbitorbitor_aggbitxorbitxor_aggcurrent_account_namecurrent_clientcurrent_ip_addresscurrent_role_typecurrent_organization_namecurrent_organization_usercurrent_secondary_rolescurrent_transactiongetbit
Bug Fixes
- Fixed the repr of TimestampType to match the actual subtype it represents.
- Fixed a bug in
DataFrameReader.dbapithat udtf ingestion does not work in stored procedure. - Fixed a bug in schema inference that caused incorrect stage prefixes to be used.
Improvements
- Enhanced error handling in
DataFrameReader.dbapithread-based ingestion to prevent unnecessary operations, which improves resource efficiency. - Bumped cloudpickle dependency to also support
cloudpickle==3.1.1in addition to previous versions. - Improved
DataFrameReader.dbapi(PuPr) ingestion performance for PostgreSQL and MySQL by using server side cursor to fetch data.
Snowpark pandas API Updates
New Features
- Completed support for
pd.read_snowflake(),pd.to_iceberg(),
pd.to_pandas(),pd.to_snowpark(),pd.to_snowflake(),
DataFrame.to_iceberg(),DataFrame.to_pandas(),DataFrame.to_snowpark(),
DataFrame.to_snowflake(),Series.to_iceberg(),Series.to_pandas(),
Series.to_snowpark(), andSeries.to_snowflake()on the "Pandas" and "Ray"
backends. Previously, only some of these functions and methods were supported
on the Pandas backend. - Added support for
Index.get_level_values().
Improvements
- Set the default transfer limit in hybrid execution for data leaving Snowflake to 100k, which can be overridden with the SnowflakePandasTransferThreshold environment variable. This configuration is appropriate for scenarios with two available engines, "Pandas" and "Snowflake" on relational workloads.
- Improve import error message by adding
--upgradetopip install "snowflake-snowpark-python[modin]"in the error message. - Reduce the telemetry messages from the modin client by pre-aggregating into 5 second windows and only keeping a narrow band of metrics which are useful for tracking hybrid execution and native pandas performance.
- Set the initial row count only when hybrid execution is enabled. This reduces the number of queries issued for many workloads.
- Add a new test parameter for integration tests to enable hybrid execution.
Bug Fixes
- Raised
NotImplementedErrorinstead ofAttributeErroron attempting to call
Snowflake extension functions/methodsto_dynamic_table(),cache_result(),
to_view(),create_or_replace_dynamic_table(), and
create_or_replace_view()on dataframes or series using the pandas or ray
backends.
Release
1.37.0 (2025-08-18)
Snowpark Python API Updates
New Features
- Added support for the following
xpathfunctions infunctions.py:xpathxpath_stringxpath_booleanxpath_intxpath_floatxpath_doublexpath_longxpath_short
- Added support for parameter
use_vectorized_scannerin functionSession.write_arrow(). - Dataframe profiler adds the following information about each query: describe query time, execution time, and sql query text. To view this information, call session.dataframe_profiler.enable() and call get_execution_profile on a dataframe.
- Added support for
DataFrame.col_ilike. - Added support for non-blocking stored procedure calls that return
AsyncJobobjects.- Added
block: bool = Trueparameter toSession.call(). Whenblock=False, returns anAsyncJobinstead of blocking until completion. - Added
block: bool = Trueparameter toStoredProcedure.__call__()for async support across both named and anonymous stored procedures. - Added
Session.call_nowait()that is equivalent toSession.call(block=False).
- Added
Bug Fixes
- Fixed a bug in CTE optimization stage where
deepcopyof internal plans would cause a memory spike when a dataframe is created locally usingsession.create_dataframe()using a large input data. - Fixed a bug in
DataFrameReader.parquetwhere theignore_caseoption in theinfer_schema_optionswas not respected. - Fixed a bug that
to_pandas()has different format of column name when query result format is set to 'JSON' and 'ARROW'.
Deprecations
- Deprecated
pkg_resources.
Dependency Updates
- Added a dependency on
protobuf<6.32
Snowpark pandas API Updates
New Features
- Added support for efficient transfer of data between Snowflake and Ray with the
DataFrame.set_backendmethod. The installed version ofmodinmust be at least 0.35.0, andraymust be installed.
Improvements
Dependency Updates
- Updated the supported
modinversions to >=0.34.0 and <0.36.0 (was previously >= 0.33.0 and <0.35.0). - Added support for pandas 2.3 when the installed
modinversion is at least 0.35.0.
Bug Fixes
- Fixed an issue in hybrid execution mode (PrPr) where
pd.to_datetimeandpd.to_timedeltawould unexpectedly raiseIndexError. - Fixed a bug where
pd.explain_switchwould raiseIndexErroror returnNoneif called before any potential switch operations were performed.
Release
1.36.0 (2025-08-05)
Snowpark Python API Updates
New Features
Session.create_dataframenow accepts keyword arguments that are forwarded to the internal call toSession.write_pandasorSession.write_arrowwhen creating a DataFrame from a pandas DataFrame or a pyarrow Table.- Added new APIs for
AsyncJob:AsyncJob.is_failed()returns aboolindicating if a job has failed. Can be used in combination withAsyncJob.is_done()to determine if a job is finished and errored.AsyncJob.status()returns a string representing the current query status (e.g., "RUNNING", "SUCCESS", "FAILED_WITH_ERROR") for detailed monitoring without callingresult().
- Added a dataframe profiler. To use, you can call get_execution_profile() on your desired dataframe. This profiler reports the queries executed to evaluate a dataframe, and statistics about each of the query operators. Currently an experimental feature
- Added support for the following functions in
functions.py:ai_sentiment
- Updated the interface for experimental feature
context.configure_development_features. All development features are disabled by default unless explicitly enabled by the user.
Snowpark pandas API Updates
New Features
Improvements
- Hybrid execution row estimate improvements and a reduction of eager calls.
- Add a new configuration variable to control transfer costs out of Snowflake when using hybrid execution.
- Added support for creating permanent and immutable UDFs/UDTFs with
DataFrame/Series/GroupBy.apply,map, andtransformby passing thesnowflake_udf_paramskeyword argument. See documentation for details.
Bug Fixes
- Fixed an issue where Snowpark pandas plugin would unconditionally disable
AutoSwitchBackendeven when users had explicitly configured it via environment variables or programmatically.
Release
1.35.0 (2025-07-24)
Snowpark Python API Updates
New Features
- Added support for the following functions in
functions.py:ai_embedtry_parse_json
Bug Fixes
- Fixed a bug in
DataFrameReader.dbapi(PrPr) thatdbapifail in python stored procedure with process exit with code 1. - Fixed a bug in
DataFrameReader.dbapi(PrPr) thatcustom_schemaaccept illegal schema. - Fixed a bug in
DataFrameReader.dbapi(PrPr) thatcustom_schemadoes not work when connecting to Postgres and Mysql. - Fixed a bug in schema inference that would cause it to fail for external stages.
Improvements
- Improved
queryparameter inDataFrameReader.dbapi(PrPr) so that parentheses are not needed around the query. - Improved error experience in
DataFrameReader.dbapi(PrPr) when exception happen during inferring schema of target data source.
Snowpark Local Testing Updates
New Features
- Added local testing support for reading files with
SnowflakeFileusing local file paths, the Snow URL semantic (snow://...), local testing framework stages, and Snowflake stages (@stage/file_path).
Snowpark pandas API Updates
New Features
- Added support for
DataFrame.boxplot.
Improvements
- Reduced the number of UDFs/UDTFs created by repeated calls to
applyormapwith the same arguments on Snowpark pandas objects.
Bug Fixes
- Added an upper bound to the row estimation when the cartesian product from an align or join results in a very large number. This mitigates a performance regression.
- Fix a
pd.read_excelbug when reading files inside stage inner directory.