Releases: StarRocks/starrocks
3.3.5
3.3.5
Release date: October 23, 2024
New Features
- Supports millisecond and microsecond precision in the DATETIME type.
- Resource groups support CPU hard isolation.
Improvements
- Optimized performance and extraction strategy for Flat JSON. #50696
- Reduced memory usage for the following ARRAY functions:
- Optimized error messages when loading
Null
values into List partition keys with theNot Null
attribute. #51086 - Optimized error messages for Files() when authentication fails in the Files function. #51697
- Optimized internal statistics for
INSERT OVERWRITE
. #50417 - Shared-data clusters support garbage collection (GC) for persistent index files. #51684
- Added FE logs to help diagnose FE out-of-memory (OOM) issues. #51528
- Supports recovering metadata from the metadata directory of FE. #51040
Bug Fixes
Fixed the following issues:
- A deadlock issue caused by PIPE exceptions. #50841
- Dynamic partition creation failures block subsequent partition creation. #51440
- An error is returned for
UNION ALL
queries withORDER BY
. #51647 - CTE in UPDATE statements causes hints to be ignored. #51458
- The
load_finish_time
field in the system-defined viewstatistics.loads_history
does not update as expected after a loading task is completed. #51174 - UDTF mishandles multibyte UTF-8 characters. #51232
Behavior Changes
- Modified the return content of the
EXPLAIN
statement. After the change, the return content is equivalent toEXPLAIN COST
. You can configure the level of details returned byEXPLAIN
using the dynamic FE parameterquery_detail_explain_level
. The default value isCOSTS
, with other valid values beingNORMAL
andVERBOSE
. #51439
3.3.4
3.3.4
Release date: September 30, 2024
New Features
- Supports creating asynchronous materialized views on List Partition tables. #46680 #46808
- List Partition tables now support Nullable partition columns. #47797
- Supports viewing external file schema information using
DESC FILES()
. #50527 - Supports viewing replication task metrics via
SHOW PROC '/replications'
. #50483
Improvements
- Optimized data recycling performance for
TRUNCATE TABLE
in shared-data clusters. #49975 - Supports intermediate result spilling for CTE operators. #47982
- Supports adaptive phased scheduling to alleviate OOM issues caused by complex queries. #47868
- Supports predicate pushdown for STRING-type date or datatime columns in specific scenarios. #50643
- Supports COUNT DISTINCT computation on constant semi-structured data. #48273
- Added a new FE parameter
lake_enable_balance_tablets_between_workers
to enable tablet balancing for tables in shared-date clusters. #50843 - Enhanced query rewrite capabilities for generated columns. #50398
- Partial Update now supports automatically populating columns with default values of
CURRENT_TIMESTAMP
. #50287
Bug Fixes
Fixed the following issues:
- The error "version has been compacted" caused by an infinite loop on the FE side during Tablet Clone. #50561
- ISO- formatted DATETIME types cannot be pushed down. #49358
- In concurrent scenarios, data still existed after the tablet was deleted. #50382
- Incorrect results returned by the
yearweek
function. #51065 - An issue with low cardinality dictionaries in ARRAY during CTE queries. #51148
- After FE restarts, partition TTL-related parameters were lost for materialized views. #51028
- Data loss in columns defined with
CURRENT_TIMESTAMP
after upgrading. #50911 - A stack overflow caused by the
array_distinct
function. #51017 - Activation failures for materialized views after upgrading due to changes in default field lengths. You can avoid such issues by setting
enable_active_materialized_view_schema_strict_check
tofalse
. #50869 - Resource group property
cpu_weight
can be set to a negative value. #51005 - Incorrect statistics for disk capacity information. #50669
- Constant fold in the
replace
function. #50828
Behavior Changes
- Changed the default replica number for external catalog-based materialized views from
1
to the value of the FE parameterdefault_replication_num
(Default value:3
). #50931
3.2.11
Release date: September 9, 2024
Improvements
- Supports masking authentication information for Files() and PIPE. #47629
- Support automatic inference for the STRUCT type when reading Parquet files through Files(). #50481
Bug Fixes
Fixed the following issues:
- An error is returned for equi-join queries because they failed to be rewritten by the global dictionary. #50690
- The error "version has been compacted" caused by an infinite loop on the FE side during Tablet Clone. #50561
- Incorrect scheduling for unhealthy replica repairs after distributing data based on labels. #50331
- An error in the statistics collection log: "Unknown column '%s' in '%s." #50785
- Incorrect timezone usage when reading complex types like TIMESTAMP from Parquet files via Files(). #50448
Behavior Changes
- When downgrading StarRocks from v3.3.x to v3.2.11, the system will ignore it if there is incompatible metadata. #49636
3.3.3
3.3.3
Release date: September 5, 2024
New Features
- Supports user-level variables. #48477
- Supports Delta Lake Catalog metadata cache with manual and periodic refresh strategies. #46526 #49069
- Supports loading JSON types from Parquet files. #49385
- JDBC SQL Server Catalog supports queries with LIMIT. #48248
- Shared-data clusters support Partial Updates with INSERT INTO. #49336
Improvements
- Optimized error messages for loading:
- When memory limits are reached during loading, the IP of the corresponding BE node is returned for easier troubleshooting. #49335
- Detailed messages are provided when CSV data is loaded to target table columns that are not long enough. #49713
- Specific node information is provided when Kerberos authentication fails in Broker Load. #46085
- Optimized the partitioning mechanism during data loading to reduce memory usage in the initial stage. #47976
- Optimized memory usage for shared-nothing clusters by limiting metadata memory usage to avoid issues when there are too many Tablets or Segment files. #49170
- Optimized the performance of queries using
max(partition_column)
. #49391 - Partition pruning is used to optimize query performance when the partition column is a generated column (a column that is calculated based on a native column in the table), and the query predicate filter condition includes the native column. #48692
- Supports masking authentication information for Files() and PIPE. #47629
- Introduced a new statement
show proc '/global_current_queries'
to view queries running on all FE nodes.show proc '/current_queries'
only shows queries running on the current FE node. #49826
Bug Fixes
Fixed the following issues:
- The source cluster's BE nodes were mistakenly added to the current cluster when exporting data to the destination cluster via StarRocks external tables. #49323
- TINYINT data type returned NULL when StarRocks reads ORC files using
select * from files
from clusters deployed on aarch64 machines. #49517 - Stream Load fails when loading JSON files containing large Integer types. #49927
- Incorrect schema is returned due to improper handling of invisible characters when users load CSV files with Files(). #49718
- An issue with temporary partition replacement in tables with multiple partition columns. #49764
Behavior Changes
- Introduced a new parameter
object_storage_rename_file_request_timeout_ms
to better accommodate backup scenarios with cloud object storage. This parameter will be used as the backup timeout, with a default value of 30 seconds. #49706 to_json
,CAST(AS MAP)
, andSTRUCT AS JSON
will return NULL instead of throwing an error by default when the conversion fails. You can allow errors by setting the system variablesql_mode
toALLOW_THROW_EXCEPTION
. #50157
3.1.15
3.1.15
Release date: September 4, 2024
Bug Fixes
Fixed the following issues:
3.2.10
Release date: August 23, 2024
Improvements
- Files() will automatically convert
BYTE_ARRAY
data with alogical_type
ofJSON
in Parquet files to the JSON type in StarRocks. #49385 - Optimized error messages for Files() when Access Key ID and Secret Access Key are missing. #49090
information_schema.columns
supports theGENERATION_EXPRESSION
field. #49734
Bug Fixes
Fixed the following issues:
- Downgrading a v3.3 shared-data cluster to v3.2 after setting the Primary Key table property
"persistent_index_type" = "CLOUD_NATIVE"
causes a crash. #48149 - Exporting data to CSV files using SELECT INTO OUTFILE may cause data inconsistency. #48052
- Queries encounter failures during concurrent query execution. #48180
- Queries would hang due to a timeout in the Plan phase without exiting. #48405
- After disabling index compression for Primary Key tables in older versions and then upgrading to v3.2.9, accessing
page_off
information causes an array out-of-bounds crash. #48230 - BE crash caused by concurrent execution of ADD/DROP COLUMN operations. #49355
- Queries against negative
TINYINT
values in ORC format files returnNone
on the aarch64 architecture. #49517 - If the disk write operation fails, failures of
l0
snapshots for Primary Key Persistent Index may cause data loss. #48045 - Partial Update in Column mode for Primary Key tables fails under scenarios with large-volume data updates. #49054
- BE crash caused by Fast Schema Evolution when downgrading a v3.3.0 shared-data cluster to v3.2.9. #42737
partition_linve_nubmer
does not take effect. #49213- The conflict between index persistence and compaction in Primary Key tables could cause clone failures. #49341
- Modifications of
partition_line_number
using ALTER TABLE do not take effect. #49437 - Rewrite of CTE distinct grouping sets generates an invalid plan. #48765
- RPC failures polluted the thread pool. #49619
- authentication failure issues when loading files from AWS S3 via PIPE. #49837
Behavior Changes
- Added a check for the
meta
directory in the FE startup script. If the directory does not exist, it will be automatically created. #48940 - Added a memory limit parameter
load_process_max_memory_hard_limit_ratio
for data loading. If memory usage exceeds the limit, subsequent loading tasks will fail. #48495
3.3.2
Release date: August 8, 2024
New Features
-
Supports renaming columns within StarRocks internal tables. #47851
-
Supports reading Iceberg views. Currently, only Iceberg views created through StarRocks are supported. #46273
-
[Experimental] Supports adding and removing fields of STRUCT-type data. #46452
-
Supports specifying the compression level for ZSTD compression format during table creation. #46839
-
Added the following FE dynamic parameters to limit table boundaries. #47896
Including:
auto_partition_max_creation_number_per_load
max_partition_number_per_table
max_bucket_number_per_partition
max_column_number_per_table
-
Supports runtime optimization of table data distribution, ensuring optimization tasks do not conflict with DML operations on the table. #43747
-
Added an observability interface for the global hit rate of Data Cache. #48450
-
Added the SQL function array_repeat. #47862
Improvements
-
Optimized the error messages for Routine Load failures due to Kafka authentication failures. #46136 #47649
-
Stream Load supports using
\t
and\n
as row and column delimiters. Users do not need to convert them to their hexadecimal ASCII codes. #47302 -
Optimized the asynchronous statistics collection method for write operators, addressing the issue of increased latency when there are many import tasks. #48162
-
Added the following BE dynamic parameters to control resource hard limits during loading, reducing the impact on BE stability when writing a large number of tablets. #48495
Including:
load_process_max_memory_hard_limit_ratio
enable_new_load_on_memory_limit_exceeded
-
Added consistency checks for Column IDs within the same table to prevent Compaction errors. #48498
-
Supports persisting PIPE metadata to prevent metadata loss due to FE restarts. #48852
Bug Fixes
- The process could not end when creating a dictionary from an FE Follower. #47802
- Inconsistent information returned by the SHOW PARTITIONS command in shared-data clusters and shared-nothing clusters. #48647
- Data errors caused by incorrect type handling when loading data from JSON fields to
ARRAY<BOOLEAN>
columns. #48387 - The
query_id
column ininformation_schema.task_runs
cannot be queried. #48876 - During Backup, multiple requests for the same operation are submitted to different Brokers, causing request errors. #48856
- Downgrading to versions earlier than v3.1.11 or v3.2.4 causes Primary Key table index decompression failures, leading to query errors. #48659
Downgrade Notes
If you have used the renaming column feature, you must rename the columns to their original names before downgrading your cluster to an earlier version. You can check the audit log of your cluster after upgrading to identify any ALTER TABLE RENAME COLUMN
operations and the original names of the columns.
3.1.14
Release date: July 29, 2024
Improvements
- Stream Load now supports using
\t
and\n
as row and column delimiters. Users do not need to convert them to their hexadecimal ASCII codes. #47302
Bug Fixes
Fixed the following issues:
- Frequent INSERT and UPDATE operations on Primary Key tables may cause write and query delays in the database. #47838
- When a Primary Key table encounters data persistence failures, the persistent index may fail to capture the error, leading to data loss and reporting the error "Insert found duplicate key". #48045
- Materialized views may report insufficient permissions when refreshed. #47561
- Materialized view reports the error "For input string" when refreshed. #46131
- During materialized view refresh, the lock is held excessively long, causing the Leader FE to be restarted by the deadlock detection script. #48256
- Queries against views with the IN clause in its definition may return inaccurate results. #47484
- Global Runtime Filter causes incorrect results. #48496
- MySQL protocol
COM_CHANGE_USER
does not supportconn_attr
. #47796
Behavior Changes
- When users create a non-partitioned table without specifying the bucket number, the minimum bucket number the system sets for the table is
16
(instead of2
based on the formula2*BE or CN count
). If users want to set a smaller bucket number when creating a small table, they must set it explicitly. #47005
3.3.1
Release date: July 18, 2024
New Features
- [Preview] Supports temporary tables.
- [Preview] JDBC Catalog supports Oracle and SQL Server.
- [Preview] Unified Catalog supports Kudu.
- Loading data into Primary Key tables with INSERT INTO supports partial updates in column mode.
- User-defined variables support the ARRAY type. #42631
- Stream Load supports converting JSON-type data and loading it into columns of STRUCT/MAP/ARRAY types. #45406
- Supports global dictionary cache.
- Supports deleting partitions in batch. #44744
- Supports queries on Iceberg views. #46273
- Supports managing column-level permissions in Apache Ranger. (Column-level permissions for materialized views and views must be set under the table object.) #47702
Improvements
- Optimized the IdChain hashcode implementation to reduce the FE restart time. #47599
- Improved error messages for the
csv.trim_space
parameter in the FILES() function, checking for illegal characters and providing reasonable prompts. #44740 - Stream Load supports using
\t
and\n
as row and column delimiters. Users do not need to convert them to their hexadecimal ASCII codes. #47302
Bug Fixes
Fixed the following issues:
- Schema Change failures due to file location changes caused by Tablet migration during the Schema Change process. #45517
- Cross-cluster Data Migration Tool fails to create tables in the target cluster due to control characters such as
\
,\r
in the default values of fields. #47861 - Persistent bRPC failures after BE restarts. #40229
- The
user_admin
role can change the root password using the ALTER USER command. #47801 - Primary key index write failures cause data write errors. #48045
Behavior Changes
- Intermediate result spilling is enabled by default when sinking data to Hive and Iceberg. #47118
- Changed the default value of the BE configuration item
max_cumulative_compaction_num_singleton_deltas
to500
. #47621 - When users create a partitioned table without specifying the bucket number, if the number of partitions exceeds 5, the rule for setting the bucket count is changed to
max(2*BE or CN count, bucket number calculated based on the largest historical partition data volume)
. The previous rule was to calculate the bucket number based on the largest historical partition data volume). #47949
Downgrade notes
To downgrade a cluster from v3.3.1 or later to v3.2, users must clean all temporary tables in the cluster by following these steps:
-
Disallow users to create new temporary tables:
ADMIN SET FRONTEND CONFIG("enable_experimental_temporary_table"="false");
-
Check if there are any temporary tables in the cluster:
SELECT * FROM information_schema.temp_tables;
-
If there are temporary tables in the system, clean them up using the following command (the SYSTEM-level OPERATE privilege is required):
CLEAN TEMPORARY TABLE ON SESSION 'session';
3.2.9
New Features
- Paimon tables now support DELETE Vectors. #45866
- Supports Column-level access control through Apache Ranger. #47702
- Stream Load can automatically convert JSON strings into STRUCT/MAP/ARRAY types during loading. #45406
- JDBC Catalog now supports Oracle and SQL Server. #35691
Improvements
- Improved privilege management by restricting user_admin role users from resetting the password of the root user. #47801
- Stream Load now supports using \t and \n as row and column delimiters. Users do not need to convert them to their hexadecimal ASCII codes. #47302
- Optimized memory usage during data loading. #47047
- Supports masking authentication information for the Files() function in audit logs. #46893
- Hive tables now support the skip.header.line.count property. #47001
- JDBC Catalog supports more data types. #47618
Bug Fixes
Fixed the following issues:
- BE crash caused by ALTER TABLE ADD COLUMN after upgrading a shared-data cluster from v3.2.x to v3.3.0 and then rolling it back. #47826
- Tasks initiated through SUBMIT TASK showed a Running status indefinitely in the QueryDetail interface. #47619
- Forwarding queries to the FE Leader node caused a null pointer exception. #47559
- SHOW MATERIALIZED VIEWS with WHERE conditions caused a null pointer exception. #47811
- Vertical Compaction fails for Primary Key tables in shared-data clusters. #47192
- Improper handling of I/O Error when sinking data to Hive or Iceberg tables. #46979
- Table properties do not take effect when whitespaces are added to their values. #47119
- BE crash caused by concurrent migration and Index Compaction operations on Primary Key tables. #46675