Skip to content

Releases: StarRocks/starrocks

3.3.5

24 Oct 03:08
6d81f75
Compare
Choose a tag to compare

3.3.5

Release date: October 23, 2024

New Features

  • Supports millisecond and microsecond precision in the DATETIME type.
  • Resource groups support CPU hard isolation.

Improvements

  • Optimized performance and extraction strategy for Flat JSON. #50696
  • Reduced memory usage for the following ARRAY functions:
  • Optimized error messages when loading Null values into List partition keys with the Not Null attribute. #51086
  • Optimized error messages for Files() when authentication fails in the Files function. #51697
  • Optimized internal statistics for INSERT OVERWRITE. #50417
  • Shared-data clusters support garbage collection (GC) for persistent index files. #51684
  • Added FE logs to help diagnose FE out-of-memory (OOM) issues. #51528
  • Supports recovering metadata from the metadata directory of FE. #51040

Bug Fixes

Fixed the following issues:

  • A deadlock issue caused by PIPE exceptions. #50841
  • Dynamic partition creation failures block subsequent partition creation. #51440
  • An error is returned for UNION ALL queries with ORDER BY. #51647
  • CTE in UPDATE statements causes hints to be ignored. #51458
  • The load_finish_time field in the system-defined view statistics.loads_history does not update as expected after a loading task is completed. #51174
  • UDTF mishandles multibyte UTF-8 characters. #51232

Behavior Changes

  • Modified the return content of the EXPLAIN statement. After the change, the return content is equivalent to EXPLAIN COST. You can configure the level of details returned by EXPLAIN using the dynamic FE parameter query_detail_explain_level. The default value is COSTS, with other valid values being NORMAL and VERBOSE. #51439

3.3.4

30 Sep 08:24
56bcf6f
Compare
Choose a tag to compare

3.3.4

Release date: September 30, 2024

New Features

  • Supports creating asynchronous materialized views on List Partition tables. #46680 #46808
  • List Partition tables now support Nullable partition columns. #47797
  • Supports viewing external file schema information using DESC FILES(). #50527
  • Supports viewing replication task metrics via SHOW PROC '/replications'. #50483

Improvements

  • Optimized data recycling performance for TRUNCATE TABLE in shared-data clusters. #49975
  • Supports intermediate result spilling for CTE operators. #47982
  • Supports adaptive phased scheduling to alleviate OOM issues caused by complex queries. #47868
  • Supports predicate pushdown for STRING-type date or datatime columns in specific scenarios. #50643
  • Supports COUNT DISTINCT computation on constant semi-structured data. #48273
  • Added a new FE parameter lake_enable_balance_tablets_between_workers to enable tablet balancing for tables in shared-date clusters. #50843
  • Enhanced query rewrite capabilities for generated columns. #50398
  • Partial Update now supports automatically populating columns with default values of CURRENT_TIMESTAMP. #50287

Bug Fixes

Fixed the following issues:

  • The error "version has been compacted" caused by an infinite loop on the FE side during Tablet Clone. #50561
  • ISO- formatted DATETIME types cannot be pushed down. #49358
  • In concurrent scenarios, data still existed after the tablet was deleted. #50382
  • Incorrect results returned by the yearweek function. #51065
  • An issue with low cardinality dictionaries in ARRAY during CTE queries. #51148
  • After FE restarts, partition TTL-related parameters were lost for materialized views. #51028
  • Data loss in columns defined with CURRENT_TIMESTAMP after upgrading. #50911
  • A stack overflow caused by the array_distinct function. #51017
  • Activation failures for materialized views after upgrading due to changes in default field lengths. You can avoid such issues by setting enable_active_materialized_view_schema_strict_check to false. #50869
  • Resource group property cpu_weight can be set to a negative value. #51005
  • Incorrect statistics for disk capacity information. #50669
  • Constant fold in the replace function. #50828

Behavior Changes

  • Changed the default replica number for external catalog-based materialized views from 1 to the value of the FE parameter default_replication_num (Default value: 3). #50931

3.2.11

09 Sep 08:28
10a5f0e
Compare
Choose a tag to compare

Release date: September 9, 2024

Improvements

  • Supports masking authentication information for Files() and PIPE. #47629
  • Support automatic inference for the STRUCT type when reading Parquet files through Files(). #50481

Bug Fixes

Fixed the following issues:

  • An error is returned for equi-join queries because they failed to be rewritten by the global dictionary. #50690
  • The error "version has been compacted" caused by an infinite loop on the FE side during Tablet Clone. #50561
  • Incorrect scheduling for unhealthy replica repairs after distributing data based on labels. #50331
  • An error in the statistics collection log: "Unknown column '%s' in '%s." #50785
  • Incorrect timezone usage when reading complex types like TIMESTAMP from Parquet files via Files(). #50448

Behavior Changes

  • When downgrading StarRocks from v3.3.x to v3.2.11, the system will ignore it if there is incompatible metadata. #49636

3.3.3

05 Sep 05:55
312ed45
Compare
Choose a tag to compare

3.3.3

Release date: September 5, 2024

New Features

  • Supports user-level variables. #48477
  • Supports Delta Lake Catalog metadata cache with manual and periodic refresh strategies. #46526 #49069
  • Supports loading JSON types from Parquet files. #49385
  • JDBC SQL Server Catalog supports queries with LIMIT. #48248
  • Shared-data clusters support Partial Updates with INSERT INTO. #49336

Improvements

  • Optimized error messages for loading:
    • When memory limits are reached during loading, the IP of the corresponding BE node is returned for easier troubleshooting. #49335
    • Detailed messages are provided when CSV data is loaded to target table columns that are not long enough. #49713
    • Specific node information is provided when Kerberos authentication fails in Broker Load. #46085
  • Optimized the partitioning mechanism during data loading to reduce memory usage in the initial stage. #47976
  • Optimized memory usage for shared-nothing clusters by limiting metadata memory usage to avoid issues when there are too many Tablets or Segment files. #49170
  • Optimized the performance of queries using max(partition_column). #49391
  • Partition pruning is used to optimize query performance when the partition column is a generated column (a column that is calculated based on a native column in the table), and the query predicate filter condition includes the native column. #48692
  • Supports masking authentication information for Files() and PIPE. #47629
  • Introduced a new statement show proc '/global_current_queries' to view queries running on all FE nodes. show proc '/current_queries' only shows queries running on the current FE node. #49826

Bug Fixes

Fixed the following issues:

  • The source cluster's BE nodes were mistakenly added to the current cluster when exporting data to the destination cluster via StarRocks external tables. #49323
  • TINYINT data type returned NULL when StarRocks reads ORC files using select * from files from clusters deployed on aarch64 machines. #49517
  • Stream Load fails when loading JSON files containing large Integer types. #49927
  • Incorrect schema is returned due to improper handling of invisible characters when users load CSV files with Files(). #49718
  • An issue with temporary partition replacement in tables with multiple partition columns. #49764

Behavior Changes

  • Introduced a new parameter object_storage_rename_file_request_timeout_ms to better accommodate backup scenarios with cloud object storage. This parameter will be used as the backup timeout, with a default value of 30 seconds. #49706
  • to_json, CAST(AS MAP), and STRUCT AS JSON will return NULL instead of throwing an error by default when the conversion fails. You can allow errors by setting the system variable sql_mode to ALLOW_THROW_EXCEPTION. #50157

3.1.15

04 Sep 09:04
5625961
Compare
Choose a tag to compare

3.1.15

Release date: September 4, 2024

Bug Fixes

Fixed the following issues:

  • During query rewrite with asynchronous materialized views, count(*) on certain tables returns NULL. #49288
  • partition_linve_nubmer does not take effect. #49213
  • FE throws a tablet exception: BE disk offline, and cannot migrate tablets. #47833

3.2.10

23 Aug 06:13
f61f51a
Compare
Choose a tag to compare

Release date: August 23, 2024

Improvements

  • Files() will automatically convert BYTE_ARRAY data with a logical_type of JSON in Parquet files to the JSON type in StarRocks. #49385
  • Optimized error messages for Files() when Access Key ID and Secret Access Key are missing. #49090
  • information_schema.columns supports the GENERATION_EXPRESSION field. #49734

Bug Fixes

Fixed the following issues:

  • Downgrading a v3.3 shared-data cluster to v3.2 after setting the Primary Key table property "persistent_index_type" = "CLOUD_NATIVE" causes a crash. #48149
  • Exporting data to CSV files using SELECT INTO OUTFILE may cause data inconsistency. #48052
  • Queries encounter failures during concurrent query execution. #48180
  • Queries would hang due to a timeout in the Plan phase without exiting. #48405
  • After disabling index compression for Primary Key tables in older versions and then upgrading to v3.2.9, accessing page_off information causes an array out-of-bounds crash. #48230
  • BE crash caused by concurrent execution of ADD/DROP COLUMN operations. #49355
  • Queries against negative TINYINT values in ORC format files return None on the aarch64 architecture. #49517
  • If the disk write operation fails, failures of l0 snapshots for Primary Key Persistent Index may cause data loss. #48045
  • Partial Update in Column mode for Primary Key tables fails under scenarios with large-volume data updates. #49054
  • BE crash caused by Fast Schema Evolution when downgrading a v3.3.0 shared-data cluster to v3.2.9. #42737
  • partition_linve_nubmer does not take effect. #49213
  • The conflict between index persistence and compaction in Primary Key tables could cause clone failures. #49341
  • Modifications of partition_line_number using ALTER TABLE do not take effect. #49437
  • Rewrite of CTE distinct grouping sets generates an invalid plan. #48765
  • RPC failures polluted the thread pool. #49619
  • authentication failure issues when loading files from AWS S3 via PIPE. #49837

Behavior Changes

  • Added a check for the meta directory in the FE startup script. If the directory does not exist, it will be automatically created. #48940
  • Added a memory limit parameter load_process_max_memory_hard_limit_ratio for data loading. If memory usage exceeds the limit, subsequent loading tasks will fail. #48495

3.3.2

08 Aug 08:14
857dd73
Compare
Choose a tag to compare

Release date: August 8, 2024

New Features

  • Supports renaming columns within StarRocks internal tables. #47851

  • Supports reading Iceberg views. Currently, only Iceberg views created through StarRocks are supported. #46273

  • [Experimental] Supports adding and removing fields of STRUCT-type data. #46452

  • Supports specifying the compression level for ZSTD compression format during table creation. #46839

  • Added the following FE dynamic parameters to limit table boundaries. #47896

    Including:

    • auto_partition_max_creation_number_per_load
    • max_partition_number_per_table
    • max_bucket_number_per_partition
    • max_column_number_per_table
  • Supports runtime optimization of table data distribution, ensuring optimization tasks do not conflict with DML operations on the table. #43747

  • Added an observability interface for the global hit rate of Data Cache. #48450

  • Added the SQL function array_repeat. #47862

Improvements

  • Optimized the error messages for Routine Load failures due to Kafka authentication failures. #46136 #47649

  • Stream Load supports using \t and \n as row and column delimiters. Users do not need to convert them to their hexadecimal ASCII codes. #47302

  • Optimized the asynchronous statistics collection method for write operators, addressing the issue of increased latency when there are many import tasks. #48162

  • Added the following BE dynamic parameters to control resource hard limits during loading, reducing the impact on BE stability when writing a large number of tablets. #48495

    Including:

    • load_process_max_memory_hard_limit_ratio
    • enable_new_load_on_memory_limit_exceeded
  • Added consistency checks for Column IDs within the same table to prevent Compaction errors. #48498

  • Supports persisting PIPE metadata to prevent metadata loss due to FE restarts. #48852

Bug Fixes

  • The process could not end when creating a dictionary from an FE Follower. #47802
  • Inconsistent information returned by the SHOW PARTITIONS command in shared-data clusters and shared-nothing clusters. #48647
  • Data errors caused by incorrect type handling when loading data from JSON fields to ARRAY<BOOLEAN> columns. #48387
  • The query_id column in information_schema.task_runs cannot be queried. #48876
  • During Backup, multiple requests for the same operation are submitted to different Brokers, causing request errors. #48856
  • Downgrading to versions earlier than v3.1.11 or v3.2.4 causes Primary Key table index decompression failures, leading to query errors. #48659

Downgrade Notes

If you have used the renaming column feature, you must rename the columns to their original names before downgrading your cluster to an earlier version. You can check the audit log of your cluster after upgrading to identify any ALTER TABLE RENAME COLUMN operations and the original names of the columns.

3.1.14

30 Jul 03:20
d8e1fc5
Compare
Choose a tag to compare

Release date: July 29, 2024

Improvements

  • Stream Load now supports using \t and \n as row and column delimiters. Users do not need to convert them to their hexadecimal ASCII codes. #47302

Bug Fixes

Fixed the following issues:

  • Frequent INSERT and UPDATE operations on Primary Key tables may cause write and query delays in the database. #47838
  • When a Primary Key table encounters data persistence failures, the persistent index may fail to capture the error, leading to data loss and reporting the error "Insert found duplicate key". #48045
  • Materialized views may report insufficient permissions when refreshed. #47561
  • Materialized view reports the error "For input string" when refreshed. #46131
  • During materialized view refresh, the lock is held excessively long, causing the Leader FE to be restarted by the deadlock detection script. #48256
  • Queries against views with the IN clause in its definition may return inaccurate results. #47484
  • Global Runtime Filter causes incorrect results. #48496
  • MySQL protocol COM_CHANGE_USER does not support conn_attr. #47796

Behavior Changes

  • When users create a non-partitioned table without specifying the bucket number, the minimum bucket number the system sets for the table is 16 (instead of 2 based on the formula 2*BE or CN count). If users want to set a smaller bucket number when creating a small table, they must set it explicitly. #47005

3.3.1

19 Jul 03:31
2b87854
Compare
Choose a tag to compare

Release date: July 18, 2024

New Features

  • [Preview] Supports temporary tables.
  • [Preview] JDBC Catalog supports Oracle and SQL Server.
  • [Preview] Unified Catalog supports Kudu.
  • Loading data into Primary Key tables with INSERT INTO supports partial updates in column mode.
  • User-defined variables support the ARRAY type. #42631
  • Stream Load supports converting JSON-type data and loading it into columns of STRUCT/MAP/ARRAY types. #45406
  • Supports global dictionary cache.
  • Supports deleting partitions in batch. #44744
  • Supports queries on Iceberg views. #46273
  • Supports managing column-level permissions in Apache Ranger. (Column-level permissions for materialized views and views must be set under the table object.) #47702

Improvements

  • Optimized the IdChain hashcode implementation to reduce the FE restart time. #47599
  • Improved error messages for the csv.trim_space parameter in the FILES() function, checking for illegal characters and providing reasonable prompts. #44740
  • Stream Load supports using \t and \n as row and column delimiters. Users do not need to convert them to their hexadecimal ASCII codes. #47302

Bug Fixes

Fixed the following issues:

  • Schema Change failures due to file location changes caused by Tablet migration during the Schema Change process. #45517
  • Cross-cluster Data Migration Tool fails to create tables in the target cluster due to control characters such as \, \r in the default values of fields. #47861
  • Persistent bRPC failures after BE restarts. #40229
  • The user_admin role can change the root password using the ALTER USER command. #47801
  • Primary key index write failures cause data write errors. #48045

Behavior Changes

  • Intermediate result spilling is enabled by default when sinking data to Hive and Iceberg. #47118
  • Changed the default value of the BE configuration item max_cumulative_compaction_num_singleton_deltas to 500. #47621
  • When users create a partitioned table without specifying the bucket number, if the number of partitions exceeds 5, the rule for setting the bucket count is changed to max(2*BE or CN count, bucket number calculated based on the largest historical partition data volume). The previous rule was to calculate the bucket number based on the largest historical partition data volume). #47949

Downgrade notes

To downgrade a cluster from v3.3.1 or later to v3.2, users must clean all temporary tables in the cluster by following these steps:

  1. Disallow users to create new temporary tables:

    ADMIN SET FRONTEND CONFIG("enable_experimental_temporary_table"="false"); 
  2. Check if there are any temporary tables in the cluster:

    SELECT * FROM information_schema.temp_tables;
  3. If there are temporary tables in the system, clean them up using the following command (the SYSTEM-level OPERATE privilege is required):

    CLEAN TEMPORARY TABLE ON SESSION 'session';

3.2.9

11 Jul 12:23
63ce1bd
Compare
Choose a tag to compare

New Features

  • Paimon tables now support DELETE Vectors. #45866
  • Supports Column-level access control through Apache Ranger. #47702
  • Stream Load can automatically convert JSON strings into STRUCT/MAP/ARRAY types during loading. #45406
  • JDBC Catalog now supports Oracle and SQL Server. #35691

Improvements

  • Improved privilege management by restricting user_admin role users from resetting the password of the root user. #47801
  • Stream Load now supports using \t and \n as row and column delimiters. Users do not need to convert them to their hexadecimal ASCII codes. #47302
  • Optimized memory usage during data loading. #47047
  • Supports masking authentication information for the Files() function in audit logs. #46893
  • Hive tables now support the skip.header.line.count property. #47001
  • JDBC Catalog supports more data types. #47618

Bug Fixes

Fixed the following issues:

  • BE crash caused by ALTER TABLE ADD COLUMN after upgrading a shared-data cluster from v3.2.x to v3.3.0 and then rolling it back. #47826
  • Tasks initiated through SUBMIT TASK showed a Running status indefinitely in the QueryDetail interface. #47619
  • Forwarding queries to the FE Leader node caused a null pointer exception. #47559
  • SHOW MATERIALIZED VIEWS with WHERE conditions caused a null pointer exception. #47811
  • Vertical Compaction fails for Primary Key tables in shared-data clusters. #47192
  • Improper handling of I/O Error when sinking data to Hive or Iceberg tables. #46979
  • Table properties do not take effect when whitespaces are added to their values. #47119
  • BE crash caused by concurrent migration and Index Compaction operations on Primary Key tables. #46675