3.0.0
Release date: April 28, 2023
New Features
System architecture
- Decouple storage and compute. StarRocks now supports data persistence into S3-compatible object storage, enhancing resource isolation, reducing storage costs, and making compute resources more scalable. Local disks are used as hot data cache for boosting query performance. The query performance of the new shared-data architecture is comparable to the classic architecture (shared-nothing) when local cache is hit. For more information, see Deploy and use shared-data StarRocks.
Storage engine and data ingestion
- The AUTO_INCREMENT attribute is supported to provide globally unique IDs, which simplifies data management.
- Automatic partitioning and partitioning expressions are supported, which makes partition creation easier to use and more flexible.
- Primary Key tables support more complete UPDATE and DELETE syntax, including the use of CTEs and references to multiple tables.
- Added Load Profile for Broker Load and INSERT INTO jobs. You can view the details of a load job by querying the load profile. The usage is the same as Analyze query profile.
Data Lake Analytics
- [Preview] Supports Presto/Trino compatible dialect. Presto/Trino's SQL can be automatically rewritten into StarRocks' SQL pattern. For more information, see the system variable sql_dialect.
- [Preview] Supports JDBC catalogs.
- Supports using SET CATALOG to manually switch between catalogs in the current session.
Privileges and security
- Provides a new privilege system with full RBAC functionalities, supporting role inheritance and default roles. For more information, see Overview of privileges.
- Provides more privilege management objects and more fine-grained privileges. For more information, see Privileges supported by StarRocks.
Query engine
- Allows more queries on joined tables to benefit from the query cache. For example, the query cache now supports Broadcast Join and Bucket Shuffle Join.
- Supports Global UDFs.
- Dynamic adaptive parallelism: StarRocks can automatically adjust the pipeline_dop parameter for query concurrency.
SQL reference
- Added the following privilege-related SQL statements: SET DEFAULT ROLE, SET ROLE, SHOW ROLES, and SHOW USERS.
- Added the following semi-structured data analysis functions: map_apply, map_from_arrays, map_filter, transform_keys, and transform_values.
array_agg supports ORDER BY. - Window functions lead and lag support IGNORE NULLS.
- Added string functions replace, hex_decode_binary, and hex_decode_string().
- Added encryption functions base64_decode_binary and base64_decode_string.
- Added math functions sinh, cosh, and tanh.
- Added utility function current_role.
Improvements
Deployment
- Updated Docker image and the related Docker deployment document for version 3.0. #20623 #21021
Storage engine and data ingestion
- Supports more CSV parameters for data ingestion, including SKIP_HEADER, TRIM_SPACE, ENCLOSE, and ESCAPE. See STREAM LOAD, BROKER LOAD, and ROUTINE LOAD.
- The primary key and sort key are decoupled in Primary Key tables. The sort key can be separately specified in ORDER BY when you create a table.
- Optimized the memory usage of data ingestion into Primary Key tables in scenarios such as large-volume ingestion, partial updates, and persistent primary indexes.
- Supports creating asynchronous INSERT tasks. For more information, see INSERT and SUBMIT TASK. #20609
Materialized view
- Optimized the rewriting capabilities of materialized views, including:
- Supports rewrite of View Delta Join, Outer Join, and Cross Join.
- Optimized SQL rewrite of Union with partition.
- Improved materialized view building capabilities: supporting CTE, select *, and Union.
- Optimized the information returned by SHOW MATERIALIZED VIEWS.
- Supports adding MV partitions in batches, which improves the efficiency of partition addition during materialized view building. #21167
Query engine
- All operators are supported in the pipeline engine. Non-pipeline code will be removed in later versions.
- Improved Big Query Positioning and added big query log. SHOW PROCESSLIST supports viewing CPU and memory information.
- Optimized Outer Join Reorder.
- Optimized error messages in the SQL parsing stage, providing more accurate error positioning and clearer error messages.
Data Lake Analytics
- Optimized metadata statistics collection.
- Supports using SHOW CREATE TABLE to view the creation statements of the tables that are managed by an external catalog and are stored in Apache Hive™, Apache Iceberg, Apache Hudi, or Delta Lake.
Bug Fixes
- Some URLs in the license header of StarRocks' source file cannot be accessed. #2224
- An unknown error is returned during SELECT queries. #19731
- Supports SHOW/SET CHARACTER. #17480
- When the loaded data exceeds the field length supported by StarRocks, the error message returned is not correct. #14
- Supports show full fields from 'table'. #17233
- Partition pruning causes MV rewrites to fail. #14641
- MV rewrite fails when the CREATE MATERIALIZED VIEW statement contains count(distinct) and count(distinct) is applied to the DISTRIBUTED BY column. #16558
- FEs fail to start when a VARCHAR column is used as the partitioning column of a materialized view. #19366
- Window functions LEAD and LAG incorrectly handle IGNORE NULLS. #21001
- Adding temporary partitions conflicts with automatic partition creation. #21222
Behavior Change
- The new role-based access control (RBAC) system supports the previous privileges and roles. However, the syntax of related statements such as GRANT and REVOKE is changed.
- Renamed SHOW MATERIALIZED VIEW as SHOW MATERIALIZED VIEWS.
- Added the following Reserved keywords: AUTO_INCREMENT, CURRENT_ROLE, DEFERRED, ENCLOSE, ESCAPE, IMMEDIATE, PRIVILEGES, SKIP_HEADER, TRIM_SPACE, VARBINARY.
Upgrade Notes
You can upgrade from v2.5 to v3.0 or downgrade from v3.0 to v2.5.
In theory, an upgrade from a version earlier than v2.5 is also supported. To ensure system availability, we recommend that you first upgrade your cluster to v2.5 and then to v3.0.
Take note of the following points when you perform a downgrade from v3.0 to v2.5.
BDBJE
StarRocks upgrades the BDB library in v3.0. However, BDBJE cannot be rolled back. You must use BDB library of v3.0 after a downgrade. Perform the following steps:
-
After you replace the FE package with a v2.5 package, copy fe/lib/starrocks-bdb-je-18.3.13.jar of v3.0 to the fe/lib directory of v2.5.
-
Delete fe/lib/je-7.*.jar.
Privilege system
The new RBAC privilege system is used by default after you upgrade to v3.0. You can only downgrade to v2.5.
After a downgrade, run ALTER SYSTEM CREATE IMAGE to create a new image and wait for the new image to be synchronized to all follower FEs. If you do not run this command, some of the downgrade operations may fail. This command is supported from 2.5.3 and later.
For details about the differences between the privilege system of v2.5 and v3.0, see "Upgrade notes" in Privileges supported by StarRocks.