11 Aug 15:12

aa107ec

Druid 34.0.0 Latest

Latest

Apache Druid 34.0.0 contains over 270 new features, bug fixes, performance enhancements, documentation improvements, and additional test coverage from 48 contributors.

See the complete set of changes for additional details, including bug fixes.

Review the upgrade notes and incompatible changes before you upgrade to Druid {{DRUIDVERSION}}.
If you are upgrading across multiple versions, see the Upgrade notes page, which lists upgrade notes for the most recent Druid versions.

Important features, changes, and deprecations

This section contains important information about new and existing features.

Java 11 support

Java 11 support has been deprecated since Druid 32.0, and official support will be removed as early as Druid 35.0.0

Hadoop-based ingestion

Hadoop-based ingestion has been deprecated since Druid 32.0 and will be removed as early as Druid 35.0.0.
We recommend one of Druid's other supported ingestion methods, such as SQL-based ingestion or MiddleManager-less ingestion using Kubernetes.

As part of this change, you must now opt-in to using the deprecated index_hadoop task type. If you don't do this, your Hadoop-based ingestion tasks will fail.

To opt-in, set druid.indexer.task.allowHadoopTaskExecution to true in your common.runtime.properties file.

#18239

Use SET statements for query context parameters

You can now use SET statements to define query context parameters for a query through the Druid console or the API.

#17894 #17974

SET statements in the Druid console

The web console now supports using SET statements to specify query context parameters. For example, if you include SET timeout = 20000; in your query, the timeout query context parameter is set:

SET timeout = 20000;
SELECT "channel", "page", sum("added") from "wikipedia" GROUP BY 1, 2

#17966

SET statements with the API

SQL queries issued to /druid/v2/sql can now include multiple SET statements to build up context for the final statement. For example, the following SQL query results includes the timeout, useCache, populateCache, vectorize, and engine query context parameters:

SET timeout = 20000;
SET useCache = false;
SET populateCache = false;
SET vectorize = 'force';
SET engine = 'msq-dart'
SELECT "channel", "page", sum("added") from "wikipedia" GROUP BY 1, 2

The API call for this query looks like the following:

curl --location 'http://HOST:PORT/druid/v2/sql' \
--header 'Content-Type: application/json' \
--data '{
  "query": "SET timeout=20000; SET useCache=false; SET populateCache=false; SET engine='\''msq-dart'\'';SELECT  user,  commentLength,COUNT(*) AS \"COUNT\" FROM wikipedia GROUP BY 1, 2 ORDER BY 2 DESC",
  "resultFormat": "array",
  "header": true,
  "typesHeader": true,
  "sqlTypesHeader": true
}'

This improvement also works for INSERT and REPLACE queries using the MSQ task engine. Note that JDBC isn't supported.

Improved HTTP endpoints

You can now use raw SQL in the HTTP body for /druid/v2/sql endpoints. You can set Content-Type to text/plain instead of application/json, so you can provide raw text that isn't escaped.

#17937

Cloning Historicals (experimental)

You can now configure clones for Historicals using the dynamic Coordinator configuration cloneServers. Cloned Historicals are useful for situations such as rolling updates where you want to launch a new Historical as a replacement for an existing one.

Set the config to a map from the target Historical server to the source Historical:

  "cloneServers": {"historicalClone":"historicalOriginal"}

The clone doesn't participate in regular segment assignment or balancing. Instead, the Coordinator mirrors any segment assignment made to the original Historical onto the clone, so that the clone becomes an exact copy of the source. Segments on the clone Historical do not count towards replica counts either. If the original Historical disappears, the clone remains in the last known state of the source server until removed from the cloneServers config.

When you query your data using the native query engine, you can prefer (preferClones), exclude (excludeClones), or include (includeClones) clones by setting the query context parameter cloneQueryMode. By default, clones are excluded.

As part of this change, new Coordinator APIs are available. For more information, see Coordinator APIs for clones.

#17863 #17899 #17956

Embedded kill tasks on the Overlord (Experimental)

You can now run kill tasks directly on the Overlord itself. Embedded kill tasks provide several benefits; they:

Kill segments as soon as they're eligible
Don't take up tasks slot
finish faster since they use optimized metadata queries and don't launch a new JVM
Kill a small number of segments per task, ensuring locks on an interval aren't held for too long
Skip locked intervals to avoid head-of-line blocking
Require minimal configuration
Can keep up with a large number of unused segments in the cluster

This feature is controlled by the following configs:

druid.manager.segments.killUnused.enabled - Whether the feature is enabled or not (Defaults to false)
druid.manager.segments.killUnused.bufferPeriod - The amount of time that a segment must be unused before it is able to be permanently removed from metadata and deep storage. This can serve as a buffer period to prevent data loss if data ends up being needed after being marked unused (Defaults to P30D)

To use embedded kill tasks, you need to have segment metadata cache enabled.

As part of this feature, new metrics have been added.

#18028 #18124

Preferred tier selection

You can now configure the Broker service to prefer Historicals on a specific tier. This is useful for across availability zone deployment. Brokers in one AZ select historicals in the same AZ by default but still keeps the ability to select historical nodes in another AZ if historicals in the same AZ are not available.

To enable, set property druid.broker.select.tier to perferred in Broker runtime properties. You can then configure druid.broker.select.tier.preferred.tier to the tier you want each broker to prefer (i.e. for brokers in AZ1, you could set this to the tier name of your AZ1 historical servers).

#18136

Dart improvements

The Dart query engine now uses the /druid/v2/sql endpoint like other SQL query engines. The former Dart specific endpoint is no longer supported. To use Dart for a query, include the engine query context parameter and set it to msq-dart.

#18003 #18003

Enabling Dart remains the same, add the following line to your broker/runtime.properties and historical/runtime.properties files:

druid.msq.dart.enabled = true

Additionally, Dart now queries real-time tasks by default. You can control this behavior by setting the query context parameter includeSegmentSource to REALTIME (default) or NONE, in a similar way to MSQ tasks. You can also run synchronous or asynchronous queries.

#18076 #18241

`SegmentMetadataCache` on the Coordinator

#17996 #17935

Functional area and related changes

This section contains detailed release notes separated by areas.

Web console

You can now assign tiered replications to tiers that aren't currently online #18050
You can now filter tasks by the error in the Task view #18057
Improved SQL autocomplete and added JSON autocomplete #18126
Changed how the web console det...

Assets 2

29 Apr 13:10

kgyrtkirk

druid-33.0.0

84b59b9

Druid 33.0.0

Apache Druid 33.0.0 contains over 190 new features, bug fixes, performance enhancements, documentation improvements, and additional test coverage from 44 contributors.

See the complete set of changes for additional details, including bug fixes.

Review the upgrade notes before you upgrade to Druid 33.0.0.
If you are upgrading across multiple versions, see the Upgrade notes page, which lists upgrade notes for the most recent Druid versions.

# Important features, changes, and deprecations

This section contains important information about new and existing features.

# Increase segment load speed

You can now increase the speed at which segments get loaded on a Historical by providing a list of servers for the Coordinator dynamic config turboLoadingNodes. For these servers, the Coordinator ignores druid.coordinator.loadqueuepeon.http.batchSize and uses the value of the respective numLoadingThreads instead. Please note that putting a Historical in turbo-loading mode might affect query performance since more resources would be used by the segment loading threads.

#17775

# Overlord APIs for compaction (experimental)

You can use the following Overlord compaction APIs to manage compaction status and configs. These APIs work seamlessly irrespective of whether compaction supervisors are enabled or not.

For more information, see Compaction APIs

#17834

# Scheduled batch ingestion (experimental)

You can now schedule batch ingestions with the MSQ task engine by using the scheduled batch supervisor. You can specify the schedule using either the standard Unix cron syntax or Quartz cron syntax by setting the type field to either unix or quartz. Unix also supports macro expressions such as @daily and others.

Submit your supervisor spec to the /druid/v2/sql/task/ endpoint.

The following example scheduled batch supervisor spec submits a REPLACE query every 5 minutes:

{
    "type": "scheduled_batch",
    "schedulerConfig": {
        "type": "unix",
        "schedule": "*/5 * * * *"
    },
    "spec": {
        "query": "REPLACE INTO foo OVERWRITE ALL SELECT * FROM bar PARTITIONED BY DAY"
    },
    "suspended": false
}

#17353

# Improved S3 upload

Druid can now use AWS S3 Transfer Manager for S3 uploads, which can significantly reduce segment upload time. This feature is on by default and controlled with the following configs in common.runtime.properties:

    druid.storage.transfer.useTransferManager=true
    druid.storage.transfer.minimumUploadPartSize=20971520
    druid.storage.transfer.multipartUploadThreshold=20971520

#17674

# Functional area and related changes

This section contains detailed release notes separated by areas.

# Web console

# MERGE INTO

The MERGE INTO keyword is now highlighted in the web console and the query gets treated as an insert query.

#17679

# Other web console improvements

Added the ability to multi-select in table filters and added suggestions to the Status field for tasks and supervisors as well as service type #17765
The Explore view now supports timezones #17650
Data exported from the web console is now normalized to how Druid exports data. Additionally, you can now export results as Markdown tables #17845

# Ingestion

# SQL-based ingestion

# Other SQL-based ingestion improvements

# Streaming ingestion

# Query parameter for restarts

You can now use an optional query parameter called skipRestartIfUnmodified for the /druid/indexer/v1/supervisor endpoint. You can set skipRestartIfUnmodified=true to not restart the supervisor if the spec is unchanged.

For example:

curl -X POST --header "Content-Type: application/json" -d @supervisor.json localhost:8888/druid/indexer/v1/supervisor?skipRestartIfUnmodified=true

#17707

# Other streaming ingestion improvements

Improved the efficiency of streaming ingestion by fetching active tasks from memory. This reduces the number of calls to the metadata store for active datasource task payloads #16098

# Querying

# Improved the query results API

The query results API (`GE...

Contributors

lfrancke, vogievetsky, and 42 other contributors

Assets 2

19 Mar 16:07

adarshsanjeev

druid-32.0.1

a042026

Druid 32.0.1

The Apache Druid team is proud to announce the release of Apache Druid 32.0.1.
Druid is a high performance analytics data store for event-driven data.

Apache Druid 32.0.1 contains security fixes for CVE-2025-27888.

Source and binary distributions can be downloaded from:
https://druid.apache.org/downloads.html

Full Changelog: druid-32.0.0...druid-32.0.1

A big thank you to all the contributors in this milestone release!

Assets 2

19 Mar 16:07

adarshsanjeev

druid-31.0.2

230605e

Druid 31.0.2

The Apache Druid team is proud to announce the release of Apache Druid 31.0.2.
Druid is a high performance analytics data store for event-driven data.

Apache Druid 31.0.2 contains security fixes for CVE-2025-27888.

Source and binary distributions can be downloaded from:
https://druid.apache.org/downloads.html

Full Changelog: druid-31.0.1...druid-31.0.2

A big thank you to all the contributors in this milestone release!

Assets 2

13 Feb 08:00

adarshsanjeev

druid-32.0.0

960e851

Druid 32.0.0

Apache Druid 32.0.0 contains over 220 new features, bug fixes, performance enhancements, documentation improvements, and additional test coverage from 52 contributors.

See the complete set of changes for additional details, including bug fixes.

Review the incompatible changes before you upgrade to Druid 32.
If you are upgrading across multiple versions, see the Upgrade notes page, which lists upgrade notes for the most recent Druid versions.

# Important features

This section contains important information about new and existing features.

# New Overlord APIs

APIs for marking segments as used or unused have been moved from the Coordinator to the Overlord service:

Mark all segments of a datasource as unused:
POST /druid/indexer/v1/datasources/{dataSourceName}
Mark all (non-overshadowed) segments of a datasource as used:
DELETE /druid/indexer/v1/datasources/{dataSourceName}
Mark multiple segments as used
POST /druid/indexer/v1/datasources/{dataSourceName}/markUsed
Mark multiple (non-overshadowed) segments as unused
POST /druid/indexer/v1/datasources/{dataSourceName}/markUnused
Mark a single segment as used:
POST /druid/indexer/v1/datasources/{dataSourceName}/segments/{segmentId}
Mark a single segment as unused:
DELETE /druid/indexer/v1/datasources/{dataSourceName}/segments/{segmentId}

As part of this change, the corresponding Coordinator APIs have been deprecated and will be removed in a future release:

POST /druid/coordinator/v1/datasources/{dataSourceName}
POST /druid/coordinator/v1/datasources/{dataSourceName}/markUsed
POST /druid/coordinator/v1/datasources/{dataSourceName}/markUnused
POST /druid/coordinator/v1/datasources/{dataSourceName}/segments/{segmentId}
DELETE /druid/coordinator/v1/datasources/{dataSourceName}/segments/{segmentId}
DELETE /druid/coordinator/v1/datasources/{dataSourceName}

The Coordinator now calls the Overlord to serve these requests.

#17545

# Realtime query processing for multi-value strings

Realtime query processing no longer considers all strings as multi-value strings during expression processing, fixing a number of bugs and unexpected failures. This should also improve realtime query performance of expressions on string columns.

This change impacts topN queries for realtime segments where rows of data are implicitly null, such as from a property missing from a JSON object.

Before this change, these were handled as [] instead of null, leading to inconsistency between processing realtime segments and published segments. When processing segments, the value was treated as [], which topN ignores. After publishing, the value became null, which topN does not ignore. The same query could have different results before and after being persisted

After this change, the topN engine now treats [] as null when processing realtime segments, which is consistent with published segments.

This change doesn't impact actual multi-value string columns, regardless of if they're realtime.

#17386

# Join hints in MSQ task engine queries

Druid now supports hints for SQL JOIN queries that use the MSQ task engine. This allows queries to provide hints for the JOIN type that should be used at a per join level. Join hints recursively affect sub queries.

#17541

# Changes and deprecations

# ANSI-SQL compatibility and query results

Support for the configs that let you maintain older behavior that wasn't ANSI-SQL compliant have been removed:

druid.generic.useDefaultValueForNull=true
druid.expressions.useStrictBooleans=false
druid.generic.useThreeValueLogicForNativeFilters=false

They no longer affect your query results. Only SQL-compliant non-legacy behavior is supported now.

If the configs are set to the legacy behavior, Druid services will fail to start.

If you want to continue to get the same results without these settings, you must update your queries or your results will be incorrect after you upgrade.

For more information about how to update your queries, see the migration guide.

#17568 #17609

# Java support

Java support in Druid has been updated:

Java 8 support has been removed
Java 11 support is deprecated

We recommend that you upgrade to Java 17.

#17466

# Hadoop-based ingestion

Hadoop-based ingestion is now deprecated. We recommend that you migrate to SQL-based ingestion.

# Join hints in MSQ task engine queries

select /*+ sort_merge */ w1.cityName, w2.countryName
from
(
  select /*+ broadcast */ w3.cityName AS cityName, w4.countryName AS countryName from wikipedia w3 LEFT JOIN wikipedia-set2 w4 ON w3.regionName = w4.regionName
) w1
JOIN wikipedia-set1 w2 ON w1.cityName = w2.cityName
where w1.cityName='New York';

(#17406)

# Functional area and related changes

This section contains detailed release notes separated by areas.

# Web console

# Explore view (experimental)

Several improvements have been made to the Explore view in the web console.

#17627

# Segment timeline view

The segment timeline is now more interactive and no longer forces day granularity.

#17521

# Other web conso...

Contributors

jgoz, vogievetsky, and 50 other contributors

Assets 2

25 Dec 05:42

clintropolis

druid-31.0.1

520482c

Druid 31.0.1

Apache Druid 31.0.1 is a patch release that contains important fixes for topN queries using query granularity other than 'ALL' and for the new complex metric column compression feature introduced in Druid 31.0.0. It also contains fixes for the web console, the new projections feature, and a fix for a minor performance regression.

See the complete set of changes for 31.0.1 for additional details.

For information about new features in Druid 31, see the Druid 31 release notes.

#Bug fixes

Fixes an issue with topN queries that use a query granularity other than 'ALL', which could cause some query correctness issues #17565
Fixes an issue with complex metric compression that caused some data to be read incorrectly, resulting in segment data corruption or system instability due to out-of-memory exceptions. We recommend that you reingest data if you use compression for complex metric columns #17422
Fixes an issue with projection segment merging #17460
Fixes web console progress indicator #17334
Fixes a minor performance regression with query processing #17397

# Credits

@clintropolis
@findingrish
@gianm
@techdocsmith
@vogievetsky

Contributors

vogievetsky, gianm, and 3 other contributors

Assets 2

22 Oct 18:15

kfaraz

druid-31.0.0

de96f9f

Druid 31.0.0

Apache Druid 31.0.0 contains over 589 new features, bug fixes, performance enhancements, documentation improvements, and additional test coverage from 64 contributors.

See the complete set of changes for additional details, including bug fixes.

Review the upgrade notes and incompatible changes before you upgrade to Druid 31.0.0.
If you are upgrading across multiple versions, see the Upgrade notes page, which lists upgrade notes for the most recent Druid versions.

# Important features, changes, and deprecations

This section contains important information about new and existing features.

# Compaction features

Druid now supports the following features:

Compaction scheduler with greater flexibility and control over when and what to compact.
MSQ task engine-based auto-compaction for more performant compaction jobs.

For more information, see Compaction supervisors.

#16291 #16768

Additionally, compaction tasks that take advantage of concurrent append and replace is now generally available as part of concurrent append and replace becoming GA.

# Window functions are GA

Window functions are now generally available in Druid's native engine and in the MSQ task engine.

You no longer need to use the query context enableWindowing to use window functions. #17087

# Concurrent append and replace GA

Concurrent append and replace is now GA. The feature safely replaces the existing data in an interval of a datasource while new data is being appended to that interval. One of the most common applications of this feature is appending new data (such as with streaming ingestion) to an interval while compaction of that interval is already in progress.

# Delta Lake improvements

The community extension for Delta Lake has been improved to support complex types and snapshot versions.

# Iceberg improvements

The community extension for Iceberg has been improved. For more information, see Iceberg improvements

# Projections (experimental)

Druid 31.0.0 includes experimental support for new feature called projections. Projections are grouped pre-aggregates of a segment that are automatically used at query time to optimize execution for any queries which 'fit' the shape of the projection by reducing both computation and i/o cost by reducing the number of rows which need to be processed. Projections are contained within segments of a datasource and do increase the segment size. But they can share data, such as value dictionaries of dictionary encoded columns, with the columns of the base segment.

Projections currently only support JSON-based ingestion, but they can be used by queries that use the MSQ task engine or the new Dart engine. Future development will allow projections to be created as part of SQL-based ingestion.

We have a lot of plans to continue to improve this feature in the coming releases, but are excited to get it out there so users can begin experimentation since projections can dramatically improve query performance.

For more information, see Projections.

# Low latency high complexity queries using Dart (experimental)

Distributed Asynchronous Runtime Topology (Dart) is designed to support high complexity queries, such as large joins, high cardinality group by, subqueries and common table expressions, commonly found in ad-hoc, data warehouse workloads. Instead of using data warehouse engines like Spark or Presto to execute high-complexity queries, you can use Dart, alleviating the need for additional infrastructure.

For more information, see Dart.

#17140

# Storage improvements

Druid 31.0.0 includes several improvements to how data is stored by Druid, including compressed columns and flexible segment sorting. For more information, see Storage improvements.

# Upgrade-related changes

See the Upgrade notes for more information about the following upgrade-related changes:

Array ingest mode now defaults to array
Disabled ZK-based segment loading
Removed task action audit logging
Removed Firehose and FirehoseFactory
Removed the scan query legacy mode

# Deprecations

# Java 8 support

Java 8 support is now deprecated and will be removed in 32.0.0.

# Other deprecations

Deprecated API /lockedIntervals is now removed #16799
Cluster-level compaction API deprecates task slots compaction API #16803
The arrayIngestMode context parameter is deprecated and will be removed. For more information, see Array ingest mode now defaults to array.

# Functional areas and related changes

This section contains detailed release notes separated by areas.

# Web console

# Improvements to the stages display

A number of improvements have been made to the query stages visualization

These changes include:

Added a graph visualization to illustrate the flow of query stages #17135
Added a column for CPU counters in the query stages detail view when they are present. Also added tool tips to expose potentially hidden data like CPU time #17132

# Dart

Added the ability to detect the presence of the Dart engine and to run Dart queries from the console as well as to see currently running Dart queries.

#17147

<a name="31.0.0-functional-areas-and-related-changes-web-console-copy-query-results-as-sql" href="#31.0.0-functional-areas-and-relat...

Contributors

rash67, lfrancke, and 62 other contributors

Assets 2

17 Sep 17:10

cryptoe

druid-30.0.1

a30af7a

druid-30.0.1

The Apache Druid team is proud to announce the release of Apache Druid 30.0.1.
Druid is a high performance analytics data store for event-driven data.

Apache Druid 30.0.1 contains security fixes for CVE-2024-45384, CVE-2024-45537.
The release also contains minor doc and task monitor fixes.

Source and binary distributions can be downloaded from:
https://druid.apache.org/downloads.html

Full Changelog: druid-30.0.0...druid-30.0.1

A big thank you to all the contributors in this milestone release!

Contributors

vogievetsky, clintropolis, and 11 other contributors

Assets 2

17 Jun 03:03

adarshsanjeev

druid-30.0.0

09d36ee

Druid 30.0.0

Apache Druid 30.0.0 contains over 407 new features, bug fixes, performance enhancements, documentation improvements, and additional test coverage from 50 contributors.

See the complete set of changes for additional details, including bug fixes.

Review the upgrade notes and incompatible changes before you upgrade to Druid 30.0.0.
If you are upgrading across multiple versions, see the Upgrade notes page, which lists upgrade notes for the most recent Druid versions.

# Upcoming removals

As part of the continued improvements to Druid, we are deprecating certain features and behaviors in favor of newer iterations that offer more robust features and are more aligned with standard ANSI SQL. Many of these new features have been the default for new deployments for several releases.

The following features are deprecated, and we currently plan to remove support in Druid 32.0.0:

Non-SQL compliant null handling: By default, Druid now differentiates between an empty string and a record with no data as well as between an empty numerical record and 0. For more information, see NULL values. For a tutorial on the SQL-compliant logic, see the Null handling tutorial.
Non-strict Boolean handling: Druid now strictly uses 1 (true) or 0 (false). Previously, true and false could be represented either as true and false or as 1 and 0, respectively. In addition, Druid now returns a null value for Boolean comparisons like True && NULL. For more information, see Boolean logic. For examples of filters that use the SQL-compliant logic, see Query filters.
Two-value logic: By default, Druid now uses three-valued logic for both ingestion and querying. This primarily affects filters using logical NOT operations on columns with NULL values. For more information, see Boolean logic. For examples of filters that use the SQL-compliant logic, see Query filters.

# Important features, changes, and deprecations

This section contains important information about new and existing features.

# Concurrent append and replace improvements

Streaming ingestion supervisors now support concurrent append, that is streaming tasks can run concurrently with a replace task (compaction or re-indexing) if it also happens to be using concurrent locks. Set the context parameter useConcurrentLocks to true to enable concurrent append.

Once you update the supervisor to have "useConcurrentLocks": true, the transition to concurrent append happens seamlessly without causing any ingestion lag or task failures.

#16369

Druid now performs active cleanup of stale pending segments by tracking the set of tasks using such pending segments.
This allows concurrent append and replace to upgrade only a minimal set of pending segments and thus improve performance and eliminate errors.
Additionally, it helps in reducing load on the metadata store.

#16144

# Grouping on complex columns

Druid now supports grouping on complex columns and nested arrays.
This means that both native queries and the MSQ task engine can group on complex columns and nested arrays while returning results.

Additionally, the MSQ task engine can roll up and sort on the supported complex columns, such as JSON columns, during ingestion.

#16068
#16322

# Removed ZooKeeper-based segment loading

ZooKeeper-based segment loading is being removed due to known issues.
It has been deprecated for several releases.
Recent improvements to the Druid Coordinator have significantly enhanced performance with HTTP-based segment loading.

#15705

# Improved groupBy queries

Before Druid pushes realtime segments to deep storage, the segments consist of spill files.
Segment metrics such as query/segment/time now report on each spill file for a realtime segment, rather than for the entire segment.
This change eliminates the need to materialize results on the heap, which improves the performance of groupBy queries.

#15757

# Improved AND filter performance

Druid query processing now adaptively determines when children of AND filters should compute indexes and when to simply match rows during the scan based on selectivity of other filters.
Known as filter partitioning, it can result in dramatic performance increases, depending on the order of filters in the query.

For example, take a query like SELECT SUM(longColumn) FROM druid.table WHERE stringColumn1 = '1000' AND stringColumn2 LIKE '%1%'. Previously, Druid used indexes when processing filters if they are available.
That's not always ideal; imagine if stringColumn1 = '1000' matches 100 rows. With indexes, we have to find every value of stringColumn2 LIKE '%1%' that is true to compute the indexes for the filter. If stringColumn2 has more than 100 values, it ends up being worse than simply checking for a match in those 100 remaining rows.

With the new logic, Druid now checks the selectivity of indexes as it processes each clause of the AND filter.
If it determines it would take more work to compute the index than to match the remaining rows, Druid skips computing the index.

The order you write filters in a WHERE clause of a query can improve the performance of your query.
More improvements are coming, but you can try out the existing improvements by reordering a query.
Put indexes that are less intensive to compute such as IS NULL, =, and comparisons (>, >=, <, and <=) near the start of AND filters so that Druid more efficiently processes your queries.
Not ordering your filters in this way won’t degrade performance from previous releases since the fallback behavior is what Druid did previously.

#15838

# Centralized datasource schema (alpha)

You can now configure Druid to manage datasource schema centrally on the Coordinator.
Previously, Brokers...

Contributors

sullis, rash67, and 56 other contributors

Assets 2

03 Apr 05:02

cryptoe

druid-29.0.1

15b3efd

druid-29.0.1

Druid 29.0.1

Apache Druid 29.0.1 is a patch release that fixes some issues in the Druid 29.0.0 release.

Bug fixes

Added type verification for INSERT and REPLACE to validate that strings and string arrays aren't mixed #15920
Concurrent replace now allows pending Peon segments to be upgraded using the Supervisor #15995
Changed the targetDataSource attribute to return a string containing the name of the datasource. This reverts the breaking change introduced in Druid 29.0.0 for INSERT and REPLACE MSQ queries #16004 #16031
Decreased the size of the distribution Docker image #15968
Fixed an issue with SQL-based ingestion where string inputs, such as from CSV, TSV, or string-value fields in JSON, are ingested as null values when they are typed as LONG or BIGINT #15999
Fixed an issue where a web console-generated Kafka supervisor spec has flattenSpec in the wrong location #15946
Fixed an issue with filters on expression virtual column indexes incorrectly considering values null in some cases for expressions which translate null values into not null values #15959
Fixed an issue where the data loader crashes if the incoming data can't be parsed #15983
Improved DOUBLE type detection in the web console #15998
Web console-generated queries now only set the context parameter arrayIngestMode to array when you explicitly opt in to use arrays #15927
The web console now displays the results of an MSQ query that writes to an external destination through the EXTERN function #15969

Incompatible changes

Changes to `targetDataSource` in EXPLAIN queries

Druid 29.0.1 includes a breaking change that restores the behavior for targetDataSource to its 28.0.0 and earlier state, different from Druid 29.0.0 and only 29.0.0. In 29.0.0, targetDataSource returns a JSON object that includes the datasource name. In all other versions, targetDataSource returns a string containing the name of the datasource.

If you're upgrading from any version other than 29.0.0, there is no change in behavior.

If you are upgrading from 29.0.0, this is an incompatible change.

#16004

Dependency updates

Updated PostgreSQL JDBC Driver version to 42.7.2 #15931

Credits

@abhishekagarwal87
@adarshsanjeev
@AmatyaAvadhanula
@clintropolis
@cryptoe
@dependabot[bot]
@ektravel
@gargvishesh
@gianm
@kgyrtkirk
@LakshSingla
@somu-imply
@techdocsmith
@vogievetsky

Contributors

vogievetsky, gianm, and 12 other contributors

Assets 2

Releases: apache/druid

Druid 34.0.0

Important features, changes, and deprecations

Java 11 support

Hadoop-based ingestion

Use SET statements for query context parameters

SET statements in the Druid console

SET statements with the API

Improved HTTP endpoints

Cloning Historicals (experimental)

Embedded kill tasks on the Overlord (Experimental)

Preferred tier selection

Dart improvements

SegmentMetadataCache on the Coordinator

Functional area and related changes

Web console

Uh oh!

Druid 33.0.0

# Important features, changes, and deprecations

# Increase segment load speed

# Overlord APIs for compaction (experimental)

# Scheduled batch ingestion (experimental)

# Improved S3 upload

# Functional area and related changes

# Web console

# MERGE INTO

# Other web console improvements

# Ingestion

# SQL-based ingestion

# Other SQL-based ingestion improvements

# Streaming ingestion

# Query parameter for restarts

# Other streaming ingestion improvements

# Querying

# Improved the query results API

Contributors

Uh oh!

Druid 32.0.1

Uh oh!

Druid 31.0.2

Uh oh!

Druid 32.0.0

# Important features

# New Overlord APIs

# Realtime query processing for multi-value strings

# Join hints in MSQ task engine queries

# Changes and deprecations

# ANSI-SQL compatibility and query results

# Java support

# Hadoop-based ingestion

# Join hints in MSQ task engine queries

# Functional area and related changes

# Web console

# Explore view (experimental)

# Segment timeline view

# Other web conso...

Contributors

Uh oh!

Druid 31.0.1

#Bug fixes

# Credits

Contributors

Uh oh!

Druid 31.0.0

# Important features, changes, and deprecations

# Compaction features

# Window functions are GA

# Concurrent append and replace GA

# Delta Lake improvements

# Iceberg improvements

# Projections (experimental)

# Low latency high complexity queries using Dart (experimental)

# Storage improvements

# Upgrade-related changes

# Deprecations

# Java 8 support

# Other deprecations

# Functional areas and related changes

# Web console

# Improvements to the stages display

`SegmentMetadataCache` on the Coordinator

Changes to `targetDataSource` in EXPLAIN queries