Skip to content

Releases: redpanda-data/redpanda

v24.3.15

18 Jun 04:29
0d70730
Compare
Choose a tag to compare

Bug Fixes

  • Allow partition balancing to opearte in case when space management was enabled, but local target capacity was unset. by @ztlpn in #26306
  • Enable TCP keepalive for cloud storage connections. by @Lazin in #26409
  • Fix Redpanda crash if partition_autobalancing_concurrent_moves was set to 0. by @ztlpn in #26306
  • #26190 Fixes a bug in which a broker would crash during sliding window compaction when started with log_compaction_use_sliding_window=false and its value was later set to true without restarting. by @WillemKauf in #26196
  • partition_autobalancing_mode=off now stops on-demand partition rebalance as well. by @ztlpn in #26306
  • PR #26082 [v24.3.x] archival: Fix archival_stm_snapshot installation by @Lazin
  • PR #26277 [v24.3.x] r/consensus: stop consumable offset monitor by @mmaslankaprv
  • PR #26307 [v24.3.x] Fix archival STM shutdown race by @bashtanov
  • PR #26432 [v24.3.x] storage: fix index state truncate overflow by @andrwng

Improvements

  • Improved handling rf=1 partitions health reporting by @mmaslankaprv in #26178
  • In AlterPartitionReassignmentsResponse per-partition response REASSIGNMENT_IN_PROGRESS error code is used if a reassignment is requested while Partition Balancer is moving partition replicas. by @bashtanov in #26349
  • Made it easier to detect and diagnose node operation issues by @mmaslankaprv in #26176
  • #26202 Adds the storage_log_adjacent_segments_compacted metric for better observability into adjacent segment compaction. by @WillemKauf in #26204
  • #26252 rpk: decommission-status reports reallocation failure details by @daisukebe in #26263
  • rpk debug bundle: improve reliability of debug bundle collection in k8s environments. by @r-vasquez in #26163
  • PR #26108 [v24.3.x] make ntp_callbacks actually support multiple callbacks by @bashtanov
  • PR #26173 [v24.3.x] storage: add _segment_cleanly_compacted to probe (Manual backport) by @WillemKauf
  • PR #26176 [v24.3.x] Decommission status improvements by @mmaslankaprv
  • PR #26223 [v24.3.x] c/archival: wakeup upload loop after flush by @ztlpn
  • PR #26230 [v24.3.x] storage: output full segment in WARN log in offset_to_filepos.cc by @WillemKauf
  • PR #26339 [v24.3.x] c/partition_manager: added log entries when partition is being shutdown by @mmaslankaprv

Full Changelog: v24.3.14...v24.3.15

v24.2.25

18 Jun 14:39
f49a910
Compare
Choose a tag to compare

Bug Fixes

  • Allow partition balancing to opearte in case when space management was enabled, but local target capacity was unset. by @ztlpn in #26305
  • Enable TCP keepalive for cloud storage connections. by @Lazin in #26410
  • Fix Redpanda crash if partition_autobalancing_concurrent_moves was set to 0. by @ztlpn in #26305
  • When Tiered Storage is paused and data is allowed to expire from local storage there will be gaps between last offset in tiered storage and first offset in local storage. If local storage was truncated in the middle of a segment (i.e. time based retention or via trim-prefix/delete records commands) tiered storage might get stuck with the following exception: Failed to schedule upload: std::runtime_error (ntp {kafka/foo/0}: log offset N is outside the translation range (starting at M > N)). Fix this by adjusting upload start offset to the first available and valid offset. Although we might have a bit more data in the segment, other information about that data (i.e. offset translation) is gone with prefix truncation. by @nvartolomei in #26064
  • partition_autobalancing_mode=off now stops on-demand partition rebalance as well. by @ztlpn in #26305

Improvements

  • In AlterPartitionReassignmentsResponse per-partition response REASSIGNMENT_IN_PROGRESS error code is used if a reassignment is requested while Partition Balancer is moving partition replicas. by @bashtanov in #26350
  • #26135 Swap out an internal data structure in the storage layer to prevent oversized allocations and crashes when a large number of segments are present in a partition. by @WillemKauf in #26138
  • rpk transform now uses the tinygo v37 to compile golang to Wasm. by @r-vasquez in #26217
  • rpk debug bundle: improve reliability of debug bundle collection in k8s environments. by @r-vasquez in #26214
  • PR #26010 [v24.2.x] create STMs based on original topic cfg by @bashtanov
  • PR #26340 [v24.2.x] c/partition_manager: added log entries when partition is being shutdown by @mmaslankaprv
  • PR #26344 [v24.2.x] raft/test/leadership_transfer_delay: increase tolerance by @bashtanov
  • PR #26345 [v24.2.x] make ntp_callbacks actually support multiple callbacks by @bashtanov
  • PR #26354 Revert "[v24.2.x] raft/log_eviction_stm: avoid unnecessary wait on visible offset" by @bharathv
  • PR #26420 [v24.2.x] Fix archival STM shutdown race by @bashtanov
  • PR #26428 [backport] [v24.2.x] raft/c: warn on struck truncation. by @bharathv
  • PR #26442 [v24.2.x] storage: call reserve() in storage::range() by @WillemKauf

Full Changelog: v24.2.24...v24.2.25

v25.1.5

12 Jun 15:42
f0e74e1
Compare
Choose a tag to compare

Features

Bug Fixes

  • Allow partition balancing to opearte in case when space management was enabled, but local target capacity was unset. by @ztlpn in #26304
  • Enable TCP keepalive for cloud storage connections. by @Lazin in #26411
  • Fix Redpanda crash if partition_autobalancing_concurrent_moves was set to 0. by @ztlpn in #26304
  • Properly set TLS SNI information for Iceberg REST catalog connections. by @wdberkeley in #26370
  • Several Iceberg REST catalog configurations are now correctly marked as needing restart. by @andrwng in #26218
  • When Tiered Storage is paused and data is allowed to expire from local storage there will be gaps between last offset in tiered storage and first offset in local storage. If local storage was truncated in the middle of a segment (i.e. time based retention or via trim-prefix/delete records commands) tiered storage might get stuck with the following exception: Failed to schedule upload: std::runtime_error (ntp {kafka/foo/0}: log offset N is outside the translation range (starting at M > N)). Fix this by adjusting upload start offset to the first available and valid offset. Although we might have a bit more data in the segment, other information about that data (i.e. offset translation) is gone with prefix truncation. by @nvartolomei in #26066
  • #26191 Fixes a bug in which a broker would crash during sliding window compaction when started with log_compaction_use_sliding_window=false and its value was later set to true without restarting. by @WillemKauf in #26197
  • partition_autobalancing_mode=off now stops on-demand partition rebalance as well. by @ztlpn in #26304

Improvements

  • Adds the storage_log_adjacent_segments_compacted metric for better observability into adjacent segment compaction. by @WillemKauf in #26203
  • Allow changing redpanda.iceberg.mode dynamically at runtime by @bharathv in #26171
  • Improved handling rf=1 partitions health reporting by @mmaslankaprv in #26101
  • In AlterPartitionReassignmentsResponse per-partition response REASSIGNMENT_IN_PROGRESS error code is used if a reassignment is requested while Partition Balancer is moving partition replicas. by @bashtanov in #26347
  • Made it easier to detect and diagnose node operation issues by @mmaslankaprv in #26147
  • Swap out an internal data structure in the storage layer to prevent oversized allocations and crashes when a large number of segments are present in a partition. by @WillemKauf in #26134
  • better observability of state machines shutdown issues by @mmaslankaprv in #26413
  • rpk debug bundle: improve reliability of debug bundle collection in k8s environments. by @r-vasquez in #26164
  • rpk: decommission-status reports reallocation failure details by @daisukebe in #26253
  • rpk: introduce logger to our rpk registry commands. (Works with -v) by @r-vasquez in #26251
  • PR #26081 [v25.1.x] archival: Fix archival_stm_snapshot installation by @Lazin
  • PR #26145 [v25.1.x] r/consensus: do not block leadership completely in maintenance mode by @mmaslankaprv
  • PR #26168 [v25.1.x] storage: CORE-10056: Remove contiguous allocations in lock_manager by @wdberkeley
  • PR #26267 [v25.1.x] Fix archival STM shutdown race by @bashtanov
  • PR #26275 [v25.1.x] r/consensus: stop consumable offset monitor by @mmaslankaprv
  • PR #26327 [v25.1.x] datalake: hold gate when interacting with catalog in schema manager by @mmaslankaprv

Full Changelog: v25.1.4...v25.1.5

v24.3.14

21 May 13:26
d6fbc26
Compare
Choose a tag to compare

Bug Fixes

  • When Tiered Storage is paused and data potentially expires from local storage, there can be gaps between last offset in tiered storage and first offset in local storage. If local storage was truncated in the middle of a segment (i.e. time based retention or via trim-prefix/delete records commands) tiered storage might get stuck with the following exception: Failed to schedule upload: std::runtime_error (ntp {kafka/foo/0}: log offset N is outside the translation range (starting at M > N)). Fix this by adjusting the upload start offset to the first available and valid offset. Although we might have a bit more data in the segment, other information about that data (i.e. offset translation) is gone with prefix truncation. by @nvartolomei in #26065

Improvements

  • #26136 Swap out an internal data structure in the storage layer to prevent oversized allocations and crashes when a large number of segments are present in a partition. by @WillemKauf in #26137
  • PR #26166 [v24.3.x] storage: CORE-10056: Remove contiguous allocations in lock_manager by @wdberkeley

Full Changelog: v24.3.13...v24.3.14

v24.1.21

13 May 00:17
5f6634d
Compare
Choose a tag to compare

Bug Fixes

  • PR #26090 [v24.1.x] storage: fix non_data_timestamps with overflow (Manual backport) by @WillemKauf

Full Changelog: v24.1.20...v24.1.21

v25.1.4

10 May 04:08
6cd1d4e
Compare
Choose a tag to compare

Bug Fixes

Full Changelog: v25.1.3...v25.1.4

v24.3.13

10 May 02:38
5415223
Compare
Choose a tag to compare

Bug Fixes

Full Changelog: v24.3.12...v24.3.13

v24.2.24

10 May 02:37
82c549b
Compare
Choose a tag to compare

Bug Fixes

  • PR #26089 [v24.2.x] storage: fix non_data_timestamps with overflow (Manual backport) by @WillemKauf

Full Changelog: v24.2.23...v24.2.24

v24.2.23

09 May 05:03
7650b5a
Compare
Choose a tag to compare

Bug Fixes

  • Fix a bug that resulted in brokers being reported with is_alive=false in the broker list returned by admin API if node ids were reused. by @ztlpn in #25966
  • Fix a bug that resulted in spurious partition balancer actions during rolling restart by @ztlpn in #25992
  • Fixes an issue with local segment indexes that could result in incorrect Raft recovery, compaction, or fetches in topic partitions with very large offsets (greater than 4294967295). by @andrwng in #26024
  • #25984 rpk: Fixed *.tls.enabled flags to correctly disable TLS when set to "false". by @r-vasquez in #25985
  • #26015 Fixes a bug that lead to the base_offset in segment_indexes being persisted as 0 during and after compaction. by @WillemKauf in #26036
  • removes stale uncompacted tx_fence batches from consumer offsets topic. by @bharathv in #26042
  • PR #25299 [v24.2.x] k/topic_utils: do not iterate over the assignments set by @mmaslankaprv
  • PR #25950 [v24.2.x] storage: make_ghost_batches off by one by @andrwng
  • PR #25999 [v24.2.x] storage: consider uint32_t offset space in adjacent segment compaction (Manual backport) by @WillemKauf
  • PR #26000 [v24.2.x] rpk: prevent security role in cloud by @r-vasquez

Full Changelog: v24.2.22...v24.2.23

v24.3.12

08 May 14:41
ff9333f
Compare
Choose a tag to compare

Bug Fixes

  • Fix a bug that resulted in brokers being reported with is_alive=false in the broker list returned by admin API if node ids were reused. by @ztlpn in #25964
  • Fix a bug that resulted in spurious partition balancer actions during rolling restart by @ztlpn in #25978
  • Fixes an issue with local segment indexes that could result in incorrect Raft recovery, compaction, or fetches in topic partitions with very large offsets (greater than 4294967295). by @andrwng in #26022
  • #25982 rpk: Fixed *.tls.enabled flags to correctly disable TLS when set to "false". by @r-vasquez in #25983
  • #26003 Fixes a bug in which the broker response to a ListOffsets timequery would return incorrect record offsets when considering a compressed record batch. by @WillemKauf in #26006
  • #26017 Fixes a bug that lead to the base_offset in segment_indexes being persisted as 0 during and after compaction. by @WillemKauf in #26018
  • removes stale uncompacted tx_fence batches from consumer offsets topic. by @bharathv in #26041
  • PR #25902 [v24.3.x] Fixed URL encoding in Azure Blob Store client by @mmaslankaprv
  • PR #25927 [v24.3.x] application: stop coordinator before translators by @andrwng
  • PR #25928 [v24.3.x] Use partition max collectible offset to gate FPM by @mmaslankaprv
  • PR #25948 [v24.3.x] storage: make_ghost_batches off by one by @andrwng
  • PR #25967 [v24.3.x] create STMs based on original topic cfg by @bashtanov
  • PR #25998 [v24.3.x] storage: consider uint32_t offset space in adjacent segment compaction (Manual backport) by @WillemKauf
  • PR #26001 [v24.3.x] rpk: prevent security role in cloud by @r-vasquez
  • PR #26047 [v24.3.x] tx/chaos: deflake TxSubscribeTest by @bharathv

Full Changelog: v24.3.11...v24.3.12