Releases: redpanda-data/redpanda
Releases · redpanda-data/redpanda
v24.3.15
Bug Fixes
- Allow partition balancing to opearte in case when space management was enabled, but local target capacity was unset. by @ztlpn in #26306
- Enable TCP keepalive for cloud storage connections. by @Lazin in #26409
- Fix Redpanda crash if
partition_autobalancing_concurrent_moves
was set to 0. by @ztlpn in #26306 - #26190 Fixes a bug in which a broker would crash during sliding window compaction when started with
log_compaction_use_sliding_window=false
and its value was later set totrue
without restarting. by @WillemKauf in #26196 partition_autobalancing_mode=off
now stops on-demand partition rebalance as well. by @ztlpn in #26306- PR #26082 [v24.3.x] archival: Fix archival_stm_snapshot installation by @Lazin
- PR #26277 [v24.3.x] r/consensus: stop consumable offset monitor by @mmaslankaprv
- PR #26307 [v24.3.x] Fix archival STM shutdown race by @bashtanov
- PR #26432 [v24.3.x] storage: fix index state truncate overflow by @andrwng
Improvements
- Improved handling rf=1 partitions health reporting by @mmaslankaprv in #26178
- In AlterPartitionReassignmentsResponse per-partition response REASSIGNMENT_IN_PROGRESS error code is used if a reassignment is requested while Partition Balancer is moving partition replicas. by @bashtanov in #26349
- Made it easier to detect and diagnose node operation issues by @mmaslankaprv in #26176
- #26202 Adds the
storage_log_adjacent_segments_compacted
metric for better observability into adjacent segment compaction. by @WillemKauf in #26204 - #26252 rpk:
decommission-status
reports reallocation failure details by @daisukebe in #26263 - rpk debug bundle: improve reliability of debug bundle collection in k8s environments. by @r-vasquez in #26163
- PR #26108 [v24.3.x] make ntp_callbacks actually support multiple callbacks by @bashtanov
- PR #26173 [v24.3.x]
storage
: add_segment_cleanly_compacted
toprobe
(Manual backport) by @WillemKauf - PR #26176 [v24.3.x] Decommission status improvements by @mmaslankaprv
- PR #26223 [v24.3.x] c/archival: wakeup upload loop after flush by @ztlpn
- PR #26230 [v24.3.x]
storage
: output fullsegment
inWARN
log inoffset_to_filepos.cc
by @WillemKauf - PR #26339 [v24.3.x] c/partition_manager: added log entries when partition is being shutdown by @mmaslankaprv
Full Changelog: v24.3.14...v24.3.15
v24.2.25
Bug Fixes
- Allow partition balancing to opearte in case when space management was enabled, but local target capacity was unset. by @ztlpn in #26305
- Enable TCP keepalive for cloud storage connections. by @Lazin in #26410
- Fix Redpanda crash if
partition_autobalancing_concurrent_moves
was set to 0. by @ztlpn in #26305 - When Tiered Storage is paused and data is allowed to expire from local storage there will be gaps between last offset in tiered storage and first offset in local storage. If local storage was truncated in the middle of a segment (i.e. time based retention or via trim-prefix/delete records commands) tiered storage might get stuck with the following exception:
Failed to schedule upload: std::runtime_error (ntp {kafka/foo/0}: log offset N is outside the translation range (starting at M > N))
. Fix this by adjusting upload start offset to the first available and valid offset. Although we might have a bit more data in the segment, other information about that data (i.e. offset translation) is gone with prefix truncation. by @nvartolomei in #26064 partition_autobalancing_mode=off
now stops on-demand partition rebalance as well. by @ztlpn in #26305
Improvements
- In AlterPartitionReassignmentsResponse per-partition response REASSIGNMENT_IN_PROGRESS error code is used if a reassignment is requested while Partition Balancer is moving partition replicas. by @bashtanov in #26350
- #26135 Swap out an internal data structure in the
storage
layer to prevent oversized allocations and crashes when a large number ofsegment
s are present in apartition
. by @WillemKauf in #26138 rpk transform
now uses the tinygo v37 to compile golang to Wasm. by @r-vasquez in #26217- rpk debug bundle: improve reliability of debug bundle collection in k8s environments. by @r-vasquez in #26214
- PR #26010 [v24.2.x] create STMs based on original topic cfg by @bashtanov
- PR #26340 [v24.2.x] c/partition_manager: added log entries when partition is being shutdown by @mmaslankaprv
- PR #26344 [v24.2.x] raft/test/leadership_transfer_delay: increase tolerance by @bashtanov
- PR #26345 [v24.2.x] make ntp_callbacks actually support multiple callbacks by @bashtanov
- PR #26354 Revert "[v24.2.x] raft/log_eviction_stm: avoid unnecessary wait on visible offset" by @bharathv
- PR #26420 [v24.2.x] Fix archival STM shutdown race by @bashtanov
- PR #26428 [backport] [v24.2.x] raft/c: warn on struck truncation. by @bharathv
- PR #26442 [v24.2.x]
storage
: callreserve()
instorage::range()
by @WillemKauf
Full Changelog: v24.2.24...v24.2.25
v25.1.5
Features
- allow use
rpk cluster config get
in cloud cluster. by @andresaristizabal in #26161 - PR #26078 [v25.1.x] make ntp_callbacks actually support multiple callbacks by @bashtanov
- PR #26183 [v25.1.x] c/archival: wakeup upload loop after flush by @ztlpn
- PR #26211 [v25.1.x] Improve safe pause resume by @Lazin
- PR #26225 [v25.1.x] kafka/debug: add a debug end point for offset_for_leader_epoch by @bharathv
- PR #26229 [v25.1.x]
storage
: output fullsegment
inWARN
log inoffset_to_filepos.cc
by @WillemKauf - PR #26438 Backport #26426 to v25.1.x by @wdberkeley
Bug Fixes
- Allow partition balancing to opearte in case when space management was enabled, but local target capacity was unset. by @ztlpn in #26304
- Enable TCP keepalive for cloud storage connections. by @Lazin in #26411
- Fix Redpanda crash if
partition_autobalancing_concurrent_moves
was set to 0. by @ztlpn in #26304 - Properly set TLS SNI information for Iceberg REST catalog connections. by @wdberkeley in #26370
- Several Iceberg REST catalog configurations are now correctly marked as needing restart. by @andrwng in #26218
- When Tiered Storage is paused and data is allowed to expire from local storage there will be gaps between last offset in tiered storage and first offset in local storage. If local storage was truncated in the middle of a segment (i.e. time based retention or via trim-prefix/delete records commands) tiered storage might get stuck with the following exception:
Failed to schedule upload: std::runtime_error (ntp {kafka/foo/0}: log offset N is outside the translation range (starting at M > N))
. Fix this by adjusting upload start offset to the first available and valid offset. Although we might have a bit more data in the segment, other information about that data (i.e. offset translation) is gone with prefix truncation. by @nvartolomei in #26066 - #26191 Fixes a bug in which a broker would crash during sliding window compaction when started with
log_compaction_use_sliding_window=false
and its value was later set totrue
without restarting. by @WillemKauf in #26197 partition_autobalancing_mode=off
now stops on-demand partition rebalance as well. by @ztlpn in #26304
Improvements
- Adds the
storage_log_adjacent_segments_compacted
metric for better observability into adjacent segment compaction. by @WillemKauf in #26203 - Allow changing
redpanda.iceberg.mode
dynamically at runtime by @bharathv in #26171 - Improved handling rf=1 partitions health reporting by @mmaslankaprv in #26101
- In AlterPartitionReassignmentsResponse per-partition response REASSIGNMENT_IN_PROGRESS error code is used if a reassignment is requested while Partition Balancer is moving partition replicas. by @bashtanov in #26347
- Made it easier to detect and diagnose node operation issues by @mmaslankaprv in #26147
- Swap out an internal data structure in the
storage
layer to prevent oversized allocations and crashes when a large number ofsegment
s are present in apartition
. by @WillemKauf in #26134 - better observability of state machines shutdown issues by @mmaslankaprv in #26413
- rpk debug bundle: improve reliability of debug bundle collection in k8s environments. by @r-vasquez in #26164
- rpk:
decommission-status
reports reallocation failure details by @daisukebe in #26253 - rpk: introduce logger to our
rpk registry
commands. (Works with -v) by @r-vasquez in #26251 - PR #26081 [v25.1.x] archival: Fix archival_stm_snapshot installation by @Lazin
- PR #26145 [v25.1.x] r/consensus: do not block leadership completely in maintenance mode by @mmaslankaprv
- PR #26168 [v25.1.x] storage: CORE-10056: Remove contiguous allocations in lock_manager by @wdberkeley
- PR #26267 [v25.1.x] Fix archival STM shutdown race by @bashtanov
- PR #26275 [v25.1.x] r/consensus: stop consumable offset monitor by @mmaslankaprv
- PR #26327 [v25.1.x] datalake: hold gate when interacting with catalog in schema manager by @mmaslankaprv
Full Changelog: v25.1.4...v25.1.5
v24.3.14
Bug Fixes
- When Tiered Storage is paused and data potentially expires from local storage, there can be gaps between last offset in tiered storage and first offset in local storage. If local storage was truncated in the middle of a segment (i.e. time based retention or via trim-prefix/delete records commands) tiered storage might get stuck with the following exception:
Failed to schedule upload: std::runtime_error (ntp {kafka/foo/0}: log offset N is outside the translation range (starting at M > N))
. Fix this by adjusting the upload start offset to the first available and valid offset. Although we might have a bit more data in the segment, other information about that data (i.e. offset translation) is gone with prefix truncation. by @nvartolomei in #26065
Improvements
- #26136 Swap out an internal data structure in the
storage
layer to prevent oversized allocations and crashes when a large number ofsegment
s are present in apartition
. by @WillemKauf in #26137 - PR #26166 [v24.3.x] storage: CORE-10056: Remove contiguous allocations in lock_manager by @wdberkeley
Full Changelog: v24.3.13...v24.3.14
v24.1.21
Bug Fixes
- PR #26090 [v24.1.x]
storage
: fixnon_data_timestamps
with overflow (Manual backport) by @WillemKauf
Full Changelog: v24.1.20...v24.1.21
v25.1.4
Bug Fixes
- PR #26085 [v25.1.x]
storage
: fixnon_data_timestamps
with overflow by @WillemKauf
Full Changelog: v25.1.3...v25.1.4
v24.3.13
Bug Fixes
- PR #26086 [v24.3.x]
storage
: fixnon_data_timestamps
with overflow by @WillemKauf
Full Changelog: v24.3.12...v24.3.13
v24.2.24
Bug Fixes
- PR #26089 [v24.2.x]
storage
: fixnon_data_timestamps
with overflow (Manual backport) by @WillemKauf
Full Changelog: v24.2.23...v24.2.24
v24.2.23
Bug Fixes
- Fix a bug that resulted in brokers being reported with
is_alive=false
in the broker list returned by admin API if node ids were reused. by @ztlpn in #25966 - Fix a bug that resulted in spurious partition balancer actions during rolling restart by @ztlpn in #25992
- Fixes an issue with local segment indexes that could result in incorrect Raft recovery, compaction, or fetches in topic partitions with very large offsets (greater than 4294967295). by @andrwng in #26024
- #25984 rpk: Fixed
*.tls.enabled
flags to correctly disable TLS when set to "false". by @r-vasquez in #25985 - #26015 Fixes a bug that lead to the
base_offset
insegment_index
es being persisted as 0 during and after compaction. by @WillemKauf in #26036 - removes stale uncompacted tx_fence batches from consumer offsets topic. by @bharathv in #26042
- PR #25299 [v24.2.x] k/topic_utils: do not iterate over the assignments set by @mmaslankaprv
- PR #25950 [v24.2.x] storage: make_ghost_batches off by one by @andrwng
- PR #25999 [v24.2.x]
storage
: consideruint32_t
offset space in adjacent segment compaction (Manual backport) by @WillemKauf - PR #26000 [v24.2.x] rpk: prevent security role in cloud by @r-vasquez
Full Changelog: v24.2.22...v24.2.23
v24.3.12
Bug Fixes
- Fix a bug that resulted in brokers being reported with
is_alive=false
in the broker list returned by admin API if node ids were reused. by @ztlpn in #25964 - Fix a bug that resulted in spurious partition balancer actions during rolling restart by @ztlpn in #25978
- Fixes an issue with local segment indexes that could result in incorrect Raft recovery, compaction, or fetches in topic partitions with very large offsets (greater than 4294967295). by @andrwng in #26022
- #25982 rpk: Fixed
*.tls.enabled
flags to correctly disable TLS when set to "false". by @r-vasquez in #25983 - #26003 Fixes a bug in which the broker response to a
ListOffsets
timequery would return incorrect record offsets when considering a compressed record batch. by @WillemKauf in #26006 - #26017 Fixes a bug that lead to the
base_offset
insegment_index
es being persisted as 0 during and after compaction. by @WillemKauf in #26018 - removes stale uncompacted tx_fence batches from consumer offsets topic. by @bharathv in #26041
- PR #25902 [v24.3.x] Fixed URL encoding in Azure Blob Store client by @mmaslankaprv
- PR #25927 [v24.3.x] application: stop coordinator before translators by @andrwng
- PR #25928 [v24.3.x] Use partition max collectible offset to gate FPM by @mmaslankaprv
- PR #25948 [v24.3.x] storage: make_ghost_batches off by one by @andrwng
- PR #25967 [v24.3.x] create STMs based on original topic cfg by @bashtanov
- PR #25998 [v24.3.x]
storage
: consideruint32_t
offset space in adjacent segment compaction (Manual backport) by @WillemKauf - PR #26001 [v24.3.x] rpk: prevent security role in cloud by @r-vasquez
- PR #26047 [v24.3.x] tx/chaos: deflake TxSubscribeTest by @bharathv
Full Changelog: v24.3.11...v24.3.12