perf: override `ArrayIter` default impl for `nth`, `nth_back`, `last` and `count` #8785

rluvaton · 2025-11-04T22:37:08Z

Which issue does this PR close?

N/A

Rationale for this change

The default implementations iterate over the iterator to get the value, while we can do that in constant time

What changes are included in this PR?

override nth, nth_back, last and count

Are these changes tested?

existing tests in this file that I added in previous pr

Are there any user-facing changes?

Nope

Extracted from the following PR as I probably close it as it is not faster locally in some cases:

perf: override default implementation in ArrayIter with dedicated null/non-nullable versions #8697

… and `count`

rluvaton · 2025-11-04T22:37:43Z

arrow-array/src/iterator.rs

    pub fn new(array: T) -> Self {
        let len = array.len();
-        let logical_nulls = array.logical_nulls();
+        let logical_nulls = array.logical_nulls().filter(|x| x.null_count() > 0);


To avoid checking for nulls if null buffer exists but no nulls

rluvaton · 2025-11-04T22:41:15Z

arrow-array/src/iterator.rs

+    fn last(mut self) -> Option<Self::Item> {
+        self.next_back()
+    }


The default implementation is doing it in O(n) and is not (currently) taking advantage of it being DoubleEndedIterator while we are doing it in constant time.

this is the default impl:

#[inline] #[stable(feature = "rust1", since = "1.0.0")] fn last(self) -> Option<Self::Item> where Self: Sized, { #[inline] fn some<T>(_: Option<T>, x: T) -> Option<T> { Some(x) } self.fold(None, some) }

from Rust source code

rluvaton · 2025-11-04T22:43:42Z

arrow-array/src/iterator.rs

+    fn count(self) -> usize
+    where
+        Self: Sized,
+    {
+        self.len()
+    }


count default impl does not (currently) taking advantage of it being ExactSizeIterator and it does it in O(n) while we are doing it in constant time:

#[inline] #[stable(feature = "rust1", since = "1.0.0")] fn count(self) -> usize where Self: Sized, { self.fold( 0, #[rustc_inherit_overflow_checks] |count, _| count + 1, ) }

From Rust source code

rluvaton · 2025-11-04T22:46:55Z

arrow-array/src/iterator.rs

+    fn nth(&mut self, n: usize) -> Option<Self::Item> {
+        // Check if we can advance to the desired offset
+        match self.current.checked_add(n) {
+            // Yes, and still within bounds
+            Some(new_current) if new_current < self.current_end => {
+                self.current = new_current;
+            }
+
+            // Either overflow or would exceed current_end
+            _ => {
+                self.current = self.current_end;
+                return None;
+            }
+        }
+
+        self.next()
+    }


The default implementation does it in O(n) while we can do it in constant time.

we also can't implement advance_by which is used by nth in the default implementation as it is unstable

This is the default implementation:

#[inline] #[unstable(feature = "iter_advance_by", reason = "recently added", issue = "77404")] fn advance_by(&mut self, n: usize) -> Result<(), NonZero<usize>> { /// Helper trait to specialize `advance_by` via `try_fold` for `Sized` iterators. trait SpecAdvanceBy { fn spec_advance_by(&mut self, n: usize) -> Result<(), NonZero<usize>>; } impl<I: Iterator + ?Sized> SpecAdvanceBy for I { default fn spec_advance_by(&mut self, n: usize) -> Result<(), NonZero<usize>> { for i in 0..n { if self.next().is_none() { // SAFETY: `i` is always less than `n`. return Err(unsafe { NonZero::new_unchecked(n - i) }); } } Ok(()) } } impl<I: Iterator> SpecAdvanceBy for I { fn spec_advance_by(&mut self, n: usize) -> Result<(), NonZero<usize>> { let Some(n) = NonZero::new(n) else { return Ok(()); }; let res = self.try_fold(n, |n, _| NonZero::new(n.get() - 1)); match res { None => Ok(()), Some(n) => Err(n), } } } self.spec_advance_by(n) } #[inline] #[stable(feature = "rust1", since = "1.0.0")] fn nth(&mut self, n: usize) -> Option<Self::Item> { self.advance_by(n).ok()?; self.next() }

From Rust source code

rluvaton · 2025-11-04T22:47:09Z

arrow-array/src/iterator.rs

        }
    }
+
+    fn nth_back(&mut self, n: usize) -> Option<Self::Item> {


Same idea as in nth

perf: override ArrayIter default impl for nth, nth_back, last…

8af989e

… and `count`

github-actions bot added the arrow Changes to the arrow crate label Nov 4, 2025

rluvaton commented Nov 4, 2025

View reviewed changes

arrow-array/src/iterator.rs

}

}

fn nth_back(&mut self, n: usize) -> Option<Self::Item> {

Copy link

Member Author

rluvaton Nov 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same idea as in nth

add inline to match rust impl

565fd1f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: override `ArrayIter` default impl for `nth`, `nth_back`, `last` and `count` #8785

perf: override `ArrayIter` default impl for `nth`, `nth_back`, `last` and `count` #8785

Uh oh!

rluvaton commented Nov 4, 2025 •

edited

Loading

Uh oh!

rluvaton Nov 4, 2025

Uh oh!

rluvaton Nov 4, 2025

Uh oh!

rluvaton Nov 4, 2025

Uh oh!

rluvaton Nov 4, 2025

Uh oh!

rluvaton Nov 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

perf: override ArrayIter default impl for nth, nth_back, last and count #8785

Are you sure you want to change the base?

perf: override ArrayIter default impl for nth, nth_back, last and count #8785

Uh oh!

Conversation

rluvaton commented Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

rluvaton Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

rluvaton Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

rluvaton Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

rluvaton Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

rluvaton Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

perf: override `ArrayIter` default impl for `nth`, `nth_back`, `last` and `count` #8785

perf: override `ArrayIter` default impl for `nth`, `nth_back`, `last` and `count` #8785

rluvaton commented Nov 4, 2025 •

edited

Loading