Skip to content

Conversation

@rluvaton
Copy link
Member

@rluvaton rluvaton commented Nov 4, 2025

Which issue does this PR close?

N/A

Rationale for this change

The default implementations iterate over the iterator to get the value, while we can do that in constant time

What changes are included in this PR?

override nth, nth_back, last and count

Are these changes tested?

existing tests in this file that I added in previous pr

Are there any user-facing changes?

Nope


Extracted from the following PR as I probably close it as it is not faster locally in some cases:

@github-actions github-actions bot added the arrow Changes to the arrow crate label Nov 4, 2025
pub fn new(array: T) -> Self {
let len = array.len();
let logical_nulls = array.logical_nulls();
let logical_nulls = array.logical_nulls().filter(|x| x.null_count() > 0);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid checking for nulls if null buffer exists but no nulls

Comment on lines +124 to +126
fn last(mut self) -> Option<Self::Item> {
self.next_back()
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default implementation is doing it in O(n) and is not (currently) taking advantage of it being DoubleEndedIterator while we are doing it in constant time.

this is the default impl:

#[inline]
#[stable(feature = "rust1", since = "1.0.0")]
fn last(self) -> Option<Self::Item>
where
    Self: Sized,
{
    #[inline]
    fn some<T>(_: Option<T>, x: T) -> Option<T> {
        Some(x)
    }

    self.fold(None, some)
}

from Rust source code

Comment on lines +128 to +133
fn count(self) -> usize
where
Self: Sized,
{
self.len()
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

count default impl does not (currently) taking advantage of it being ExactSizeIterator and it does it in O(n) while we are doing it in constant time:

#[inline]
#[stable(feature = "rust1", since = "1.0.0")]
fn count(self) -> usize
where
    Self: Sized,
{
    self.fold(
        0,
        #[rustc_inherit_overflow_checks]
        |count, _| count + 1,
    )
}

From Rust source code

Comment on lines +106 to +122
fn nth(&mut self, n: usize) -> Option<Self::Item> {
// Check if we can advance to the desired offset
match self.current.checked_add(n) {
// Yes, and still within bounds
Some(new_current) if new_current < self.current_end => {
self.current = new_current;
}

// Either overflow or would exceed current_end
_ => {
self.current = self.current_end;
return None;
}
}

self.next()
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default implementation does it in O(n) while we can do it in constant time.

we also can't implement advance_by which is used by nth in the default implementation as it is unstable

This is the default implementation:

#[inline]
#[unstable(feature = "iter_advance_by", reason = "recently added", issue = "77404")]
fn advance_by(&mut self, n: usize) -> Result<(), NonZero<usize>> {
    /// Helper trait to specialize `advance_by` via `try_fold` for `Sized` iterators.
    trait SpecAdvanceBy {
        fn spec_advance_by(&mut self, n: usize) -> Result<(), NonZero<usize>>;
    }

    impl<I: Iterator + ?Sized> SpecAdvanceBy for I {
        default fn spec_advance_by(&mut self, n: usize) -> Result<(), NonZero<usize>> {
            for i in 0..n {
                if self.next().is_none() {
                    // SAFETY: `i` is always less than `n`.
                    return Err(unsafe { NonZero::new_unchecked(n - i) });
                }
            }
            Ok(())
        }
    }

    impl<I: Iterator> SpecAdvanceBy for I {
        fn spec_advance_by(&mut self, n: usize) -> Result<(), NonZero<usize>> {
            let Some(n) = NonZero::new(n) else {
                return Ok(());
            };

            let res = self.try_fold(n, |n, _| NonZero::new(n.get() - 1));

            match res {
                None => Ok(()),
                Some(n) => Err(n),
            }
        }
    }

    self.spec_advance_by(n)
}

#[inline]
#[stable(feature = "rust1", since = "1.0.0")]
fn nth(&mut self, n: usize) -> Option<Self::Item> {
    self.advance_by(n).ok()?;
    self.next()
}

From Rust source code

}
}

fn nth_back(&mut self, n: usize) -> Option<Self::Item> {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same idea as in nth

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant