Skip to content

Conversation

friendlymatthew
Copy link
Contributor

Which issue does this PR close?

Rationale for this change

Mapping from an iterator of Variants into a VariantArray currently requires a lot of ceremony. This PR abstracts over this.

@github-actions github-actions bot added the parquet-variant parquet-variant* crates label Oct 15, 2025
@friendlymatthew
Copy link
Contributor Author

cc @alamb @scovich

Re: #8611 (comment)

I'll throw in the other impl From in the second commit

@friendlymatthew friendlymatthew force-pushed the friendlymatthew/impl-into-iter-variant-array branch from 1722f43 to ece0ffb Compare October 15, 2025 14:26
Copy link
Contributor

@scovich scovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

I'll throw in the other impl From in the second commit

I'm not sure this is needed? VariantArray::from_iterator is pretty clear, and a VariantArray::from wouldn't be able to do anything smarter (can't just take ownership of the vec's data, for example)

Comment on lines 453 to 455
let mut b = VariantArrayBuilder::new(0);
b.extend(iter.into_iter().map(Some));
b.build()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this work?

Suggested change
let mut b = VariantArrayBuilder::new(0);
b.extend(iter.into_iter().map(Some));
b.build()
Self::from_iter(iter.into_iter().map(Some))

Variant::ShortString(ShortString::try_new("norm").unwrap()),
];

let variant_array = VariantArray::from_iter(v.into_iter());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't need that into_iter call, the trait takes IntoIterator as input?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I was just testing something and forgot to revert that. Luckily clippy caught it

Comment on lines 443 to 456
impl<'m, 'v> FromIterator<Option<Variant<'m, 'v>>> for VariantArray {
fn from_iter<T: IntoIterator<Item = Option<Variant<'m, 'v>>>>(iter: T) -> Self {
let mut b = VariantArrayBuilder::new(0);
b.extend(iter);
b.build()
}
}

impl<'m, 'v> FromIterator<Variant<'m, 'v>> for VariantArray {
fn from_iter<T: IntoIterator<Item = Variant<'m, 'v>>>(iter: T) -> Self {
let mut b = VariantArrayBuilder::new(0);
b.extend(iter);
b.build()
}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there's a way to get the iterator size without consuming the iterator...

But then again, I haven't ran a profile on the VariantArray code yet. I plan on looking at perf later on

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most std lib code I've seen relies on the iterator's lower bound hint, with a fallback path if that turned out to be too low.

There's also the ExactSizeIterator trait, but given how rust treats potentially overlapping generic definitions as conflicts, I don't know how to provide implementations for both Iterator and ExactSizeIterator?

@friendlymatthew
Copy link
Contributor Author

LGTM!

I'll throw in the other impl From in the second commit

I'm not sure this is needed? VariantArray::from_iterator is pretty clear, and a VariantArray::from wouldn't be able to do anything smarter (can't just take ownership of the vec's data, for example)

Yeah, I agree. I'll just remove the second impl From

@friendlymatthew friendlymatthew force-pushed the friendlymatthew/impl-into-iter-variant-array branch from ece0ffb to eda7457 Compare October 15, 2025 14:31
@friendlymatthew friendlymatthew force-pushed the friendlymatthew/impl-into-iter-variant-array branch from eda7457 to ea264c4 Compare October 15, 2025 14:31
}
}

impl<'m, 'v> FromIterator<Option<Variant<'m, 'v>>> for VariantArray {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do this:

Suggested change
impl<'m, 'v> FromIterator<Option<Variant<'m, 'v>>> for VariantArray {
impl<'m, 'v, V: Into<Variant<'m, 'v>> FromIterator<Option<V>> for VariantArray {

then we can create a variant array from anything that converts cleanly to variant (e.g. i64 values or &str)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I thought about this as well, but I figured we'd be creating 2 abstractions?

As in, I wonder if it's better to guide the user to do the Variant::from explicitly 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mostly going to be used by test code, and a homogenous set of e.g. i64 is a pretty common use case. But I don't have strong feelings either way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

parquet-variant parquet-variant* crates

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Variant] Remove ceremony of going from list of Variant to VariantArray

2 participants