-
Notifications
You must be signed in to change notification settings - Fork 3.7k
GH-46677: [C++] Expose an BinaryViewBuilder interface for append a binary and multiple subslice #46730
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@mapleFU is currently implemented interface expected? |
cpp/src/arrow/array/builder_binary.h
Outdated
return AppendBlock(value.data(), static_cast<int64_t>(value.size())); | ||
} | ||
|
||
Status AppendViewFromBuffer(int32_t buffer_id, int32_t buffer_offset, int32_t start, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
naming: from buffer or from block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally both is ok for me, I prefer Buffer since a variable is buffer_index
cpp/src/arrow/array/builder_binary.h
Outdated
@@ -645,6 +657,28 @@ class ARROW_EXPORT BinaryViewBuilder : public ArrayBuilder { | |||
UnsafeAppend(value.data(), static_cast<int64_t>(value.size())); | |||
} | |||
|
|||
Result<std::pair<int32_t, int32_t>> AppendBlock(const uint8_t* value, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use more specific name rather than pair<i32, i32>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can i directly use BinaryViewType::c_type
since it already contains these two info we need?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The syntax is a bit weird here? Append a BinaryView and then append the sub-slice of the view?
@@ -100,6 +100,13 @@ void BinaryViewBuilder::Reset() { | |||
data_heap_builder_.Reset(); | |||
} | |||
|
|||
Result<std::pair<int32_t, int32_t>> BinaryViewBuilder::AppendBlock(const uint8_t* value, | |||
const int64_t length) { | |||
DCHECK_GT(length, TypeClass::kInlineSize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If length <= kInlineSize, should this return false or ok? Why just DCHECK here?
cpp/src/arrow/array/builder_binary.h
Outdated
c_type GetViewFromBlock(int32_t block_id, int32_t block_offset, int32_t offset, | ||
int32_t length) const { | ||
const auto* value = blocks_.at(block_id)->data_as<uint8_t>() + block_offset + offset; | ||
if (length <= BinaryViewType::kInlineSize) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uses ToBinaryView
?
Should we rename Maybe aligning with arrow-rs's impl is fine.. |
Some personal thoughts:
|
Not sure why these 2 ci always failed.. |
/// let array = builder.finish(); | ||
ASSERT_OK_AND_ASSIGN(const auto buffer, | ||
src_builder.AppendBuffer("helloworldbingobongo")); | ||
ASSERT_OK(src_builder.AppendViewFromBuffer(buffer, 0, 5)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we try append unexists buffer_index
?
|
||
// Verify the content of the resulting array | ||
ASSERT_EQ(src->length(), 6); | ||
const auto& binary_view_array = static_cast<const BinaryViewArray&>(*src); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we both test stringView and binaryView?
ASSERT_OK(src_builder.AppendViewFromBuffer(buffer, 10, 5)); | ||
ASSERT_OK(src_builder.AppendViewFromBuffer(buffer, 15, 5)); | ||
ASSERT_OK(src_builder.AppendViewFromBuffer(buffer, 0, 15)); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we test append multiple buffers?
Rationale for this change
see #46677
What changes are included in this PR?
see #46677
Are these changes tested?
Yes
Are there any user-facing changes?
No