Skip to content

Add record batch adapter to zero-copy stream record batches into csp #541

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

arhamchopra
Copy link
Collaborator

No description provided.

@arhamchopra arhamchopra force-pushed the ac/record_batch_adapter branch from 0da8c1a to 643546c Compare June 4, 2025 14:47
@arhamchopra arhamchopra force-pushed the ac/record_batch_adapter branch from 643546c to 395a316 Compare June 4, 2025 18:21
}
else
{
auto last_time = m_tsArray -> Value( m_numRows - 1 );
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we possibly use std::upper_bound for this logic? Same with std::find_if for the small batch case

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think arrow arrays have iterators. We would need to define a forward iterator class to use the std::upper_bound and std::find_if methods.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m_tsArray is contiguous right? Wonder if we could create a std::span of it and then we wouldn't need to define iterators ourselves

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we can rely on data being contiguous. I would expect Arrow to expose the iterator interface.


long long findFirstMatchingIndex( DateTime time )
{
// Find the first index with time equal or greater than `time`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same idea, could we use std::lower_bound here rather than implementing our own binary search?

{
case ::arrow::TimeUnit::SECOND:
{
m_multiplier = 1000000000;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These constants are defined here

const int64_t NANOS_PER_SECOND = 1000000000;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants