Skip to content

Interoperability between arrow-rs and nanoarrow #5052

Closed
@evgenyx00

Description

@evgenyx00

Which part is this question about
Deserialization from arrow-rs into nanoarrow

Describe your question
I’ve encountered a problem while serializing a basic Arrow object using StreamWriter with a single RecordBatch, and deserialize the object using nanoarrow, it fails while deserializing RecordBatch, due to header alignment verification in flatcc https://github.com/apache/arrow-nanoarrow/blob/d104e9065101401c63e931acdc7c10f114c47eaf/dist/flatcc.c#L2453

The alignment failure occurs in the calculation of the base offset and the offset of the union value relative to the base.
I'm not fully sure, the problem is arrow-rs or flatbuffers.

Additional context
Tested arrow-rs versions 5.0.0 and 47.0.0, so it's not a degradation or never worked.

Steps to reproduce:

  1. Create arrow object and save only RecordBatch bytes

  2. Test using nanoarrow or example code https://github.com/apache/arrow-nanoarrow/blob/d104e9065101401c63e931acdc7c10f114c47eaf/examples/cmake-ipc/src/app.c

  3. Reproduced on Debian 11 x86 and MacOS M1

  4. Code snippet

`

fn get_arrow_bytes() -> Vec<u8> {


    let mut buf: Vec<u8> = Vec::new();

    {

        let schema = Schema::new(vec![
            Field::new("AAAAAAAA", DataType::Utf8, true)
        ]);

        let mut arrow_writer = writer::StreamWriter::try_new(&mut buf, &schema).unwrap();

        let id_array = StringArray::from(vec!["BBBBBBBB".to_string()]);

        let batch = RecordBatch::try_new(
            Arc::new(schema),
            vec![Arc::new(id_array)]
        ).unwrap();

        arrow_writer.write(&batch).unwrap();
                
        arrow_writer.finish().unwrap();
    }

    buf
}

`

Metadata

Metadata

Assignees

Labels

questionFurther information is requested

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions