Skip to content

[C++][Parquet] Fuzz: Integer or heap overflow when reading bad file #47184

@mapleFU

Description

@mapleFU

Describe the enhancement requested


Running: /mnt/scratch0/clusterfuzz/bot/inputs/fuzzer-testcases/crash-9b38f5396ac22bce51abd03db1c3ec649d333186
--
  | /src/arrow/cpp/src/arrow/array/builder_binary.cc:165:3: runtime error: signed integer overflow: 4611686018427387903 * 16 cannot be represented in type 'int64_t' (aka 'long')
  | #0 0x5b4331e7d4ab in arrow::FixedSizeBinaryBuilder::Resize(long) arrow/cpp/src/arrow/array/builder_binary.cc:165:3
  | #1 0x5b43311ef56f in arrow::ArrayBuilder::Reserve(long) arrow/cpp/src/arrow/array/builder_base.h:145:12
  | #2 0x5b43312fab7f in parquet::internal::(anonymous namespace)::FLBARecordReader::ReserveValues(long) arrow/cpp/src/parquet/column_reader.cc:1974:5
  | #3 0x5b4331046f2b in parquet::arrow::(anonymous namespace)::LeafReader::LoadBatch(long) arrow/cpp/src/parquet/arrow/reader.cc:488:21
  | #4 0x5b433104406e in parquet::arrow::ColumnReaderImpl::NextBatch(long, std::__1::shared_ptr<arrow::ChunkedArray>*) arrow/cpp/src/parquet/arrow/reader.cc:110:5
  | #5 0x5b4331063767 in parquet::arrow::(anonymous namespace)::FileReaderImpl::ReadColumn(int, std::__1::vector<int, std::__1::allocator<int>> const&, parquet::arrow::ColumnReader*, std::__1::shared_ptr<arrow::ChunkedArray>*) arrow/cpp/src/parquet/arrow/reader.cc:286:20
  | #6 0x5b4331080c00 in parquet::arrow::(anonymous namespace)::FileReaderImpl::DecodeRowGroups(std::__1::shared_ptr<parquet::arrow::(anonymous namespace)::FileReaderImpl>, std::__1::vector<int, std::__1::allocator<int>> const&, std::__1::vector<int, std::__1::allocator<int>> const&, arrow::internal::Executor*)::$_0::operator()(unsigned long, std::__1::shared_ptr<parquet::arrow::ColumnReaderImpl>) const arrow/cpp/src/parquet/arrow/reader.cc:1282:5
  | #7 0x5b433107e5d8 in OptionalParallelForAsync<(lambda at /src/arrow/cpp/src/parquet/arrow/reader.cc:1278:22) &, std::__1::shared_ptr<parquet::arrow::ColumnReaderImpl>, std::__1::shared_ptr<arrow::ChunkedArray> > arrow/cpp/src/arrow/util/parallel.h:97:7
  | #8 0x5b433107e5d8 in parquet::arrow::(anonymous namespace)::FileReaderImpl::DecodeRowGroups(std::__1::shared_ptr<parquet::arrow::(anonymous namespace)::FileReaderImpl>, std::__1::vector<int, std::__1::allocator<int>> const&, std::__1::vector<int, std::__1::allocator<int>> const&, arrow::internal::Executor*) arrow/cpp/src/parquet/arrow/reader.cc:1300:10
  | #9 0x5b43310356c8 in parquet::arrow::(anonymous namespace)::FileReaderImpl::ReadRowGroups(std::__1::vector<int, std::__1::allocator<int>> const&, std::__1::vector<int, std::__1::allocator<int>> const&, std::__1::shared_ptr<arrow::Table>*) arrow/cpp/src/parquet/arrow/reader.cc:1261:14
  | #10 0x5b4331034e76 in parquet::arrow::(anonymous namespace)::FileReaderImpl::ReadRowGroup(int, std::__1::vector<int, std::__1::allocator<int>> const&, std::__1::shared_ptr<arrow::Table>*) arrow/cpp/src/parquet/arrow/reader.cc:323:12
  | #11 0x5b43310350b9 in parquet::arrow::(anonymous namespace)::FileReaderImpl::ReadRowGroup(int, std::__1::shared_ptr<arrow::Table>*) arrow/cpp/src/parquet/arrow/reader.cc:327:12
  | #12 0x5b4331029a1b in FuzzReader arrow/cpp/src/parquet/arrow/reader.cc:1405:37
  | #13 0x5b4331029a1b in parquet::arrow::internal::FuzzReader(unsigned char const*, long) arrow/cpp/src/parquet/arrow/reader.cc:1432:11
  | #14 0x5b4331025413 in LLVMFuzzerTestOneInput arrow/cpp/src/parquet/arrow/fuzz.cc:22:17
  | #15 0x5b4330f879e0 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:614:13
  | #16 0x5b4330f72c55 in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:327:6
  | #17 0x5b4330f786ef in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:862:9
  | #18 0x5b4330fa3992 in main /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerMain.cpp:20:10
  | #19 0x79372c745082 in __libc_start_main /build/glibc-LcI20x/glibc-2.31/csu/libc-start.c:308:16
  | #20 0x5b4330f6ae3d in _start
  |  
  | SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior arrow/cpp/src/arrow/array/builder_binary.cc:165:3
  |  

<br class="Apple-interchange-newline">Running: /mnt/scratch0/clusterfuzz/bot/inputs/fuzzer-testcases/crash-9b38f5396ac22bce51abd03db1c3ec649d333186
/src/arrow/cpp/src/arrow/array/builder_binary.cc:165:3: runtime error: signed integer overflow: 4611686018427387903 * 16 cannot be represented in type 'int64_t' (aka 'long')
    #0 0x5b4331e7d4ab in arrow::FixedSizeBinaryBuilder::Resize(long) [arrow/cpp/src/arrow/array/builder_binary.cc:165](https://github.com/apache/arrow/blob/bb33493bd34dcd21d71b6b942203992e67f5ef3c/cpp/src/arrow/array/builder_binary.cc#L165):3
    #1 0x5b43311ef56f in arrow::ArrayBuilder::Reserve(long) [arrow/cpp/src/arrow/array/builder_base.h:145](https://github.com/apache/arrow/blob/bb33493bd34dcd21d71b6b942203992e67f5ef3c/cpp/src/arrow/array/builder_base.h#L145):12
    #2 0x5b43312fab7f in parquet::internal::(anonymous namespace)::FLBARecordReader::ReserveValues(long) [arrow/cpp/src/parquet/column_reader.cc:1974](https://github.com/apache/arrow/blob/bb33493bd34dcd21d71b6b942203992e67f5ef3c/cpp/src/parquet/column_reader.cc#L1974):5
    #3 0x5b4331046f2b in parquet::arrow::(anonymous namespace)::LeafReader::LoadBatch(long) [arrow/cpp/src/parquet/arrow/reader.cc:488](https://github.com/apache/arrow/blob/bb33493bd34dcd21d71b6b942203992e67f5ef3c/cpp/src/parquet/arrow/reader.cc#L488):21
    #4 0x5b433104406e in parquet::arrow::ColumnReaderImpl::NextBatch(long, std::__1::shared_ptr<arrow::ChunkedArray>*) [arrow/cpp/src/parquet/arrow/reader.cc:110](https://github.com/apache/arrow/blob/bb33493bd34dcd21d71b6b942203992e67f5ef3c/cpp/src/parquet/arrow/reader.cc#L110):5
    #5 0x5b4331063767 in parquet::arrow::(anonymous namespace)::FileReaderImpl::ReadColumn(int, std::__1::vector<int, std::__1::allocator<int>> const&, parquet::arrow::ColumnReader*, std::__1::shared_ptr<arrow::ChunkedArray>*) [arrow/cpp/src/parquet/arrow/reader.cc:286](https://github.com/apache/arrow/blob/bb33493bd34dcd21d71b6b942203992e67f5ef3c/cpp/src/parquet/arrow/reader.cc#L286):20
    #6 0x5b4331080c00 in parquet::arrow::(anonymous namespace)::FileReaderImpl::DecodeRowGroups(std::__1::shared_ptr<parquet::arrow::(anonymous namespace)::FileReaderImpl>, std::__1::vector<int, std::__1::allocator<int>> const&, std::__1::vector<int, std::__1::allocator<int>> const&, arrow::internal::Executor*)::$_0::operator()(unsigned long, std::__1::shared_ptr<parquet::arrow::ColumnReaderImpl>) const [arrow/cpp/src/parquet/arrow/reader.cc:1282](https://github.com/apache/arrow/blob/bb33493bd34dcd21d71b6b942203992e67f5ef3c/cpp/src/parquet/arrow/reader.cc#L1282):5
    #7 0x5b433107e5d8 in OptionalParallelForAsync<(lambda at /src/arrow/cpp/src/parquet/arrow/reader.cc:1278:22) &, std::__1::shared_ptr<parquet::arrow::ColumnReaderImpl>, std::__1::shared_ptr<arrow::ChunkedArray> > [arrow/cpp/src/arrow/util/parallel.h:97](https://github.com/apache/arrow/blob/bb33493bd34dcd21d71b6b942203992e67f5ef3c/cpp/src/arrow/util/parallel.h#L97):7
    #8 0x5b433107e5d8 in parquet::arrow::(anonymous namespace)::FileReaderImpl::DecodeRowGroups(std::__1::shared_ptr<parquet::arrow::(anonymous namespace)::FileReaderImpl>, std::__1::vector<int, std::__1::allocator<int>> const&, std::__1::vector<int, std::__1::allocator<int>> const&, arrow::internal::Executor*) [arrow/cpp/src/parquet/arrow/reader.cc:1300](https://github.com/apache/arrow/blob/bb33493bd34dcd21d71b6b942203992e67f5ef3c/cpp/src/parquet/arrow/reader.cc#L1300):10
    #9 0x5b43310356c8 in parquet::arrow::(anonymous namespace)::FileReaderImpl::ReadRowGroups(std::__1::vector<int, std::__1::allocator<int>> const&, std::__1::vector<int, std::__1::allocator<int>> const&, std::__1::shared_ptr<arrow::Table>*) [arrow/cpp/src/parquet/arrow/reader.cc:1261](https://github.com/apache/arrow/blob/bb33493bd34dcd21d71b6b942203992e67f5ef3c/cpp/src/parquet/arrow/reader.cc#L1261):14
    #10 0x5b4331034e76 in parquet::arrow::(anonymous namespace)::FileReaderImpl::ReadRowGroup(int, std::__1::vector<int, std::__1::allocator<int>> const&, std::__1::shared_ptr<arrow::Table>*) [arrow/cpp/src/parquet/arrow/reader.cc:323](https://github.com/apache/arrow/blob/bb33493bd34dcd21d71b6b942203992e67f5ef3c/cpp/src/parquet/arrow/reader.cc#L323):12
    #11 0x5b43310350b9 in parquet::arrow::(anonymous namespace)::FileReaderImpl::ReadRowGroup(int, std::__1::shared_ptr<arrow::Table>*) [arrow/cpp/src/parquet/arrow/reader.cc:327](https://github.com/apache/arrow/blob/bb33493bd34dcd21d71b6b942203992e67f5ef3c/cpp/src/parquet/arrow/reader.cc#L327):12
    #12 0x5b4331029a1b in FuzzReader [arrow/cpp/src/parquet/arrow/reader.cc:1405](https://github.com/apache/arrow/blob/bb33493bd34dcd21d71b6b942203992e67f5ef3c/cpp/src/parquet/arrow/reader.cc#L1405):37
    #13 0x5b4331029a1b in parquet::arrow::internal::FuzzReader(unsigned char const*, long) [arrow/cpp/src/parquet/arrow/reader.cc:1432](https://github.com/apache/arrow/blob/bb33493bd34dcd21d71b6b942203992e67f5ef3c/cpp/src/parquet/arrow/reader.cc#L1432):11
    #14 0x5b4331025413 in LLVMFuzzerTestOneInput [arrow/cpp/src/parquet/arrow/fuzz.cc:22](https://github.com/apache/arrow/blob/bb33493bd34dcd21d71b6b942203992e67f5ef3c/cpp/src/parquet/arrow/fuzz.cc#L22):17
    #15 0x5b4330f879e0 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:614:13
    #16 0x5b4330f72c55 in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:327:6
    #17 0x5b4330f786ef in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:862:9
    #18 0x5b4330fa3992 in main /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerMain.cpp:20:10
    #19 0x79372c745082 in __libc_start_main /build/glibc-LcI20x/glibc-2.31/csu/libc-start.c:308:16
    #20 0x5b4330f6ae3d in _start
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior [arrow/cpp/src/arrow/array/builder_binary.cc:165](https://github.com/apache/arrow/blob/bb33493bd34dcd21d71b6b942203992e67f5ef3c/cpp/src/arrow/array/builder_binary.cc#L165):3

Component(s)

C++, Parquet

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions