Skip to content

GH-46708: [C++][Gandiva] Added zero return values for castDECIMAL_utf8 #46709

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

DenisTarasyuk
Copy link
Contributor

@DenisTarasyuk DenisTarasyuk commented Jun 4, 2025

Rationale for this change

castDECIMAL_utf8 has undefined behavior if input value can not be parsed, which causes SIGSEGV for some expressions in Projector.

What changes are included in this PR?

Setting 0 to out_high, out_low in function castDECIMAL_utf8 to have those values initialised if input value can not be parsed.
Added corresponding test that reproduces SIGSEGV in projector.

Are these changes tested?

Yes

Are there any user-facing changes?

No

@DenisTarasyuk DenisTarasyuk changed the title GH-46708: [C++][Gandiva] Added zero return values for castDECIMAL_utf… GH-46708: [C++][Gandiva] Added zero return values for castDECIMAL_utf8 Jun 4, 2025
Copy link

github-actions bot commented Jun 4, 2025

⚠️ GitHub issue #46708 has been automatically assigned in GitHub to PR creator.


int num_records = 1;
auto invalid_in = MakeArrowArrayUtf8({"1.345"}, {true});
auto in_batch_1 = arrow::RecordBatch::Make(schema, num_records, {invalid_in});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
auto in_batch_1 = arrow::RecordBatch::Make(schema, num_records, {invalid_in});
ASSERT_OK_AND_ASSIGN(auto in_batch_1, arrow::RecordBatch::Make(schema, num_records, {invalid_in}));

BTW, why do we need _1 suffix here? It seems that there is only one record batch.

Copy link
Contributor Author

@DenisTarasyuk DenisTarasyuk Jun 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_1 was just copy paste, fixed.
arrow::RecordBatch::Make does not return Result<> so ASSERT_OK_AND_ASSIGN should not work, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry.

Comment on lines 1250 to 1262
auto int_literal = TreeExprBuilder::MakeLiteral((int32_t)100);
auto int_literal_multiply = TreeExprBuilder::MakeLiteral((int32_t)10);
auto string_literal = TreeExprBuilder::MakeStringLiteral("foo");
auto cast_multiply_literal =
TreeExprBuilder::MakeFunction("castDECIMAL", {int_literal_multiply}, decimal_type_10_0);
auto cast_int_literal =
TreeExprBuilder::MakeFunction("castDECIMAL", {int_literal}, decimal_type_38_30);
auto cast_string_func =
TreeExprBuilder::MakeFunction("castDECIMAL", {string_literal}, decimal_type_38_30);
auto multiply_func =
TreeExprBuilder::MakeFunction("multiply", {cast_multiply_literal, cast_int_literal}, decimal_type_38_27);
auto equal_func = TreeExprBuilder::MakeFunction("equal", {multiply_func, cast_string_func}, arrow::boolean());
auto expr = TreeExprBuilder::MakeExpression(equal_func, res_bool);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does not really repro sigsegv without that complex expression

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we don't need to reproduce SIGSEGV here.
Can we check only that out_high/out_low are set on error instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Let me try that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like if function failed to parse data, corresponding element will not be set in output array.
@kou can you suggest how do I test that castDecimal returned 0?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I tried the current changes but it's still SEGV-ed.

I couldn't get backtrace:

$ LD_LIBRARY_PATH=$PWD/cpp.build/debug gdb --args cpp.build/debug/gandiva-projector-test --gtest_filter=TestDecimal.TestCastDecimalVarCharInvalidInputInvalidOutput
...
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from TestDecimal
[ RUN      ] TestDecimal.TestCastDecimalVarCharInvalidInputInvalidOutput
/home/kou/work/cpp/arrow.kou/cpp/src/gandiva/cache.cc:58: Creating gandiva cache with capacity of 5000
/home/kou/work/cpp/arrow.kou/cpp/src/gandiva/engine.cc:282: Detected CPU Name : arrowlake-s
/home/kou/work/cpp/arrow.kou/cpp/src/gandiva/engine.cc:283: Detected CPU Features: [ +prfchw -cldemote +avx +aes +sahf +pclmul -xop +crc32 +xsaves -avx512fp16 -usermsr +sm4 -egpr +sse4.1 -avx512ifma +xsave +sse4.2 -tsxldtrk +sm3 +ptwrite -widekl +invpcid +64bit +xsavec -avx10.1-512 -avx512vpopcntdq +cmov -avx512vp2intersect -avx512cd +movbe +avxvnniint8 -ccmp -amx-int8 -kl -avx10.1-256 +sha512 +avxvnni -rtm +adx +avx2 +hreset +movdiri +serialize +vpclmulqdq -avx512vl +uintr -cf +clflushopt -raoint +cmpccxadd +bmi -amx-tile +sse +gfni +avxvnniint16 -amx-fp16 -ndd +xsaveopt +rdrnd -avx512f -amx-bf16 -avx512bf16 -avx512vnni -push2pop2 +cx8 -avx512bw +sse3 +pku +fsgsbase -clzero -mwaitx -lwp +lzcnt +sha +movdir64b -ppx +wbnoinvd -enqcmd +avxneconvert -tbm -pconfig -amx-complex +ssse3 +cx16 +bmi2 +fma +popcnt +avxifma +f16c -avx512bitalg -rdpru +clwb +mmx +sse2 +rdseed -avx512vbmi2 -prefetchi +rdpid -fma4 -avx512vbmi +shstk +vaes +waitpkg -sgx +fxsr -avx512dq -sse4a]

Program received signal SIGSEGV, Segmentation fault.
0x00007fffe5861d6e in ?? ()
(gdb) bt
#0  0x00007fffe5861d6e in ?? ()
#1  0x00007fffffffd770 in ?? ()
#2  0x000055015597d138 in ?? ()
#3  0x4674edea40000000 in ?? ()
#4  0x0000000c9f2c9cd0 in ?? ()
#5  0x85acef8100000000 in ?? ()
#6  0x000004ee2d6d415b in ?? ()
#7  0x00007fffffffd710 in ?? ()
#8  0x0000555555a2a550 in ?? ()
#9  0x0000000000000000 in ?? ()

Could you share the backtrace on your environment?

Copy link
Contributor Author

@DenisTarasyuk DenisTarasyuk Jun 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strange. Fix works on my machine. Also test was green in this PR and also tried in docker x86 env.

[ RUN      ] TestDecimal.TestCastDecimalVarCharInvalidInputInvalidOutput
/arrow2/cpp/src/gandiva/cache.cc:58: Creating gandiva cache with capacity of 5000
/arrow2/cpp/src/gandiva/engine.cc:282: Detected CPU Name : znver2
/arrow2/cpp/src/gandiva/engine.cc:283: Detected CPU Features: [ +prfchw -cldemote +avx +aes +sahf +pclmul -xop +crc32 +xsaves -avx512fp16 -sm4 +sse4.1 -avx512ifma +xsave -avx512pf +sse4.2 -tsxldtrk -ptwrite -widekl -sm3 -invpcid +64bit +xsavec -avx512vpopcntdq +cmov -avx512vp2intersect -avx512cd +movbe -avxvnniint8 -avx512er -amx-int8 -kl -sha512 -avxvnni -rtm +adx +avx2 -hreset -movdiri -serialize -vpclmulqdq -avx512vl -uintr +clflushopt -raoint -cmpccxadd +bmi -amx-tile +sse -gfni -avxvnniint16 -amx-fp16 +xsaveopt +rdrnd -avx512f -amx-bf16 -avx512bf16 -avx512vnni +cx8 -avx512bw +sse3 -pku +fsgsbase +clzero -mwaitx -lwp +lzcnt +sha -movdir64b -wbnoinvd -enqcmd -prefetchwt1 -avxneconvert -tbm -pconfig -amx-complex +ssse3 +cx16 +bmi2 +fma +popcnt -avxifma +f16c -avx512bitalg -rdpru +clwb +mmx +sse2 +rdseed -avx512vbmi2 -prefetchi +rdpid -fma4 -avx512vbmi -shstk -vaes -waitpkg -sgx +fxsr -avx512dq +sse4a]

Thread 1 "gandiva-project" received signal SIGSEGV, Segmentation fault.
0x00007ffff786fdfc in ?? ()
(gdb) bt
#0  0x00007ffff786fdfc in ?? ()
#1  0x00007fffffffd700 in ?? ()
#2  0x0000000108b76b98 in ?? ()
#3  0x0000000c9f2c9cd0 in ?? ()
#4  0x4674edea40000000 in ?? ()
#5  0x4674edea40000000 in ?? ()
#6  0x0000000c9f2c9cd0 in ?? ()
#7  0x85acef8100000000 in ?? ()
#8  0x000004ee2d6d415b in ?? ()
#9  0x00007fffffffd6b0 in ?? ()
#10 0x0000000002064191 in std::_Tuple_impl<0ul, unsigned char**, std::default_delete<unsigned char* []> >::_M_head (__t=...)
    at /opt/rh/devtoolset-10/root/usr/include/c++/10/tuple:204
#11 0x000000000203d925 in gandiva::Projector::Evaluate (this=0x8d66a50, batch=..., selection_vector=0x0, pool=0x8768f40 <arrow::global_state+576>, output=0x7fffffffdaa0)
    at /arrow2/cpp/src/gandiva/projector.cc:176
#12 0x000000000203d5e5 in gandiva::Projector::Evaluate (this=0x8d66a50, batch=..., pool=0x8768f40 <arrow::global_state+576>, output=0x7fffffffdaa0)
    at /arrow2/cpp/src/gandiva/projector.cc:154
#13 0x0000000001f9ea88 in gandiva::TestDecimal_TestCastDecimalVarCharInvalidInputInvalidOutput_Test::TestBody (this=0x8aedb20)
    at /arrow2/cpp/src/gandiva/tests/decimal_test.cc:1207

If you have repro even with my fix. Can you please share commands how do I repro it. What is the env?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated Issue with some additional information. #46708

@github-actions github-actions bot added awaiting changes Awaiting changes awaiting change review Awaiting change review and removed awaiting review Awaiting review awaiting changes Awaiting changes labels Jun 5, 2025
Fixed PR comments
@DenisTarasyuk
Copy link
Contributor Author

@github-actions autotune

fixed formatting
@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting change review Awaiting change review labels Jun 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants