Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: test Pyodide with cibuildwheel and fix 32-bit issue #3270

Merged
merged 5 commits into from
Jan 24, 2025

Conversation

henryiii
Copy link
Member

@henryiii henryiii commented Oct 3, 2024

Using cibuildwheel for pyodide. Adds testing.

@jpivarski jpivarski added the pr-inactive A pull request that hasn't been touched in a long time label Nov 7, 2024
@henryiii henryiii force-pushed the henryiii/ci/cibwpyodide branch 2 times, most recently from 2f7cc25 to a18048a Compare November 7, 2024 16:41
@jpivarski
Copy link
Member

Since "Build C++ WASM" still fails, we'll keep this in pr-inactive.

It looks like it has a significant merge conflict, too.

@henryiii henryiii force-pushed the henryiii/ci/cibwpyodide branch 3 times, most recently from 1f020d2 to 00a73c2 Compare November 9, 2024 04:57
@henryiii
Copy link
Member Author

henryiii commented Nov 9, 2024

Okay, now getting a Pyodide fatal error:

Error: Dynamic linking error: cannot resolve symbol _ZN7awkward12ArrayBuilderC1ERKNS_7OptionsIJxdEEE

@jpivarski jpivarski removed the pr-inactive A pull request that hasn't been touched in a long time label Nov 14, 2024
@jpivarski
Copy link
Member

_ZN7awkward12ArrayBuilderC1ERKNS_7OptionsIJxdEEE

is

awkward::ArrayBuilder::ArrayBuilder(awkward::Options<long long, double> const&)

which is core libawkward.

@ariostas
Copy link
Collaborator

Okay, now getting a Pyodide fatal error:

Error: Dynamic linking error: cannot resolve symbol _ZN7awkward12ArrayBuilderC1ERKNS_7OptionsIJxdEEE

This issue is solved by bumping CIBW to 2.22. However, the tests are skipped or fail because none of the dependencies are installed and Pyodide doesn't have multiprocessing. Maybe it would make sense to add a tests-wasm directory with a few simple tests that don't have any dependencies.

@agriyakhetarpal
Copy link

I've just merged pyodide/pyodide#5247, so awkward-cpp will be fully up to date in the Pyodide 0.27 release which we'll get out any day now. If the fatal error is indeed resolved with cibuildwheel 2.22 here as mentioned by @ariostas, it would be great to revisit this. :)

That said, I can't fully tell what the error was caused by and what resolved it, there weren't many changes between Pyodide 0.26.1 in cibuildwheel < 2.22 and cibuildwheel >=2.22's Pyodide 0.26.4, besides a few more JSPI-related bug fixes for stack switching and a bump from Python 3.12.1 to Python 3.12.7.

@agriyakhetarpal
Copy link

I tried it on my fork via agriyakhetarpal#1 and, after temporarily skipping seven tests where ForthMachine does not seem to work with 32-bit environments and emits conversion errors, the rest of the tests all pass now:
https://github.com/agriyakhetarpal/awkward/actions/runs/12483892163/job/34840440332

I'd like to reinstate the WASM job that was disabled as an effect of #3329 so that I can work on some further improvements around interactive docs for awkward via jupyterlite-sphinx, as an extension to the "Try it" button on the homepage. If this would be alright with those here and with the rest of the awkward-cpp developers, could I go ahead and perform those improvements in a new PR that builds on this one?

@jpivarski
Copy link
Member

Fixing awkward-cpp's WASM issues is very welcome! (I had been under the impression that it was working. We do have a Windows 32-bit suite of tests to ensure that all integer/pointer handling correctly distinguishes between size_t and uint632_t/uint64_t, so I don't think the WASM 32-bit issue could be only because it's 32-bit...)

awkward-cpp has two main pieces: one is exposed through pybind11 in the ordinary way for a Python extension, and 3 subsystems are accessed this way:

  • ArrayBuilder (with all of its component classes and subclasses, including GrowableBuffer)
  • ForthMachine (which has ForthInputBuffer and ForthOutputBuffer and that's it—the rest is internal)
  • JSON I/O (a few functions for different cases: from string, from file, one big JSON, newline-delimited JSON)

The other main piece is accessed through ctypes, by compiling the C++ functions with an extern C API:

  • ArrayBuilder again, so that it can be used in Numba (which can call C ABI function pointers)
  • a few hundred "kernels," which are given raw buffers from NumPy arrays for most Awkward Array functionality (so that the CPU-based interface would look as much as possible like a CUDA one)

I think all of these things are loaded on startup, when import awkward is called, so a dynamic link failure in any one of them probably has more to do with the linking itself (visibility of symbols? name clashes? namespaces?) than the implementation. Some explicit C macros were added to ensure visibility, but they might have been needed for reasons that are no longer relevant and (who knows?) might be interfering with the WASM build. I'm referring to EXPORT_SYMBOL defined here:

#ifdef _MSC_VER
#define EXPORT_SYMBOL __declspec(dllexport)

#else
#define EXPORT_SYMBOL __attribute__((visibility("default")))

#ifdef __GNUC__
// Silence a gcc warning: type attributes ignored after type is already defined
#define EXPORT_TEMPLATE_INST
#else
#define EXPORT_TEMPLATE_INST EXPORT_SYMBOL
#endif

This is one possible culprit, but anything like this is suspect and can be investigated. We have enough tests for all the other builds that if you need to change something to make WASM work and all other tests still pass, it's a good fix. (To be super-cautious, we can also manually check the CUDA tests—which don't run in GitHub Actions—but I think they're very separate from the rest of the build procedure because they use CuPy for runtime compilation.)

@agriyakhetarpal
Copy link

Thanks for the extended information! awkward in WASM is working and no longer fails on import, fortunately – it is just the tests in these files:

  • test_3209_awkwardforth_read_negative_number_of_items.py, and
  • test_1345_avro_reader.py

that are failing at the moment. I am not an experienced C++ programmer in any fashion, but I can try to see where I can debug it while also getting the WASM wheels deployed with the docs again.

@jpivarski
Copy link
Member

The Avro reader is a "canary in a coal mine" in the sense that it's one of the few extended uses of AwkwardForth in the Awkward codebase. AwkwardForth is primarily needed for Uproot, but Awkward Array can't load Uproot in its tests without a circular dependence.

Even though these two AwkwardForth-based tests fail, do the Uproot tests run (when using this compiled version of awkward-cpp)? That would give us an indication of how serious the failure is. For instance, what if it's failing in these two tests for a reason that's irrelevant to Uproot? If that's the case, then these two tests are using functionality we may not need and perhaps could remove. AwkwardForth is not a user-facing API (technically public so that a library like Uproot can use it, but not publicized); we can consider removing features.

Of course, it would be even better to understand the failure and fix it. I've been assuming that the issue has something to do with compiling and linking—an area in which I'm not familiar with the details—but if it's a feature that's not used by Uproot, then it's possible that there's a segfault or other undefined behavior in the C++. So, checking the Uproot tests would be informative, either way.

The Uproot tests that use AwkwardForth most intensively are:

Uproot does have a Pyodide test. I think it only tests local files (to avoid unimplemented features in Pyodide, many of which involve threading), but the above tests are based on local files. (scikit-hep-testdata downloads the file locally before using it.)

@henryiii
Copy link
Member Author

henryiii commented Jan 23, 2025

I'm getting:

machine = awkward.forth.ForthMachine64(forth_code)
E       ValueError: stoul: no conversion
E       
E       This error occurred while calling
E       
E           ak.from_avro_file(
E               file = '/home/runner/work/awkward/awkward/tests/samples/bool_test'...
E           )

On lots (all?) Forth tests. Could this happen if it can't find the file?

@agriyakhetarpal
Copy link

Sorry for not responding here – I did start looking into this some time ago, but I don't recall why I stopped (likely because pyodide-build was broken at that moment). We can check if it's due to the Avro file itself going missing, that's much more helpful.

@ariostas
Copy link
Collaborator

ariostas commented Jan 23, 2025

I made a PR for Uproot (scikit-hep/uproot5#1365) that adds a test that uses AwkwardForth in wasm. The setup is pretty different (cibw is not used), so it is definitely that AwkwardForth is broken in wasm.

@ariostas
Copy link
Collaborator

The most basic functionality like

vm = ForthMachine32("3 5 +")
vm.run()

works, but as soon as there's anything more complex like

ForthMachine32(": callme 1 2 3 4 ;")

it crashes with the same error.

As far as I can tell, the only usage of stoul in AwkwardForth is here

template <typename T, typename I>
bool
ForthMachineOf<T, I>::is_integer(const std::string& word, int64_t& value) const {
if (word.size() >= 2 && word.substr(0, 2) == std::string("0x")) {
try {
value = (int64_t)std::stoul(word.substr(2, word.size() - 2), nullptr, 16);
}
catch (std::invalid_argument& err) {
return false;
}
return true;
}
else {
try {
value = (int64_t)std::stoul(word, nullptr, 10);
}
catch (std::invalid_argument& err) {
return false;
}
return true;
}
}

So maybe the try-catch block is not working properly?

I'm not sure what's the best way to debug this in wasm, but I'll see if I can reproduce it with a minimal Emscripten code.

@ariostas
Copy link
Collaborator

I was able to reproduce this in a simple code compiled with Emscripten. Then I found that the reason for this is that exceptions are disabled in Emscripten by default (see here). Enabling them made the code work as expected.

So @henryiii I think the fix is just to tell cibw to use -fwasm-exceptions when building.

@henryiii
Copy link
Member Author

pybind11 adds this for you. But maybe this part doesn't use pybind11 when compiling?

Signed-off-by: Henry Schreiner <[email protected]>
@henryiii
Copy link
Member Author

Now I get:

FAILED ../../../../home/runner/work/awkward/awkward/tests/test_1345_avro_reader.py::test_bytes - ValueError: When changing to a larger dtype, its size must be a divisor of the total size in bytes of the last axis of the array.
  
  This error occurred while calling
  
      ak.from_avro_file(
          file = '/home/runner/work/awkward/awkward/tests/samples/bytes_tes'...
      )
  FAILED ../../../../home/runner/work/awkward/awkward/tests/test_1345_avro_reader.py::test_string - ValueError: When changing to a larger dtype, its size must be a divisor of the total size in bytes of the last axis of the array.
  
  This error occurred while calling
  
      ak.from_avro_file(
          file = '/home/runner/work/awkward/awkward/tests/samples/string_te'...
      )
  FAILED ../../../../home/runner/work/awkward/awkward/tests/test_1345_avro_reader.py::test_null - ValueError: When changing to a larger dtype, its size must be a divisor of the total size in bytes of the last axis of the array.
  
  This error occurred while calling
  
      ak.from_avro_file(
          file = '/home/runner/work/awkward/awkward/tests/samples/null_test'...
      )
  FAILED ../../../../home/runner/work/awkward/awkward/tests/test_1345_avro_reader.py::test_enum - TypeError: size of array (5) is less than size of form (10)
  
  This error occurred while calling
  
      ak.from_avro_file(
          file = '/home/runner/work/awkward/awkward/tests/samples/enum_test'...
      )
  FAILED ../../../../home/runner/work/awkward/awkward/tests/test_1345_avro_reader.py::test_arrays_int - ValueError: When changing to a larger dtype, its size must be a divisor of the total size in bytes of the last axis of the array.
  
  This error occurred while calling
  
      ak.from_avro_file(
          file = '/home/runner/work/awkward/awkward/tests/samples/array_tes'...
      )
  FAILED ../../../../home/runner/work/awkward/awkward/tests/test_1345_avro_reader.py::test_array_string - TypeError: size of array (3) is less than size of form (6)
  
  This error occurred while calling
  
      ak.from_avro_file(
          file = '/home/runner/work/awkward/awkward/tests/samples/array_str'...
      )
  FAILED ../../../../home/runner/work/awkward/awkward/tests/test_1345_avro_reader.py::test_array_enum - TypeError: size of array (3) is less than size of form (6)
  
  This error occurred while calling
  
      ak.from_avro_file(
          file = '/home/runner/work/awkward/awkward/tests/samples/array_enu'...
      )
  FAILED ../../../../home/runner/work/awkward/awkward/tests/test_1345_avro_reader.py::test_Unions_string_null - TypeError: size of array (4) is less than size of form (8)
  
  This error occurred while calling
  
      ak.from_avro_file(
          file = '/home/runner/work/awkward/awkward/tests/samples/string_nu'...
      )
  FAILED ../../../../home/runner/work/awkward/awkward/tests/test_1345_avro_reader.py::test_Unions_enum_null - ValueError: When changing to a larger dtype, its size must be a divisor of the total size in bytes of the last axis of the array.
  
  This error occurred while calling
  
      ak.from_avro_file(
          file = '/home/runner/work/awkward/awkward/tests/samples/enum_null'...
      )
  FAILED ../../../../home/runner/work/awkward/awkward/tests/test_1345_avro_reader.py::test_Unions_record_null - TypeError: size of array (3) is less than size of form (6)
  
  This error occurred while calling
  
      ak.from_avro_file(
          file = '/home/runner/work/awkward/awkward/tests/samples/record_nu'...
      )
  FAILED ../../../../home/runner/work/awkward/awkward/tests/test_1345_avro_reader.py::test_Unions_null_X_Y - TypeError: size of array (4) is less than size of form (8)
  
  This error occurred while calling
  
      ak.from_avro_file(
          file = '/home/runner/work/awkward/awkward/tests/samples/int_strin'...
      )
  FAILED ../../../../home/runner/work/awkward/awkward/tests/test_1345_avro_reader.py::test_record_1 - TypeError: size of array (1) is less than size of form (2)
  
  This error occurred while calling
  
      ak.from_avro_file(
          file = '/home/runner/work/awkward/awkward/tests/samples/record_1_'...
      )
  FAILED ../../../../home/runner/work/awkward/awkward/tests/test_1345_avro_reader.py::test_records - TypeError: size of array (4) is less than size of form (8)
  
  This error occurred while calling
  
      ak.from_avro_file(
          file = '/home/runner/work/awkward/awkward/tests/samples/record_te'...
      )
  ========== 13 failed, 24438 passed, 678 skipped in 173.64s (0:02:53) ===========

@ariostas
Copy link
Collaborator

It should work now. It was just missing an extra check for 32-bit systems (it was only checking for x86).

@henryiii
Copy link
Member Author

henryiii commented Jan 24, 2025

Would this work more portably?

#include <cstdint>
#if defined _MSC_VER || INTPTR_MAX == INT32_MAX
    // 32 bit or Windows
#else
    // 64 bit
#endif

?

@ariostas
Copy link
Collaborator

Would this work more portably?

Yeah, that would be better. I was also thinking whether it could be a problem later on if Emscripten starts supporting wasm64.

Up to you if you want me to change it or you go ahead.

@henryiii henryiii changed the title ci: use cibuildwheel for pyodide test fix: test Pyodide with cibuildwheel and fix 32-bit issue Jan 24, 2025
@henryiii
Copy link
Member Author

#ifdef _MSC_VER
  #define EXPORT_SYMBOL __declspec(dllexport)
  #ifdef _WIN64
    typedef signed   __int64 ssize_t;
    typedef unsigned __int64 size_t;
  #else
    typedef signed   int     ssize_t;
    typedef unsigned int     size_t;
  #endif
  typedef   unsigned char    uint8_t;
  typedef   signed   char    int8_t;
  typedef   unsigned short   uint16_t;
  typedef   signed   short   int16_t;
  typedef   unsigned int     uint32_t;
  typedef   signed   int     int32_t;
  typedef   unsigned __int64 uint64_t;
  typedef   signed   __int64 int64_t;
  #define ERROR Error
#else
  #define EXPORT_SYMBOL __attribute__((visibility("default")))
  #include <cstddef>
  #include <cstdint>
  #define ERROR struct Error
#endif

Isn't <cstdint> available on Windows for a few years now?

@ianna
Copy link
Collaborator

ianna commented Jan 24, 2025

@henryiii - shall I wait for this PR integration before the release? Thanks!

@henryiii
Copy link
Member Author

It contains a fix for emscripten now, so I'd say yes.

@henryiii
Copy link
Member Author

If it passes CI, it's ready.

Copy link
Collaborator

@ianna ianna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@henryiii - Great! Thank you. CI passes, I'm merging it now.

@ianna ianna merged commit 08c7789 into main Jan 24, 2025
52 checks passed
@ianna ianna deleted the henryiii/ci/cibwpyodide branch January 24, 2025 17:01
@agriyakhetarpal
Copy link

Thanks! I can now revive my branch that will reinstate Pyodide-enabled interactive docs for awkward. Apologies for not being able to help as much as I would have hoped!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants