Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Debye efficiency (macOS) #2882

Merged
merged 25 commits into from
Oct 24, 2024
Merged

Debye efficiency (macOS) #2882

merged 25 commits into from
Oct 24, 2024

Conversation

klytje
Copy link
Contributor

@klytje klytje commented May 10, 2024

Test branch for MacOS support of the external AUSAXS library.

@lucas-wilkins
Copy link
Contributor

Seems like its not finding it!

@lucas-wilkins
Copy link
Contributor

I'll be in the office for a little while longer if there's something you need doing.

@klytje
Copy link
Contributor Author

klytje commented May 10, 2024

I think it's working - we didn't mark our new release as the latest, so on the first run the tests were still using the old library. I forgot we have two different actions running, one for pull_request and another for push, so I accidentally only requested a rerun of the former after fixing the issue.

Edit: Yep, everything's passing. @wpotrzebowski can you check the new installer from here?

@wpotrzebowski
Copy link
Contributor

@klytje: sorry for the slow reply. It seems to be crashing on my machine.

Screenshot 2024-05-14 at 12 28 33

@butlerpd
Copy link
Member

Question: is that possibly a chip issue? I think different macs can use very different chips?

Also out of curiosity: does it really crash? I see Mac saying it crashed and offering to kill the program but "ignore" is also an option and looking at the log explorer it looks like SasView caught the crash of ausaxs and fell back to the old method and is just taking a very long time (9% of that calculation at the time of the screenshot)? If one hits ignore does it continue? If so and it is difficult to have it not crash on some chips/platforms it might be good to capture the failure better so that the user does not see the crash warning but just the log explorer popping up saying we could not use it falling back to slow method?

@klytje
Copy link
Contributor Author

klytje commented May 14, 2024

Question: is that possibly a chip issue?

Perhaps? Depends on the kind of optimizations Clang decided to use when @lucas-wilkins compiled it. I'll have to research how to specify what it can and cannot use; it's not obvious to me how to do this with universal binaries like this. The current implementation hides the actual crash message; I'll have to change this to debug further.

Also out of curiosity: does it really crash?

A new process is spawned to 'test-run' my library, which indeed did fatally crash. Perhaps the OS is picking up on this and offering to kill the rest of the program as well? Such a popup is not ideal, to say the least. I'm not sure how to get rid of it. When I tested with Linux & Windows, neither of them seemed to care about the crashing process.

If one hits ignore does it continue?

It should recover just fine - that's why we use a new process to test-run it. Any crash, no matter how severe, will not affect the main process which is thus free to continue. As you pointed out it seems to already have started the calculation using the built-in GSC.

If so and it is difficult to have it not crash on some chips/platforms it might be good to capture the failure better so that the user does not see the crash warning

That's how it's supposed to work (and how it does work on both Windows & Linux), but MacOS seems to have picked up on the crashing child process and decided to intervene all on its own.

@krzywon
Copy link
Contributor

krzywon commented May 17, 2024

As another data point, I see the same error as @wpotrzebowski. I am running an M2 processor. Attached is the MacOS crash report which might help you debug the issue.
sasview-2024-05-14-134817.txt

@klytje
Copy link
Contributor Author

klytje commented May 22, 2024

@wpotrzebowski Can you test this artifact with this script? Should be easier to debug through this than the sasview installers.

@wpotrzebowski
Copy link
Contributor

@klytje sure!
It seems to work ok

python test_shared_lib.py .
AUSAXS: Started evaluating Debye equation.
OK

@klytje
Copy link
Contributor Author

klytje commented May 22, 2024

@wpotrzebowski That's odd. I didn't change anything compared to when Lucas compiled it. What happens if you use this library with sasview? It should go in the src/sas/sascalc/calculator/ausaxs/lib folder.

@wpotrzebowski
Copy link
Contributor

Hmm, it doesn't work with just swapping lib files for the app from the installer. (Mac security-keeper is not happy about replacing files in the signed executable). I need to think if I can test it differently. We probably can disable signing for this PR (it doesn't work for the moment anyway).

@klytje
Copy link
Contributor Author

klytje commented May 22, 2024

I don't think you need an actual installer, you should be able to use it through the run.py script as long as the library is in the correct folder

@wpotrzebowski
Copy link
Contributor

I don't think you need an actual installer, you should be able to use it through the run.py script as long as the library is in the correct folder

This is how I made it work before (with the library compiled on my machine), so I know this approach works but indeed it may be worth trying with external one.

@wpotrzebowski
Copy link
Contributor

I don't think you need an actual installer, you should be able to use it through the run.py script as long as the library is in the correct folder

This is how I made it work before (with the library compiled on my machine), so I know this approach works but indeed it may be worth trying with external one.

Yes, this works but I had to allow in security settings to run it.
Screenshot 2024-05-22 at 14 39 29

Screenshot 2024-05-22 at 14 40 41

@klytje
Copy link
Contributor Author

klytje commented May 22, 2024

@wpotrzebowski I wonder why you didn't get this warning when you were running my testscript? They hook into the library in the exact same way. Could this be related to why it doesn't work in the installer?

If we fix the signing problem I can try compiling a new installer using this version of the library since we know it should work.

@wpotrzebowski
Copy link
Contributor

@klytje I don't think this is the reason. The signing script goes through the entire bundle and signs everything (I can probably check if your lib has been signed properly). I had to allow running your lib as it was downloaded from the Internet.

@wpotrzebowski
Copy link
Contributor

But disabling signing completely may help understand it. I will try to do it in the evening.

@klytje
Copy link
Contributor Author

klytje commented May 22, 2024

@wpotrzebowski I updated the Releases so all new installer runs should include the new library version.

@klytje
Copy link
Contributor Author

klytje commented May 22, 2024

Ah - the failing tests is my mistake. I recently changed the distance binning width from 0.1Å to 0.25Å, but this is apparently not good enough for the highly ordered diamond structure we're testing with. I'll fix this in the next version.

@wpotrzebowski
Copy link
Contributor

Without signing it returns the same error as for the signed package.

@wpotrzebowski
Copy link
Contributor

I´ve run test script on the library from the installer. Maybe this tells you something

 python test_shared_lib.py /Volumes/SasView6/SasView6.app/Contents/Frameworks/sas/sascalc/calculator/ausaxs
Traceback (most recent call last):
  File "/Users/wojciechpotrzebowski/Downloads/macos-latest-binaries/test_shared_lib.py", line 38, in <module>
    ausaxs = ct.CDLL(str(path))
             ^^^^^^^^^^^^^^^^^^
  File "/Users/wojciechpotrzebowski/miniconda3/envs/sasview_dev/lib/python3.11/ctypes/__init__.py", line 376, in __init__
    self._handle = _dlopen(self._name, mode)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: dlopen(/Volumes/SasView6/SasView6.app/Contents/Frameworks/sas/sascalc/calculator/ausaxs/lib/libausaxs.dylib, 0x0006): tried: '/Volumes/SasView6/SasView6.app/Contents/Frameworks/sas/sascalc/calculator/ausaxs/lib/libausaxs.dylib' (mach-o file, but is an incompatible architecture (have 'arm64', need 'x86_64')), '/System/Volumes/Preboot/Cryptexes/OS/Volumes/SasView6/SasView6.app/Contents/Frameworks/sas/sascalc/calculator/ausaxs/lib/libausaxs.dylib' (no such file), '/Volumes/SasView6/SasView6.app/Contents/Frameworks/sas/sascalc/calculator/ausaxs/lib/libausaxs.dylib' (mach-o file, but is an incompatible architecture (have 'arm64', need 'x86_64'))

@klytje
Copy link
Contributor Author

klytje commented May 22, 2024

I´ve run test script on the library from the installer. Maybe this tells you something

Can you run file libausaxs.dylib for me?

@wpotrzebowski
Copy link
Contributor

I´ve run test script on the library from the installer. Maybe this tells you something

Can you run file libausaxs.dylib for me?

file /Volumes/SasView6/SasView6.app/Contents/Frameworks/sas/sascalc/calculator/ausaxs/lib/libausaxs.dylib
/Volumes/SasView6/SasView6.app/Contents/Frameworks/sas/sascalc/calculator/ausaxs/lib/libausaxs.dylib: Mach-O 64-bit dynamically linked shared library arm64

@klytje
Copy link
Contributor Author

klytje commented May 22, 2024

Can you verify that the uncompressed library has both x86_64 and arm64? This is what my GitHub runner tells me after compilation:

 build/lib/libausaxs.dylib: Mach-O universal binary with 2 architectures: [x86_64:Mach-O 64-bit dynamically linked shared library x86_64] [arm64]

@wpotrzebowski
Copy link
Contributor

Can you verify that the uncompressed library has both x86_64 and arm64? This is what my GitHub runner tells me after compilation:

 build/lib/libausaxs.dylib: Mach-O universal binary with 2 architectures: [x86_64:Mach-O 64-bit dynamically linked shared library x86_64] [arm64]

@klytje: What do you mean by uncompressed library?

@klytje
Copy link
Contributor Author

klytje commented May 23, 2024

@wpotrzebowski The library artifact you downloaded from my repo.

I've read through the PyInstaller docs. Three observations:

  1. PyInstaller has the ability to trim multi-arch binaries to only a single arch. I am sure this is what must be happening. I don't know why it is removing the wrong arch, though.
  2. PyInstaller has a binary section where you're supposed to put binaries. I moved the library there. Previously it was in the data section. Can you try using the latest installer with this change?
  3. PyInstaller defaults to building for the architecture of the host machine unless otherwise specified. The macos-latest runners are all arm64, which explains why SasView is looking for that version of my library. But wouldn't this mean that the installer will not work with older Intel MacOS machines?

@wpotrzebowski
Copy link
Contributor

For uncompressed

file libausaxs.dylib 
libausaxs.dylib: Mach-O universal binary with 2 architectures: [x86_64:Mach-O 64-bit dynamically linked shared library x86_64] [arm64]
libausaxs.dylib (for architecture x86_64):	Mach-O 64-bit dynamically linked shared library x86_64
libausaxs.dylib (for architecture arm64):	Mach-O 64-bit dynamically linked shared library arm64

@klytje
Copy link
Contributor Author

klytje commented May 23, 2024

Actually after checking your previous comments again, I am even more confused. The installer is definitely building for arm64 (from CI output):
image
So PyInstaller seems to be doing it right, trimming away the x86_64 version of my library (your earlier file output):
libausaxs.dylib: Mach-O 64-bit dynamically linked shared library arm64

But then at runtime, your system seems to be running SasView as x86_64? That's the library architecture it is looking for, anyway (from your earlier debugging output):
OSError: dlopen [...] libausaxs.dylib' (mach-o file, but is an incompatible architecture (have 'arm64', need 'x86_64')

I don't understand how this is possible? How can an arm64 program be running in x86_64 mode?

@klytje
Copy link
Contributor Author

klytje commented May 23, 2024

Could this perhaps be due to the way an external Python process is spawned?
image

The first time we try to hook into my library, we do it from a spawned child process to avoid propagating crashes. But this spawns a fresh Python interpreter process. What would happen if you only have x86_64 Python installed on your system, but the process is spawned from an arm64 program?

Edit: I think we can test this by skipping the safety feature and just directly hooking into my library from the main process. @wpotrzebowski please try the new installer when you have time :)

@wpotrzebowski
Copy link
Contributor

The latest installer crashes silently.

@klytje
Copy link
Contributor Author

klytje commented Sep 9, 2024

@krzywon can you maybe give the latest installers a try since Wojciech is not available? Just try to run the generic calculator and see if it crashes

@krzywon
Copy link
Contributor

krzywon commented Sep 9, 2024

Thanks for the update, @klytje. I'll give it a try sometime this week.

@krzywon
Copy link
Contributor

krzywon commented Sep 13, 2024

I was able to test the latest build this morning. Unfortunately, I am still seeing the error pop up saying SasView stopped, but I also only see a single MacOS build. I am going to update this branch to the latest release to see if we can have parallel builds.

@klytje
Copy link
Contributor Author

klytje commented Sep 13, 2024

I was able to test the latest build this morning. Unfortunately, I am still seeing the error pop up saying SasView stopped, but I also only see a single MacOS build. I am going to update this branch to the latest release to see if we can have parallel builds.

It should already have been parallel builds? Anyway, if it's still crashing, I don't think we can get around waiting for the macbook to be shipped so I can do some proper debugging.

@krzywon
Copy link
Contributor

krzywon commented Sep 16, 2024

It should already have been parallel builds? Anyway, if it's still crashing, I don't think we can get around waiting for the macbook to be shipped so I can do some proper debugging.

I reran the build so the parallel builds were actually created. I just tested the MacOS latest build and am still seeing the error when running the debye calculation. @rozyczko - any news on the MacBook?

@klytje
Copy link
Contributor Author

klytje commented Oct 22, 2024

@krzywon or @wpotrzebowski Can you try again but running SasView from the terminal? I.e. navigating to the downloaded build and running ./SasView6.app/Contents/MacOS/sasview ? I have a suspicion of what might be going wrong.

@klytje
Copy link
Contributor Author

klytje commented Oct 22, 2024

I recompiled my own library to no longer automatically create a /temp folder on startup, and recompiled SasView to use this updated version. This seems to have fixed the crashing on my intel mac even when running the app directly. Can someone check if it also works on arm?

@krzywon
Copy link
Contributor

krzywon commented Oct 22, 2024

I was able to test this on my arm based Mac and the error message is no longer present when running any of the calculations.

@klytje
Copy link
Contributor Author

klytje commented Oct 23, 2024

Latest build confirmed working on all three platforms. I think this is ready for testing & review.

@klytje klytje marked this pull request as ready for review October 23, 2024 17:27
@krzywon
Copy link
Contributor

krzywon commented Oct 23, 2024

As noted in an email conversation, we are holding this for the next release.

@wpotrzebowski
Copy link
Contributor

Tested functionality on arm Mac and indeed it seems to work!

Copy link
Contributor

@krzywon krzywon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some thought, having this in the release would be useful for Mac users. Approved

@krzywon krzywon merged commit fb5d45c into release_6.0.0 Oct 24, 2024
40 checks passed
@krzywon krzywon deleted the mac_test branch October 24, 2024 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants