Skip to content

Re-enable Scala-Native build #394

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Re-enable Scala-Native build #394

wants to merge 1 commit into from

Conversation

lihaoyi
Copy link
Member

@lihaoyi lihaoyi commented May 24, 2025

  • ZipOps.scala just moved to src-jvm for now since we rely on Java sources
  • The ProcessHandle APIs that previously were missing are now present in scala-native 0.5.7

@LeeTibbert
Copy link

I'd like to see os-lib up and running on Scala Native again.

I took a look at two of the log files; one for linux & one for macos.

The good news is that they both have the same fundamental failure. Detecting child process exit on
linux-like is almost instantaneous. doing that on macos must deal with a timing delay. The fact that
they are both the same implies a logic error, somewhere in the call chain leading to it or in
the methods sense of "recovery" from what would otherwise be an error.

I'll have to see if I can replicate the error in a private os-lib build.

I believe I have seen this "waitpid failed: No child processes", or its close kin, before in an
os-lib context. Without jumping to conclusions before evidence, I seem to remember
that it is a race condition: where two un-synchronized threads are manipulating
the same pid.

May I assume the tests which fail on SN pass on JVM?

As I mentioned on the SN Discourse channel:

The 'obvious' fix is the treat a failed  "get the exit status of this child process" as a non-fatal "get the exit status of this possibly non-extant child process, I'll risk leaving zombie processes around."  I need to think through how that change
would reverberate.

An alternate 'obvious' fix would be to make (an ancestor of) the method in question synchronized.
I'll have to read the JDK doc for the 5 thousandth time to see what it says about synchronization
or lack there of in the Process class.

This is the work of days & weeks, not hours.

@LeeTibbert
Copy link

CI for SN PR #4349 appears in danger of succeeding.

That PR is, hopefully, a "reduction in strength" change which might help this issue.

The change is to not throw and to return an Process exit code of 1 (general error). This
should be similar to current SN WindowsProcess behavior. I have not been able to
trace far enough to see how the currently failing os-lib tests will react. There seems to
be a parameter to os-lib Process.(call?, ???) which says "ignore non-zero error codes from children".

I do not know if the cited PR will be merged. I also do not know, beyond Real Soon Now, SN 0.5.8
will be released.

My time is asymptotically zero over the next few months, but I will monitor this PR #394.

Please do let me know what you experience.

What do you think about disabling the problematic SubProcess Tests on Scala Native
until I/we can sort this out?

SN ProcessTest has a year old test for "waitfor after waitfor succeeds" so I think os-lib usage in
the wild will be OK. SubProcess on SN could be documented as "at your risk" or "occassionally throws".

@LeeTibbert
Copy link

quick update:

With sufficient thrashing, I figured out how to run the tests. I established a baseline replicating
the "No such process" part of this Issue. The tests consistently passed on JVM and failed
once every 7 to 10 runs on Scala Native. No news here for os-lib devos, but getting
that baseline was an achievement for me.

I then ran the SubProcess tests against a version of Scala Native which incorporates
a debug (printf) version of the Scala Native PR #4349. The printf's there let me
know that I was actually executing the code I thought I was execution. The result
of the one run I was able to do before running out of time for this session was that
I had 0 SubProcess failures in 50 runs.

In a future session, I want to run those tests in a loop for a few thousand executions.

I did consistently see a 'isExecutable` test failure in another group of tests.
I whistled past it because I did not want to loose my focus on the "No such process"
error.

I mention that failure here to help set expectations. The SN PR #4349 should reduce
the number of errors in this issue, but it appears to not be a total fix.

When the ECHILD (No such process) concerns settle, I'll try to chase the isExecutable failure.

@LeeTibbert
Copy link

LeeTibbert commented May 29, 2025

Status: 2025-05-29

I switched my attention to the failing "test.os.FilesystemMetadataTests.isExecutable.0",
since that seems to be the blocker once SN PR #4349 becomes generally available.

  • I have not intentionally run the previously failing "ProcessHandle" tests as "testOnly".
    They run as part of an overall ".test". I have not seen any failures there in the 15 or so
    full ".test" runs I've done today.

  • As previously reported, there is a ""test.os.FilesystemMetadataTests" error in the
    scala native build log here, specifically for "isExecutable". I can replicate that manually.

  • FilesystemMetadataTests.scala uses an idiom test - prep { wd =>. I could not find
    a description of that idiom in the mill or utest documentation or via Google in economic
    time. Can I take that to mean "execute the prep task and then execute this test"?

  • It looks like TestUtil.scala#prep copies the "misc/echo" file using the code below.

    By my reading, nothing on that like specifies COPY_ATTRIBUTES.

    The 'isExecutable' test fails on SN because the destination file indeed
    has does not have the executable bit set.

    The JVM appears to copy the attributes, including the execute bit, unbidden.

       override def visitFile(file: Path, attrs: BasicFileAttributes) = {
          Files.copy(file, directory.resolve(original.relativize(file)), LinkOp\
tion.NOFOLLOW_LINKS)
          FileVisitResult.CONTINUE
        }
  • I also have to study if that copy interacts with a non-default user umask, say 0077 instead of 022.

  • My next step is to create SN test cases to see how JVM & SN handle Files.copy(Path, Path, COPY_OPTIONS) with
    on a origin file with the user execute bit set. I believe that Files.copy has some variance from the JVM when
    it comes to file timestamps. I'd expect the permission bits to be set correctly.

  • The JVM 24 javadoc is pretty unclear, at least to me, as to what happens when COPY_OPTIONS is not set.
    I'd expect the os or current process defaults to be used. JVM evidence suggests elsewise.

  • An immediate is for the "os-lib" "isExecutable" test to explicitly set the "x" bit
    if it expects it, similar to the way it is setting expected the expected bits for "File.txt" in the same test.
    os.perms.set(wd / "File.txt", "rw-rw-rw-"). This would make the os-lib test more robust and a test
    of the executable bit, not of Files.copy() behavior.

    • This would be a "point" fix for that specific case but would not address possible other occurrences of the
      same.
  • I can also explore in a sandbox what happens if prep is changed to specify COPY_OPTIONS. That is a more general
    fix and what was probably intended in the first place. This is probably a better long term fix. It makes it easier
    for readers of the "prep" code to follow what the maintainer intended/intends.

@LeeTibbert
Copy link

Status: 2025-05-30

@lihaoyi @lefou

I have come to the point where I would like some guidance, as your time permits. Thank you.

  1. Making the change below to FilesystemMetadataTests#isExecutable allows both
    mill "os.native[3.3.5].test and mill "os.jvm[3.3.5].test to successfully complete.
    The Scala Native version was a private build of 0.5.8-SNAPSHOT, which contains
    the fix from SN PR #4349.

    If you give the go ahead, I can create an os-lib PR. That should allow os-lib
    native builds to pass unit-tests once SN 0.5.8 is available. I do not know the
    schedule for that release.

  test("isExecutable") {
      test - prep { wd =>
        if (Unix()) {
          os.perms.set(wd / "File.txt", "rw-rw-rw-")
          os.isExecutable(wd / "File.txt") ==> false
          os.perms.set(wd / "misc/echo", "r-xr--r--") // THIS IS THE CHANGE
          os.isExecutable(wd / "misc/echo") ==> true
        }
      } 
  1. TestUtil#prep can be altered to allow copy attributes. With that change alone and no isExecutable change
    both native and JVM complete test runs succeed.

    The change is messay and betrays my lack of a deep JVM understanding, so I an not recommending it this time.
    I do think it is worth considering in the mid-range, after I have studied why copying link attributes fail.

  2. As a starting point, I think that the semantics of TestUtil#prep copies should be close approximation
    of os.copy().

    -- That invites the question "What are the intended semantics of 'os.copy()' with respect
    to copying attributes, especially file permissions?".

    -- I read the nicely done os-lib repository README but did not see the semantics of
    os.copy() described. Since that is, I think, meant as an intro, I also scratched around
    for a Advance Topics link or such. Sorry if I missed anything, especially anything obvious,
    in my haste.

  3. There may or may not be two or more bugs in the Scala Native Files.copy() implementation.
    I need to do a focused study of that method. I'd like to get this os-lib build succeeding before
    I do such a time-consuming recursive edit. IMO, the isExecutable fix above in os-lib is worthwhile
    in any case.

@LeeTibbert
Copy link

Status: 2025-06-07

@lihaoyi @lefou @ekrich @WojciechMazur

FYI

  • I built this branch of os-lib using the publicly available repository of the recently released
    Scala Native 0.5.8 and the FilesystemPermissionsTests.scala#isExecutable() edit suggested
    in the entry just above. All looks good to me.
    2025-06-07 07:07 -0400

    These all work for me, JVM 24 Zulu:

    - mill "os.native[3.3.5].compile"

    - mill "os.native[3.3.5].test.testOnly" "test.os.SubprocessTests"

    - mill "os.native[3.3.5].test.testOnly"  "test.os.FilesystemMetadataTests"

    - mill "os.native[3.3.5].test"

    - mill "os.jvm[3.3.5].test"
  • Please advise if there is anything I can do, such as submitting a PR for the
    isExecutable change above.

    I think we are all on the same page that it would be nice to see os-lib built
    for a current version of Scala Native.

  • It is unfortunate that a Scala Native specific change/workaround is needed. I am still
    studying SN javalib Files.copy() to determine proper behavior. The isExecutable edit
    is probably applicable in any world.

@LeeTibbert
Copy link

Once Scala Native PR #4383 (or successor) is merged, any
change to FilesystemMetadataTests#isExecutable should no longer be necessary.

After that PR is merged, I hope to again attempt building os-lib with a Scala Native
build. I expect it to succeed, as my previous SN os-lib builds have,
but want to rule out that "one last build breaker".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants