Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unique names for generated "test.fs" files #18308

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

majocha
Copy link
Contributor

@majocha majocha commented Feb 10, 2025

WIP, just to see how many things break.

See: #18049 (comment)

Main problem is that the hardcoded "test.fs" makes it into expected IL as implicit module name (?) in many VerifyIL tests.

Copy link
Contributor

github-actions bot commented Feb 10, 2025

✅ No release notes required

@majocha
Copy link
Contributor Author

majocha commented Feb 10, 2025

I don't really like the approach in this PR.
I wonder what it would take to implement what the comment here says instead:

// Note: we should likely adjust this code to always normalize. However some testing (and possibly some
// product behaviour) appears to be sensitive to error messages reporting un-normalized file names.
// Currently all names going through 'mkRange' get normalized, while this going through just 'fileIndexOfFile'
// do not. Also any file names which are not put into ranges at all are non-normalized.
//
// TO move forward we should eventually introduce a new type NormalizedFileName that tracks this invariant.
member t.FileToIndex normalize filePath =

And if it would help?

@KevinRansom
Copy link
Member

What are you trying to solve?

@majocha
Copy link
Contributor Author

majocha commented Feb 10, 2025

There's a problem with FileIndexTable that Martin is hitting in his PR:

A week ago I found that some random test failure that appeared in some of the CI runs were no longer flaky tests that everybody experiences (like a few weeks ago), but are specific to this PR. The reason turned out to be that, for a certain class of tests, global resources (in this case, the FileIndex tables) are kept across tests but are used in ways that were not foreseen (in this case, using different source files with the same name). Therefore these tests are not suited for the parallelization that was recently introduced. It seems that for other uses of FileIndex (like for #line directives, or for file names for error messages) this issue is not visible or just very rare, because of low test coverage.

It seems it is fundamentally non thread safe :(

@KevinRansom
Copy link
Member

You mean the compiler caches based on just filename rather than the path?

@majocha
Copy link
Contributor Author

majocha commented Feb 10, 2025

Yes, so it seems.

@KevinRansom
Copy link
Member

Well, that is a compiler bug then.

@majocha
Copy link
Contributor Author

majocha commented Feb 10, 2025

Yes, exactly! Working around it doesn't seems right to me.

@KevinRansom
Copy link
Member

For IL verify, because the module name appears in the output I try to use
withName "MyNameForTest"
and
|> withOutputContainsAllInOrderWithWildcards [
"All Classes and Methods in*MyNameForTest.exe Verified."

rather than rely on generated names.

Someone should add a bug and repro for the FileIndex issue. I wonder what it means for deterministic builds though.

@Martin521
Copy link
Contributor

Aren't these two different problems?

  • FileIndex seems to be computed based on file name only instead of path
  • Some test sets use (with the same checker instance) that same file name "test.fs" in multiple tests

@majocha
Copy link
Contributor Author

majocha commented Feb 10, 2025

Aren't these two different problems?

  • FileIndex seems to be computed based on file name only instead of path
  • Some test sets use (with the same checker instance) that same file name "test.fs" in multiple tests

Yes, right. I have to take a deeper look again.

//static do testFileName.Value <- "test"
static member internal MakeTestFileNameUniqueForThisTestCase() =
testFileName.Value <- $"{Interlocked.Increment &counter}{Path.PathSeparator}test"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you didn't want to use Path.PathSeparator, which is "A platform-specific separator character used to separate path strings in environment variables."

@majocha
Copy link
Contributor Author

majocha commented Feb 10, 2025

@Martin521 I'll try to redo it with more limited approach, only touching the places where the checker is used directly. (i.e. ParseAndCheckFileInProject). Other code paths using checker.Compile actually do commit the files to a unique temp directory first, so they should be good.

@majocha
Copy link
Contributor Author

majocha commented Feb 10, 2025

Ok, so I just made the names of synthetic sources (that never got written to disk) unique.
But now I wonder if this is necessary at all.
The checker checking different sources under the same filename is equivalent of checking many quick changes in the same file. This should never break anything, even when executed in parallel. If it does break, it is a bug, not a test suite issue.

@Martin521
Copy link
Contributor

The checker checking different sources under the same filename is equivalent of checking many quick changes in the same file. This should never break anything, even when executed in parallel. If it does break, it is a bug, not a test suite issue.

Well, if it is done in parallel, one test might use the (intermediate) FileIndex-based results of another one. In "my" case, one might skip a warning because the other one had a "#nowarn". This is a rare race condition, but it is happening. It is at least my explanation for runs like this one that show errors that did not appear before or after.

I would love to test your code in my PR.

@majocha
Copy link
Contributor Author

majocha commented Feb 10, 2025

Ok, then it seems my approach is wrong again. Probably we should just give a unique ProjectId or Stamp:

ProjectFileName = "Z:\\test.fsproj"
ProjectId = None
SourceFiles = [|"test.fs"|]
OtherOptions = Array.append testDefaults assemblies
ReferencedProjects = [||]
IsIncompleteTypeCheckEnvironment = false
UseScriptResolutionRules = false
LoadTime = DateTime()
UnresolvedReferences = None
OriginalLoadReferences = []
Stamp = None

each time we generate a test project.

@Martin521
Copy link
Contributor

Probably we should just give a unique ProjectId or Stamp

How would that help? Does it change the file name?

@majocha
Copy link
Contributor Author

majocha commented Feb 10, 2025

It instructs the service that we have different project. I think it shoud create separate incremental builders for each, or in case of Transparent Compiler, cache the intermediate results correctly, because project snapshots are different.

@Martin521
Copy link
Contributor

Martin521 commented Feb 10, 2025

But as long as it runs in the same process, we have the same global FileIndex table, no?

(Which means also that my remark above about "the same checker instance" does not make sense. It is about the same process, I think.)

@majocha
Copy link
Contributor Author

majocha commented Feb 10, 2025

Yes! I'm not familiar with that stuff, but I think parsing and lexing operates on a lower level that is oblivious of the checker, and uses global state that is somehow not fully threadsafe. If it comes out that having a file with same name in two different projects leads to errors in parallel checking, this is really a compiler bug.

@majocha
Copy link
Contributor Author

majocha commented Feb 10, 2025

It would be great to have a small clean repro.

@Martin521
Copy link
Contributor

In all production scenarios it is not a problem, because a file with the same path is the same file. The problem appears only in testing.

@Martin521
Copy link
Contributor

It would be great to have a small clean repro.

Let me try to create that

@majocha
Copy link
Contributor Author

majocha commented Feb 10, 2025

It seems when this is called, there is no requirement for fileName to be absolute path:

lexbuf.EndPos <- Position.FirstLine(FileIndex.fileIndexOfFile fileName)

I wonder if it is sufficient to have two different files with same name in two different projects for this to break.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: New
Development

Successfully merging this pull request may close these issues.

3 participants