Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include the GHC "Project Unit Id" in the cabal store path #9326

Closed
wants to merge 1 commit into from

Conversation

sol
Copy link
Member

@sol sol commented Oct 12, 2023

  • This allows the use of several API incompatible builds of the same version of GHC without corrupting the cabal store.
  • This relies on the "Project Unit Id" which is available since GHC 9.8.1, older versions of GHC do not benefit from this change.

This PR is an alternative to #9325. It changes the store path from e.g. [...]store/ghc-9.8.1 to [..]store/ghc-9.8.1-f7d8.

[fixes #8114]

@sol sol force-pushed the abi-tag-in-store-path branch 4 times, most recently from 63c7fc3 to 23b1e7a Compare October 12, 2023 11:30
@sol sol marked this pull request as ready for review October 12, 2023 11:32
@sol sol requested a review from hasufell October 12, 2023 11:36
@andreabedini
Copy link
Collaborator

This design seems reasonable, perhaps worth spending a moment to make sure it does not backfire with some unexpected consequences ☺️

I am a bit surprised to see how far down we need to bring Compiler. I never really liked the way StoreLayout is passed around. It's not modular, it's passed in the same form everywhere even if a single path is actually needed.

I'll have another look tomorrow or early next week.

@hasufell
Copy link
Member

This design seems reasonable, perhaps worth spending a moment to make sure it does not backfire with some unexpected consequences ☺️

Does anything rely on the shape of the hashes? Some 3rd party tool?

@gbaz
Copy link
Collaborator

gbaz commented Oct 12, 2023

Of the two proposed tickets to address this, I think this is the cleaner one conceptually, though the amount of code necessarily touched is unfortunate.

@andreabedini
Copy link
Collaborator

andreabedini commented Oct 16, 2023

As we discuss this topic, I noticed today that the platform is present in the cabal hash but not in the store path. This is different from what we do with dist-newstyle where we have dist-newstyle/build/x86_64-linux/ghc-9.4.7/....

Perhaps this is something to consider togheter with the abi-hash.

Edit: On a second thought. I think I'd prefer adding the platform to the path, just like we do with dist-newstyle, and adding the ghc abi hash to cabal hash, rather that to the store path. This has a simple structure and a clear meaning to the user.

A power user can still delete all cached entries from a particular GHC by grepping for the GHC ABI tag. cabal-install could also provide some store-management commands to help.

@andreabedini
Copy link
Collaborator

Does anything rely on the shape of the hashes? Some 3rd party to1ol?

This PR changes the store path, not the hashes. Am I misunderstanding?

@andreabedini
Copy link
Collaborator

Apologies if I keep commenting. I was under the impression we were going to include a full hash, while GHC does not actually provide one but only ("Project Unit Id","ghc-9.8.1-5640").

@hasufell
Copy link
Member

Stack indexes like this: snapshots/x86_64-linux/56dcb4c9ff9dd05479063f17af7f28176fa92fad90f4c0a1da92505d2a5f0010/9.2.7/

The hashing is here: https://github.com/commercialhaskell/stack/blob/bffa6d8cd33620fdf962d1d16fca0eb92cea7549/src/Stack/Build/Source.hs#L144-L183

@hasufell
Copy link
Member

A power user can still delete all cached entries from a particular GHC by grepping for the GHC ABI tag. cabal-install could also provide some store-management commands to help.

I'm a power user and I have no idea how to do that. I don't want to think about nix hashes at all. I want to delete directories.

We should add the hashes to the nix store AND the path. There's no sensible sharing of artifacts possible between two different GHC ABIs. Directory structure is unix interface.

@andreabedini
Copy link
Collaborator

Let's make a summary.

The goal is to prevent cabal from linking togheter two packages compiled with incompatible binary distributions of GHC. Given cabal only tracks the compiler id in the package hash, it is not able to tell the difference between them; and it will use a package compiled with bindist1 as a dependency of a package being compiled with bindist2.

After some discussions, the proposed solution seems to be the following:

  1. store/ghc-9.8.1/ -> store/x86_64-linux/ghc-9.8.1-5640

The platform is just like distParamPlatform.

The current implementation creates ghc-9.8.1-5640 as follows:

compilerId = CompilerId GHC ghcVersion
compilerAbiTag = maybe NoAbiTag AbiTag (Map.lookup "Project Unit Id" ghcInfoMap >>= stripPrefix (prettyShow compilerId <> "-"))

storeDirectory compiler =
  storeRoot </> case compilerAbiTag compiler of
    NoAbiTag -> prettyShow (compilerId compiler)
    AbiTag tag -> prettyShow (compilerId compiler) <> "-" <> tag

This feels like a roundabout way (stripping the prefix and putting it back later) and making the assumption that Project Unit Id does not change format is a bit dangerous. I am not sure what else to do though. @ulysses4ever @Kleidukos @Mikolaj any ideas?

  1. compilerAbiTag (if present) is also added to PackageHashConfigInputs (so far it only includes CompilerId).

I would be ok with this approach.

On the other hand. I am thinking that the problem might magically disappear in GHC 9.8.1 anyway (note that the above does not solve anything for GHC < 9.8.1 since before 9.8.1 GHC does not report a Project Unit Id).

I am suggesting this because in the GHC 9.8.1 bindist the boot packages have hashes in their unit-id:

❯ ghc-pkg-9.8 field base id
id: base-4.19.0.0-c1f2

which automatically become part of cabal-hash.txt

❯ cat ~/.cabal/store/ghc-9.8.1/recursion-schemes-5.2.2.5-096d5be76b146a7451245d55b00e825d439de1f5f38035fef98d58717306ba07/cabal-hash.txt
pkgid: recursion-schemes-5.2.2.5
component: ComponentLib
src: 5cb79bd5d6dd5a0adf61ccc37a93c4fcfaeb6077f60a975c895feb32744d97ec
pkg-config-deps:
deps: base-4.19.0.0-c1f2, comonad-5.0.8-26bba58834fdfe953c6c90a6ef050bac0853862fe18a710c09715b3c29224e79, containers-0.6.8-3ad8, data-fix-0.3.2-66268e4eff8c2b635e86cd1ff1dcd4a3e9d93b6b37e50d8021849fc34c233314, free-5.2-c2bcd459c60818c125bce8f9c43cfbe0fd5a27a7c844aad5a140ddcfe113d115, template-haskell-2.21.0.0-32bc, th-abstraction-0.6.0.0-2a7c5dce902ff7623b3dd30662122a08ae46f7cb08fbe2488264492afb84279c, transformers-0.6.1.0-0c63
compilerid: ghc-9.8.1
platform: x86_64-linux
flags: +template-haskell
shared-lib: True
stripped-exe: False
debug-info: 0

Could this be enough to prevent the binary compatibility issues we want to avoid?

@bgamari @Ericson2314 @angerman can you comment from the GHC side of things?

@hasufell
Copy link
Member

Let's not try to be too smart. It is the right thing to do to isolate compilers that have zero ABI compatibility.

For GHCs that don't provide Unit ID, we can employ similar hashing like stack. I linked it above.

@andreabedini
Copy link
Collaborator

Let's not try to be too smart. It is the right thing to do to isolate compilers that have zero ABI compatibility.

😒 I don't think I have said or suggested anything smart. I was trying to define what you mean by "compilers that have zero ABI compatibility".

For GHCs that don't provide Unit ID, we can employ similar hashing like stack. I linked it above.

I still don't know how that is computed. If we want to be very sure what about using sha256sum --binary $(which ghc)?

🤷

I won't contribute to this any further. I will be ok with whatever solution is accepted.

@hasufell
Copy link
Member

I still don't know how that is computed.

  • it hashes the output of ghc --info (this is a nice approximation prior to the unit ID, although in some cases too weak and in some cases too eager)
  • compiler path (I think that is in fact inappropriate)

I don't think I have said or suggested anything smart. I was trying to define what you mean by "compilers that have zero ABI compatibility".

I think relying on boot libraries abi hashes is somewhat "smart" (we do that in the HLS wrapper script in fact, but because there was no exposed GHC ABI at that time). We also don't know exactly how that will pan out if we ever have proper decoupling of GHC and base (reinstallable base).

- This allows the use of several **API incompatible builds of the same
  version of GHC** without corrupting the cabal store.
- This relies on the "Project Unit Id" which is available since GHC
  9.8.1, older versions of GHC do not benefit from this change.

[fixes haskell#8114]
@mpickering
Copy link
Collaborator

I think @andreabedini is right to observe that including the compiler ABI hash in the package hash would be sufficient to fix this problem. Remember, nix builds literally everything into the same store without any issues.

If we remember the purpose of the separating out the build artifact in per-version directories is based on the incorrect assumption that there will only be one, compatible ghc version installed and used on your system the natural progression of this idea is to do what is done in this MR and increase the resolution of this distinction to distinguish between GHC's of the same version but different ABI.

I agree with Julian that it's more ergonomic to maintain the per-compiler directories for manually gc-ing build artifacts.

Therefore big 👍 from me on this direction of travel.

@mpickering
Copy link
Collaborator

Anything which can be done to move this PR forwards?

@alt-romes
Copy link
Collaborator

@sol thanks for putting this forward. Would it be okay for me to try and take it over the finish line?
Do let me know if you intended to pick this up again soon.

I agree with @mpickering's summary of the discussion:

  • It is likely sufficient for the ghc unit id to be included in the package hash to avoid corrupting the cabal store from incompatible ghcs
  • As @hasufell put it, I agree it is better for this to be simultaneously reflected in the directory structure, as the current PR already does.

alt-romes added a commit to alt-romes/cabal that referenced this pull request Jan 16, 2024
This complements the previous commit in order to fix haskell#9326
@alt-romes
Copy link
Collaborator

I've rebased this commit and additionally included the ABI hash in the package hashes in #9618.
It should subsume this PR. I've kept your commit there @sol. Hope that is fine.

@alt-romes alt-romes closed this Jan 16, 2024
alt-romes added a commit to alt-romes/cabal that referenced this pull request Jan 16, 2024
This complements the previous commit in order to fix haskell#9326
alt-romes added a commit to alt-romes/cabal that referenced this pull request Jan 16, 2024
This complements the previous commit in order to fix haskell#9326
alt-romes added a commit to alt-romes/cabal that referenced this pull request Jan 16, 2024
This complements the previous commit in order to fix haskell#9326
alt-romes added a commit to alt-romes/cabal that referenced this pull request Jan 16, 2024
This complements the previous commit in order to fix haskell#9326
alt-romes added a commit to alt-romes/cabal that referenced this pull request Jan 17, 2024
This complements the previous commit in order to fix haskell#9326
alt-romes added a commit to alt-romes/cabal that referenced this pull request Jan 17, 2024
This complements the previous commit in order to fix haskell#9326
alt-romes added a commit to alt-romes/cabal that referenced this pull request Jan 17, 2024
This complements the previous commit in order to fix haskell#9326
alt-romes added a commit to alt-romes/cabal that referenced this pull request Jan 17, 2024
This complements the previous commit in order to fix haskell#9326
@sol sol deleted the abi-tag-in-store-path branch January 19, 2024 09:52
@sol
Copy link
Member Author

sol commented Jan 19, 2024

Would it be okay for me to try and take it over the finish line?

Yes, please.

@alt-romes
Copy link
Collaborator

@sol I've finished it in #9618.

Mikolaj pushed a commit to alt-romes/cabal that referenced this pull request Jan 19, 2024
This complements the previous commit in order to fix haskell#9326
erikd pushed a commit to erikd/cabal that referenced this pull request Apr 22, 2024
This complements the previous commit in order to fix haskell#9326
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cabal does not consider GHC ABI
6 participants