Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[native_assets_cli] HookConfigBuilder.checksum should exclude some keys #1803

Closed
dcharkes opened this issue Dec 11, 2024 · 5 comments
Closed
Labels
P2 A bug or feature request we're likely to work on package:native_assets_cli type-bug Incorrect behavior (everything from a crash to more subtle misbehavior)

Comments

@dcharkes
Copy link
Collaborator

The refactoring in #1643 does a blanket checksum over the whole json representation of the config, this leads to the build-directory changing and cache disappearing for various config elements that should not have this effect:

If any of these change, we're likely not interested in any previous invocations of the same hook with the previous versions. (Note for config.codeConfig.targetIOSSdk we do actually want to cache the simulator and device separately, because you'd actually switch between building those in a Flutter project.)

I worked around linkConfig.assetsForLinking by applying setupLinkConfig after computing the hash, but that doesn't work for setupCodeConfig, because some parts should be taken into account for the hash while others don't.

Possible solutions:

  1. Keep a json and a jsonForChecksum.
  2. Keep a listOfPathsForChecksum

Side note, currently the checksum is dependent on the order in which the separate parts are added. This is a bug, the maps in the JSON should be regarded as inherently unsorted.

Possible solutions:

  1. recursively sort before hashing
  2. Use a hashing algorithm for sets

cc @mkustermann

@dcharkes dcharkes added type-bug Incorrect behavior (everything from a crash to more subtle misbehavior) P2 A bug or feature request we're likely to work on package:native_assets_cli labels Dec 11, 2024
@dcharkes dcharkes added this to the Native Assets v1.0 milestone Dec 11, 2024
@mkustermann
Copy link
Member

The refactoring in #1643 does a blanket checksum over the whole json ... this leads to the build-directory changing and cache disappearing for various config elements that should not have this effect:

It should exclude the out / out_shared members I believe, doesn't it?

cache disappearing for various config elements that should not have this effect:

  • cCompilerConfig.*

How often does the C compiler config change? Does it matter at all?

It's the safest thing to do to re-run the hook in a clean slate if the config (anything in the config modulo out / out_shared) changed. The hook doesn't even have to think about doing internal invalidation for things changing in the config. Imagine the hook would be invoked in the same directory but with different android api levels, then it would need to remember which one it used last time and basically invalidate bits and pieces that depend on api level. Seems complicated and error prone.

If there are expensive operations that one wants to cache across different configs, then users have the out_shared they can use for that. That's why we have it in the first place, no?

Also in the future, if we add metadata support (maybe it's still there?) we'd run it from clean state if metadata changed. If we don't do this, then again the hook itself needs to deal with this internally somehow, remember last metadata new metadata and magically know what needs to be invalidated.

Basically it's a trade-off of how much the system takes care of caching/invalidation vs the individual hooks. I think the way it's now is a reasonable sweet spot: For a typical developer they will flutter run many times and it will use the same config every time, so the system ensures to use same directory & developer gets incremental builds. And a hook author doesn't have to guard against being invoked in a "dirty" directory that matches a different configuration from a previous run and has to deal with that somehow.

  • linkConfig.assetsForLinking

This particular one we could explicitly exclude, yes. Currently it will only use a new directory if the set of assets changes (or the file uris in the assets), but not if the assets have different content.

Though as bugs like #1412 show reusing the same dir leads to bugs. So this is a "safer" solution.

I'm open to excluding that one.

If any of these change, we're likely not interested in any previous invocations of the same hook with the previous versions.

Exactly, so we don't benefit from an old populated out directory (normally we benefit because we have an incremental build, but if api levels change we have to compile things from scratch anyway).

Side note, currently the checksum is dependent on the order in which the separate parts are added. This is a bug, the maps in the JSON should be regarded as inherently unsorted.

That's not a bug as it causes no trouble. Also, a tool like flutter does generate them in a deterministic order (for same config it will always generate them in the same order and the checksum will therefore be the same) - only if the flutter SDK was updated may it change the order, which isn't a problem.

If we want to keep it sorted (and we really don't have to) we should do it in one place instead of assuming N different places in the codebase happen to remember to sort the maps, I've filed #1650 for this some time ago.

Possible solutions:

Keep a json and a jsonForChecksum.
Keep a listOfPathsForChecksum

Or leave it like it is :)

Possible solutions:

recursively sort before hashing
Use a hashing algorithm for sets

Or leave it like it is :)

@dcharkes
Copy link
Collaborator Author

Also in the future, if we add metadata support (maybe it's still there?) we'd run it from clean state if metadata changed. If we don't do this, then again the hook itself needs to deal with this internally somehow, remember last metadata new metadata and magically know what needs to be invalidated.

With that kind of logic we should also run from a clean state if the file system changed. Basically preventing hooks from doing any kind of internal caching when being reinvoked. If we move meta data to be files it's the same spiel, if the contents of the metadata change, we reinvoke the hook. It's up to the hook to inspect whether files changed and to do internal caching if they want. Whether metadata is in the config or in a file shouldn't change whether caching is reinvoked or not.

If there are expensive operations that one wants to cache across different configs, then users have the out_shared they can use for that. That's why we have it in the first place, no?

With that kind of logic, why do we even try to provide them the same dir on subsequent invocations of flutter run, then we should just always provide a random temp dir for out.

We discussed at length to preserve the outdir in #1593 and #1375.

If we go the direction of pushing users to use the out_shared, then also all users start to have to worry about locking. And users would have to make subdirs per config entry they are interested (e.g. $outdirShared/$os/$architecture). Having the convenience of the outdir being not shared by default is a safer default for users.

Okay, with that kind of argument, target OS versions should also influence it. (I guess I just convinced myself to include everything.)

Or leave it like it is :)

We're taking a checksum of an unordered collection and depend on the order for the checksum, that's just plain wrong. We can address it by #1650 indeed.

Okay, closing this issue as WAI. (Which means metadata caching semantics changes if we switch to files. #1251) The checksums being influenced by order can be solved as part of #1650.

@mkustermann
Copy link
Member

@dcharkes I'm just expressing my own view. If there's good reasons we should do it differently, we should re-open and discuss more!

IMHO we should optimize the case where the out folder contains cpu-intensively computed things that can be re-used in a follow-up build. Then re-using the same out folder will provide a faster incremental build (e.g. if a c source file changed, we have faster incremental builds). Though if we have to re-do all the work again anyway (e.g. because android target abi levels change / new c compiler / ... - require recompiling everything anyway) then the new build will not be incremental, it's going to be a full build. So having the same out dir doesn't buy us much.

That being said, I am a little concerned of us creating lots of directories that users eventually need to rm -rf because they run out of space. But this is somewhat similar to flutter builds already I think, they also accumulate large temp directories with hashes that have to be user-removed after a while. Especially this point here may be an argument for excluding the assetsForLinking in the checksum.

@mkustermann
Copy link
Member

Side note:

We're taking a checksum of an unordered collection and depend on the order for the checksum, that's just plain wrong. We can address it by #1650 indeed.

It's actually an ordered collection not an unordered collection: Our maps guarantee iteration order being insertion order (that's why deterministic code building a json will iterate keys in the same order as they were inserted and produce the same hash)

@dcharkes
Copy link
Collaborator Author

That being said, I am a little concerned of us creating lots of directories that users eventually need to rm -rf because they run out of space.

That would be a second benefit of including less of the JSON into the checksum indeed.

Presumably, the asset-specific-config-provider would know which parts of the config would cause unnecessary creation of new output directors. But I'm not entirely convinced of this yet.

It's actually an ordered collection not an unordered collection

(Right, we don't have the equivalent of unordered_map, which does not allow us to express precisely what we care about. We're over-specifying.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P2 A bug or feature request we're likely to work on package:native_assets_cli type-bug Incorrect behavior (everything from a crash to more subtle misbehavior)
Projects
Status: Done
Development

No branches or pull requests

2 participants