Skip to content

Use sets rather than lists in compiler types #247

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 25, 2024

Conversation

ppedrot
Copy link
Contributor

@ppedrot ppedrot commented Jul 24, 2024

We still have to keep the order for some reason, but at least the filtering process checks for membership in O(log n) rather than O(n). Not sure this matters in practice but at least we are not calling the generic structural equality on potentially arbitrary data.

ppedrot added 2 commits July 17, 2024 23:39
We still have to keep the order for some reason, but at least the filtering
process checks for membership in O(log n) rather than O(n). Not sure this
matters in practice but at least we are not calling the generic structural
equality on potentially arbitrary data.
@ppedrot
Copy link
Contributor Author

ppedrot commented Jul 24, 2024

@gares the merge_type function is a hotspot of HB for some reason. This PR doesn't solve the underlying efficiency problem, but 1. it makes it algorithmically more reasonable 2. it abstracts aways the implementation and 3. it doesn't rely anymore on structural equality.

@ppedrot
Copy link
Contributor Author

ppedrot commented Jul 24, 2024

(Full disclaimer: I have no idea what purpose this list of types is supposed to have, I am just messing with the code and hinting at a suspect point.)

if set' == t.set && lst' == t.lst && def' == t.def then t
else { set = set'; lst = lst'; def = def' }

let append x t = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this cons ?

@gares
Copy link
Contributor

gares commented Jul 24, 2024

I'd like to understand why you need ord all over the place.
Also ord for constant is risky, does it do compare x y = x - y or equivalent? I'd rather write that one by hand and be sure it is efficient (no function call)

@gares
Copy link
Contributor

gares commented Jul 24, 2024

merge_type function is a hotspot of HB for some reason. This PR doesn't solve the underlying efficiency problem, but 1. it makes it algorithmically more reasonable 2. it abstracts aways the implementation and 3. it doesn't rely anymore on structural equality.

Thanks your change makes a lot of sense.

At the same time, I don't get why it should be a bottleneck. In which file?
These are the list of types, as declared by the user. If I merge these lists over and over, then I should avoid that. I mean, they should be merged once and forall early in the compilation chain.

TBH, the compilation chain will require some serious scrutiny and speedup, scheduled this fall.

@ppedrot
Copy link
Contributor Author

ppedrot commented Jul 24, 2024

At the same time, I don't get why it should be a bottleneck. In which file?

The first HB calls in e.g. mathcomp-analysis/lebesgue_* files. There merging types account for ~70% of the runtime of the HB call. But I've seen it in other places. I'm not sure that the problem is the list proper, the merging of map itself seems to be costly as well. The typical call stack is in Compiler.Assemble.assemble, there is a call to ToDBL.merge_types there that is the main root of the issue. Rereading this piece of code, I'm assuming there are many collisions there because of poor algorithmics.

@gares
Copy link
Contributor

gares commented Jul 25, 2024

I'm looking into this, it is very weird this is expensive since most of these maps should be empty

@gares
Copy link
Contributor

gares commented Jul 25, 2024

I did push a commit. I guess you have a setup where you can bench this. If not I will do it myself.

In my local tests the list/sets of types are of size 1 or 2, so anything would do.
What was insane was to use Map.merge instead of Map.union.
I'm still puzzled you got a hot spot in that branch (the one calling Types.merge) since it is called very rarely, as far as I can tell. Could it be the case you saw a merge_type and you optimized the wrong one?

@gares
Copy link
Contributor

gares commented Jul 25, 2024

New make tests ONLY=sepcomp_perf

OK       sepcomp_perf1          0.23   0.00   0.23   55.1M  dune
OK       sepcomp_perf2          0.21   0.00   0.21   54.9M  dune
OK       sepcomp_perf3          0.83   0.00   0.83  275.3M  dune
OK       sepcomp_perf4          1.61   0.00   1.61  486.6M  dune

old

OK       sepcomp_perf1          0.27   0.00   0.27   56.6M  dune
OK       sepcomp_perf2          0.24   0.00   0.24   55.2M  dune
OK       sepcomp_perf3          1.00   0.00   1.00  270.7M  dune
OK       sepcomp_perf4          1.95   0.00   1.95  487.5M  dune

@ppedrot
Copy link
Contributor Author

ppedrot commented Jul 25, 2024

Could it be the case you saw a merge_type and you optimized the wrong one?

No, I'm confident this is the precise stack call I mentioned before.

In any case, your last commit has indeed solved the issue on mathcomp-analysis.

@gares gares merged commit b0b0d6c into LPCIC:master Jul 25, 2024
7 of 8 checks passed
@ppedrot ppedrot deleted the saner-merge-types branch July 25, 2024 13:28
Comment on lines +479 to +484
let fold t accu =
let t' = f t in
if t' == t then accu
else Set.add t' (Set.remove t accu)
in
let set' = Set.fold fold t.set t.set in

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants