-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve minimize performance #94
base: master
Are you sure you want to change the base?
Conversation
I'm not sure I understand. It may be the case that a proof is shortened by statements using constants not appearing in the current proof. Maybe you know in advance that such shortenings will not be caught by the current minimize_with program, so you're saying we might as well discard them from the beginning ? |
Yes, minimize_with X tries to use X at one of the steps of the proof, so naturally all constants used in X must be already present in the proof of our statement. I can imagine a multi-step subproof using some new constants yet shortening the proof of the original statement, but minimize_with can't find those, unfortunately. |
src/metamath.c
Outdated
|
||
/* Fill the usefulStatements array */ | ||
for (k = 1; k < g_proveStatement; k++) { | ||
usefulStatements[k] = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to preset it to 0, so you dont need all the usefulStatements[k] = 0 instructions for the early-out-of-loopbody instructions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
for (i = 0; i < mlen; i++) { | ||
if (g_MathToken[mString[i]].tokenType == (char)con_) { | ||
if (!g_MathToken[mString[i]].tmp) | ||
usefulStatements[k] = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be followed by a break; instruction
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
It's certainly possible to add one. The usual approach is a simple data structure with a count of the "number of elements" and a pointer to the (beginning of the) array elements; if not NULL it needs have been allocated (e.g., with malloc). To resize, call realloc() to resize the array, move things as necessary, and change the count. I did a quick Google search and found this one, implemented as OSS: https://github.com/eteran/c-vector |
https://github.com/eteran/c-vector |
We could move to C11 if that's helpful. The key is compiler support in deployed systems. GCC seems to have good C11 support. However, I don't think that helps. C11 doesn't have a vector type as far as I know, and https://github.com/eteran/c-vector doesn't require C11. We should write code that is compatible with C11, but I think we should only require C11 if there's advantage to it. I didn't see any advantages in this list of C11 additions |
See for example https://en.cppreference.com/w/c/io/fprintf to see how much C has developed since 1989. Functions like snprintf really make sense. Or // style comments.
This is the command executed by the build.sh script. It does not check for ANSI C. The option |
I agree snprintf and // are useful, but I those are C99 additions, not C11. I'm not opposed to switching to a later C spec, I just want to make sure that a spec move is justified and that it is widely implemented by commonly-available & distributed compilers. We should at least move up to C99, no question in my mind!!! C99 is very widely supported & I think we're already using its facilities. The only question is if we should move to C11 or not. |
We are already on |
It is, but is it worth doing so for one application? Also, metamath-exe uses its own memory management system, for example, in code I used |
Indeed: CONTRIBUTIONS.md states this literally. |
/* usefulStatements is a boolean array, where usefulStatements[k] == 1 | ||
iff statement k might be useful in minimization */ | ||
char *usefulStatements; | ||
usefulStatements = poolMalloc(g_proveStatement*sizeof(char)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use bool
, true
, false
as advised in #17 (comment) ? Then a few statements below have minor modifications, e.g., usefulCount += usefulStatements[k];
has to be replaced by something like if (usefulStatements[k]) {usefulCount += 1}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think originally I used char
for consistency, for example, the .tmp
field uses 0
and 1
instead of bool
. As a positive side effect, usefulCount
can be updated without branching (although I don't think it makes any great difference in this case).
This is an implementation of the idea presented in #10 . One precomputes the set of math constants used in the proof of the minimized statement (it is stored in .tmp field of g_MathToken in the code), and for each statement we check if all its constants lie in this set. Statements which pass this validation step are marked with 1 in the boolean array usefulStatements. Then the usual back-and-forth minimize_with loop is run, but now it checks only "useful statements".
If minimize_with is supplied with the /VERBOSE option, it will output the number of potentially useful theorems at the beginning.
This optimization reduces the minimization time twofold. For example, on my computer
smfsupmpt
requires 94s to minimize with the vanilla version, and 42s with this optimization;ssmapsn
requires 32s without and 13s with the optimization.One downside of my code is the allocation of the usefulStatements array, I would prefer it to be an expandable vector (not a boolean array), but as far as I understand metamath codebase doesn't have a vector implementation.