-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Please formally specify numeric type handling rules #170
Comments
Another thing I realized: since |
I don't see this. If the result is a float then wouldn't
I think that's actually the whole purpose.
I think its origin was mainly in the fact that without it random variation could (and often would) produce single numbers that exhaust all of memory, e.g. from a loop that performs repeated exponentiation. We have to make the environment safe from such run-killing excesses, which is also why we limit the size of values on the That explains the
Maybe that's a fine value, but if the denominator is allowed to grow exponentially then we'd be in trouble again, possibly consuming all memory and crashing the run because one program did something foolish. Conceivably, though, we'd want different limits for rationals, in which case we should ditch the generic approach and break out everything by numeric type. |
No, you're looking at the underflow. If the result of |
So (not asking for a history, just proposing) why not simply check values pushed to the stacks? In other words why not state explicitly that any ( Then we can simply check each numeric value before pushing it to the stack, and change or discard it as appropriate. That separates the gate-keeping responsibility from the instructions (which are doing it wrong now), and expresses your goals better. As things are now, they are tied up inextricably with one another. |
Alternately, just use exception-handling as it's intended, and capture over- and underflows and do the truncation when it occurs. So |
I still don't see it. Here's the full definition:
If |
Isn't that exactly what we do? When a numeric instruction is about it push a result to the stack it runs it through |
Perhaps that's a good alternative implementation, but I'm not sure. I liked and understood exceptions ("conditions") in Common Lisp, but when I moved to Clojure I found them confusing and never got them to do what I wanted (I think because they're handled in a different context than I expected). So I've avoided them. If there's a way to get the functionality of |
@lspector you're right about the On that note: 😄
All I'm saying that instead of having fifty different calls to a single function called I'm saying that if you want to have integers fall within a particular range, then there should one one call to the function that forces them into that range, and it should be right there where every Similarly, there should not be a |
My heartfelt and carefully-considered frustration comes, to be honest, from trying to imagine how to make simple, small changes that could be incorporated into your working codebase. And failing. Because it's so tied together in knots, there's almost no single method or object that I can approach without potentially affecting a whole slew of globals and at least one thing tucked away in To avoid introducing dangerous regressions (whatever that means when there are no automated tests) the entire codebase either has to all be tested first as it is (in other words without making any changes to improve or change it), or started over from scratch. I wish somebody from your side had the resources to actually pair with me. Preferably you, since you are the final gatekeeper, and seem to have the most to gain from understanding the strengths of automated testing and the extraordinary dangers this untested and hard-to-extend codebase poses to your students and research program. :( |
I thought we were sort of doing this by having |
I would like this too, but it will be unusually tough for me in the next couple of weeks at least, since I've just taken on an administrative position, covering for a colleague who is out on maternity leave... But I will bring this up at our lab meeting tomorrow and see what we can do! |
In the meantime I've kicked to wall too many times, and started this project. I'm not a good enough programmer to write any interpreter without tests, and definitely am not the man to fix this born-legacy code. As I move forward (much faster now), what I'll do is stop opening unresolvable issues here that involve architecture changes nobody can make, and start asking specifically what it is each de facto "feature" does. So for example in this case, the code defines the Push |
Cool.
I guess. My real intention was more like "integers, but don't let them get big enough to crash the world." But in practice, yes. |
I'm definitely late to this party (had a crazy week and saw a long thread). But, I like @Vaguery's idea of checking the legitimacy of things at pushing-to-stack time instead of in-instructor time. If we decided to do this, we could do all size checking at push time, including making sure things like strings and vectors and code don't get too big. This would definitely clean up the code base a lot. I'm a bit sad that @Vaguery is quitting on the idea of separating the Clojush interpreter from the GP. It would definitely be a very useful thing to have done. But, I agree that it's a mess of wires as it currently stands, and is not simple. |
Dude, I'm totally not quitting the idea of separating the interpreter. I'm just not able to heal the current interpreter. |
Oh yay! I misinterpreted you then. Are you aiming to have feature parity with the current interpreter? |
Absolutely. Except with complete coverage with automated tests, and without 729 duplicates of "are there two integers on the :integer stack?" written in sixteen different dialects :). The first thing is to make it clean, and make it extensible. It will work in fundamentally different (and more stable) ways, internally, but it will not behave in any detectably different way. Except where I have found bugs in the Clojush codebase. |
That said, I am still going to implement this weird-ass (no offense) hard precision limit on integers and floats as a "quirks mode" behavior, for backwards-compatibility. |
OK, this really should get addressed at some point before I am able to say that the new interpreter works "the same" as the old one. At the moment (and for the foreseeable future I suppose) the behavior of Clojush is:
Summary: a "Clojush
Summary: a "Clojush The question that I need answer is: Moving forward, do you want numeric values to be truncated this way, or do you me to try to pursue an approach that leaves the behavior of fixed-point and floating-point numerics up to Clojure as much as possible? We agree that the problem arises because GP often tries to use large values in inappropriate situations. But Clojure's (and Java's) unboxed numerics are capable of quickly calculating arbitrary-precision numbers, and are apparently smart enough to avoid doing so prematurely. The problems of memory consumption @lspector mentioned are things we can perhaps better manage by monitoring and restricting memory consumption, rather than numeric precision. The troubling thing is that there is no logging of when truncation has occurred, a situation I will fix in the new interpreter if you want to have this behavior preserved. |
So I'm trying to write tests for the
Interpreter
"routing" system, which is the cascade of recognizers that send items on the:exec
stack to the appropriate other stacks.At the moment this is the code that handles that, but the behavior of particular programs and the results of calculations are also filtered through this kind of ad hoc thingie, and as a result there are some non-obvious consequences for numeric values calculated in the course of running a Push program.
So as far as I can see, more or less every function that "returns an$\epsilon$ of
integer
" explicitly callskeep-number-reasonable
, so the de facto type we call a Pushinteger
is a kind of truncated thing. Clojure will happily treat arithmetic (and other) numeric results as either boxed (fixed precision) or unboxed (arbitrary precision) values, and it looks also as if the basics of Clojush arithmetic uses arbitrary precision for results, but then inevitably applieskeep-number-reasonable
to that. Similarly,float
values are protected bykeep-number-reasonable
from both the same overflow and also underflow—anything within some small0.0
is rounded down to0.0
.But it is also true that if a
float
result is larger thanmax-number-magnitude
, thenkeep-number-reasonable
will convert it to aninteger
. You don't have to look at the tests I'm writing that surfaced this problem, but can see this in the code itself. Whenever the value ofmax-number-magnitude
is the default value of1000000000000
, then the result of(float_mult 10000000.0 10000000.0)
(or any largerfloat
result) will be dropped down to theinteger
value1000000000000
.And because the result of
keep-number-reasonable
is used inline within numeric instructions of several types, theinteger
will be present on thefloat
stack as a result offloat_mult
. I suspect at least a few more of these are lurking in there for transcendental and exponential functions as well.So what is
keep-number-reasonable
really supposed to do? Can we be a bit more rigorous about its goal? For instance, can we pick a maximum precision for any Pushinteger
value? I expect the origin of the truncation kludge is something about the way Clojure handles numeric data type arithmetic, and that you maybe were getting exceptions whenBigDecimal
results had infinite representations, but pure Javafloat
primitives were overflowing and raising exceptions too?But what do you want it to do, really? Besides "not raise exceptions"? At the moment it's injecting subtle and confusing ad hoc changes to running code. And there are no such checks and truncations on the "routers", so a program that contains
2000000000000000000000
as a value would be perfectly runnable.The text was updated successfully, but these errors were encountered: