Refactor ECMA-SL Operators to Use External Functions #146

andreffnascimento · 2024-07-24T12:31:57Z

In this PR, we update the ECMA-SL operators, changing some of them to be implemented as external functions. To avoid clutter, we will push the changes to the interpreters after the PR is reviewed. Below, we provide a rationale for each commit:

Changes

refactor(syntax): remove constants from operators
The implementation of constants was inconsistent. Some were directly converted to float values in the parser, while others were represented internally as operators and then converted to float values in the compiler. To address this, we have removed constants from the .esl operators and now convert them to values in the .esl parser.

Note: I feel like the proper place for constants is in the stdlib of ECMA-SL.

chore(syntax): change the syntax of the eq operator to ==
Since the syntax for the not equals operator is defined as !=, we have changed the syntax for the equals operator from = to ==.
clean(syntax): rename NE operator to Ne
All operators (variants) are written with the first letter of each word capitalized and the rest lowercase. This change fixes the name of the operator to match the convention.
chore(syntax): match the syntax of the ite operator in .cesl with .esl
The syntax for the ifThenElse operator in .esl (e1 ? e2 : e3) was different from that in .cesl (ite(e1, e2, e3)). This commit updates the syntax in the .cesl parser to match the one from the .esl parser. Additionally, we rename the operator to Conditional operator to align with the standard naming convention.

I noticed that this operator does not behave like the standard conditional operators, in that both expressions e2 and e3 are always evaluated regarding the value of e1. Should we change this?

chore(parser): remove locations from .cesl and .esl parsers
Locations should not exist in the parsers, as they are internal values used to manipulate objects. Since the programmer has no control over the location assigned to an object, they should not be able to access the object directly using its location.

Unless there is another reason which I'm not aware?

chore(syntax): externalize math operators
Most mathematical operators were already externalized except for the abs, sqrt, floor, ceil, and trunc. With this change, they are now implemented as external functions.
chore(syntax): externalize string operators
The functions to manipulate strings should all be implemented as external functions, as discussed in ECMA-SL Operator Refactoring (+more syntax) #134
chore(syntax): externalize list operators
All list operations can be defined using the hd (head) and tl (tail) operations. Therefore, these two were retained as unary operators, while the remaining list operations were moved to external functions.
chore(syntax): rename casting operators for consistency
The syntax for some of the type casting operators did not match their internal names in the language. This commit fixes the discrepancies.

Instead of always requiring explicit casting between two types (e.g., type1_to_type2), should we allow our type casting operators to be polymorphic, where we simply specify the target type (similar to the approach in C/C++)?

chore(syntax): remove symbolic expressions
After Refactor: Using values of Smtml #128, symbolic expressions became obsolete. This commit removes them from the syntax of both .esl and .cesl.
refactor(interpreter): clean the eval_op module
With this commit, we cleaned the operator.ml and eval_op.ml modules. Now all operators are implemented in the same style. We also fix some of the error messages that can be generated while evaluating these operators.
refactor(parser): remove the op_parser file
The op_parser.mly file was introduced previously to handle the large number of operators available in .esl and .cesl. This file helped prevent duplicated code in both parsers. The main consequence of having this shared file, however, is that it becomes slightly harder to comprehend the parsers: the operator tokens and their usage are defined in the shared file, but their precedence and some operator tokens need to be in the main parser file. Since most of these operators have been externalized, I believe the disadvantages outweigh the advantages, hence the removal of this file.

I purposefully made this the last commit of this PR, so we can easily revert these changes if you don't approve them.

Future Work

Since the changes in Refactor: Using values of Smtml #128 (and probably even more so after this PR) there have been a lot of problems introduced in the ECMARef interpreters (small issues that cause the interpreter to crash). In of my next PRs I will attempt to address most of them, so we can go back to having a 93% test success rate.
The external files (in both the symbolic and specially in the concrete interpreters) are really messy. I plan on cleaning them ASAP. The same thing goes for the stdlib, which I want to split into multiple files.
If you don't oppose @filipeom, I will be changing the typeof function to be an operator once again. I think it makes total sense.
Two of the objects functions objectToList and objectFields should also be defined as external operators. The in_obj can also be defined as an external operator, but this would require another mechanism for differentiating a field not existing in an object vs it being set to 'undefined. Perhaps introducing the special absent value as in JISET
We are still missing line comments in the parsers
We are still missing proper syntax errors

filipeom

Sorry had busy day, only got to see this now.

Some comments about the Changes:

1. refactor(syntax): remove constants from operators

Note: I feel like the proper place for constants is in the stdlib of ECMA-SL.

I agree! 😄

4. chore(syntax): match the syntax of the ite operator in .cesl with .esl

I noticed that this operator does not behave like the standard conditional operators, in that both expressions e2 and e3 are always evaluated regarding the value of e1. Should we change this?

I don't think I understood, can you elaborate? For clarity, Eval.tripo Ite should return the appropriate value during evaluation: https://github.com/formalsec/smtml/blob/main/lib/eval.ml#L228

5. chore(parser): remove locations from .cesl and .esl parsers

Unless there is another reason which I'm not aware?

Unless you want to leak memory 😅 I think it was a major security issue we allowed this to happen. Nice catch!

6. chore(syntax): externalize math operators

I actually think abs, floor, ceil, and trunc could be operators. abs is similar to unary negation and floor or trunc are similar to casting operators. I.e., floor is just the composition of int_to_float o float_to_int (I think this turned into a pitch to just define these functions directly in the stdlib instead of as external functions 😆). And, ceil could also be defined similarly.

12. refactor(parser): remove the op_parser file

I think it's fine. My major concern of duplication would be about changing one thing in one parser and not in another. But I think the tests we have already will catch inconsistencies.

Future Work

If you don't oppose @filipeom, I will be changing the typeof function to be an operator once again. I think it makes total sense.

I think as long as it still returns a string we can still reason about it symbolically.

Two of the objects functions objectToList and objectFields should also be defined as external operators. The in_obj can also be defined as an external operator, but this would require another mechanism for differentiating a field not existing in an object vs it being set to 'undefined. Perhaps introducing the special absent value as in JISET

I like in_obj as an infix operator. Also, I dislike adding another value for representing nothing. We already have null, undefined, and void. I actually think we could move into Kotlin's direction in the future: https://stackoverflow.com/questions/55953052/kotlin-void-vs-unit-vs-nothing. We could build a stronger type system with Unit and Nothing instead of undefined and null.

Below I left you some nitpicks:

P.s.: Sorry for the possible typos. I'm doing this in a small screen.

ECMA-SL/semantics/core/concrete/eval_op.ml

ECMA-SL/semantics/core/functorial/symbolic_esl_ffi.ml

ECMA-SL/semantics/error/runtime_error.ml

andreffnascimento · 2024-07-25T11:03:08Z

I don't think I understood, can you elaborate? For clarity, Eval.tripo Ite should return the appropriate value during evaluation: https://github.com/formalsec/smtml/blob/main/lib/eval.ml#L228

It returns the proper value, yes. But have a look at this example:

function foo() { print "foo"; return 10; }
function bar() { print "bar"; return 20; }
x := true ? foo() : bar();
print x;

With this code, the system is going to print:

foo
bar
10

My point is, should we even evaluate the bar function?

filipeom · 2024-07-25T11:26:30Z

I don't think I understood, can you elaborate? For clarity, Eval.tripo Ite should return the appropriate value during evaluation: https://github.com/formalsec/smtml/blob/main/lib/eval.ml#L228

It returns the proper value, yes. But have a look at this example:
function foo() { print "foo"; return 10; }
function bar() { print "bar"; return 20; }
x := true ? foo() : bar();
print x;
With this code, the system is going to print:
foo
bar
10
My point is, should we even evaluate the bar function?

Oh, I see. Yes, for concrete execution this is indeed unintended behaviour:

ECMA-SL/ECMA-SL/semantics/core/concrete/interpreter.ml

Lines 94 to 99 in 8e71fa3

    
           | TriOpt (op, e1, e2, e3) -> 
        
             let v1 = eval_expr state e1 in 
        
             let v2 = eval_expr state e2 in 
        
             let v3 = eval_expr state e3 in 
        
             let eval_op_fun () = Eval_op.triop_semantics op v1 v2 v3 in 
        
             eval_operator eval_op_fun [ e1; e2; e3 ]

Seems like triops always evaluate the arguments before dispatching on the operator. We probably need a another case for TriOpt (Conditional, ... now to avoid unwanted side-effects

Constants are now converted to their respective values in the lexer from .esl. This is a temporary solution, since they should only exist within the stdlib of .esl.

Closes #64

andreffnascimento · 2024-07-25T13:21:08Z

@filipeom I've addressed the review in 8a9219b

I've also pushed the changes to the interpreters/tests

filipeom

LGTM! 😄

andreffnascimento self-assigned this Jul 24, 2024

andreffnascimento requested a review from filipeom July 24, 2024 12:32

andreffnascimento added refactor General change of the implementation scope-language Related to the ECMA-SL language design labels Jul 24, 2024

filipeom reviewed Jul 24, 2024

View reviewed changes

andreffnascimento closed this Jul 25, 2024

andreffnascimento reopened this Jul 25, 2024

andreffnascimento added a commit that referenced this pull request Jul 25, 2024

address PR #146 review

090b0bc

andreffnascimento added 16 commits July 25, 2024 14:17

refactor(syntax): remove constants from operators

b903cbf

Constants are now converted to their respective values in the lexer from .esl. This is a temporary solution, since they should only exist within the stdlib of .esl.

chore(syntax): change the syntax of the eq operator to ==

e81a3b6

clean(syntax): rename NE operator to Ne

9db97be

chore(syntax): match the syntax of the ite operator in .cesl with .esl

5bcfe55

chore(parser): remove locations from .cesl and .esl parsers

8fb5a4b

chore(syntax): externalize math operators

8946474

chore(syntax): externalize string operators

e200689

chore(syntax): externalize list operators

df8f715

chore(syntax): rename casting operators for consistency

dc0bf8b

chore(syntax): remove symbolic expressions

7488fd2

Closes #64

refactor(interpreter): clean the eval_op module

4337e79

refactor(parser): remove the op_parser file

9a4de6e

chore(ecmaref): update the interpreters with the new parsing rules

dff245c

test(ecmaref): promote tests

a502928

address PR #146 review

8a9219b

docs: update changelog

fdcd2b8

andreffnascimento force-pushed the fix-operators branch from 264df67 to fdcd2b8 Compare July 25, 2024 13:19

andreffnascimento requested a review from filipeom July 25, 2024 13:19

filipeom approved these changes Jul 25, 2024

View reviewed changes

andreffnascimento merged commit 4e21eff into main Jul 25, 2024
2 checks passed

andreffnascimento added a commit that referenced this pull request Jul 25, 2024

address PR #146 review

ecaf09e

andreffnascimento deleted the fix-operators branch July 25, 2024 16:29

andreffnascimento mentioned this pull request Jul 26, 2024

Fix conditional operator evaluation both branches in the concrete interpreter #149

Merged

filipeom pushed a commit that referenced this pull request Dec 27, 2024

address PR #146 review

4cfa4b5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor ECMA-SL Operators to Use External Functions #146

Refactor ECMA-SL Operators to Use External Functions #146

andreffnascimento commented Jul 24, 2024

filipeom left a comment

andreffnascimento commented Jul 25, 2024

filipeom commented Jul 25, 2024 •

edited

Loading

andreffnascimento commented Jul 25, 2024

filipeom left a comment

Refactor ECMA-SL Operators to Use External Functions #146

Refactor ECMA-SL Operators to Use External Functions #146

Conversation

andreffnascimento commented Jul 24, 2024

Changes

Future Work

filipeom left a comment

Choose a reason for hiding this comment

1. refactor(syntax): remove constants from operators

4. chore(syntax): match the syntax of the ite operator in .cesl with .esl

5. chore(parser): remove locations from .cesl and .esl parsers

6. chore(syntax): externalize math operators

12. refactor(parser): remove the op_parser file

Future Work

andreffnascimento commented Jul 25, 2024

filipeom commented Jul 25, 2024 • edited Loading

andreffnascimento commented Jul 25, 2024

filipeom left a comment

Choose a reason for hiding this comment

filipeom commented Jul 25, 2024 •

edited

Loading