diff --git a/README.md b/README.md index ca3f501..880aab0 100644 --- a/README.md +++ b/README.md @@ -33,13 +33,15 @@ * [3. Object Methods](#3-object-methods) * [3.1 Defininig New Object Methods](#31-defininig-new-object-methods) * [Github Setup](#github-setup) +* [Fuzz Testing](#fuzz-testing) + + # Monkey This repository contains an interpreter for the "Monkey" programming language, as described in [Write an Interpreter in Go](https://interpreterbook.com). - #### My changes The interpreter in _this_ repository has been significantly extended from the starting point: @@ -85,7 +87,6 @@ The interpreter in _this_ repository has been significantly extended from the st * Add support for explicit `null` usage: * `a = null; if ( a == null ) { .. }` - #### See Also If you enjoyed this repository you might find the related ones interesting: @@ -149,12 +150,15 @@ If no script-name is passed to the interpreter it will read from STDIN and execute that instead, allowing simple tests to be made. + + # 2 Syntax **NOTE**: Example-programs can be found beneath [examples/](examples/) which demonstrate these things, as well as parts of the standard-library. + ## 2.1 Definitions Variables are defined using the `let` keyword, with each line ending with `;`. @@ -184,6 +188,7 @@ typos will cause much confusion! puts( "Hello, " + name + "\n"); + ## 2.2 Arithmetic operations `monkey` supports all the basic arithmetic operation of `int` and `float` types. @@ -204,6 +209,7 @@ Here `**` is used to raise the first number to the power of the second. When operating with integers the modulus operator is available too, via `%`. + ## 2.3 Builtin containers `monkey` contains two builtin containers: `array` and `hash`. @@ -267,6 +273,7 @@ changing it in-place). Hash functions are demonstrated in the [examples/hash.mon](examples/hash.mon) sample. + ## 2.4 Builtin functions The core primitives are: @@ -329,6 +336,7 @@ Nothing special is required, the following will suffice as you'd expect: go build . + ## 2.5 Functions `monkey` uses `fn` to define a function which will be assigned to a variable for @@ -379,6 +387,7 @@ The same thing works for literal functions: meh( {"Steve":"Kemp", true:1, false:0, 7:"seven"} ); + ## 2.6 If-else statements `monkey` supports if-else statements. @@ -408,6 +417,8 @@ would expect with a C-background: Note that in the interests of clarity nested ternary-expressions are illegal! + + ## 2.7 Switch Statements Monkey supports the `switch` and `case` expressions, as the following example demonstrates: @@ -434,6 +445,7 @@ Monkey supports the `switch` and `case` expressions, as the following example de See also [examples/switch.mon](examples/switch.mon). + ## 2.8 For-loop statements `monkey` supports a golang-style for-loop statement. @@ -452,6 +464,7 @@ See also [examples/switch.mon](examples/switch.mon). puts(sum(100)); // Outputs: 4950 + ## 2.8.1 Foreach statements In addition to iterating over items with the `for` statement, as shown above, it is also possible to iterate over various items via the `foreach` statement. @@ -474,6 +487,7 @@ The same style of iteration works for Arrays, Hashes, and the characters which m When iterating over hashes you can receive either the keys, or the keys and value at each step in the iteration, otherwise you receive the value and an optional index. + ## 2.9 Comments `monkey` support two kinds of comments: @@ -482,6 +496,7 @@ When iterating over hashes you can receive either the keys, or the keys and valu * Multiline comments between `/*` and `*/`. + ## 2.10 Postfix Operators The `++` and `--` modifiers are permitted for integer-variables, for example the following works as you would expect showing the numbers from `0` to `5`: @@ -510,6 +525,7 @@ The update-operators work with integers and doubles by default, when it comes to puts( str ); // -> "Forename Surname\n" + ## 2.11 Command Execution As with many scripting languages commands may be executed via the backtick @@ -528,6 +544,7 @@ The output will be a hash with two keys `stdout` and `stderr`. NULL is returned if the execution fails. This can be seen in [examples/exec.mon](examples/exec.mon). + ## 2.12 Regular Expressions The `match` function allows matching a string against a regular-expression. @@ -548,6 +565,8 @@ You can also perform matching (complete with captures), with a literal regular e printf("Matched! %s.%s.%s.%s\n", $1, $2, $3, $4 ); } + + ## 2.13 File I/O The `open` primitive is used to open files, and can be used to open files for either reading, or writing: @@ -590,6 +609,7 @@ By default three filehandles will be made available, as constants: * Used for writing messages. + ## 2.14 File Operations The primitive `stat` will return a hash of details about the given file, or @@ -608,6 +628,9 @@ And finally to make a directory: mkdir( "/tmp/blah" ); + + + # 3. Object Methods There is now support for "object-methods". Object methods are methods @@ -649,6 +672,7 @@ The `string` object has the most methods at the time of writing, but no doubt things will change over time. + ## 3.1 Defininig New Object Methods The object-methods mentioned above are implemented in Go, however it is also @@ -671,6 +695,7 @@ in this fashion, for example the functional-programming methods `array.map`, `array.filter`, `string.toupper`, etc, etc. + ## Github Setup This repository is configured to run tests upon every commit, and when @@ -682,5 +707,49 @@ Releases are automated in a similar fashion via [.github/build](.github/build), and the [github-action-publish-binaries](https://github.com/skx/github-action-publish-binaries) action. + +## Fuzz Testing + +Fuzz-testing involves creating random input, and running the program to test with that, to see what happens. + +The intention is that most of the random inputs will be invalid, so you'll be able to test your error-handling and see where you failed to consider specific input things. + +The 1.18 release of the golang compiler/toolset has integrated support for fuzz-testing, and you can launch it like so: + +```sh +go test -fuzztime=300s -parallel=1 -fuzz=FuzzMonkey -v +``` + +Sample output looks like this: + +``` +$ go test -fuzztime=300s -parallel=1 -fuzz=FuzzMonkey -v +=== RUN FuzzMonkey +fuzz: elapsed: 0s, gathering baseline coverage: 0/240 completed +fuzz: elapsed: 0s, gathering baseline coverage: 240/240 completed, now fuzzing with 1 workers +fuzz: elapsed: 3s, execs: 4321 (1440/sec), new interesting: 6 (total: 246) +fuzz: elapsed: 6s, execs: 4321 (0/sec), new interesting: 6 (total: 246) +cfuzz: elapsed: 9s, execs: 4321 (0/sec), new interesting: 6 (total: 246) +fuzz: elapsed: 12s, execs: 4321 (0/sec), new interesting: 6 (total: 246) +fuzz: elapsed: 15s, execs: 4321 (0/sec), new interesting: 6 (total: 246) +fuzz: elapsed: 18s, execs: 4321 (0/sec), new interesting: 6 (total: 246) +fuzz: elapsed: 21s, execs: 4321 (0/sec), new interesting: 6 (total: 246) +fuzz: elapsed: 24s, execs: 4321 (0/sec), new interesting: 6 (total: 246) +fuzz: elapsed: 27s, execs: 73463 (23060/sec), new interesting: 17 (total: 257) +fuzz: elapsed: 30s, execs: 75639 (725/sec), new interesting: 18 (total: 258) +fuzz: elapsed: 33s, execs: 125712 (16701/sec), new interesting: 25 (total: 265) +fuzz: elapsed: 36s, execs: 139338 (4543/sec), new interesting: 34 (total: 274) +fuzz: elapsed: 39s, execs: 173881 (11511/sec), new interesting: 49 (total: 289) +fuzz: elapsed: 42s, execs: 198046 (8055/sec), new interesting: 55 (total: 295) +fuzz: elapsed: 45s, execs: 210203 (4054/sec), new interesting: 75 (total: 315) +fuzz: elapsed: 48s, execs: 262945 (17580/sec), new interesting: 85 (total: 325) +fuzz: elapsed: 51s, execs: 297505 (11517/sec), new interesting: 108 (total: 348) +fuzz: elapsed: 54s, execs: 308672 (3722/sec), new interesting: 116 (total: 356) +fuzz: elapsed: 57s, execs: 341614 (10984/sec), new interesting: 123 (total: 363) +fuzz: elapsed: 1m0s, execs: 366053 (8146/sec), new interesting: 131 (total: 371) +fuzz: elapsed: 1m3s, execs: 396575 (10172/sec), new interesting: 137 (total: 377 +... +``` + Steve -- diff --git a/evaluator/evaluator.go b/evaluator/evaluator.go index 5f9d372..064f565 100644 --- a/evaluator/evaluator.go +++ b/evaluator/evaluator.go @@ -277,6 +277,11 @@ func evalBangOperatorExpression(right object.Object) object.Object { } func evalMinusPrefixOperatorExpression(right object.Object) object.Object { + // Found by fuzzing + if right == nil { + return newError("null operand %v", right) + } + switch obj := right.(type) { case *object.Integer: return &object.Integer{Value: -obj.Value} @@ -288,6 +293,12 @@ func evalMinusPrefixOperatorExpression(right object.Object) object.Object { } func evalInfixExpression(operator string, left, right object.Object, env *object.Environment) object.Object { + + // Found by fuzzing + if left == nil || right == nil { + return newError("null operand %v %v", left, right) + } + switch { case left.Type() == object.INTEGER_OBJ && right.Type() == object.INTEGER_OBJ: return evalIntegerInfixExpression(operator, left, right) @@ -412,6 +423,11 @@ func evalBooleanInfixExpression(operator string, left, right object.Object) obje } func evalIntegerInfixExpression(operator string, left, right object.Object) object.Object { + // Found by fuzzing + if left == nil || right == nil { + return newError("null operand %v %v", left, right) + } + leftVal := left.(*object.Integer).Value rightVal := right.(*object.Integer).Value switch operator { @@ -420,6 +436,11 @@ func evalIntegerInfixExpression(operator string, left, right object.Object) obje case "+=": return &object.Integer{Value: leftVal + rightVal} case "%": + // Found by fuzzing + if rightVal == 0 { + return newError("divide by zero") + } + return &object.Integer{Value: leftVal % rightVal} case "**": return &object.Integer{Value: int64(math.Pow(float64(leftVal), float64(rightVal)))} @@ -432,6 +453,10 @@ func evalIntegerInfixExpression(operator string, left, right object.Object) obje case "*=": return &object.Integer{Value: leftVal * rightVal} case "/": + // Found by fuzzing + if rightVal == 0 { + return newError("divide by zero") + } return &object.Integer{Value: leftVal / rightVal} case "/=": return &object.Integer{Value: leftVal / rightVal} @@ -463,6 +488,11 @@ func evalIntegerInfixExpression(operator string, left, right object.Object) obje step = -1.0 } + // Found by fuzzing + if len > 2048 { + return newError("impossible large range for .. operator") + } + // Make an array to hold the return value array := make([]object.Object, len) @@ -498,6 +528,10 @@ func evalFloatInfixExpression(operator string, left, right object.Object) object case "**": return &object.Float{Value: math.Pow(leftVal, rightVal)} case "/": + // Found by fuzzing + if rightVal == 0 { + return newError("divide by zero") + } return &object.Float{Value: leftVal / rightVal} case "/=": return &object.Float{Value: leftVal / rightVal} @@ -538,6 +572,10 @@ func evalFloatIntegerInfixExpression(operator string, left, right object.Object) case "**": return &object.Float{Value: math.Pow(leftVal, rightVal)} case "/": + // Found by fuzzing + if rightVal == 0 { + return newError("divide by zero") + } return &object.Float{Value: leftVal / rightVal} case "/=": return &object.Float{Value: leftVal / rightVal} @@ -578,6 +616,10 @@ func evalIntegerFloatInfixExpression(operator string, left, right object.Object) case "**": return &object.Float{Value: math.Pow(leftVal, rightVal)} case "/": + // Found by fuzzing + if rightVal == 0 { + return newError("divide by zero") + } return &object.Float{Value: leftVal / rightVal} case "/=": return &object.Float{Value: leftVal / rightVal} @@ -1009,9 +1051,29 @@ func trimQuotes(in string, c byte) string { // `stderr`, `stdout`, and `error` will be the fields func backTickOperation(command string) object.Object { + command = strings.TrimSpace(command) + if command == "" { + return newError("empty command") + } + + // default arguments, if none are found + args := []string{} + // split the command toExec := splitCommand(command) - cmd := exec.Command(toExec[0], toExec[1:]...) + + // Did that work? + if len(args) == 0 { + return newError("error - empty command") + } + + // Use the real args if we got any + if len(args) > 1 { + args = toExec[1:] + } + + // Run the ocmmand. + cmd := exec.Command(toExec[0], args...) // get the result var outb, errb bytes.Buffer @@ -1049,6 +1111,12 @@ func backTickOperation(command string) object.Object { } func evalIndexExpression(left, index object.Object) object.Object { + + // Found by fuzzing + if left == nil || index == nil { + return newError("null operand %v[%v]", left, index) + } + switch { case left.Type() == object.ARRAY_OBJ && index.Type() == object.INTEGER_OBJ: return evalArrayIndexExpression(left, index) @@ -1126,6 +1194,11 @@ func evalHashLiteral(ctx context.Context, node *ast.HashLiteral, env *object.Env } func applyFunction(ctx context.Context, env *object.Environment, fn object.Object, args []object.Object) object.Object { + + // Found by fuzzing + if fn == nil { + return newError("impossible empty body on function-call") + } switch fn := fn.(type) { case *object.Function: extendEnv := extendFunctionEnv(ctx, fn, args) @@ -1171,6 +1244,11 @@ func RegisterBuiltin(name string, fun object.BuiltinFunction) { func evalObjectCallExpression(ctx context.Context, call *ast.ObjectCallExpression, env *object.Environment) object.Object { obj := EvalContext(ctx, call.Object, env) + + if obj == nil { + return newError("impossible object-call on an empty object") + } + if method, ok := call.Call.(*ast.CallExpression); ok { // diff --git a/fuzz_test.go b/fuzz_test.go new file mode 100644 index 0000000..575e8ca --- /dev/null +++ b/fuzz_test.go @@ -0,0 +1,67 @@ +//go:build go1.18 +// +build go1.18 + +package main + +import ( + "context" + "strings" + "testing" + "time" + + "github.com/skx/monkey/evaluator" + "github.com/skx/monkey/lexer" + "github.com/skx/monkey/object" + "github.com/skx/monkey/parser" +) + +// FuzzMonkey runs the fuzz-testing against our parser and interpreter. +func FuzzMonkey(f *testing.F) { + + // Known errors we might see + known := []string{ + "as integer", + "divide by zero", + "null operand", + "could not parse", + "exceeded", + "expected assign", + "expected next token", + "impossible", + "nested ternary expressions are illegal", + "no prefix parse function", + } + + f.Fuzz(func(t *testing.T, input []byte) { + + ctx, cancel := context.WithTimeout(context.Background(), 200*time.Millisecond) + defer cancel() + + env := object.NewEnvironment() + l := lexer.New(string(input)) + p := parser.New(l) + + program := p.ParseProgram() + falsePositive := false + + // No errors? Then execute + if len(p.Errors()) == 0 { + + evaluator.EvalContext(ctx, program, env) + return + } + + for _, msg := range p.Errors() { + for _, ignored := range known { + if strings.Contains(msg, ignored) { + falsePositive = true + } + } + + } + + if !falsePositive { + t.Fatalf("error running input: '%s': %v", input, p.Errors()) + } + }) +} diff --git a/lexer/lexer.go b/lexer/lexer.go index 23320c6..379e2ca 100644 --- a/lexer/lexer.go +++ b/lexer/lexer.go @@ -271,7 +271,26 @@ func (l *Lexer) NextToken() token.Token { return tok } + + // Not printable? That's a bug + if !unicode.IsPrint(l.ch) { + tok.Literal = string(l.ch) + tok.Type = token.ILLEGAL + + // skip the characters + l.readChar() + return tok + } + tok.Literal = l.readIdentifier() + + // Did we fail to read a token? + if len(tok.Literal) == 0 { + // Then we've got an illegal + tok.Type = token.ILLEGAL + l.readChar() + return tok + } tok.Type = token.LookupIdentifier(tok.Literal) l.prevToken = tok diff --git a/parser/parser.go b/parser/parser.go index 73ce6dd..0049bc1 100644 --- a/parser/parser.go +++ b/parser/parser.go @@ -930,7 +930,12 @@ func (p *Parser) parseAssignExpression(name ast.Expression) ast.Expression { if n, ok := name.(*ast.Identifier); ok { stmt.Name = n } else { - msg := fmt.Sprintf("expected assign token to be IDENT, got %s instead around line %d", name.TokenLiteral(), p.l.GetLine()) + msg := "expected assign token to be IDENT, got null instead" + + // found by fuzzer + if name != nil { + msg = fmt.Sprintf("expected assign token to be IDENT, got %s instead around line %d", name.TokenLiteral(), p.l.GetLine()) + } p.errors = append(p.errors, msg) } diff --git a/testdata/fuzz/FuzzMonkey/275dcde72610da78 b/testdata/fuzz/FuzzMonkey/275dcde72610da78 new file mode 100644 index 0000000..0fd4d42 --- /dev/null +++ b/testdata/fuzz/FuzzMonkey/275dcde72610da78 @@ -0,0 +1,2 @@ +go test fuzz v1 +[]byte("0..48888880") diff --git a/testdata/fuzz/FuzzMonkey/2d4e792fd0bfa014 b/testdata/fuzz/FuzzMonkey/2d4e792fd0bfa014 new file mode 100644 index 0000000..b299135 --- /dev/null +++ b/testdata/fuzz/FuzzMonkey/2d4e792fd0bfa014 @@ -0,0 +1,2 @@ +go test fuzz v1 +[]byte("'[0]") diff --git a/testdata/fuzz/FuzzMonkey/2fb1f2f9af56548f b/testdata/fuzz/FuzzMonkey/2fb1f2f9af56548f new file mode 100644 index 0000000..ecc8b12 --- /dev/null +++ b/testdata/fuzz/FuzzMonkey/2fb1f2f9af56548f @@ -0,0 +1,2 @@ +go test fuzz v1 +[]byte("0..4888888488880") diff --git a/testdata/fuzz/FuzzMonkey/34fb718936c5abee b/testdata/fuzz/FuzzMonkey/34fb718936c5abee new file mode 100644 index 0000000..4f69cdb --- /dev/null +++ b/testdata/fuzz/FuzzMonkey/34fb718936c5abee @@ -0,0 +1,2 @@ +go test fuzz v1 +[]byte("0%00") diff --git a/testdata/fuzz/FuzzMonkey/3eefa231244e4231 b/testdata/fuzz/FuzzMonkey/3eefa231244e4231 new file mode 100644 index 0000000..7325d8b --- /dev/null +++ b/testdata/fuzz/FuzzMonkey/3eefa231244e4231 @@ -0,0 +1,2 @@ +go test fuzz v1 +[]byte("`0`") diff --git a/testdata/fuzz/FuzzMonkey/6b91ac7f9f7618fb b/testdata/fuzz/FuzzMonkey/6b91ac7f9f7618fb new file mode 100644 index 0000000..6353c80 --- /dev/null +++ b/testdata/fuzz/FuzzMonkey/6b91ac7f9f7618fb @@ -0,0 +1,2 @@ +go test fuzz v1 +[]byte("0b=") diff --git a/testdata/fuzz/FuzzMonkey/6ca869e0fd4fbda4 b/testdata/fuzz/FuzzMonkey/6ca869e0fd4fbda4 new file mode 100644 index 0000000..a907d5c --- /dev/null +++ b/testdata/fuzz/FuzzMonkey/6ca869e0fd4fbda4 @@ -0,0 +1,2 @@ +go test fuzz v1 +[]byte("\xe0.0A)000") diff --git a/testdata/fuzz/FuzzMonkey/7503d7e4a29aaa55 b/testdata/fuzz/FuzzMonkey/7503d7e4a29aaa55 new file mode 100644 index 0000000..f2fa28a --- /dev/null +++ b/testdata/fuzz/FuzzMonkey/7503d7e4a29aaa55 @@ -0,0 +1,2 @@ +go test fuzz v1 +[]byte("7\xdc%00000000000\"00000000000000") diff --git a/testdata/fuzz/FuzzMonkey/a0acac5dae65a123 b/testdata/fuzz/FuzzMonkey/a0acac5dae65a123 new file mode 100644 index 0000000..bfa5314 --- /dev/null +++ b/testdata/fuzz/FuzzMonkey/a0acac5dae65a123 @@ -0,0 +1,2 @@ +go test fuzz v1 +[]byte("'()") diff --git a/testdata/fuzz/FuzzMonkey/bb33601cb4718064 b/testdata/fuzz/FuzzMonkey/bb33601cb4718064 new file mode 100644 index 0000000..ffcbcaa --- /dev/null +++ b/testdata/fuzz/FuzzMonkey/bb33601cb4718064 @@ -0,0 +1,2 @@ +go test fuzz v1 +[]byte("``") diff --git a/testdata/fuzz/FuzzMonkey/f35e37c0e30ffd2e b/testdata/fuzz/FuzzMonkey/f35e37c0e30ffd2e new file mode 100644 index 0000000..95d1611 --- /dev/null +++ b/testdata/fuzz/FuzzMonkey/f35e37c0e30ffd2e @@ -0,0 +1,2 @@ +go test fuzz v1 +[]byte("-")