NewLang Specification

"NewLang" is my placeholder name for the language idea I'm working on.

1. Introduction

NewLang is a high-level statically-typed programming language with a garbage collector.

Clarity and Readability: A clean syntax inspired by Rust, OCaml, TypeScript, and Zig.
Robust Type System: A flexible system blending nominal, structural, and duck typing concepts to provide strong guarantees with expressive power.
Immutability by Default: Promoting a value-oriented and functional programming style for safer and more predictable code.
Ergonomic Concurrency: Built-in support for concurrent programming that is easy to use and reason about.
Developer Productivity: Features designed to reduce boilerplate and common errors, such as powerful pattern matching and type inference.

This document details the core features and semantics of NewLang.

2. Core Concepts

2.1. Values and Immutability

Everything in NewLang is a value. All types exhibit value semantics by default. This means that assignment, parameter passing, and function returns conceptually create a copy of the value. Variables are immutable by default, meaning their value cannot be changed after initialization.

var message = "Hello, NewLang!";
// message = "New message"; // Error: cannot assign to immutable variable 'message'

var p1 = [ x: 10, y: 20 ];
var p2 = p1; // p2 is a new value, a full copy of p1.
             // Modifying p2 (if it were mutable) would not affect p1.

2.2. Mutability

Variables can be declared as mutable using the mut keyword. This allows reassignment of the variable and, for collection types or objects, in-place modification of their contents if the type itself supports it.

mut var counter = 0;
counter = counter + 1; // Allowed: counter is rebound to a new value.

type Point = [ x: I32, y: I32 ];
mut var current_pos = Point [ x = 0, y = 0 ];
current_pos.x = 100; // Allowed: in-place modification of a field in a mutable objects.

mut var numbers = [1, 2, 3];
numbers[0] = 10; // Allowed: in-place modification of an element in a mutable array.
// numbers.push(4); // If array methods for mutation exist

Mutability also extends to function parameters (see Section 6.2. Parameters).

2.3. References (`&`)

While NewLang defaults to value semantics, explicit reference semantics can be opted into using the & operator. References allow multiple variables to point to the same underlying data. NewLang employs a garbage collector, so manual lifetime management is not required; the GC will reclaim memory when objects are no longer reachable.

type Data = [ value: I32 ];
var d1 = [ value = 42 ];
var d2_val_copy = d1; // d2_val_copy is a copy of d1.

var d_ref1: &Data = &d1; // d_ref1 now references the data of d1.
var d_ref2: &Data = d_ref1; // d_ref2 now references the same data as d_ref1.

// Modifying through a mutable reference (if d1 was mutable and &mut syntax existed or implied)
// mut var m_data = [ value = 10 ];
// var m_ref: &mut Data = &m_data;
// m_ref.value = 20; // m_data.value would now be 20

print(d1 == d2_val_copy); // True (deep structural equality)
print(d_ref1 == d_ref2);  // True for reference equality (pointing to same instance)

var d3 = [ value = 42 ];
var d_ref3: &Data = &d3;
print(d_ref1 == d_ref3); // false (references to different instances, even if structurally equal)

The precise semantics of mutable references (&mut T) will be detailed in an advanced section if they are introduced; for now, &T provides shared, immutable access via reference. The primary use of & is for sharing large data structures without copying or for specific interop scenarios.

You can also use the .* operator to dereference a reference.

var d1 = [ value = 42 ];
var d_ref = &d1;
var d_val = d_ref.*; // d_val is a copy of the value of d1

2.4. Blocks as Expressions

Code blocks enclosed in {} are expressions. The value of a block is the value of the last expression within it. This allows blocks to be assigned to variables, returned from functions, or used in any context where an expression is expected.

var computation = {
    var x = 10;
    var y = 20;
    x + y // The value of this block is 30
};
print(computation); // Output: 30

var message = if computation > 20 {
    var prefix = "High";
    "${prefix} value: ${computation}"
} else {
    "Low value"
};

2.5. The `None` Type

NewLang has a built-in None type. It represents the absence of a value and has only one possible inhabitant, also implicitly None. Functions that do not explicitly return a value implicitly return None.

var do_nothing() {
    // No return statement, implicitly returns None
};

var result: None = do_nothing();

3. Types

3.1. Primitive Types

I32, I64: 32-bit and 64-bit signed integers.
F32, F64: 32-bit and 64-bit floating-point numbers.
String: UTF-8 encoded, immutable strings.
Bool: True or False.
Char: A single Unicode scalar value.
None: The unit type, representing no value (see Section 2.5).

3.2. Nominal Typing with `tag`

The tag keyword is used to create all new nominal types. A nominal type is distinct from any other type, even if they share the same underlying structure. This is crucial for type safety, ensuring, for example, that a UserId cannot be accidentally used where a ProductId is expected, even if both are internally represented as integers.

Simple Tags (Marker Types): Creates a new type with no associated data, often used as a marker or for distinct unit-like values. The underlying data is implicitly None.

tag AdminUser; // Creates a new, distinct type AdminUser.
               // An instance `AdminUser` wraps `None`.
var admin_marker = AdminUser; // `admin_marker` is of type AdminUser.

Wrapper Tags: Wraps an existing type to create a new, distinct nominal type. The new type contains the wrapped value.

tag UserId = I32;
tag ProductId = I32;

var user_id = UserId(123);       // user_id is of type UserId, contains 123.
var product_id = ProductId(456); // product_id is of type ProductId.

// user_id = product_id; // Error: UserId and ProductId are distinct types.
// user_id = 123;        // Error: UserId and I32 are distinct types.

If you assign a wrapper tag to a variable or parameter of the wrapped type, it will be automatically untagged.

var raw_user_id: I32 = user_id; // raw_user_id is 123 (type I32)

tag Email = String;
var user_email = Email("[email protected]");
var email_str = untag user_email; // type String

If a simple tag (wrapping None) is untagged, the result is of type None.

tag Confirmed;
var confirmation = Confirmed;
var val_none = untag confirmation; // val_none is of type None

Nominal Objects: Defines an object (a collection of named fields) with a nominal name. Instances are distinct from other object types, even if those objects have identical fields.

tag NominalPerson = [
    name: String,
    id: I32,
];
var person_a = NominalPerson([ name = "Alice", id = 1 ]);

// To access fields, use dot notation:
print(person_a.name); // "Alice"

// `untag` on a nominal object yields its underlying structural representation:
// var structural_person = untag person_a; // type is [ name String, id I32 ]
// print(structural_person.name); // "Alice"

Spreading in Nominal Objects: Compose objects using the spread (...) operator. Fields from the spread type are included. If there are overlapping field names, the later definition (or the host object's field) takes precedence.

tag BaseProfile = [ id: String, last_login: String ];
tag FullUserProfile = [
    ...BaseProfile, // Includes id, last_login
    username: String,
    email: String,
    id: Uuid // Overrides BaseProfile.id if Uuid is different from String
           // or simply provides the type if BaseProfile.id was just a name.
           // Exact rules for override vs. error on type mismatch need careful definition.
           // For now, assume same-name fields must be compatible or the outer wins.
];

3.3. Structural Typing: Object Types

Objects defined using a type alias are structurally typed. Two structural object types are compatible if they have the same field names and their corresponding field types are compatible. The order of fields does not matter for structural equivalence.

type Person = [ name: String, age: I32 ];
type Human = [ age: I32, name: String ]; // Structurally identical to Person

var alice Person = [ name: "Alice", age: 30 ];
var bob Human = [ name: "Bob", age: 32 ];

var another_human Human = alice; // Valid: Person is structurally compatible with Human.
// var another_person Person = bob; // Valid for the same reason.

3.4. Union Types and Enums (`|`, `::`)

Union types, defined with |, allow a value to be one of several specified types. The :: syntax is sugar for defining tags and immediately including them in a union, often used for creating enums.

Enum-style Unions (Simple Enums):

tag Red;
tag Green;
tag Blue;

type Color = Red | Green | Blue;

var primary_color: Color = Red;
var secondary_color: Color = Green;

Unions with Data (Tagged Unions): Enum variants can carry data.

tag Some<T> = T;
tag None = None;
type Option<T> = T | None;

var num: Option<I32> = Some(123);
var no_num: Option<I32> = None;

var text: Option<String> = Some("hello");
var no_text: Option<String> = None;

Some tags a value of type T. None is a simple tag (tagging None).

Namespaced Enums with enum:

enum Color {
    Red,
    Green,
    Blue,
};
var primary_color: Color = Color::Red;
var secondary_color: Color = Color::Green;

3.5. Built-in Generic Union Types

Option<T>: Represents an optional value. Variants: Some<T> (wraps T) and None.
Result<T, E>: Represents a computation that can succeed with a value of type T (Ok<T>) or fail with an error of type E (Err<E>).
- tag Ok<T> = T;
- tag Err<E> = E;
- type Result<T, E> = Ok<T> | Err<E>;

3.6. Tuple Types

Tuples are ordered, fixed-size collections of values, where each element can have a different type. They are defined using square brackets [], just like objects. In fact, objects can be thought of as tuples with named fields, and can be accessed by index and defined without names.

type Point2D = [F32, F32];
var origin Point2D = [0.0, 0.0];

type MixData = [String, I32, Bool];
var data_record MixData = ["label", 10, True];

// Access by index (0-based):
var x_coord = origin[0];
var label = data_record[0];

3.7. Array Types

Fixed-Size Arrays: Syntactically similar to tuples but imply homogeneity if not all types are specified. T[N] denotes an array of N elements of type T. For N <= SmallNumber (e.g., 32), these may be stack-allocated.
```
type Vector3 = F32[3]; // equivalent to [F32, F32, F32]
var v: Vector3 = [1.0, 2.0, 3.0];
// type Point = [I32, I32]; // also like a tuple
```

Dynamic Arrays (T[]): Collections of a single type whose size can change at runtime. These are heap-allocated.

mut var my_numbers: I32[] = [10, 20, 30];
my_numbers::push(40);
var first = my_numbers[0];
var last = my_numbers[-1]; // same as my_numbers[my_numbers::length() - 1]

3.8. String Literal and Template Types

Types can be constrained to specific string literals or string patterns, enabling more precise type checking.

String Literal Types:

type TrafficLightState = "red" | "yellow" | "green";
var go_signal: TrafficLightState = "green";
// var invalid_signal: TrafficLightState = "blue"; // Error: "blue" is not in the union.

String Template Types: (Advanced Feature) Define types that match a string pattern, potentially using generic parameters.

type HexChar = "0"|"1"|"2"|"3"|"4"|"5"|"6"|"7"|"8"|"9"|"A"|"B"|"C"|"D"|"E"|"F";
type HexColor<S1 HexChar, S2 HexChar, S3 HexChar, S4 HexChar, S5 HexChar, S6 HexChar> =
    "#${S1}${S2}${S3}${S4}${S5}${S6}";

// Or a simpler form using a generic constraint for the pattern part:
type UserIdFormat<T extends String> = "user_${T}";
// var user1: UserIdFormat<"123"> = "user_123"; // Valid
// var user2: UserIdFormat<"abc"> = "user_abc"; // Valid
// var user3: UserIdFormat<"1-2"> = "user_1-2"; // Valid if "1-2" matches String constraint
// var invalid_user: UserIdFormat<"123"> = "usr_123"; // Error

This requires sophisticated compile-time string manipulation and type checking.

4. Generics

NewLang supports generic programming, allowing types and functions to be parameterized by other types.

4.1. Generic Types

Types can be defined with type parameters. Option<T>, Result<T, E>, and Future<T> are examples.

tag Box<T> = [ content: T ];

var int_box: Box<I32> = [ content: 10 ];
var str_box: Box<String> = [ content: "hello" ];

4.2. Generic Functions

Functions can also be parameterized by types.

fn identity = <T>(value T) -> T {
    value
};

// without the =
fn identity<T>(value T) -> T {
    value
};

// with type inference
fn identity<T>(value T) {
    value // T inferred as the type of value
};

var num = identity(5);       // num is type I32, since T is inferred as I32
var text = identity("echo"); // text is type String, since T is inferred as String

4.3. Type Constraints (`extends`)

Generic type parameters can be constrained to ensure they meet certain requirements (e.g., possess certain fields or methods). The extends keyword is used for this. Constraints are defined using structural type descriptions or by referring to existing nominal types or traits (if a full trait system is added).

// Define a structural constraint: any type with a 'name' field of type String.
type HasName = [ name: String ];

fn greet<P extends HasName>(person P) {
    print("Hello, ${person.name}!");
};

var p1 = [ name: "Alice", age: 30 ]; // Structurally matches HasName
greet(p1); // Valid

tag NamedUser = [ name: String, id: I32 ];
var u1 = NamedUser([ name: "Bob", id: 1 ]);
greet(u1); // Valid, NamedUser structurally matches HasName

// This would fail as I32 does not have a 'name' field:
// greet(123); // Error

// Constraints can also involve methods (see Section 9):
type Renderable = [ render() -> String ]; // or [ render: () -> String ]
fn display<T extends Renderable>(item T) {
    print(item.render());
};

4.4. Variance

For this version of the specification, generic types are considered invariant by default. For example, Box<String> is not a subtype of Box<String | Int>. Covariance and contravariance annotations might be introduced in future versions for more flexible subtype relationships with generics. in and out annotations might be introduced in future versions for more flexible subtype relationships with generics.

5. Type System Operators & Keywords

5.1. `typeof`

Returns the type of a value expression or the definition of a type alias/tag.

var greeting = "Hello";
type GreetingType = typeof greeting; // GreetingType is String

tag MyUnitTag;
type MyUnitTagDefinition = typeof MyUnitTag; // MyUnitTagDefinition is None (underlying type)

tag MyWrapperTag = I32;
type MyWrapperTagDefinition = typeof MyWrapperTag; // MyWrapperTagDefinition is I32

type Point = [ x: I32, y: I32 ];
type PointDef = typeof Point; // PointDef is [ x: I32, y: I32 ]

5.2. `keyof`

For a given record type (nominal or structural), keyof returns a tuple type representing all its field names as special "key tags" or string literal types. Dot-notation for property access (record.field) is syntactic sugar for indexing with these key tags (record[Type.fieldKey]).

tag User = [ username: String, email: String ];
type UserKeys = keyof User;
// UserKeys is a union of string literals "username" | "email"
type UserKeys2 = keyof [I32, String];
// or for a tuple type, it would be a union of the indices 0 | 1
type UserKeys3 = keyof F32[];
// or for an array type, it would be the type of the indices: U64

var active_user = User([ username: "jdoe", email: "[email protected]" ]);

var name1 = active_user.username;
// If UserKeys is ["username", "email"]:
// var name2 = active_user["username"]; // Access by string key
// var key_tuple = keyof User; // ["username", "email"]
// var name3 = active_user[key_tuple[0]];

// This allows for generic functions that operate on object fields:
// fn get_field<T, K extends keyof T>(obj T, key K) -> T[K] { ... };

5.3. Type Arithmetic: Combination (`+`) and Subtraction (`-`)

These operators work on type constraints or structural types.

+ (Intersection/Combination): Creates a new type constraint that must satisfy all combined constraints.

type HasName = [ name: String ];     // Structural constraint for a name
type HasAge = [ age: I32 ];         // Structural constraint for an age
// type HasRender = [ render() -> String ]; // Structural constraint for a method

type PersonData = HasName + HasAge; // Requires name: String AND age: I32

fn process_person<P extends PersonData>(person P) {
    print("${person.name} is ${person.age}");
};

- (Difference/Exclusion): Creates a new type constraint by excluding properties or variants.

// Refine a union type
type MustBeSome<T> = Option<T> - None; // Effectively Some<T>

// Refine a record type (structurally)
type FullPerson = [ name: String, age: I32, address: String ];
type PersonNameOnly = FullPerson - [ age: I32, address: String ]; // [ name: String ]

// Exclude specific fields by their "shape"
// type PersonWithoutAge = FullPerson - [ age: I32 ];
// This would mean PersonWithoutAge is [ name: String, address: String ], since [ age: I32 ] means "any type with an age field of type I32".

The .-syntax for field constraints (e.g., .name String) can be seen as shorthand for an anonymous structural type [ name: String, ... ] used in combination/subtraction.

5.4. Optional Type Suffix (`?`)

A shorthand for creating an Option type. T? is equivalent to Option<T>.

type OptionalName = String?; // Equivalent to Option<String>
var name1: OptionalName = Some("Alice");
var name2: OptionalName = None;
var name3: String? = "Bob"; // Implicitly Some("Bob")

5.5. Helper Types

The standard library provides helper types for working with futures, like Awaited<T>, which is the type T of the value awaited from a Future<T>. So Awaited<Future<T>> is the type of the value awaited from a Future<T>.

type FutureResult = Awaited<Future<String>>; // FutureResult is String

Also, Return<T> is the type of the value returned from a function. So Return<() -> I32> is I32.

type ReturnType = Return<() -> I32>; // ReturnType is I32

6. Functions

Functions are first-class citizens in NewLang. They can be assigned to variables, passed as arguments, and returned from other functions.

6.1. Declaration

Functions are typically declared with var. The fn keyword can be used for functions that need to be hoisted (e.g., for mutual recursion, or defining before use if preferred stylistically). var-defined functions are not hoisted. Return types can often be inferred but can also be explicitly annotated.

// Function
var subtract = (a I32, b I32) -> I32 { 
    a - b
};

// Hoisted function (can be called before its definition in the same scope)
fn factorial = (n I32) -> I32 {
    if n <= 1 { 1 } else { n * factorial(n - 1) }
};

You can omit the = when defining a function.

var add = (a I32, b I32) -> I32 { a + b };
// is equivalent to:
var add(a I32, b I32) -> I32 { a + b };

// this applies to hoisted functions as well:
fn factorial = (n I32) -> I32 {
    if n <= 1 { 1 } else { n * factorial(n - 1) }
};
// is equivalent to:
fn factorial(n I32) -> I32 {
    if n <= 1 { 1 } else { n * factorial(n - 1) }
};

You can omit the type annotation when defining a function to enable type inference.

var add = (a I32, b I32) -> I32 { a + b };
// is equivalent to:
var add = (a I32, b I32) { a + b };

// this applies to hoisted functions as well as functions without the assignment operator `=`
fn factorial(n I32) -> I32 {
    if n <= 1 { 1 } else { n * factorial(n - 1) }
};
// is equivalent to:
fn factorial(n I32) {
    if n <= 1 { 1 } else { n * factorial(n - 1) }
};

6.2. Parameters

Type Annotations: Parameters must have type annotations.

Mutability: Parameters are immutable by default. Use mut to make a parameter's binding mutable within the function, allowing the local variable to be reassigned. If the parameter type is a mutable collection or object, and the argument passed was mutable, then in-place modifications are possible through this mutable binding.

fn append_item(mut list: I32[], item: I32) {
    // list.push(item); // Assuming push modifies list in-place
    list = list + [item]; // Reassigns the local 'list' parameter
};

mut var my_list = [1, 2];
append_item(my_list, 3); // my_list might be [1,2,3] or unchanged depending on
                         // whether append_item modified it in place or only reassigned its local var.
                         // To modify caller's variable, needs to be passed as &mut T in Rust.
                         // NewLang's value semantics: copy in, unless & is used.
                         // `list mut I32[]` means `list` binding is mutable. Original `my_list` is copied.
                         // To modify `my_list` it would need to be:
                         // fn append_item_ref(list &mut I32[], item I32) { list.push(item); }
                         // For now, `mut` on param means the local copy is mutable.

Clarification on mut parameters: mut param: Type means the binding param inside the function is mutable. If Type is a value type, modifications affect a copy. To modify the original caller's variable, explicit references (&mut Type) would be needed, which are managed by the GC. For simplicity, we assume mut on a parameter allows the function to reassign its local copy or, if the copied value is a mutable structure (like a dynamic array), mutate its contents.

Default Values: Parameters can have default values. Default values are only used if the parameter is not provided.

fn greet(name: String = "World") {
    print("Hello, ${name}!");
};
greet(); // "Hello, World!"
greet("Alice"); // "Hello, Alice!"

This also enables type inference for the parameter type.

fn greet(name = "World") { // equivalent to fn greet(name: String = "World") {
    print("Hello, ${name}!");
};
greet(); // "Hello, World!"

Variadic Parameters: Parameters can be variadic.

fn print_all(...items: String[]) {
    for item in items {
        print(item);
    };
};
print_all("Hello", "World"); // "Hello", "World"

Optional Parameters: Parameters can be optional.

fn greet(name: String? = None) {
    if name {
        print("Hello, ${name}!");
    } else {
        print("Hello, World!");
    };
};
greet(); // "Hello, World!"
greet("Alice"); // "Hello, Alice!"

Optional parameters automatically default to None without specifying = None.

fn greet(name: String?) {
    if name {
        print("Hello, ${name}!");
    } else {
        print("Hello, World!");
    };
};
greet(); // "Hello, World!"
greet("Alice"); // "Hello, Alice!"

Parameter Destructuring: Parameters can destructure records, tuples, and arrays.

type Person = [ name: String, age: I32 ];
var greet_person = ([ name, age ]: Person) { // Destructures Person
    print("Hello ${name}, you are ${age} years old.");
};
var p = Person([ name: "Alice", age: 30 ]);
greet_person(p);

var process_point = ([x, y]: [I32, I32]) { // Destructures a tuple/fixed array
    print("Point: (${x}, ${y})");
};
process_point([10, 20]);

6.3. Return Types

Explicit: Type
Inferred: If Type is omitted, the compiler infers it from return statements or the block's final expression.

None: Functions not returning a value implicitly return None.

var greet = (name: String) { // Implicitly returns None
    print("Hello, ${name}!");
};

6.4. Anonymous Functions (Lambdas) and IIFEs

Functions can be defined anonymously and used as expressions.

var multiply = (a: I32, b: I32) -> I32 { a * b }; // Regular var-binding

var divide = (a: F32, b: F32) -> Result<F32, String> {
    if b == 0.0 { Err("Division by zero") } else { Ok(a / b) }
};

// Immediately-Invoked Function Expression (IIFE):
(name: String) {
    print("Hello, ${name} from IIFE!");
}("World");

6.5. Closures

There is nothing special about closures in NewLang. All functions capture their environment.

var make_adder = (x: I32) {
    // The returned function is a closure.
    // It captures the 'x' from its defining environment (make_adder's scope).
    var adder_fn = (y: I32) -> I32 {
        x + y // 'x' is accessible here because adder_fn is a closure.
    };
    adder_fn
};

var add5 = make_adder(5);    // add5 is now a function that will add 5 to its argument.
var add10 = make_adder(10);  // add10 is now a function that will add 10 to its argument.

print(add5(3));     // Output: 8  (since 5 + 3 = 8)
print(add10(3));    // Output: 13 (since 10 + 3 = 13)

// You can also call it more directly:
var result = make_adder(20)(5); // result = 25
print(result); // Output: 25

7. Control Flow

7.1. `if/else` Expressions

if/else constructs are expressions, meaning they evaluate to a value. An if expression without an else block evaluates to None if the condition is false. This means an else block is not mandatory, even when used in a context requiring a value (e.g., assignment).

var num = -5;
var description: String = if num > 0 {
    "positive"
} else if num < 0 {
    "negative"
} else {
    "zero"
};

if num > 10 { print("Large number"); } // Statement form, no else needed if not assigning.

var value = if num > 10 { "large" }; // `value` will be `None` because num is -5.
var unwrapped = value else "not large"; // `unwrapped` will be "not large".

7.2. `match` Expressions

match provides exhaustive pattern matching and is also an expression. The compiler ensures all possible cases for the matched type are handled. An else branch can be used as a catch-all.

var outcome: Result<F32, String> = Ok(5.0);

var message = match outcome {
    Ok val => "Success with value: ${val}",
    Err msg => "Failure with message: ${msg}"
};
print(message);

var opt_val: Option<I32> = Some(10);
match opt_val {
    Some x if x > 5 => print("Large some: ${x}"), // Guard condition
    Some x => print("Small some: ${x}"),
    None => print("It was None")
};

// Exhaustiveness with `else`
enum MyEnum { A, B, C };
var val: MyEnum = MyEnum::A;
match val {
    MyEnum::A => print("It's A"),
    MyEnum::B => print("It's B"),
    // No C, but if `else` is not present, this is a compile error
    else => print("It's something else, i.e., C")
};
// A non-exhaustive match without an `else` branch is a compile-time error.

Patterns can include literals, variables (which bind parts of the matched value), destructuring for records/tuples/arrays, and type checks.

7.3. Loops

NewLang supports several loop constructs. break can be used to exit a loop prematurely, and continue to skip to the next iteration.

Range-based for loop: Iterates over a range or an iterable collection.

mut var sum = 0;
for i in 0..10 { sum += i; } // Iterates i from 0 up to (but not including) 10
print(sum); // 45

mut var sum_by_step = 0;
for i in 0..10 by 2 { sum_by_step += i; }; // 0, 2, 4, 6, 8
print(sum_by_step); // 20

var items = ["a", "b", "c"];
for item in items { print(item); }

while loop: Executes as long as a condition is True.

mut var k = 0;
while k < 5 {
    print(k);
    k = k + 1;
    if k == 3 { break; }; // Exit loop
};

There are do while loops, which are like while loops but with the condition at the end.

mut var k = 0;
do {
    print(k);
    k = k + 1;
} while k < 5;

List Comprehensions (Array Comprehensions): A for block can be used as an expression to create an array. An optional if condition can filter elements.

var squares = for i in 0..5 { i * i }; // [0, 1, 4, 9, 16]

var even_numbers = for i in 0..10 if i % 2 == 0 { i }; // [0, 2, 4, 6, 8]

var processed = for item in ["a", "bb", "ccc"] if item::length() < 3 {
    item::to_upper_case() + "!"
}; // ["A!", "BB!"]

This also applies to while loops.

var squares = while i < 5 { i * i }; // [0, 1, 4, 9, 16]

You can use continue and break in for and while in list comprehensions.

var squares = for i in 0..5 {
    if i == 3 { continue; };
    i * i
}; // [0, 1, 4, 9, 16]

You can call continue with a value.

var squares = for i in 0..5 {
    if i == 3 { continue 69; };
    i * i
}; // [0, 1, 4, 69, 16]

Or break with a value.

var squares = for i in 0..5 {
    if i == 3 { break 69; };
    i * i
}; // [0, 1, 4, 69]

8. Operators

NewLang supports standard arithmetic, logical, comparison, and bitwise operators with common precedence rules.

8.1. Destructuring and Spread (`...`)

Destructuring assignment works on tuples, arrays, and records. The spread operator (...) can be used in expressions to copy fields from one record to another or elements from one array/tuple to another. It can also capture remaining elements/fields during destructuring.

Destructuring Assignment:

var point: [I32, I32] = [10, 20];
var [x, y] = point; // x is 10, y is 20

var profile = [ name: "Eve", age: 28, city: "Codeville" ];
var [ name, age, ...other_fields ] = profile;
// name is "Eve", age is 28
// other_fields is [ city: "Codeville" ] (a new record)

var numbers = [1, 2, 3, 4, 5];
var [first, second, ...rest_array] = numbers;
// first is 1, second is 2, rest_array is [3, 4, 5]

Spread in Expressions:

var defaults = [ x: 0, y: 0, color: "red" ];
var custom_point = [ ...defaults, y: 10, z: 100 ];
// custom_point is [ x: 0, y: 10, color: "red", z: 100 ]
// Fields from `custom_point` (y, z) override those from `defaults` (y).

var arr1 = [1, 2];
var arr2 = [3, 4];
var combined_arr = [...arr1, ...arr2, 5]; // [1, 2, 3, 4, 5]

Destructing with aliases:

var [name as person_name, age] = person;
// person_name is "Alice", age is 30

9. Error Handling

NewLang promotes robust error handling primarily through the Result<T, E> and Option<T> types.

9.1. `Result<T, E>` Type

Used for operations that can fail. It is an enum with two variants:

Ok<T>: Represents success and contains a value of type T.
Err<E>: Represents failure and contains an error value of type E.

fn parse_int(s: String) -> Result<I32, String> {
    // Simplified parsing logic
    if s::is_numeric() { Ok(s::to_int()) } // Assume these string methods exist
    else { Err("Invalid integer format") }
};

9.2. `Option<T>` Type

Used for values that might be absent. It is an enum with two variants:

Some<T>: Represents an existing value of type T.
None: Represents the absence of a value (internally tag None = None;).

fn find_user(id UserId) -> Option<User> {
    // ... logic to find user ...
    if found { Some(user_data) } else { None }
};

9.3. Error/None Propagation (`?` Operator)

The ? operator unwraps an Ok<T> or Some<T> value. If the value is Err<E> or None, it immediately returns that Err<E> or None value from the current function. The current function's return type must be compatible (i.e., a Result or Option respectively).

fn process_data() -> Result<String, String> {
    var val1 = parse_int("123")?;       // val1 is 123 (I32)
    var val2 = parse_int("abc")?;       // This call returns Err "Invalid integer format".
                                        // The `?` propagates this Err from process_data().
    var sum_str = (val1 + val2)::to_string();
    Ok("Sum: ${sum_str}")
};

fn get_first_char_option(data Option<String>) -> Option<Char> {
    var s = data?; // If data is None, returns None from here. s is String.
    if s::is_empty() { None } else { Some(s::char_at(0)) }
};

9.4. Unwrapping with Default

A common patter is to use a match statement to unwrap a value with a default.

var data = Some("hello");
var first_char = match data {
    Some s => s,
    None => "I am empty",
};

We can make this more concise with the else keyword.

var data = Some("hello"); // type is Option<String>
var first_char: String = data else "I am empty";

Same goes for Results.

var data = Ok("hello"); // type is Result<String, String>
var first_char: String = data else "I was an error";

You can use the special err keyword to handle the error dynamically if you pass a block to else (thereby creating a new lexical scope). For Result<T, E>, error will be defined as of type E.

var data: Result<I32, String> = Err("Not Found");
var first_char: I32 = data else { "Error: ${err}" };

You can also use continue or break.

for i in [Ok(1), Err("Not Found"), Ok(3)] {
    var x = i else continue;
    print(x);
}
// or with break
for i in [Ok(1), Err("Not Found"), Ok(3)] {
    var x = i else break;
    print(x);
}
// or inline
for i in [Ok(1), Err("Not Found"), Ok(3)] {
    print(i else break);
}

Here's a cool example of it all generalizing well with list comprehensions.

var data = [Some(1), None, Ok(3), None, Ok(12)];
var just_the_ok_values = for i in data { i else continue }; // [1, 3, 12]

9.4. Panic Unwrapping (`!` Operator)

The ! operator unwraps an Ok<T> or Some<T> value. If the value is Err<E> or None, the program panics and typically terminates. This should be used sparingly, only when the presence of the value is a guaranteed invariant.

var config = load_config_or_panic()!; // If load_config returns Err, panics.
var definitely_present_value = Some(10)!; // value is 10.
// var failure = None!; // Panics!

A panic includes a stack trace and error message.

9.5. Defining Custom Error Types

For more structured error handling, E in Result<T, E> can be any type, often a custom enum or record.

enum FileError {
    NotFound(String),
    PermissionDenied,
    IOError,
}

fn read_file_contents(path: String) -> Result<String, FileError> {
    if !file_exists(path) { return Err(FileError::NotFound(path)); }
    // ... other checks and operations ...
    Ok "file contents"
};

10. Methods and Properties on Types

10.1. Extension Methods

Functions can be associated with a type to define its extension methods. New behavior can be added to any existing type, even those from external modules or primitives. The this keyword refers to the instance the method is called on, and is lexically bound the type you are defining the method on.

tag Counter = [ value: I32 ];

// Define a method on the Counter type
impl Counter {
    fn increment(amount: I32) {
        this.value = this.value + amount;
    },
};

impl Counter {
    fn get_value() -> I32 {
        this.value
    },
};

mut var c = Counter [ value = 0 ];
c::increment(5);
print(c::get_value()); // Output: 5

Methods can be defined for tag types, type aliases (structurally), and even primitive types (though less common for primitives directly).

You can use type inference when defining a method, like with any other function.

impl Counter {
    fn get_value() {
        this.value
    },
};

// Define an extension method for any type that has a 'length' extension method that returns an I32
impl ::length() -> I32 {
    fn describe_length() {
        "This item has a length of ${this::length()}."
    }
};

var my_array = [1, 2, 3]; // Arrays have a `Length` extension
print(my_array::describe_length()); // "This item has a length of 3."

var my_string = "hello"; // Strings also have a `Length` extension
print(my_string::describe_length()); // "This item has a length of 5."

// Extension methods can also be grouped together:
impl String {
    fn lines() -> String[] {
        this::split("\n")
    },
    fn describe_length() {
        "This item has a length of ${this::length()}."
    }
}
var text = "line1\nline2";
print(text::lines()); // ["line1", "line2"]

To mimic an object oriented style, you can use these blocks to define lots of methods on a type.

tag Person = [ name String, age I32 ];
impl Person {
    fn greet() { print("Hello, ${this.name}!") },
    fn say_age() { print("I am ${this.age} years old.") },
};

var p = Person([ name = "Alice", age = 30 ]);
p::greet();
p::say_age();

If you want to operate on a mutable type, just specify so in the type declaration.

impl mut Person {
    fn increment(amount I32) {
        // now this is of type mut Person
        this.age = this.age + amount;
    },
};

mut var p = Person([ name = "Alice", age = 30 ]);
p::increment(1);
print(p.age); // Expected output: 31

Conflict Resolution for Extension Methods:

Extension methods on a tag always take precedence over extension methods on a type.
Extensions methods from within the same block take precedence over extensions methods from other blocks.
For now, attempting to define the same extension method for the exact same tag or potentially overlapping structural type within the same block will be a compile error.

If you define multiple extension methods with the same name on different members of a union type, the method will be defined on the union type itself.

tag Dog = [ name: String ];
tag Cat = [ name: String ];
tag Animal = Dog | Cat;

impl Dog {
    fn make_sound() -> String {
        "Woof"
    },
};

impl Cat {
    fn make_sound() -> I32 {
        42
    },
};

var dog = Dog([ name = "Fido" ]);
var cat = Cat([ name = "Whiskers" ]);

var animal: Animal[] = [dog, cat];

for animal in animal {
    var sound = animal::make_sound(); // type is String | I32
    print(sound); // "Woof" and 42
}

10.2. Static Methods, Properties, and Types

Constant values or functions that belong to a type itself (rather than instances) can be defined as static properties/methods.

tag MathConstants;

impl MathConstants {
    static pi: F32 = 3.1415926535;
    static e: F32 = 2.71828;
}

var my_pi: F32 = MathConstants::pi;

You can also define static methods on a type like this:

tag MathConstants;

impl MathConstants {
    static fn sin(x F32) -> F32 {
        // ... implementation of sin ...
    },
};

var sin_of_one: F32 = MathConstants::sin(1.0);
var cos_of_one: F32 = MathConstants::cos(1.0);

You can also define static types on a type.

impl MathConstants {
    static type Pi = F32;
}

var pi: MathConstants::Pi = 3.1415926535;

11. Concurrency

NewLang provides high-level concurrency primitives inspired by Go's goroutines and channels, simplifying concurrent programming. It uses a model of lightweight, cooperatively scheduled concurrent tasks managed by a runtime.

11.1. `spawn`: Launching Concurrent Tasks

The spawn keyword starts a new concurrent task (similar to a goroutine). It takes a block of code (a thunk or function call) to execute concurrently. spawn immediately returns a Future<T>, where T is the result type of the block.

fn long_computation(id I32) -> String {
    // Simulate work
    sleep(1000); // Assume sleep function exists
    "Computation ${id} done"
};

var future1: Future<String> = spawn { long_computation(1) };
var future2: Future<String> = spawn { long_computation(2) };

print("Tasks spawned.");

11.2. `Future<T>`: The Result of a Concurrent Task

A Future<T> is a handle to the eventual result of a concurrent computation. The actual computation may or may not have compvared when the Future is returned.

11.3. `await`: Waiting for Results

The await keyword is used to pause the current task's execution until a Future<T> is resolved (i.e., its computation compvares or fails). await then returns the result.

If the spawned task compvares successfully with a value v of type T, await future evaluates to v.
If the spawned task returns an Err(e) of type Result<T, E>, await future will evaluate to Err(e) (if the future's type T is itself a Result). More commonly, the Future's type parameter would be the success type directly, and errors are handled as described below.
If the spawned task panics, the await future expression will re-throw that panic in the awaiting task.

// (Continuing from above)
print("Waiting for future1...");
var result1: String = await future1; // Pauses here until future1 compvares
print("Result 1: ${result1}");

print("Waiting for future2...");
var result2: String = await future2; // Pauses here until future2 compvares
print("Result 2: ${result2}");

// Error Handling with Futures:
fn task_that_might_fail(should_fail: Bool) -> Result<I32, String> {
    sleep(500);
    if should_fail { Err("Task failed intentionally") } else { Ok(100) }
};

var future_ok: Future<Result<I32, String>> = spawn { task_that_might_fail(False) };
var future_err: Future<Result<I32, String>> = spawn { task_that_might_fail(True) };

var res_ok = await future_ok; // res_ok is Ok(100) of type Result<I32, String>
match res_ok {
    Ok v => print("Succeeded: ${v}"),
    Err e => print("Failed: ${e}"),
};

var res_err = await future_err; // res_err is Err("Task failed intentionally")
match res_err {
    Ok v => print("Succeeded: ${v}"), // Won't happen
    Err e => print("Failed: ${e}"),   // Prints this
};

// Panics in spawned tasks:
var future_panic = spawn {
    panic("Something went wrong in spawned task!");
    "never reached" // Type String
};
// var panicking_result: String = await future_panic; // This line would cause the current task to panic.

You can await a Future<T> without first assigning it to a variable, like this:

print(await spawn { long_computation(1) });

You can await an tuple of Future<T>s to simultaneously await all of them and get a list of the results, like this:

var results: [String, String] = await [spawn { long_computation(1) }, spawn { long_computation(2) }];
print(results[0]);
print(results[1]);

Same with lists. You can await a list of Future<T>s to simultaneously await all of them and get a list of the results, like this:

var results = await for i in 0..10 { spawn { long_computation(i) } }; // results is type String[]
for result in results {
    print(result);
};
// or even this — remember, `for` blocks are expressions!
for result in await for i in 0..10 { spawn { long_computation(i) } } {
    print(result);
};

NewLang's spawn/await system aims to avoid "colored functions" (async/sync distinction infecting call stacks), making concurrency more natural to integrate. The underlying runtime manages scheduling and resource allocation for these lightweight tasks.

TODO: richer patterns (e.g., fan-in, fan-out, timeouts, pipelines, streams), you might end up needing something like channels + select {} blocks

12. Modules and Program Structure

NewLang code is organized into modules. Each file typically represents a module.

12.1. File-based Modules

A file with the .newlang extension (e.g., my_module.newlang) constitutes a module. The module's name is implicitly derived from its filename or can be declared.

12.2. `export` Keyword

By default, all top-level declarations (types, functions, constants) within a module are private to that module. The export keyword makes a declaration public and thus importable by other modules.

// In file: utils.newlang
export tag: UserId = I32;

export fn helper_function() {
    print("Utility action.");
};

var private_constant: I32 = 123; // Not exported

12.3. `import` Keyword

Modules can use declarations from other modules using the import keyword.

Import specific items:

// In file: main.newlang
import { UserId, helper_function } from "./utils.newlang";

var user_id: UserId = UserId(10);
helper_function();

Import with aliasing:

import { UserId as UID, helper_function as util_fn } from "./utils.newlang";
var user_id: UID = UID(20);
util_fn();

Import all exported items under a namespace:

import * as Utils from "./utils.newlang";

var user_id: Utils.UserId = Utils.UserId(30);
Utils.helper_function();

// TODO: this needs improvement, including project root and library paths, etc.

12.4. Module Resolution and Directory Structure

Module paths are typically relative to the current file or can be resolved based on a project configuration (e.g., from a project root or library paths).

"./module.newlang": Refers to module.newlang in the same directory.
"../module.newlang": Refers to module.newlang in the parent directory.
"./directory_name/module.newlang": Refers to module.newlang within directory_name.

A directory can itself be a module if it contains a special file, e.g., mod.newlang (or index.newlang). If directory_name/mod.newlang exists: import { item } from "./directory_name"; would import item exported from directory_name/mod.newlang.

12.5. Program Entry Point

The entry point for a NewLang executable program is a function named main in the root module (or a specifically designated entry module). The main function takes no arguments and returns None or Result<None, ErrorType> for top-level error reporting.

// In main.newlang (or the project's entry file)
export fn main() Result<None, String> {
    print("Application started.");
    // ... application logic ...
    if error_occurred {
        Err("An error occurred during execution.")
    } else {
        Ok(None)
    }
};

13. Metaprogramming

13.1. Type Functions

Type functions are functions that return a type rather than a value. They are defined with the type keyword, but otherwise look a lot like regular functions. They can take types as generics, and return types as output. They can also take values as parameters, but those values must be known at compile time. To specify that a value must be known at compile time, you can use the @ operator in front of the type. Type T is always a subtype of @T. A type function returns a TypeResult<type, String>, and if you return a String you will get nice error squiglies and messages in the editor where the type function was called. You can make a type mutable by adding the mut keyword, just like variables.

type CSVRecord = (path: @String) {

    var file = await File::open(path)?;
    var content = await file::read()?;
    var records = content::split("\n");

    if records::length() < 2 {
        return Err("CSV file must have at least 2 lines");
    };

    var header = records[0];
    var first_record = records[1];

    mut type CSVRecord = [];

    for i in 0..header::length() {

        var field_name = header[i];
        var field_value = first_record[i];

        type Value = match field_value {
            /\d+/ => I32,
            /[0-9]+\.[0-9]+/ => F64,
            else => String,
        };

        CSVRecord += [ [field_name::to_camel_case()]: Value ];
    };

    CSVRecord
}

// let's say file.csv is:
//      name,age,email
//      John Doe,25,[email protected]
//      Jane Smith,30,[email protected]

type MyCSVRecord = CSVRecord("file.csv");
// type is [ name: String, age: I32, email: String ], known by the language server!

var record: MyCSVRecord = [
    name: "John Doe",
    age: 25,
    email: "[email protected]",
];

The @ operator is automatically added to type function parameters.

13.2. Macros

You can define macros using the macro keyword for custom code generation. Macros are functions that return an AST Expression that contains other AST nodes like Statements or Blocks. They can also return semantic information about the macro call content to inform syntax highlighting in the code editor. They are passed arbitrary strings that can be lexed and parsed to create custom syntax. They also have access to the scope of the macro call.

For example, you should be able to do something like this, and it should be properly syntax highlighted if the macro is well built. If you hover over the macro call, it should show the code that was generated, and if you hover over the x variable, it should show you the type of x, etc.

var x = 1;
var dom = #jsx(
    <div>
        <h1>Hello {x}</h1>
    </div>
);

// or just codegen a block
#jsx();

To define a macro, you can use the macro keyword.

macro jsx(content @String, scope Scope) {
    return [    
        code: some_ast_block,
        semantics: // semantic information about the macro call content, basically a list of tokens and their spans, with the type of the token and a link to the proper variable in the scope and errors if applicable, etc.
    ];
}

There are some rules for the structure of the macro string such that the compiler knows when it ends. Mainly, it must be parenthesis-balanced.

14. Comments

Comments are written like this:

// This is a comment

Multi-line comments are written like this:

/*
This is a multi-line comment
*/

15. Regular Expressions

NewLang has a built-in regular expressions literals, similar to TypeScript.

var re = /hello/; // type is Regex
var matches = re::matches("hello world"); // matches is type [String]

var valid = re::validate("hello world"); // valid is type Bool

// etc.

16. Unit Testing

NewLang has a built-in unit testing. You can use the test keyword to define a test block.

test "test name" {
    assert(1 + 1 == 2);
}

Great groups of tests by using describe blocks.

describe "test group name" {
    test "test name" {
        assert(1 + 1 == 2);
    }
}

That's it for now.

17. Future Considerations

While this specification covers the core language, future development might include:

Extended Standard Library: Comprehensive APIs for collections, I/O, networking, date/time, etc.
Foreign Function Interface (FFI): Interoperability with other languages (e.g., C, Rust, WebAssembly).
Package Management & Build System: Standardized tools for managing dependencies and building projects.
Detailed Memory Model: Precise rules for GC behavior, especially concerning & references and potential optimizations like escape analysis.
Variance Annotations: For generic types (in, out).
Operator Overloading: Perhaps like operator (lhs: Vec) + (rhs: Vec) { Vec [ x: lhs.x + rhs.x, y: lhs.y + rhs.y ] }

Name		Name	Last commit message	Last commit date
Latest commit History 92 Commits
client		client
examples		examples
server		server
syntaxes		syntaxes
.gitignore		.gitignore
.vscodeignore		.vscodeignore
CHANGELOG.md		CHANGELOG.md
README.md		README.md
eslint.config.mjs		eslint.config.mjs
language-configuration.json		language-configuration.json
old_tests.txt		old_tests.txt
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
tsconfig.json		tsconfig.json
vsc-extension-quickstart.md		vsc-extension-quickstart.md

joshdchang/newlang

Folders and files

Latest commit

History

Repository files navigation

NewLang Specification

1. Introduction

2. Core Concepts

2.1. Values and Immutability

2.2. Mutability

2.3. References (&)

2.4. Blocks as Expressions

2.5. The None Type

3. Types

3.1. Primitive Types

3.2. Nominal Typing with tag

3.3. Structural Typing: Object Types

3.4. Union Types and Enums (|, ::)

3.5. Built-in Generic Union Types

3.6. Tuple Types

3.7. Array Types

3.8. String Literal and Template Types

4. Generics

4.1. Generic Types

4.2. Generic Functions

4.3. Type Constraints (extends)

4.4. Variance

5. Type System Operators & Keywords

5.1. typeof

5.2. keyof

5.3. Type Arithmetic: Combination (+) and Subtraction (-)

5.4. Optional Type Suffix (?)