hophop

hophop is a small programming language in the C/Go family. The reference compiler is written in strict C11 and can run programs through an evaluator, generate WASM modules and generate freestanding C code for a hophop package.

Source on GitHub
Language specification [md]

Example

fn main() {
    greetings := [
        "Hej världen!"
        "Hello, world!"
        "¡Hola Mundo!"
        "Γειά σου Κόσμε!"
        "Привіт, світе!"
        "こんにちは世界!"
    ]
    for greeting in greetings {
        print(greeting)
    }
}

Run it directly:

$ hop run hello.hop
Hej världen!
Hello, world!
¡Hola Mundo!
Γειά σου Κόσμε!
Привіт, світе!
こんにちは世界!

Build a wasm-compatible package to wasm:

$ hop build --platform wasm-min app.hop -o app.wasm

Language guide

Functions

Functions are declared with fn. Programs start at fn main()

fn add(a, b i32) i32 {
    return a + b
}

fn main() {
    assert add(1, b: 2) == 3
}

Function-call arguments must be named at the call site, with the exception of the first argument. "Named" means either using a variable with the same name as the parameter or an explicit label param: value.

HopHop uses uniform function call syntax to support expr.function() calls without special syntax or a distinction between functions and methods.

struct Vec2 {
    x, y f64
}

fn mul(v Vec2, exp f64) Vec2 {
    return { x: v.x*exp, y: v.y*exp }
}

fn main() {
    var v = Vec2{ x: 3.0, y: 4.0 }
    assert mul(v, exp: 4.3) == v.mul(4.3)
}

Comments

HopHop accepts // line comments and /* ... */ block comments. Block comments may nest, which makes it possible to comment out code that already contains block comments.

Most semicolons are inserted from newlines, so ordinary code is written one statement per line.

fn main() {
    // line comment
    /* outer
       /* nested */
       outer */
    print("comments")
}

Literals

Literals include numbers, strings, runes, booleans and null, structs and arrays.

Rune literals use single quotes and represent one Unicode codepoint. null is only assignable where the type explicitly accepts it, such as optionals and rawptr.

fn main() {
    n      := 42
    hex    := 0xff
    pi     := 3.14
    msg    := "hello\n"
    raw    := `hello\n`
    letter := 'å'
    ok     := true
    array  := [1, 2, 3]

    assert n + hex == 297
    assert pi > 3.0
    assert len(msg) == 6
    assert len(raw) == 7
    assert letter == 'å'
    assert ok
    assert array[0] == 1
}

Variables

var creates mutable local storage. const creates a compile-time value and must have an initializer.

The type can be written explicitly or inferred from the initializer. var x T zero-initializes most types, while direct pointer and reference locals must be assigned before use.

fn main() {
    const limit = 10
    var count     = 0
    var total i32 = 0
    var ready bool

    count += limit
    total = count as i32
    assert total == 10
    assert !ready
}

:= is local short-assignment syntax. It assigns to an existing local when one exists, otherwise it declares a new mutable local inferred from the right-hand side.

Regular assignment uses =, compound assignment uses operators such as +=, and multi-assignment evaluates right-hand sides before storing left-to-right.

fn main() {
    x := 1
    x += 2
    y, z := 3, 4
    y, z = z, y
    assert x == 3
    assert y == 4 && z == 3
}

Lexical scope

Names are block scoped, and the nearest declaration wins. The _ name is a discard hole: it accepts a value without creating or updating a binding.

Top-level declarations are collected before function bodies are checked, so functions can call declarations that appear later in the file.

fn main() {
    var x = 1
    {
        var x = 2
        var _, y = x, 3
        assert y == 3
    }
    assert x == 1
}

Types

Built-in types include bool, fixed-width integers, pointer-sized int and uint, f32, f64, rawptr and type. Text uses str, and rune is a Unicode codepoint.

Constant numeric expressions use const_int and const_float until they are assigned or cast to concrete numeric types.

fn main() {
    var signed i32    = -1
    var size   uint   = 10
    var text   &str   = "hej"
    var r      rune   = 'h'
    var p      rawptr = null

    assert signed < 0
    assert size == 10
    assert len(text) == 3
    assert r == 'h'
    assert p == null
}

Operators

HopHop has the usual arithmetic, bitwise, relational and logical operators. Assignment is an expression form, but the left side must be assignable.

Numeric conversions between concrete types are explicit. Casts use as, and pointer/reference casts through rawptr are the explicit low-level escape hatch.

fn main() {
    var a i32 = 10
    var b i64 = a as i64
    var ok    = a > 0 && b < 100
    var p     = null as rawptr

    assert ok
    assert p == null
}

Control flow

if conditions take a bool conditional expression and split execution into two branches.

fn greet(name ?&str) {
    if name {
        print(name)
    } else {
        print("anonymous")
    }
}

When an "optional" value (?T) is used as the condition, its effective type is narrowed inside the branches. A non-null branch sees the payload type, while the null branch sees the null case.

for supports infinite loops, condition loops, C-style loops and for ... in iteration. break exits a loop or switch, and continue starts the next loop iteration.

The for ... in form can bind values, key/value pairs or discard values with _.

fn count(items &[i32]) i32 {
    var total i32 = 0
    for i, value in items {
        total += (i as i32) + value
    }
    return total
}

switch supports expression switches and condition switches. Cases are tested left-to-right, there is no fallthrough, and finite domains such as bool and enums must be exhaustive unless default is present.

Enum payload variants can be narrowed by switching on the enum value.

fn classify(n i32) &str {
    switch {
        case n < 0  { return "negative" }
        case n == 0 { return "zero" }
        default     { return "positive" }
    }
}

Error handling

assert checks a condition and traps if it fails. panic traps with a message and returns no value.

defer schedules a statement or block to run when the current scope exits through structured control flow such as fallthrough, return, break or continue.

fn use_value(x i32) {
    defer print("leaving")
    assert x >= 0, "expected non-negative"
    if x == 0 {
        panic("zero")
    }
}

Arrays

Arrays have fixed length in the type. Slices are unsized views and must be used through pointer or reference forms when passed around.

len reports the length of strings, arrays and slices. Indexing and slicing use bracket syntax, and copy(dst, src) copies sequence elements.

fn first(xs &[i32]) i32 {
    assert len(xs) > 0
    return xs[0]
}

fn prefix(xs &[i32]) &[i32] {
    return xs[0:2]
}

Pointers

*T is writable reference-like access to T. &T is read-only reference-like access.

The address-of operator & forms a read-only reference, and unary * dereferences a pointer or reference. Slice mutability follows the wrapper: *[T] is writable, while &[T] is read-only.

fn read(x &i32) i32 {
    return *x
}

fn set(x *i32, value i32) {
    *x = value
}

Optional

?T represents either a T or null. A plain T can lift into ?T, but an optional does not implicitly convert back to T.

Use control-flow narrowing for ordinary code and postfix ! when an explicit runtime null trap is intended.

fn length(s ?&str) int {
    if s == null {
        return 0
    }
    return len(s)
}

Structs

struct groups named fields, and union stores one of several field layouts in the same storage. Fields can have defaults, and omitted struct fields are initialized from defaults or zero values.

Unions may initialize at most one field explicitly.

struct Point {
    x i32 = 0
    y i32 = 0
}

union Word {
    i i32
    u u32
}

Enums

Enums have an integer base type, and enum items are scoped under the enum type. Plain enum values are selected as Name.Item.

Variants may carry payload types. Struct payload constructors use compound-literal syntax, and switches can narrow payload variants for field access.

enum Result i32 {
    Ok struct {
        value i32
    }
    Err struct {
        code i32
    }
}

fn read(r Result) i32 {
    switch r {
        case Result.Ok  { return r.value }
        case Result.Err { return 0 }
    }
}

fn main() {
    assert read(Result.Ok{ value: 7 }) == 7
}

Compound literals

Compound literals use named fields. A literal may name its type, or it may be inferred from an expected aggregate type.

Field names can be dotted for nested initialization. Explicit initializers override defaults for the initialized path.

struct Size {
    w i32
    h i32
}

fn main() {
    var size      = Size{ w: 640, h: 480 }
    var same Size = { w: 640, h: 480 }
    assert size.w == same.w
    assert size.h == same.h
}

Type aliases

type Name T declares a distinct named type with T as its base. Assignment can implicitly peel from the alias to the target, but not from the target back to the alias.

Type names and value names live in separate namespaces, so a type and a function can share a spelling when their uses are unambiguous.

type UserId u64

fn raw(id UserId) u64 {
    return id
}

Packages

A package is a file or directory; there is no package keyword. Imports appear before top-level declarations.

pub exports a top-level declaration. Imports can use the default alias, an explicit alias, a side-effect-only _ alias or named symbol imports.

import "compiler" as c { error }

pub fn fail_with_alias() {
    c.error("stopped by package alias")
}

pub fn fail_with_symbol() {
    error("stopped by named import")
}

Function overloading

Functions may be overloaded by signature. Overload resolution ranks conversion costs deterministically and reports an ambiguity when there is no single best match.

For calls only, recv.f(args...) can resolve as f(recv, args...). Real fields take precedence over selector-call sugar.

struct Cat {
    score int
}

struct Dog {
    score int
}

fn pick(v Cat) int {
    return v.score
}

fn pick(v Dog) int {
    return v.score
}

fn main() {
    cat := Cat{ score: 9 }
    dog := Dog{ score: 4 }
    assert pick(cat) == 9
    assert dog.pick() == 4
}

Tuples

Tuple types and tuple-style result clauses represent multiple values. Functions with tuple returns must return one value per position.

fn apply(f fn(i32) i32, x i32) i32 {
    return f(x)
}

fn divmod(a, b i32) (i32, i32) {
    return a / b, a % b
}

Memory management

HopHop is a manual memory-management language. You control allocations and deallocations with alloc and dealloc.

alloc T allocates memory, returning *T. alloc [T n] allocates an array of type *[T] (dynamically sized) or *[T n], depending on the receiver type and if n can be computed at compile time or not.

fn make_count() *i32 {
    var p = alloc i32
    *p = 1
    return p
}

fn release(p *i32) {
    dealloc p
}

alloc and dealloc use the allocator defined by context.allocator by default. You can specify an explicit allocator with in, e.g. v := alloc [i32 3] in my_allocator.

Unlike languages like C and Go, in hophop *T cannot be null. Instead, ?*T [optional] must be used when something may be null (and checked before use.)

fn example(never_null *int, may_be_null ?*int) {
    if may_be_null {
        // type automatically narrowed in positive branch
        assert typeof(may_be_null) == type *int
    }
}

Strings

str is UTF-8 text, and most string operations are byte-oriented.

fn main() {
    name := "hello" // type is &str
    print(name)
}

str is a specialized type of [u8] that guarantees valid UTF-8 text. An expression of type &str can be used wherever a value of type &[u8] or &str is needed, and an expression of type *str can be used as *str, &str, *[u8] and &[u8]. However, an expression of type &[u8] or *[u8] cannot be used as &str or *str without explicit cast. An explicit cast from &[u8] or *[u8] to &str or *str is the only way to break the guarantee that a string contains valid UTF-8 data.

Generics

Structs, unions, enums, type aliases and functions may declare type parameters with brackets after the declaration name. Type parameters are compile-time values of metatype type.

Named generic types are instantiated in type positions with type arguments. Generic function calls infer type arguments from ordinary arguments.

struct Box[T] {
    value T
}

fn get[T](box Box[T]) T {
    return box.value
}

Types as values

Types can be treated as values. For example i32.kind() or assert i64 != u32. The "type" of a type is type, which together with compile-time evaluation allows declaring functions that take types as arguments and/or produces types.

The type expr syntax can be used in non-type contexts where syntax would otherwise be ambiguous, for example &T means "make a reference to expression T" in a value grammar position, while in a type position it means "reference type with base T". type T disambiguates to always mean "the type T". typeof(x) returns the type of an expression.

Reflection helpers such as kind, base, is_alias, type_name, ptr, slice and array operate on type values.

fn main() {
    const T type = typeof(123 as i32)
    const P type = ptr(T)

    assert T == i32
    assert P == ptr(i32)
}

Variadic parameters

A final parameter may be variadic with name ...T. In the body, concrete variadic parameters behave like a slice of T.

At the call site, arguments can be passed one by one or with a final spread argument using ....

fn sum(values ...i32) i32 {
    var total i32 = 0
    for value in values {
        total += value
    }
    return total
}

fn main() {
    assert sum(1, 2, 3) == 6
}

Function Closures

Anonymous functions use fn(...) { ... } in expression context. They produce ordinary function values, so they can be assigned to locals, passed to other functions and called indirectly.

When an anonymous function references enclosing locals, it forms a limited stack closure. Captured locals may be read and mutated, but the closure must not outlive the scope that owns those locals. This keeps callbacks allocation-free and lets the compiler reject escaping captures.

fn keep_value(x int, keep fn(int) bool) bool {
    return keep(x)
}

fn main() {
    threshold := 40
    is_large := fn(x int) bool {
        return x > threshold
    }

    count := 0
    next_count := fn() int {
        count += 1
        return count
    }

    assert keep_value(42, keep: is_large)
    assert next_count() == 1
    assert next_count() == 2
}

Local named functions can also capture enclosing locals under the same non-escaping rules.

Compile-time evaluation

A parameter marked const requires a const-evaluable argument at the call site. This lets library code validate values while typechecking the caller.

Const evaluation also powers constant numeric values, sizeof, compile-time function calls and type-value computations.

fn make_array_type(const n uint) type {
    return array(i32, N: n)
}

const FourI32 = make_array_type(4 as uint)

fn main() {
    var values FourI32
    assert len(values) == 4
}

anytype is valid in function parameter positions and captures the static type of the corresponding argument. Each non-variadic anytype parameter binds independently.

...anytype forms a heterogeneous compile-time pack. Pack length is available with len(args), and const indexes preserve the selected element type.

fn debug(value anytype) {
    print(type_name(typeof(value)))
}

fn count(args ...anytype) int {
    return len(args)
}

const { ... } executes a statement block in compile-time evaluation context. If it cannot be evaluated at compile time, compilation fails.

The compiler package provides diagnostic functions for code that validates itself during const evaluation.

import "compiler"

fn require_positive(const n const_int) {
    const {
        if n <= 0 {
            compiler.error("expected positive value")
        }
    }
}

Anonymous aggregate types

Anonymous structs and unions can be written directly as types. Anonymous struct identity is structural, based on field names and field types.

This is useful for local shapes and context-like values that do not need a named declaration.

fn length_squared(p struct { x f64; y f64 }) f64 {
    return p.x*p.x + p.y*p.y
}

fn main() {
    assert length_squared({ x: 3.0, y: 4.0 }) == 25.0
}

Struct composition

A struct may embed one named base struct as its first field. Direct fields win lookup, then promoted fields are searched through the embedded chain.

Embedded bases support implicit upcasts by value, pointer and reference forms.

struct Entity {
    id u64
}

struct User {
    Entity
    name &str
}

fn entity_id(e &Entity) u64 {
    return e.id
}

fn main() {
    user := User{ Entity: { id: 42 }, name: "Ada" }
    assert entity_id(&user) == 42
}

Variable-size structs

Variable-size structs use dependent fields whose length comes from a previous integer field. Once the first dependent field appears, following fields must also be dependent.

VSS values cannot be used by value in locals, parameters or returns. They are intended for pointer/reference layout work.

struct Packet {
    len  u32
    data [u8 .len]
}

fn packet_len(p &Packet) u32 {
    return p.len
}

Context

context is an ambient builtin expression inside function bodies. The builtin Context provides at least allocator, temp_allocator and logger.

Operations such as alloc, dealloc, concat, fmt and print use context capabilities when no explicit resource is supplied.

fn print_hi() {
    var msg = concat("hi, ", "there")
    print(msg)
    dealloc msg
}
fn example(ma Allocator) {
    // allocate in ma by default until this scope ends
    context.allocator = ma
    print_hi()
}
fn main() {
    example(context.allocator)
}

Platform imports

import "platform" loads the builtin platform package. The platform surface is intentionally small and target dependent.

The common platform import exposes process/platform operations such as exit(status i32).

import "platform"

fn main() {
    platform.exit(0)
}

Iterator protocol

for ... in works over built-in sequence-like values and can also use the iterator protocol. A custom source provides __iterator(x) and matching next_* functions for the binding mode used by the loop.

This keeps the loop syntax simple while letting libraries define their own iteration shapes.

fn walk(xs &[i32]) i32 {
    var total i32 = 0
    for value in xs {
        total += value
    }
    return total
}

Directives and foreign linkage

Directives attach to the following top-level declaration. Current foreign-linkage directives include @c_import, @wasm_import and @export.

Imported declarations have no HopHop body because the symbol is supplied externally. @export publishes a public HopHop function under a requested external name.

@wasm_import("env", "now")
fn now() f64

@export("run")
pub fn run() {
    now()
}

Build selection

Filename build tags let packages include files for specific targets. Active tags come from the selected backend and platform.

This keeps platform-specific code close to the package that needs it without adding conditional syntax inside the language.

logger_wasm.hop
logger_macos.hop
logger_linux.hop