mirror of
https://github.com/exoticorn/curlywas.git
synced 2026-01-20 19:56:42 +01:00
436 lines
15 KiB
Markdown
436 lines
15 KiB
Markdown
# CurlyWas
|
|
|
|
CurlyWas is a (still WIP) curly-braces, infix synatx for WebAssembly.
|
|
The goal is to have as to a 1:1 mapping to the resulting wasm instructions
|
|
as possible while still being reasonably convenient to write.
|
|
|
|
For this reason alone (and in no way because I'm a little lazy) does this
|
|
compiler not implement any optimizations except for constant folding.
|
|
|
|
## Example
|
|
|
|
```rust
|
|
import "env.memory" memory(4);
|
|
import "env.sin" fn sin(f32) -> f32;
|
|
|
|
export fn tic(time: i32) {
|
|
let i: i32;
|
|
loop screen {
|
|
let lazy t = time as f32 / 2000 as f32;
|
|
let lazy o = sin(t) * 0.8;
|
|
let lazy q = (i % 320) as f32 - 160.1;
|
|
let lazy w = (i / 320 - 120) as f32;
|
|
let lazy r = sqrt(q*q + w*w);
|
|
let lazy z = q / r;
|
|
let lazy s = z * o + sqrt(z * z * o * o + 1 as f32 - o * o);
|
|
let lazy q2 = (z * s - o) * 10 as f32 + t;
|
|
let lazy w2 = w / r * s * 10 as f32 + t;
|
|
let lazy s2 = s * 50 as f32 / r;
|
|
i?120 = max(
|
|
0 as f32,
|
|
((q2 as i32 ^ w2 as i32 & ((s2 + t) * 20 as f32) as i32) & 5) as f32 *
|
|
(2 as f32 - s2) * 22 as f32
|
|
) as i32;
|
|
branch_if (i := i + 1) < 320*240: screen
|
|
}
|
|
}
|
|
```
|
|
|
|
You can compile this to `technotunnel.wasm` with the command
|
|
|
|
```
|
|
curlywas technotunnel.cwa
|
|
```
|
|
|
|
Then run it on [MicroW8](https://exoticorn.github.io/microw8/v0.1pre2)
|
|
|
|
## Syntax
|
|
|
|
### Comments
|
|
|
|
```
|
|
// This is a single line comment.
|
|
|
|
/*
|
|
Multiline comments are also supported.
|
|
*/
|
|
```
|
|
|
|
### Include
|
|
|
|
Other sourcefiles can be included with the `include` top level statement:
|
|
|
|
```
|
|
include "platform_imports.cwa"
|
|
```
|
|
|
|
### Types
|
|
|
|
There are four types in WebAssembly and therefore CurlyWas:
|
|
|
|
* `i32`: 32bit integer
|
|
* `i64`: 64bit integer
|
|
* `f32`: 32bit float
|
|
* `f64`: 64bit float
|
|
|
|
There are no unsigned types, but there are unsigned operators where it makes a difference.
|
|
|
|
### Literals
|
|
|
|
Integer numbers can be given either in decimal or hex:
|
|
|
|
```
|
|
123, -7878, 0xf00
|
|
```
|
|
|
|
For floating point numbers, only the most basic decimal format is currently implemented (no scientific notation or hex-floats, yet):
|
|
|
|
```
|
|
0.464, 3.141, -10.0
|
|
```
|
|
|
|
String literals are used for include paths, import names and as literal strings in the data section. The following escapes are supported:
|
|
| Escape | Result | Comment |
|
|
| `\"` | `"` | |
|
|
| `\'` | `'` | |
|
|
| `\t` | 8 | |
|
|
| `\n` | 10 | |
|
|
| `\r` | 13 | |
|
|
| `\N` | 0x0N | (Can't be followed by a hex digit) |
|
|
| `\NN` | 0xNN | |
|
|
|
|
```
|
|
"env.memory", "Hello World!"
|
|
|
|
"one line\nsecond line", "They said: \"Enough!\""
|
|
```
|
|
|
|
Character literals are enclosed in single quotes `'` and support the same escapes as strings. They can contain up to 4 characters and evaluate to the
|
|
little-endian representation of these characters. For examples: `'A'` evaluates to `0x41`, `'hi'` evaluates to 0x6968, and `'Crly'` to 0x7a6c7243.
|
|
|
|
### Imports
|
|
|
|
WebAssembly imports are specified with a module and a name. In CurlyWas you give them inside a single string literal, seperated by a dot. So a module `env` and name `printString` would be written `"env.printString"`.
|
|
|
|
Linear memory can be imported like this:
|
|
|
|
```
|
|
import "module.name" memory(min_pages);
|
|
```
|
|
|
|
giving the minimum required size as the number of 64KB pages.
|
|
|
|
Global variables can be imported with:
|
|
|
|
```
|
|
import "module.name" global var_name: type; // const global
|
|
import "module.name" global mut var_name: type; // mutable global
|
|
```
|
|
|
|
`var_name` being the name you want to reference the variable by in your code.
|
|
|
|
Functions are imported like this:
|
|
|
|
```
|
|
import "module.name" fn fun_name(param_types) [-> return_type];
|
|
|
|
examples:
|
|
|
|
import "env.cls" cls(i32); // no return type
|
|
import "env.random" rand() -> i32; // no params
|
|
import "env.atan2" atan2(f32, f32) -> f32;
|
|
```
|
|
|
|
### Global variables
|
|
|
|
Global variables are declare like this:
|
|
|
|
```
|
|
global name[: type] = value; // immutable global value
|
|
global mut name[: type] = initial_value; // mutable variable
|
|
```
|
|
|
|
An immutable global is probably of very limited use, as usually you'd most often use it by exporting it so that some other module
|
|
can use it. However, exporting global variable is not yet supported in CurlyWas.
|
|
|
|
The type is optional, if missing it is inferred from the init value.
|
|
|
|
### Constants
|
|
|
|
Constants can be declared in the global scope:
|
|
|
|
```
|
|
const name[: type] = value;
|
|
```
|
|
|
|
`value` has to be an expression evaluating to a constant value. It may reference other constants.
|
|
|
|
The type is optional, but if given has to match the type of `value`.
|
|
|
|
### Functions
|
|
|
|
Functions look like this:
|
|
|
|
```
|
|
[export] fn name(param_list) [-> return_type] {
|
|
[...]
|
|
}
|
|
|
|
exampels:
|
|
|
|
fn getPixel(x: i32, y: i32) -> i32 {
|
|
...
|
|
}
|
|
|
|
export fn upd() {
|
|
...
|
|
}
|
|
```
|
|
|
|
The body of a function is a block (see below), meaning a sequence of statements followed by an optional expression which gives the return value of the function.
|
|
|
|
#### Local variables
|
|
|
|
Variables are defined using `let`:
|
|
|
|
```
|
|
let name: type;
|
|
```
|
|
|
|
They can also be initialized to a value at the same time, in this case the type can be left out and will be infered:
|
|
|
|
```
|
|
let name = expression;
|
|
```
|
|
|
|
Local variables are lexically scoped and shadowing variables declared earlier is explicitely allowed.
|
|
|
|
`name = value;` assigns a new value to a (non-inline) variable.
|
|
|
|
There are two modifiers that change when the initializer expression is actually evaluated. They both can reduce the size of the resulting code (when used appropriately), but it is up to the coder to make sure that the delayed evaluation doesn't change the semantics of the code.
|
|
|
|
```
|
|
let inline name = expression;
|
|
```
|
|
|
|
The expression is evaluated (inlined) everytime you use the variable.
|
|
|
|
```
|
|
let lazy name = expression;
|
|
```
|
|
|
|
The expression is evaluated and assigned to the variable at the first place it is used (and only there).
|
|
|
|
`let lazy` uses the `local.tee` instruction which combines `local.set` and `local.get` and therefore saves on instruction (usually 2 bytes).
|
|
|
|
Examples of mistakes to watch out for:
|
|
```
|
|
let x = 4;
|
|
let inline y = 7;
|
|
print(y); // prints 11
|
|
x = x + 2;
|
|
print(y); // prints 13
|
|
```
|
|
|
|
```
|
|
let inline num_bytes = write(buffer, size);
|
|
print(num_bytes);
|
|
read(buffer, num_bytes); // calls write a second time
|
|
```
|
|
|
|
```
|
|
let lazy num_bytes = write(header, 8);
|
|
write(body, size);
|
|
print(num_bytes); // the header is only written now
|
|
```
|
|
|
|
```
|
|
let lazy foo = 42;
|
|
if rand() & 1 {
|
|
printNumber(foo); // foo is initialized here
|
|
} else {
|
|
printNumber(foo / 2 + 2); // foo is never initialized in the else branch
|
|
}
|
|
```
|
|
|
|
#### Expressions
|
|
|
|
Expressions are written in familiar infix operator and function call syntax.
|
|
|
|
The avaliable operators are:
|
|
|
|
| Precedence | Symbol | WASM instruction(s) | Description |
|
|
| ---------- | ---------------- | -------------------------------------- | ----------------------------- |
|
|
| 1 | - | fxx.neg, ixx.sub | Unary negate |
|
|
| | ! | i32.eqz | Unary not / equal to zero |
|
|
| 2 | as | default signed casts | Type cast |
|
|
| 3 | ?, ! | i32.load_u8_u, i32.load | load byte/word |
|
|
| 4 | * | ixx.mul, fxx.mul | Multiplication |
|
|
| | /, % | ixx.div_s, fxx.div, ixx.rem_s | signed division / remainder |
|
|
| | #/, #% | ixx.div_u, ixx.rem_u | unsigned division / remainder |
|
|
| 5 | +, - | xxx.add, xxx.sub | Addition, substraction |
|
|
| 6 | <<, >>, #>> | ixx.shl, ixx.shr_s, ixx.shr_u | Shifts |
|
|
| 7 | ==, != | xxx.eq, xxx.ne | Equal, not equal |
|
|
| | <, <=, >, >= | ixx.lt_s, ixx.le_s, ixx.gt_s, ixx.ge_s | signed comparison |
|
|
| | | fxx.lt, fxx.le, fxx.gt, fxx.ge | |
|
|
| | #<, #<=, #>, #>= | ixx.lt_u, ixx.le_u, ixx.gt_u, ixx.ge_u | unsigned comparison |
|
|
| 8 | &, \|, ^ | ixx.and, ixx.or, ixx.xor | Bitwise logic |
|
|
| 9 | <\| | n/a | take first, see sequencing |
|
|
|
|
You can obviously group sub-expression using brackets `( )`.
|
|
|
|
Functions can be called using familiar `function_name(parameters)` syntax.
|
|
|
|
There are intrinsic functions for all WASM instructions that simply take a number of parameters and return a value. So you can, for example, do something like `i32.clz(value)` to use instructions that don't map to their own operator.
|
|
|
|
Some common float instructions have shortcuts, ie. they can be used without the `f32.` / `.f64.` prefix:
|
|
|
|
`sqrt, min, max, ceil, floor, trunc, nearest, abs, copysign`
|
|
|
|
`name := value` both assigns the value to the variable `name` and returns the value (using the `local.tee` WASM instruction).
|
|
|
|
Blocks are delimited by curly braces `{ }`. They contain zero or more statements, optionally followed by an expression. They evaluate to the
|
|
value of that final expression if it is there.
|
|
|
|
So for example this block evaluates to 12:
|
|
|
|
```
|
|
{
|
|
let a = 5;
|
|
let b = 7;
|
|
a + b
|
|
}
|
|
```
|
|
|
|
Blocks are used as function bodies and in flow control (`if`, `block`, `loop`), but can also used at any point inside an expression.
|
|
|
|
Variable re-assignments of the form `name = name <op> expression` can be shortened to `name <op>= expression`, for example `x += 1` to increment `x` by one. This works for all arithmetic, bit and shift operators.
|
|
The same is allowed for `name := name <op> expression`, ie. `x +:= 1` increments `x` and returns the new value.
|
|
|
|
#### Flow control
|
|
|
|
`if condition_expression { if_true_block } [else {if_false_block}]` executes the `if_true_block` if the condition evaluates to a non-zero integer and
|
|
the `if_false_block` otherwise (if it exists). It can also be used as an expression, for example:
|
|
|
|
```
|
|
let a = if 0 { 2 } else { 3 }; // assigns 3 to a
|
|
```
|
|
|
|
If the `if_false_block` contains exactly one `if` expression or statement you may omit the curly braces, writing `else if` chains like:
|
|
```
|
|
if x == 0 {
|
|
doOneThing()
|
|
} else if x == 1 {
|
|
doThatOtherThing()
|
|
} else {
|
|
keepWaiting()
|
|
}
|
|
```
|
|
|
|
`block name { ... }` opens a named block scope. A branch statement can be used to jump to the end of the block. Currently, `block` can only be used
|
|
as a statement, returning a value from the block is not yet supported.
|
|
|
|
`loop name { ... }` opens a named loop scope. A branch statement can be used to jump back to the beginning of the loop.
|
|
|
|
`branch name` jumps to the end/start of the named `block` or `loop` scope. `branch_if condition: name` does the same if the condition evaluates to a
|
|
non-zero integer.
|
|
|
|
`return [expression]` returns from the current function with the value of the optional expression.
|
|
|
|
#### Memory load/store
|
|
|
|
To read from memory you specify a memory location as `base?offset`, `base!offset` or `base$offset`. `?` reads a byte, `!` reads a 32bit word
|
|
and `$` reads a 32bit float.
|
|
|
|
`base` can be any expression that evaluates to an `i32` while `offset` has to be a constant `i32` value. The effective memory address is the sum of both.
|
|
|
|
Writing to memory looks just like an assignment to a memory location: `base?offset = expression`, `base!offset = expression` and `base$offset = expression`.
|
|
|
|
When reading/writing 32bit words you need to make sure the address is 4-byte aligned.
|
|
|
|
These compile to `i32.load8_u`, `i32.load`, `f32.load`, `i32.store8`, `i32.store` and `f32.store`.
|
|
|
|
In addition, all wasm memory instructions are available as intrinsics:
|
|
|
|
```
|
|
<load-ins>(<base-address>[, <offset>, [<align>]])
|
|
|
|
offset defaults to 0, align to the natural alignment: 0 for 8bit loads, 1 for 16bit, 2 for 32 bit and 3 for 64bit.
|
|
```
|
|
|
|
with `<load-ins>` being one of `i32.load`, `i32.load8_u`, `i32.load8_s`, `i32.load16_u`, `i32.load16_s`,
|
|
`i64.load`, `i64.load8_u`, `i64.load8_s`, `i64.load16_u`, `i64.load16_s`, `i32.load32_u`, `i32.load32_s`,
|
|
`f32.load` and `f64.load`.
|
|
|
|
```
|
|
<store-ins>(<value>, <base-address>[, <offset>, [<align>]])
|
|
|
|
offset and align defaults are the same as the load intrinsics.
|
|
```
|
|
with `<store-ins>` being one of `i32.store`, `i32.store8`, `i32.store16`, `i64.store`, `i64.store8`,
|
|
`i64.store16`, `i64.store32`, `f32.store` and `f64.store`.
|
|
|
|
#### Data
|
|
|
|
Data sections are written in `data` blocks:
|
|
|
|
```
|
|
data <address> {
|
|
...
|
|
}
|
|
```
|
|
|
|
The content of such a block is loaded at the given address at module start.
|
|
|
|
Inside the data block you can include 8, 16, 32, 64, f32 or f64 values:
|
|
|
|
```
|
|
i8(1, 255) i16(655350) i32(0x12345678) i64(0x1234567890abcdefi64) f32(1.0, 3.141) f64(0.5f64)
|
|
```
|
|
|
|
Strings:
|
|
```
|
|
"First line" i8(13, 10) "Second line"
|
|
```
|
|
|
|
And binary files:
|
|
|
|
```
|
|
file("font.bin")
|
|
```
|
|
|
|
#### Advanced sequencing
|
|
|
|
Sometimes when sizeoptimizing it helps to be able to execute some side-effecty code in the middle an expression.
|
|
Using a block scope, we can execute any number of statements before evaluating a final expression to an actual value. For example:
|
|
|
|
```
|
|
let x = { randomSeed(time); random() }; // set the random seed right before obtaining a random value
|
|
```
|
|
|
|
To execute something after evaluating the value we want to return, we can use the `<|` operator. Here is an example from the Wasm4 version of
|
|
Skip Ahead (see the example folder for the full source):
|
|
|
|
```
|
|
text(8000, set_color(c) <| rect(rx, y, rw, 1), set_color(4));
|
|
```
|
|
|
|
Here, we first set the color to `c`. `set_color` also happens to return the constant `6` which we want to use for the text x-position but only
|
|
after drawing a rectangle with color `c` and setting the color for the text to `4`. This line compiles to the following sequence:
|
|
|
|
* Push `8000` onto the stack
|
|
* Call `set_color(c)` which sets the drawing color and pushes 6 on the stack
|
|
* Call `rect` which draws a rectangle with the set color. This call doesn't affect the stack.
|
|
* Call `set_color(4)` which sets the drawing color to `4` and pushes another 6 on the stack.
|
|
* Call `text` with the parameters (`8000`, `6`, `6`) pushed on the stack.
|
|
|
|
## Limitations
|
|
|
|
The idea of CurlyWas is to be able to hand-craft any valid WASM program, ie. having the same amount of control over the instruction sequence as if you would write in the web assembly text format (`.wat`) just with better ergonomics.
|
|
|
|
This goal is not yet fully reached, with the following being the main limitations:
|
|
|
|
* CurlyWas currently only targets MVP web assembly + non-trapping float-to-int conversions. No other post-MVP features are currently supported. Especially "Multi-value" will be problematic as this allows programs that don't map cleanly to an expression tree.
|
|
* Memory intrinsics are still missing, so only (unsigned) 8 and 32 bit integer reads and writes are possible.
|
|
* `block`s cannot return values, as the branch instructions are missing syntax to pass along a value.
|
|
* `br_table` and `call_indirect` are not yet implemented. |