ArkScript bytecode specification

You will find ArkScript bytecode specification on page, if you are interested in implementing your own virtual machine, or just want to learn more.

ArkScript bytecode headers

NameSizeDescription
Magic number4 bytes6386283, numeric version of "ark\0"
Compiler.Major2 bytesBig endian layout
Compiler.Minor2 bytesBig endian layout
Compiler.Patch2 bytesBig endian layout
Timestamp8 bytesBuild time (Unix format), Big endian layout
SHA25632 bytesSHA256 of the tables and code segments for integrity check
Symbols table
Symbols.count2 bytesBig endian layout
Symbol.valueVariableNull-terminated string
Values table
Values.count2 bytesBig endian layout
Symbol.type1 byte1 for number, 2 for string, 3 for function
Number.valueVariableNull-terminated string representation of the number
String.valueVariableNull-terminated string
Function.value2 bytesBig endian layout
Code segments
Instruction count2 bytesBig endian layout, can be 0
Instruction4 bytesInstructions follow this layout: pppppppp iiiiiiii dddddddd dddddddd ; p for padding (always ignored), i for the instruction, d for the immediate argument

Note on builtins

Builtins are handled with BUILTIN id, with id being the id of the builtin function object. The ids of the builtins are listed below.


NameID
false0
true1
nil2

The other builtins are listed in Builtins.cpp.

The stack and the locales

The stack is used for passing temporary values around, for example the arguments of a function. On the other end the locales are there to store long term values, the variables. They are stored in a LIFO stack and should be referenced by there identifier (index in the symbols table, also used by instructions like LOAD_SYMBOL).

Instructions

TS represents the element at the top of the stack, TS1 represents the element below it, and so on.

CodeArgument(s)Job
NOP (0x00) Does Nothing
LOAD_SYMBOL (0x01)symbol idLoad a symbol from its id onto the stack
LOAD_CONST (0x02)constant idLoad a constant from its id onto the stack. Should check for a saved environment and push a Closure with the page address + environment instead of the constant
POP_JUMP_IF_TRUE (0x03)absolute address to jump toJump to the provided address if the last value on the stack was equal to true. Remove the value from the stack no matter what it is
STORE (0x04)symbol idTake the value on top of the stack and put it inside a variable named following the symbol id (cf symbols table), in the nearest scope. Raise an error if it couldn't find a scope where the variable exists
LET (0x05)symbol idTake the value on top of the stack and create a constant in the current scope, named following the given symbol id (cf symbols table)
POP_JUMP_IF_FALSE (0x06)absolute address to jump toJump to the provided address if the last value on the stack was equal to false. Remove the value from the stack no matter what it is
JUMP (0x07)absolute address to jump to (two byte, big endian)Jump to the provided address
RET (0x08) If in a code segment other than the main one, quit it, and push the value on top of the stack to the new stack; should as well delete the current environment. Otherwise, acts as a HALT
HALT (0x09) Stop the Virtual Machine
CALL (0x0a)number of arguments when calling the functionCall function from its symbol id located on top of the stack. Take the given number of arguments from the top of stack and give them to the function (the first argument taken from the stack will be the last one of the function). The stack of the function is now composed of its arguments, from the first to the last one
CAPTURE (0x0b)symbol idUsed to tell the Virtual Machine to capture the variable from the current environment. Main goal is to be able to handle closures, which need to save the environment in which they were created
BUILTIN (0x0c)id of builtinPush the builtin function object on the stack
MUT (0x0d)symbol idTake the value on top of the stack and create a variable in the current scope, named following the given symbol id (cf symbols table)
DEL (0x0e)symbol idRemove a variable/constant named following the given symbol id (cf symbols table)
SAVE_ENV (0x0f) Save the current environment, useful for quoted code
GET_FIELD (0x10)symbol idUsed to read the field named following the given symbol id (cf symbols table) of a Closure stored in TS. Pop TS and push the value of field read on the stack
PLUGIN (0x11)constant idUsed to load a plugin dynamically, plugin name is stored as a string in the constants table
LIST (0x12)number of argumentsCreate a list from the elements pushed on the stack in reverse order
APPEND (0x13)number of argumentsAppend elements to a list in reverse order (first the last element, then the other, then the list itself)
CONCAT (0x14)number of argumentsConcatenate lists in reverse order
APPEND_IN_PLACE (0x15)number of argumentsAppend elements to a reference to a list (TS) in reverse order (first the last element, then the other, then the list itself). Push nil on the stack
CONCAT_IN_PLACE (0x16)number of argumentsConcatenate lists in reverse order, to a reference to a list (TS). Push nil to the stack
POP_LIST (0x17) Remove an element from a list (TS), given an index (TS1). Push the modified list to the stack
POP_LIST_IN_PLACE (0x18) Remove an element from a reference to a list (TS), given an index (TS1). Push nil to the stack
POP (0x19) Remove the top of the stack
ADD (0x20) Push TS1 + TS
SUB (0x21) Push TS1 - TS
MUL (0x22) Push TS1 * TS
DIV (0x23) Push TS1 / TS
GT (0x24) Push TS1 > TS
LT (0x25) Push TS1 < TS
LE (0x26) Push TS1 <= TS
GE (0x27) Push TS1 >= TS
NEQ (0x28) Push TS1 != TS
EQ (0x29) Push TS1 == TS
LEN (0x2a) Push len(TS), TS must be a list
EMPTY (0x2b) Push empty?(TS), TS must be a list
TAIL (0x2c) Push tail(TS), all the elements of TS except the first one (TS must be a list)
HEAD (0x2d) Push head(TS), the first element of TS or nil if empty (TS must be a list)
ISNIL (0x2e) Push true if TS is nil, false otherwise
ASSERT (0x2f) Throw an exception if TS1 is false, and display TS (must be a string). Otherwise, push nil
TO_NUM (0x30) Convert TS to number (must be a string)
TO_STR (0x31) Convert TS to string (must be a number)
AT (0x32) Push the value at index TS (must be a number) in TS1 (must be a list)
AND_ (0x33) Push true if TS and TS1 are true, false otherwise
OR_ (0x34) Push true if TS or TS1 is true, false otherwise
MOD (0x35) Push TS1 % TS
TYPE (0x36) Push the type of TS as a string
HASFIELD (0x37) Check if TS1 is a closure field of TS. TS must be a Closure and TS1 a String
NOT (0x38) Push !TS