Understanding the project architecture
Global architecture
The src/
folder is divided in two subfolders:
arkreactor/
, the compiler and the runtimearkscript/
, the CLI and the REPL
Builtins
All the builtin functions (and constants) are located in include/Ark/Builtins
. Those can be used with
the bytecode instructions BUILTIN id
. Adding one will need to reference it in include/Ark/Builtins/Builtins.hpp
and in src/arkreactor/Builtins/Builtins.cpp
, and then implementing it accordingly under
src/arkreactor/Builtins/file.cpp
.
For more details on how to implement a module see the tutorial.
Compiler
All the compiler passes are located under include/Ark/Compiler/
.
- the Welder calls the Parser on a given piece of code
- the Parser returns an AST
- the ImportSolver is called to resolve any
import
left- the Parser returns an AST for each file
- the AST are aggregated into one
- the Welder calls the Macro Processor on the ImportSolver final's AST, to process and remove macros from the AST
- the Welder calls the Name Resolver on the Macro Processor final's AST, to resolve names, prefixes and packages. It also checks for undefined symbols, constness violations and constant redefinitions
- the Welder calls the AST optimizer on the Name Resolver's AST to remove dead code
- the Compiler generates IR for the IR Optimizer from the modified AST
- the IR Optimizer replaces some groups of 2 or more instructions with super instructions (one instruction doing specialized work)
- the IR Compiler compiles the modified IR to bytecode, which can be used as is by the virtual machine, or be saved to a file
For more details on how it works, see the implementation details.
REPL (read-eval-print-loop)
Is located under include/CLI/REPL/
. Basically it's an abstraction level over replxx
(external library for
completion and coloration in the shell) and our virtual machine to run user inputs.
Virtual Machine
It lies under include/Ark/VM/
and all the folders under it.
- it handles the Closures which capture whole Scopes through
shared_ptr
. Closures are functions with a saved scope, so they can retain information over multiple calls - the Plugin loader is a generic DLL / SO / DYNLIB loader, used by the virtual machine to load ArkScript modules (
.arkm
files) - the Scope is how we store our mapping
variable id => value
, heavily optimized for our needs - the State is:
- reading bytecode
- decoding it
- filling multiple tables with it (symbol table, value table, code pages), which are then used by the virtual machine. It allows us to load a single ArkScript bytecode file and use it in multiple virtual machines.
- the State retains tables which are never altered by the virtual machines
- it can also compile ArkScript code and files on the go, and run them right away
- the UserType is how we store C++ types unknown to our virtual machine, to use them in ArkScript code
- the Value is a very big proxy class to a
variant
to store our types (std::string, double, Closure, UserType and more), thus it must stay small because it's the primitive type of the virtual machine and the language. It provides proxy functions to the underlyingvariant
. - the virtual machine handles:
- the stack, a single
array
(the stack size is a constexpr value, thus it can be changed at compile time only) - a reference to the State, to read the tables and code segments
- a
void*
user_data, retrievable by modules and C++ user functions - the scopes, and their destruction
- the instructions, executed in
safeRun
which is enclosed in a try/catch to display tracebacks when errors occur - external function calls through its private
call
method (to call modules' functions and builtins) - value resolving, useful for modules when a function receives an ArkScript function, through the public method
resolve
- the stack, a single
For more details, see the implementation details.
If you feel that this section is lacking information, please open an issue.