A highly inefficient, very experimental, and absolutely explicit compiled general-purpose language that is adequately comfortable to write.
It features manual memory management and inherently UTF-8 safe strings.
The book Writing An Interpreter In Go got me started. Highly recommend it!
An amalgamation of Go, Rust, C#, JS, and PHP syntax, maybe some Zig added for flavour.
This is will be a compiler targeting C as its back-end.
The bootstrapping process will be done entirely in PHP, for personal reasons.
Once v0.0.1 is fully operable, this entire repository will be rewritten in Wes (with some C interop).
Wes is intended to be a compiled general purpose language. It will feature a vast standard library, hopefully as great as Go, including HTTP server capabilities. It also will feature interoperability with C (mostly for reliance on established C libraries to get started).
You don't. Not yet.
You can run the hand compiled examples to see that the resulting C works. So far no process exists to automatically generate C from Wes source.
This will be the first proper compiler I will have ever written.
I have not spent much time with functional languages, that's why this isn't one.
It is mostly an exercise, although I intend to use this in personal projects to prove real world application and improve the language and standard library.
This language does not fill any niche, it does not satisfy an active need for "something new", it will probably die with me once my own clock runs out, but why would that stop me?
I intend to conquer the concept of writing interpreters and compilers, first with this, next maybe with something else entirely.
For now and probably the near future I don't think anyone should use this. Actually: Please don't use this in production code. Use:
- C, C++, Rust, Zig or Odin for your system programming
- Use Go, PHP, JS, etc. for your webservices
- Or just use any other language you are comfortable with
Yes, bootstrapping will be done entirely in PHP.
Why? Because that's the language I feel most comfortable writing. I have spent (as of writing) 6 years with this language and would consider myself quite competent with PHP.
Since this is a throwaway product, why bother writing it in Rust, OCaml or Go? Just use the thing you know best to get the job done. It's only for bootstrapping after all.
No you don't!
I know I'm likely talking to no one here. But since it will be open source, someone might decide to help out. Maybe some day, in a few years. Until then this is just for my personal learning.
boolis a Cboolbyteis a Cunsigned charintis a Clongfloatis a Cdoublecharis a custom UTF-8 character (5 bytes, yep)stringis a custom string of (not C)chararrayis essentially a map with array capabilities, just like PHP'sarrayerroris a caught error message consisting of- message
string - file
string - line
int - column
int - trace
array[array{line: int, column: int, class: string, function: string, args: array[?]}] - previous
error|null
- message
- Object is any form of object
- an instance of a
class - ranges created by the
..or..=syntax are instances of\Standard\Range
- an instance of a
typerepresents any of the above types
neveris not a data type, you cannot ever storenever- it is a method return type similar to
void, not returning anything - it forces a method marked as returning
neverto- either directly call to
\Standard\Process::exit()which also returnsnever - or call another method with return type
never
- either directly call to
- a call to a method with return type
neverdoes not necessary make the calling method require returningnever- only if a method always ends execution by always calling a method with return type
never(likeProcess::exit()) should you usenever
- only if a method always ends execution by always calling a method with return type
Program::main()SHOULD NOT be written with the return typenever(it should always bevoid)- since the end of
main()only implicitly stops the program - and
neveris for explicit code stops - which aids any potential IDE in detecting dead code
- since the end of
- methods with irrecoverable permanent loops SHOULD return
neverto signify no other code will ever run after it- those methods are required to catch all runtime errors inside the loop
It is basically the PHP never type.
/!\ This will be used but not enforced during the initial phase of this compiler.
Implementing this ruleset requires extensive "linting" and will be implemented eventually.
int, float, bool, char, type: always live on the stack.
byte also lives on the stack, you should only handle bytes with streams!
string, array, error, Object: always live on the heap
and need to be manually freed with the delete keyword.
Simple rules, no boxing and unboxing, no nothing.
All types are passed by value by default.
Strings will be automatically cloned before being passed to a new variable or a function. Similarly, arrays will be recursively cloned. Objects are expensively cloned recursively on every pass.
To improve performance, explicit passing by reference should be used where adequate.
References need to be explicitly dereferenced.
References themselves do not need to be freed as they live on the stack. Only the non-scalar value inside the reference needs manual cleanup.
This is basically a soft specification of the language features.
Full example projects are available in the examples directory.
namespace \App;
class Program {
public static function main() void {
// runs forever, until broken or returned
loop {
}
// exclusive range 0..10, 0 to including 9
for i in 0..10 {
}
// classic while
while condition == true {
}
// classic do while
do {
} while condition == true;
}
}namespace \App;
use \Standard\Format;
class Program {
public static function main() void {
let foo int = 1234;
// no (cast) shenanigans, we just have conversion methods
let bar float = foo.toFloat();
// strings need to be cleaned up to not leak memory
let str string = bar.toString();
Format::println(str);
// cleans up the string
delete str;
}
}namespace \App;
use \Standard\Format;
class Program {
public static function main() void {
let myObject MyClass = new MyClass();
// what PHP couldn't give us
myObject.{
foo = 13,
bar = 37,
}.doSomething();
// this is equivalent to
myObject.foo = 13;
myObject.bar = 37;
myObject.doSomething();
Format::println("{}{}".format(myObject.foo, myObject.bar));
delete myObject;
// can be easily used as initializer, returns the object instance
let otherObject OtherClass = new OtherClass().{fooBar = 1337};
let leet string = otherObject.toString();
delete otherObject;
Format::println(leet);
delete leet;
}
}namespace \App;
class Foo : Stringable {
private a int;
private b int;
public c int = 3;
public Foo(a int, b int = 2) {
this.a = a;
this.b = b;
}
public function toString() string {
return "A: {}\nB: {}\nC: {}".format(this.a, this.b, this.c);
}
}let myObj Foo = new Foo(1).{c = 4};
Format::println(myObj.toString());
// A: 1
// B: 2
// C: 4namespace \App;
use \Standard\Format;
use \Standard\Types;
class Dumper {
public static function dump(
data int|float|string|Stringable|null,
) string {
Format::println(self::getValue(data));
}
private static function getValue(data int|float|string|Stringable|null) string {
if data == null {
return "NULL";
}
let typeString string = Types::getType(data).toString();
let dataString string = data.toString();
defer {
delete typeString;
delete dataString;
};
return "{}({})".format(typeString, dataString);
}
}Dumper::printDump(1); // int(1)
Dumper::printDump(13.37); // float(13.37)
Dumper::printDump("Hello"); // string(Hello)
Dumper::printDump(null); // NULLThe compiler will enforce strict typing.
A value of string|int|null cannot be passed to a method expecting int
without previous assertions about the type as that would cause
undefined behaviour.
let value string|int|null = ValueGenerator::something();
// THIS WILL NOT COMPILE:
// value = MyCustomMathClass::add(value, 123);
if value === null {
return;
}
// compiler now knows value may only be string|int
if value instanceof string {
return;
}
// compiler now knows value may only be int
value = MyCustomMathClass::add(value, 123);namespace \App;
class Foo {
public function test(
// Go equivalent: []string
myStringArray array[string],
// Go equivalent: map[string]int
myStringIntMap array[string -> int],
// Go equivalent: map[string][]int
mapCouldHaveStringsOrArraysOfInts array[string|array[int]],
// Go equivalent: NONE
arrayWithKnownKeys array{id: int, uuid: string},
// Go equivalent: NONE ([string|int|float KEY] map[KEY]any)
arbitraryArrayMustBeExplicit array[?],
) string {
// ...
}
}public static function main() void {
let foo array[int] = array(10); // bucket-capacity¹
foo["a"] = 1300;
foo[0] = 37;
// append
foo[] = 123;
let bar array[string] = [
"123",
"456",
"789",
];
// this lets you modify a value via variable
// but more importantly this prevents copying
// which can be very important for large values like objects!
let bar1Ref = bar&[1];
*bar1Ref = "000";
// bar now is ["123", "000", "789"]
Format::println(
"{} {} {} {}".format(
foo["a"] + foo[0],
foo[1],
foo.capacity,
foo.length,
),
);
}¹ Increased bucket capacity may significantly increase performance at the cost of slight memory usage overhead as the collision handling implementation uses linked lists on collision which is slower than direct access.
There is no extends keyword.
There also is no single- or multi-inheritance.
We just have classes with traits.
namespace \App;
trait User {
public id int;
public username string;
}namespace \App;
class Administrator {
use User;
public function doSomethingAdministrative() void {}
}namespace \App;
class User {
use User;
}public function getUserId(user User) int {
return user.id;
}public function isAdministrator(user User) bool {
return user instanceof Administrator;
}namespace \App;
use \Standard\Process;
trait LogTrait {
protected function log(message string) void {
// do the logging to file
}
protected function fatal(message string) never {
this.log(message);
Process::exit(1);
}
}namespace \App;
use \Standard\Format;
use \Standard\Process;
trait FatalTrait {
protected function fatal(message string) never {
Format::println(message);
Process:exit(1);
}
}namespace \App;
class FooService {
use LoggerTrait { log }, FatalTrait;
public function whatever() {
this.log("Test"); // LoggerTrait
this.fatal("OH NO!"); // FatalTrait
}
}Errors are basically interfaces.
They always only have one attached value, a message string.
I mostly came up with the syntax on my own and after the fact realized it is quite similar to Zig. After taking a look at Zig, I streamlined the syntax to look saner.
namespace \App\Error;
error AppError;namespace \App\Error;
use \Standard\Error\InvalidArgumentError;
error AppInvalidArgumentError : AppError, InvalidArgumentError;namespace \App\Error;
error AppSomethingIsNotRightError : AppError;namespace \App;
use \Standard\Format;
use \Standard\Process;
class Program {
// ! indicated this may return an error
private static function doSomethingRisky() !void {
throw AppSomethingIsNotRightError "You done goofed up.";
}
// this has to still be explicitly !void
// since it doesn't intercept nested errors
// compiler will (at some point) enforce that
private static function mightDoSomethingRisky() !void {
this.doSomethingRisky();
}
public static function main() void {
mightDoSomethingRisky() catch err {
AppError => {
Format::println(err.toString());
Process::exit(1);
},
};
}
}Taken right out of the Go cookbook.
public function foo() !void {
let bar string = "Abc 123";
// bar will be automatically cleaned up no matter what
defer { delete bar; };
this.doSomethingRiskyWithString(bar);
}public function foo() void {
// this doesnt leak memory
let fmt string = "Foo: {}";
Format::println(fmt.format(1337));
delete fmt;
// this would leak memory, if it were just a string, but it's a string-literal!
// they get automatically cleaned up once out of scope
// string variable assignments clone literals to gain ownership, which probably eats performance
// I desperately need to remember to implement it this way!
Format::println("Foo: {}".format(1337));
}$PROJECT_DIR$/wes.toml
[project]
# TODO
[project.source]
namspace="App"
directory="src"
[project.interop]
load="./c/load.h"$PROJECT_DIR$/c/load.h
#include "libs/leet.h"$PROJECT_DIR$/c/libs/leet.h
double leet(long x, double y) {
return (double)x + y;
}Identifiers starting with $ are called "Lexer Directives".
A thing I just came up with after 4 days of trying to create an interop syntax.
They completely change the behaviour of the lexer until the end-sequence $end is encountered¹.
¹ "encountering" really depends on the individual lexer directive,
as the $run directive has C-string, C-char, and comment aware parsing and
won't "encounter" $end inside string-/char-literals and comments.
$PROJECT_DIR$/src/Program.wes
namespace \App;
use \Standard\Format;
class Program {
public static function main() void {
let foo int = 13;
let bar float = 0.37;
$pass foo as a, bar as b $end
$run
double result = leet(a, b);
$end
let result float = $get result as float $end;
Format::println(result.toString());
}
}IMPORTANT: The Wes dereference operator
*has higher binding power than C.It always dereferences the variable next to it.
*foo.*baris equivalent to C*((*foo).bar)!Similarly, the reference operator
&differs from C in terms of syntax.
user.&nameis the syntax to access a property reference. It is equivalent to C&(user.name).
public static function main() void {
let foo int = 100;
let bar int = foo;
let fooref &int = &foo;
bar += 1;
*fooref += 2;
Format::print("Foo: {}\nBar: {}\nFooRef: {}\n".format(foo, bar, *fooref));
}Foo: 102
Bar: 101
FooRef: 102This example shows a &User having a &Group that also has array<&User>.
public static function main() void {
let user &User = UserRepository::findUserById(123);
let admin &User|null = null;
// equivalent to C user->group->users
// or C *(*(*user).group).users
for userInGroup in *user.*group.*users {
if *userInGroup.administrator == true {
admin = userInGroup;
break;
}
}
if admin == null {
Format::println("No administrator found!");
return;
}
Format::println("Administrator: {}".format(*admin.name));
}let foo string = Whatever::generateStringOrNull() ?? "default";
let username string|null = UserRepository::findById(123)?.username;There are no backed enums like PHP, just
int.
enum TokenType {
Illegal,
Eof,
Identifier,
Integer,
Decimal,
// ...
}
let tokenType TokenType = TokenType::Identifier;Missing
defaultbranch will result in a NULL-Value on runtime and may cause unexpected behaviour or crashes if the type does not acceptnull.Always complete your matches or ensure matched null value is accepted as
null.
matchexpressions may mix between code and value syntax. The resulting C code will always use a code syntax equivalent for value branches.
let test int|string|null = match val {
"test" => 1,
"foo", "bar", "foobar" => 2,
"thingy" => "stringy",
default => null,
};The following result is not optimized at all, it doesn't use any proper lookup. That is fine for now.
Omitting
giveresults innullbeing the given value of the branch.
let result int = match val {
0 => {
give A::do(val);
},
1, 2 => {
give B::do(1);
},
3, 4, 5 => {
C::do(val);
give D::do();
},
};public function demo(name string, role Role = Role::User) {
// ...
}- named arguments
- static null safe accessor
?:: - attributes (available through reflection)
- variadic functions
- ...
