Skip to content

Commit 2f305fe

Browse files
committed
progress on parser, still getting in the swing of things
1 parent c4982a9 commit 2f305fe

14 files changed

+1235
-0
lines changed

.eslintrc

+2
Original file line numberDiff line numberDiff line change
@@ -16,8 +16,10 @@
1616
"indent": [2, 4, {"SwitchCase": 1}],
1717
"linebreak-style": 0,
1818
"max-len": [2, 250],
19+
"no-cond-assign": 0,
1920
"no-continue": 0,
2021
"no-else-return": 0,
22+
"no-empty": 0,
2123
"no-loop-func": 0,
2224
"no-nested-ternary": 0,
2325
"no-param-reassign": 0,

README.md

+107
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,109 @@
11
# renlang
22
The Ren Programming Language
3+
4+
## Goals
5+
6+
The goals of this language haven't been entirely solidified yet. As of now, the goals are:
7+
8+
- Strong, static type system
9+
- Type inference
10+
- Object-oriented
11+
- Abstract data types (ADTs, including Monads, Functors, etc.)
12+
- Optional pure functional logic
13+
- Optional imperative logic
14+
- Hybrid quasi-pure logic
15+
- Function contexts (new concept based on several similar ideas)
16+
- Async programming
17+
- Generator functions
18+
- Destructuring/Pattern matching
19+
- Succinct data structure literals (lists, objects, maps, etc.)
20+
- Clear delimitation (braces, commas, and semi-colons over whitespace/indentation)
21+
- Easy low-level programming with raw types (types that provide a direct mapping to memory)
22+
- Pointers (modelled after Rust)
23+
- and much more
24+
25+
From a high level, this language strives to be practical by providing tons of syntax and features designed to make it easy to choose a solution for any given problem.
26+
27+
The language includes some opinionation, but anything should be possible. It should be able to be used for any of the following scenarios and more:
28+
29+
- Game development
30+
- Web development
31+
- Systems development
32+
- Application development
33+
- High-concurrency programming
34+
- Scripting
35+
- etc.
36+
37+
It also comes with a suite of development tools built-in:
38+
39+
- Package manager
40+
- Build tool
41+
- Test framework
42+
- Profiling tools
43+
- Debugging tools
44+
- Code coverage tool
45+
- Linting tool
46+
47+
The module system is heavily inspired by the ECMAScript module system, with a bunch of extra goodies added:
48+
49+
- All module dependencies are file-system-relative, i.e. a module import will search in the file's directory for that module, and then ascend upward to the root directory until it finds the desired module.
50+
- A module file can be used to specify other relative files around it that it will export. This allows for encapsulation.
51+
- Module importing is string-based, to promote the idea that you are simply specifying a relative path. A module's "name" is simply its path.
52+
- Modules export named values, and an optional default value that will be imported when no name is specified.
53+
- Importing a directory as a module will check an 'index.ren' file for export information.
54+
- Modules control compiler settings for the code within them. These settings can also cover modules in sub-directories.
55+
56+
## Progress
57+
58+
This project is starting from scratch, and will build a full compiler using no external dependencies. This is the TODO list:
59+
60+
- [ ] Grammar
61+
- [ ] AST logic
62+
- [ ] Lexical grammar
63+
- [ ] Lexer
64+
- [ ] Parser
65+
- [ ] Type checker
66+
- [ ] IR code generator
67+
- [ ] Runtime library
68+
- [ ] Interpreter
69+
- [ ] Self-hosting
70+
- [ ] ...backend
71+
- [ ] Fully self-hosting
72+
- [ ] More backends
73+
- [ ] Test framework/coverage tool
74+
- [ ] Build system
75+
- [ ] Debugger
76+
- [ ] Profiler
77+
- [ ] Linter
78+
- [ ] Package manager
79+
80+
So the workflow is as so:
81+
82+
1. Come up with a formal grammar for the language.
83+
2. Create a simple AST library.
84+
3. Extract the lexical grammar from the formal grammar.
85+
4. Write the lexer (component that splits a source code string into tokens).
86+
5. Write the parser (component that converts a token sequence into an AST).
87+
6. Write the type checker (component that verifies that the AST is valid according to specified types)
88+
7. Write the IR code generator (component that spits out some simple middle-level language that can be easily executed)
89+
8. Write the runtime library (library of basic functionality that can be used to create programs, e.g. IO, threads, string operations, CLI argument parsing, data structures)
90+
9. Write an interpreter (a component that is capable of executing the IR for the time being, so that the next step works...)
91+
10. Make the language self-hosting (rewrite all logic (except the interpreter) in the language, now that it can be executed)
92+
11. Create the x86 backend (this will be split out into more steps, but I'm not sure what most of them are at the moment)
93+
12. Make the language fully self-hosting, so that all logic (including the interpreter) is written in the language
94+
13. Create more backends (more of a long-term process)
95+
- JS backend (for web apps, ren.js)
96+
- wasm backend (for web apps, ren.wasm)
97+
- JVM backend (jren)
98+
- LLVM backend
99+
- ARM backend
100+
- CLR backend (ren.net)
101+
- GCC backend (cren)
102+
14. Build the test framework, convert tests to use it (a simpler one will have already been created prior)
103+
15. Build the build system (component that provides a mechanism for building applications in Ren)
104+
16. Build the debugger (component that allows halting and inspection of a program while it is running)
105+
17. Build the profiler (component that provides detailed statistics of runtime performance)
106+
18. Build the linter (component that allows developers to specify and enforce style rules)
107+
19. Build the package manager (component that allows developers to publish packages for usage by other developers)
108+
- This will be interesting because it requires hosting. Luckily, we can probably just use Github to start out.
109+
- I can have a separate branch of renlang that contains a package registry, and developers can publish packages via pull requests.

docs/grammar.txt

+118
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,118 @@
1+
----
2+
Base Grammar
3+
----
4+
This will be the grammar that we start out with and gradually add features to. We need a base level that works because it's hard to enumerate all features.
5+
Here are the features that will be included:
6+
- functions
7+
- functions are first-class
8+
- pointers (this will make certain things easier, later revisions will put restrictions on these)
9+
- integers (floating-point numbers will be added later)
10+
- all typical sizes (8,16,32,64)
11+
- arrays
12+
- base types (ADTs)
13+
- type expressions: | (union), & (intersection)
14+
- generics
15+
- product types (tuples, tagged tuples)
16+
- classes/structs (structs are raw and have different inheritance rules than classes)
17+
- constructors
18+
- methods
19+
- fields
20+
- inheritance
21+
- static functions
22+
- pure functions (functions that are just expressions)
23+
- function calls
24+
- operators (infix functions)
25+
- let-in expressions
26+
- where clauses
27+
- if-else expressions
28+
- tail recursion optimization
29+
- match expressions
30+
- FOR NOW: pure functions can call imperative functions, as long as they aren't void
31+
- imperative functions (functions that are sequences of instructions)
32+
- can do everything that pure functions can, plus...
33+
- blocks/statements
34+
- loops: for, foreach, while, do-while
35+
- return
36+
- throw, try-catch, try-catch-finally (EXCEPTIONS, only imperative functions can deal with exceptions for now)
37+
- quasi-pure functions (this is more complex but it is a central part of the language)
38+
- pure functions that look like imperative functions
39+
- assignments create new allocations, don't break old references
40+
- loops not allowed
41+
- any parameters modified will be implicitly returned, which leads to...
42+
- function contexts (again, complex but central)
43+
- implicit parameters for quasi-pure functions, pure and imperative can access them explicitly
44+
- examples: 'this' for classes/structs, 'global' for mutable state
45+
- functions using contexts must explicitly declare them (except for 'this' and 'global'), callers will implicitly pass them
46+
- modified contexts will be implicitly returned so the caller will hold a new reference to the modified context
47+
- unmodified contexts
48+
- 'this'
49+
- built-in context that makes it easier to mutate instances from methods in a pure way
50+
- always refers to an instance of the class containing the method
51+
- '<inst>.<blah>()' will pass 'inst' as 'this' into 'blah'
52+
- 'global'
53+
- built-in context that provides access to impure code/state from pure code
54+
- anytime 'global' is used to access impure code/state, it returns a modified instance of itself
55+
- 'global' is passed implicitly into 'main', so it can be accessed anywhere in the application
56+
- modules
57+
- modules' named exports can be imported individually
58+
- default export is used when only the module name is imported
59+
- modules can export named exports or default export
60+
- each imported module is evaluated according to node-esque rules
61+
- 'main'
62+
- for now, this is the top-level of all applications
63+
- the specified module passed into the command line must contain 'main' so that the program can start
64+
----
65+
66+
# Breaks are implicit in this grammar to improve clarity.
67+
# A break is a new line or a semicolon, so semicolons are only required when putting multiple incompatible constructions on the same line
68+
69+
# A Program is a list of top-level components, which can be:
70+
# - functions
71+
# - types
72+
# - classes
73+
# - structs
74+
# - import declarations
75+
# - export declarations
76+
# A stipulation is that import declarations must come before any other declaration
77+
Program ::= ImportDeclaration* FreeDeclaration*
78+
FreeDeclaration ::= ComponentDeclaration
79+
| ExportDeclaration
80+
81+
ExportDeclaration ::= EXPORT DEFAULT Expression -- default export
82+
| EXPORT IDENT EQUALS Expression -- named inline export
83+
| EXPORT IDENT -- named export of already declared name
84+
85+
ComponentDeclaration ::= FunctionDeclaration
86+
| TypeDeclaration
87+
| ClassDeclaration
88+
| StructDeclaration
89+
90+
# pure <returnType> <name>\<<typeParameters>\>(<parameters>) => <expression>
91+
# proc <returnType> <name>\<<typeParameters>\>(<parameters>) => <block>
92+
# func <returnType> <name>\<<typeParameters>\>[<contextParameters>](<parameters>) => <quasiBlock>
93+
FunctionDeclaration ::= PURE Type IDENT TypeParameterList? ParameterList FAT_ARROW Expression
94+
| PROC Type IDENT TypeParameterList? ParameterList FAT_ARROW Block
95+
| FUNC Type IDENT TypeParameterList? ContextParameterList? ParameterList FAT_ARROW QuasiBlock
96+
97+
TypeParameterList ::= LT GenericParam (COMMA GenericParam)* GT
98+
ContextParameterList ::= LBRACK Param (COMMA Param)* RBRACK
99+
ParameterList ::= LPAREN (Param (COMMA Param)*)? RPAREN
100+
101+
Param ::= Type IDENT
102+
103+
GenericParam ::= IDENT
104+
| IDENT COLON Type
105+
106+
# type <name>
107+
# type <name> = <typeExpression>
108+
# type <name> { <typeComponentDeclarations> }
109+
# type <name>(<types>)
110+
# type <name>(<type> <fieldName>...)
111+
TypeDeclaration ::= TYPE IDENT
112+
| TYPE IDENT EQUALS Type
113+
| TYPE IDENT LBRACE TypeComponentDeclaration+ RBRACE
114+
| TYPE IDENT LPAREN Type+ RPAREN
115+
| TYPE IDENT LPAREN (Type IDENT)+ RPAREN
116+
117+
TypeComponentDeclaration ::= IDENT
118+
| IDENT EQUALS

0 commit comments

Comments
 (0)