I want AST generation in V compiler.
V compiler currently do many things (parsing, code generating, source code formatting) in just 1 file (parser.v). It makes V compiler's source code really complex and hard to read.
Once we have AST generation in V compiler, we can split these process into individual parts.
I hope Alex have a plan to implement AST generation.
Yes, this is mandatory, also for native code generation.
TCC generates machine code without AST, one of the reasons it can build a million lines of code per second.
I agree with you completely. But if we build a project with millions of lines of code, AST is going to require lots of extra CPU power and RAM.
Does it really make the code that complex? Everything is in one place. You need to modify an if statement, you go to if_statement(). Var declaration? var_decl().
With a more complex structure you always have to find your way in a web of function calls and abstractions.
In Go, for example, sometimes I want to have a look at how one method is implemented, but I often have to go 3-4 levels deep across various files to find the actual implementation.
Also, just like TCC, V's compiler is going to be simple and small, and without many changes.
TCC is Tiny C Compiler?
https://bellard.org/tcc/
I love to understand how V parser works for those without knowledge in AST and not someone else's messy code. Hint, a flow chart is necessary, Rust presentation does it.
Creating AST could be helpful when working on optimising later on, or is the goal to keep things that simple that it would be 'easy' enough to optimise without requiring the AST abstractions?
This is true. AST will most likely have to be implemented for optimization.
Optimization will be worked on after the 1.0 release.
@vertical222 it's explained here:
I read that the day you posted but I prefer to understand from the code.
Flowchart editor can help us visualize better.
https://ivanceras.github.io/svgbob-editor/
Does it really make the code that complex? Everything is in one place. You need to modify an if statement, you go to if_statement(). Var declaration? var_decl().
At least, It is complex for me...
functions in parser.v do too many things (parse code, register type, check type, generate code...) and it confused me.
AST does not only simplify the compiler's source code but also help us to analyze and optimize the given source code. Many optimization algorithms and analysis algorithms assume that we have AST (or IR). If we have AST, we can use these algorithms.
IMO, compilation speed is really important, but we should not stick to it because people do not select a
programming language only based on compilation speed.
For a simple compiler with a limited scope it is fine to skip AST or IR and generate code directly in my opinion.
But if the compiler grows with optimizations, multiple backends, language servers, IDE support, etc. it will get more complicated and/or require duplicated work if there are no any intermediate steps.
I see that an x64 backend is in the works, how is that implemented? A complete reimplementation/duplicate of parser.v with x64 codegen or is the parsing and type checking shared with the C backend?
If compilation speed is important it can still be beneficial to have cleanly separated steps that can be benchmarked and optimized individually.
I see that an x64 backend is in the works, how is that implemented? A complete reimplementation/duplicate of parser.v with x64 codegen or is the parsing and type checking shared with the C backend?
It's all in one file and it's a bit messy.
So yes, AST is unavoidable, because the plan is to build a strong optimizer.
Can't rely on GCC/Clang, they are too slow (~12k loc/sec with -O2).
Go team has proven that's it's possible to build a fast optimizer that produces good enough binaries.
But there will always be an option to use GCC/Clang with -O2.
I'm new to V, great work, I hope it goes great! I'm looking forward to contribute and I'm currently getting familiar with the compiler.
Alex I agree with you that having everything in one place (i.e. as you said, you want to look an if statement, you go to if_st) is easy to grasp, maintain and modify _if_ you plan to keep the language as small as it is. For example, I have worked on the D compiler (front-end) and if you plan to have such a complicated front-end (or less, but still not as simple as now), I think that the AST is a very wise choice.
But even if you plan to keep it small and setting aside the optimization, I would suggest that we add an AST for debugging purposes. A well-structured AST (as D's for example) for one, let's you print anything, which is invaluable. It's very good to be able to print expressions, statements etc. no matter how complicated. I have not found a way to do that in V.
One other thing is the error handling. Without an AST, I don't know how you could do anything but "stop on the first error". E.g. https://www.youtube.com/watch?v=l_96Crl998E&feature=youtu.be&t=2759
@baziotis That talk was very informative, thank you.
Thanks, you're welcome.
Now that I saw that again, here's some more info that you might find helpful since you found the above.
I happen to work in 2 things related to what already mentioned
First of all, for error reporting for syntax errors can be done without AST on some adequate level and if you take the "error recovery" road. In this case, you "guess" or "skip" tokens so both don't need ASTs. Especially for just expressions. The problem with that is that it seems it's not the best way to report errors (check Walter's talk I posted above). If instead you try to do what he calls "poisoning", then I think you need an AST.
As far as optimization is concerned, well most of it happens on a linear IR level (think of LLVM IR). And most of the work in the Go optimizer is to build SSA IR and then do transformations on it.
Not that transformations don't happen on the AST, but this is not the bulk of the work and one
may get away without it. Clang does basic stuff, e.g. some basic constant folding but not something amazing, e.g. here's an example for when an if condition is constant.
Here's "the same thing" in GCC. GCC is a little different in that in this fold-const.c it has some _generic_ constant folding routines (which I don't even understand, GCC is a mess for the most part). Anyway, I'm probably getting of track here. I hope that helped.
The tl;dr is that _maybe_ you can get away without an AST even if you want optimization. But I
don't think that in a language like V (or Go) which are simple, a well-optimized AST will be any real
bottle-neck.
@d2verb @baziotis @mvlootman
It has been decided to implement the AST.
The development will be taking place on the ast branch:
https://github.com/vlang/v/tree/ast
Contributions are welcome, I'll be able to join after the 0.2 release.
Is there any discussion thread about the design of the AST?
@baziotis Discord mostly.
AST parser is now complete. Thanks for your input :)
Most helpful comment
@d2verb @baziotis @mvlootman
It has been decided to implement the AST.
The development will be taking place on the
astbranch:https://github.com/vlang/v/tree/ast
Contributions are welcome, I'll be able to join after the 0.2 release.