This is a draft text, but before it gets lost or i forget to publish it, here it is:
Hi,
i think at the current macro language is too weak. I think it is more of a
code-template language. This is nice for simple macros that basically are code
templates, but macros that have complex logic in it get extremely complicated.
I think this should be changed. Here is a example
that is difficult to implement and hard to understand:
# flatten a tuple literal
macro flatten_tuple(t)
{%
queue = [] of ASTNode
res = [] of ASTNode
%}
{% for e in t %}
{% queue << e %}
{% end %}
{% for e in queue %}
{% if e.class_name == "TupleLiteral" %}
{% for n in e %}
{% queue << n %}
{% end %}
{% else %}
{% res << e %}
{% end %}
{% end %}
{ {{*res}} }
end
Problems:
Here are some "hacks" i use in order to achieve what i need. Maybe there are
better solutions, but since the documentation is not very extensive this is
the best i could come up with.
This is my proposal for a IMO better macro language:
Here are some examples:
def array_reverse(array)
# regular crystal implementation
end
macro foo(lit : AST::ArrayLiteral)
lit.to_a.reverse.inspect # returning a string
end
macro flatten_tuple(exp : AST::TupleLiteral)
result = TupleLiteral.new
exp.args.each do |arg|
case exp
when AST::TupleLiteral then result.args += flatten_tuple(arg)
else result.args << arg
end
end
result # returning an AST node
end
macro property(t : AST::TypeDeclaration)
<<-TEMPLATE
def #{t.name}
@#{t.name}
end
def #{t.name}=(value : #{t.type})
@#{t.name} = value
end
TEMPLATE
end
For me calling macros from macros and changing macros to return a string or AST nodes is the biggest issue. I need that for my parser combinator framework (and probably two other projects of mine). Here is how it is supposed to work:
# Variant 1
class Parser < Syntaks::Parser
include EBNF
#rule(root, {assignment >> /[ \t]*\n/})
rule(root, {assignment})
rule(assignment, id >> /\s+/ >> "=" >> /\s+/ >> value)
rule(id, /\w+/)
rule(value, id | /\d+/)
end
def test_acceptance
Parser.new.call("test = 15") as Success
end
# Variant 2
class Parser < Syntaks::Parser
rules do
root = call
call = "method" >> /\s+/ >> id >> param_list
param_list = "(" >> params >> ")"
params = param >> {"," >> param}
param = int_lit | name_lit
int_lit = /\d+/
name_lit = /\w+/
id = /\w+/
end
end
def test_acceptance
assert Parser.new.call("method test(banana,1337,9001)").is_a?(Success)
assert Parser.new.call("method a(1)").is_a?(Success)
assert Parser.new.call("method test()").is_a?(Failure)
end
That actually works. The problems begin when generating the AST. I dont want to generate a parse tree, but an AST. And also, i want to change the structure by passing blocks in the definition:
rule(:method_definition, method_head >> method_body >> inline_ws_opt >> method_end) do |head, body, _, _|
MethodDef.new(head, body)
end
The parser combinators would generate nested objects for sequences, so actually i would just retrieve one node in the block which has nested nodes for the sequences. But i want sequences to pass their result as multiple arguments to the block. For that i need to be able to call macros from macros.
Seems to me, at a glance, that either:
Will follow ideas here with curiosity!
Thanks for the very detailed explanation!
We actually discuss how to enhance macros from time to time, though many times we stumble upon the same problems, some of which you mention.
Our original idea was to compile macros down to programs and then invoke them passing pointers to AST node that exist in memory, that are in the current program. This has the issue that you mention, that we'd first need to compile the methods defined in the program before the macro invocation, but what if that in turns needs other macros. It's kind of recursive and not doable (I explained it briefly because I don't remember all the details).
It's also curious that every programming language I know that has compile-time features or macros use an interpreter or a VM to expand them:
run macro call, only that our result is a string that is parsed back (though you can of course create AST nodes and then turn them into strings).As a separate topic, many languages that allow very powerful macros to be created almost always advice you "Don't use macros! Only use them when you really need them! They are dangerous!".
In my opinion, macros should be used to avoid some (not all) boilerplate. Many will probably not like what I'll say now, but I actually like it that macros are kind of limited.
The problem with macros is that when you have a problem to solve, and you have super powerful macros, you stop and think "Hmm... how can I use macros to create a super awesome DSL that will allow me to solve this problem in a very elegant way?". Well, my problem with that is that you suddenly forgot that you had a problem to solve _at runtime_. Macros only work _at compile-time_. Maybe without using macros you could have solved the problem in 5 minutes, maybe with some duplicated code. So with dumb macros you first would think how to solve the problem at runtime, with methods and objects, and then, at the end, see if you find a way to reduce some boilerplate with macros.
An example of the above is JSON.mapping. The whole macro is 100 lines of code (it could be shorter, but the macro allows for a lot of configurations). But the real code that solves JSON is the lexer, the parser and the pull parser. The macro merely generates a bit of code to use the pull parser, and to avoid some boilerplate. Maybe with a powerful macro you'd be tempted to create a specific JSON parser for each macro invocation, but then you'd only cover that use case, while the runtime pull parser covers a lot more cases.
If we look at macros in the standard library we have:
new(JSON::PullParser) you can define (it's really simple not to use JSON.mapping and read the pull parser manually)Reference#to_s and Reference#inspect: automatically inject boilperplate code that inspects an object's instance variablesspawn(call): avoid the boilperplate of creating a proc, invoking the call inside it, and then invoking the proc with the call's argumentEnum.flags: avoid a bit of duplication in Flag::One | Flag::One | Flag::Three by letting you do Flag.flags(One, Two Three)ecr: this one uses macro run, to avoid manually translating template code to crystalIn other cases we use macros to loop over some types or expression to define similar methods on similar types.
I'd really like macros to be used in this way, as their use is simple and they are very easy to understand. It also keeps compile times low, more complex macros need more time to execute (specially because they are interpreted). But of course the current macro language is more powerful than those use cases, though not super powerful so it kind of limits you (but I think this is good).
I recommend watching this excellent talk about what macros can do, and why they should be avoided: https://www.youtube.com/watch?v=o69H0MXCNxw
And, if you really want to do "whatever you want" at compile time, you can always use the run macro call, where there's no need to have an interpreter or a VM, it's just Crystal code that executes like any other program (so the implementation of it is also easy, but more powerful than an interpreter).
It's also worth noting that this is just my opinion, and I know that @waj and @bcardiff would like much more powerful and flexible macros (and I'm sure many more in the community too!), though of course we don't have a clear idea of how to achieve that.
Just to chip in, opinionated, I totally agree with @asterite, also thinking macros should be just strong enough to avoid boiler plate. When one starts to go into the DSL territory (which of course is fine for those use cases), run macros are fine, but it would be nice to be able to use them "more transparently", avoiding the run-boilerplate for making run-macros ;-) Fully "self-macroable linguistics" (there's probably a common term for this) feels more like a vanity thing.
We always have macros in our backlog as something to be improved, so this issue doesn't need to remain open. If we find a way to improve this situation, we will.
Most helpful comment
Thanks for the very detailed explanation!
We actually discuss how to enhance macros from time to time, though many times we stumble upon the same problems, some of which you mention.
Our original idea was to compile macros down to programs and then invoke them passing pointers to AST node that exist in memory, that are in the current program. This has the issue that you mention, that we'd first need to compile the methods defined in the program before the macro invocation, but what if that in turns needs other macros. It's kind of recursive and not doable (I explained it briefly because I don't remember all the details).
It's also curious that every programming language I know that has compile-time features or macros use an interpreter or a VM to expand them:
runmacro call, only that our result is a string that is parsed back (though you can of course create AST nodes and then turn them into strings).As a separate topic, many languages that allow very powerful macros to be created almost always advice you "Don't use macros! Only use them when you really need them! They are dangerous!".
In my opinion, macros should be used to avoid some (not all) boilerplate. Many will probably not like what I'll say now, but I actually like it that macros are kind of limited.
The problem with macros is that when you have a problem to solve, and you have super powerful macros, you stop and think "Hmm... how can I use macros to create a super awesome DSL that will allow me to solve this problem in a very elegant way?". Well, my problem with that is that you suddenly forgot that you had a problem to solve _at runtime_. Macros only work _at compile-time_. Maybe without using macros you could have solved the problem in 5 minutes, maybe with some duplicated code. So with dumb macros you first would think how to solve the problem at runtime, with methods and objects, and then, at the end, see if you find a way to reduce some boilerplate with macros.
An example of the above is JSON.mapping. The whole macro is 100 lines of code (it could be shorter, but the macro allows for a lot of configurations). But the real code that solves JSON is the lexer, the parser and the pull parser. The macro merely generates a bit of code to use the pull parser, and to avoid some boilerplate. Maybe with a powerful macro you'd be tempted to create a specific JSON parser for each macro invocation, but then you'd only cover that use case, while the runtime pull parser covers a lot more cases.
If we look at macros in the standard library we have:
new(JSON::PullParser)you can define (it's really simple not to use JSON.mapping and read the pull parser manually)Reference#to_sandReference#inspect: automatically inject boilperplate code that inspects an object's instance variablesspawn(call): avoid the boilperplate of creating a proc, invoking the call inside it, and then invoking the proc with the call's argumentEnum.flags: avoid a bit of duplication inFlag::One | Flag::One | Flag::Threeby letting you doFlag.flags(One, Two Three)ecr: this one uses macrorun, to avoid manually translating template code to crystalIn other cases we use macros to loop over some types or expression to define similar methods on similar types.
I'd really like macros to be used in this way, as their use is simple and they are very easy to understand. It also keeps compile times low, more complex macros need more time to execute (specially because they are interpreted). But of course the current macro language is more powerful than those use cases, though not super powerful so it kind of limits you (but I think this is good).
I recommend watching this excellent talk about what macros can do, and why they should be avoided: https://www.youtube.com/watch?v=o69H0MXCNxw
And, if you really want to do "whatever you want" at compile time, you can always use the run macro call, where there's no need to have an interpreter or a VM, it's just Crystal code that executes like any other program (so the implementation of it is also easy, but more powerful than an interpreter).