Solidity: Option to output the ast even before import resolution

Created on 22 Apr 2020 · 20Comments · Source: ethereum/solidity

We should provide the options to abort after certain steps:

parsing
import resolution
symbol resolution
type checking

etc.

-> only implement a switch a la stop-after: parsing for now, and only allow parsing for now.

feature

Source

chriseth

🎉1

Most helpful comment

Hey everyone! We, the Solidity team, would like to discuss this topic in a little language design discussion call with you next week to come to a conclusion about the implementation.

When?

Thursday, 25th of June, 5-6PM CEST (see time in your timezone here)

What?

Discuss issue #8739 - "output the AST even before import resolution"

How?

I will share a link to join the meeting in this issue 10mins before the start. You can also get a calendar invite with all information, if you send me your e-mail address as a DM on Gitter (@franzihei).

Please let me know if you're interested in participating. :)

franzihei on 17 Jun 2020

👍7

All 20 comments

Could add a stage field to the standard json which signals up to what point to run the CompilerStack: parse, importResolution, analysis, compilation.

axic on 6 May 2020

Alternative: more fine-grained output selection, i.e. being able to select "preImportResolutionAST" and nothing else would stop before import resolution.
Advantage: we would have to think about and specify what information actually is present for those output selections.

ekpyron on 6 May 2020

The main reason we are interested in this feature is to be able to extract imports reliably. Now, we are using the ANTLR grammar, which doesn't represent the exact same language, especially when a file fails/should fail to parse.

As I mentioned Chris in a call, we are ok with not getting an AST if the file fails to parse, but we'd love to get the same errors that solc would return.

I think anyone working on tools that process Solidity sources would benefit from this, so I'll tag a few of them: @iamdefinitelyahuman @cgewecke @fvictorio @frangio @jcaracciolo

alcuadrado on 9 May 2020

👍5

I have played around with this idea, and it seems is not that hard to accomplish. Here is a fork where i manage to get some functionality to work for it with some considerations (https://github.com/jcaracciolo/solidity/commit/bb4a0aa93d980038ecf9161addd78eb29a92ed65). I know it may probably break in most edge cases, but i try it out with a couple of contracts and worked out great.

There is an issue with the requirements to "analyze" the file when the AST JSON is generated. This is because the AST JSON contains much more data than just parsed information (such as scope, complex types and others) that require for more processing to be done. This will be trickier without the imports since some of the information required to "analyze" the file is present in other files, and the compiler will fail to execute if it cant find them.

There has to be some thought if all this extra information gathered after the parsing is required in the AST JSON when parsing only one file. If is not, then is trivial to do the change, just skip the this loop and double check the AST JSON generation wont throw a null pointer on missing data (which is basically the commit in my fork).

If in fact it is, then there are a lot more edge cases to take into account when trying to analyze a file without its imports.

jcaracciolo on 11 May 2020

The main reason we are interested in this feature is to be able to extract imports reliably.

I agree that this is much needed. We should consider changing the focus of this issue to import extraction exclusively. It might be interesting to get an AST with unresolved imports and symbols, but it might be best to discuss it as a separate feature.

Something interesting about import extraction is that it's recursive, so the user would need to request import extraction repeatedly. For this use case, requesting an entire AST is overkill and may have a performance impact.

frangio on 12 May 2020

By the way, when using javascript, this can be done "manually" using the "import file read callback" - and that is actually what it is meant to be used for.

chriseth on 12 May 2020

I think the import file callback is somewhat limited, as it only works in js, and doesn't work when using a native/docker version of solc. As we discussed in #3168 and in private, we think using a native solc would improve the development speed in most cases.

The benefits of exporting the parser aren't just extracting imports. That's why I tagged other people that deal with solc's last or build/use their own parsers. Can you please comment here if this would help you?

alcuadrado on 28 May 2020

I have two main uses for accessing an AST prior to compilation:

I can more effectively determine if a contract has changed, and from that infer which source files to recompile. It's unfortunate to have to run the compiler on an entire project because the user added an extra line in their SafeMath library.
I can better determine dependencies and associations. Knowing imports is useful, but a single source file can contain multiple contracts and a user might forget to remove an import statement for a contract they no longer require. The AST lets me build an effective dependency chain so I know exactly which contracts to recompile when I detect a change.

I'm also in agreement with @alcuadrado that this is useful in the native solc.. I'm working in Python so solcjs isn't an option.

iamdefinitelyahuman on 28 May 2020

To concur with @iamdefinitelyahuman and @alcuadrado, an exported parser would be very helpful. There's always a gap between the JS parser's accuracy and solc as new syntax is introduced and it's the source of a large number of issues in both solidity-coverage and eth-gas-reporter.

These tools rely on fast parsing either to avoid double-compilation (because they modify contracts) or strong coupling with the user's compilation strategy (e.g requiring they use the correct solc output selections and knowing how they store their comp artifacts locally)

cgewecke on 28 May 2020

Is there any update on where this issue is heading? I do agree there are much more cases in which an exported parser could be used for, other than extracting imports.

jcaracciolo on 10 Jun 2020

👍1

I still like @axic's suggestion most. Would we add it to settings or to the top level?

chriseth on 15 Jun 2020

Hey everyone! We, the Solidity team, would like to discuss this topic in a little language design discussion call with you next week to come to a conclusion about the implementation.

When?

Thursday, 25th of June, 5-6PM CEST (see time in your timezone here)

What?

Discuss issue #8739 - "output the AST even before import resolution"

How?

Please let me know if you're interested in participating. :)

franzihei on 17 Jun 2020

👍7

During the call we would like to know:

what the participants would use the feature for
what exactly is needed as output
if other stages to stop compilation would be useful

chriseth on 24 Jun 2020

Join the call in 15 mins here --> https://meet.google.com/mga-marp-oid.

franzihei on 25 Jun 2020

Thanks everybody for joining the call!

Really rough notes (please feel free to correct or add to it):

Attendees: Alex B, Aniket, Ben H, Chris R, Francisco G, Franco V, Franco Z, Franziska H, Nick D, Patricio P, Yann L.

Questions for tool builders:

what the participants would use the feature for
what exactly is needed as output
if other stages to stop compilation would be useful

Buidler: need relative path as it is written in Solidity source code

Truffle: we look to see which Sol files have been changed; min set of files that need to be compiled at a certain time; we don’t need absolute paths either, we just need relative paths upfront, list of sources is only used for the “profiler”

Solhint: needs to parse files to do static analysis, using JS parser, for prettier I just need the syntax, no need for types, for solhint, only syntax - works on file level (no cross file analysis), if we had a type AST that could probably be useful too (but not sure how yet)

Chris: let’s introduce a generic flag to stop after point x and at a later point in time we could still introduce other stages to stop?

Remix: will use features to have fast feedback and display more details quickly

Brownie: would it be possible to have an AST fully linked without static analysis?

Chris: yes, before type checker

Brownie: could I build a dependency tree with that? having an AST for just one file is useful; if I can build all the associations; build dependent chain without static analysis would allow me to fine tune what contracts I am compiling

Chris: I think the main problem is that the compiler is f*ing slow, we should also focus on that to increase compilation time

Truffle: but exposing a parser would be easier than making the compiler faster and it’s useful.

[...]

_We also discussed other tooling issues in this call, which are not relevant for this issue (hence I'm not putting the other notes here)._

franzihei on 25 Jun 2020

👍1

summary: Implement a switch e.g. stop-after: parsing that only handles a single file and does not even call the import callback. I think the absolute path can still be put in the annotations. For now, we only have 'parsing' - but we could extend it at some point in the future by defining other "checkpoints". The main problem is to check that such an incomplete AST can be exported without crashing.

chriseth on 25 Jun 2020

I created https://github.com/protofire/solhint/issues/232 to keep track of this in solhint. I'm referencing it here because I assume other people building on top of this future solc parser might have similar issues.

fvictorio on 25 Jun 2020

👍1

Is it safe to assume these things?

A parser for v0.4.26 can parse any v0.4.x file?
A parser for v0.5.17 can parse any v0.5.x file?

If so, can we consider backporting this change to each of those majors? This way we'd be able to parse any file with solc.

alcuadrado on 21 Jul 2020

👍1

We usually only backport critical bugfixes, but we can see how much work the task is overall.

chriseth on 22 Jul 2020

❤1

Backporting this to 0.4.26 and 0.5.17 would be a radical step in making this hugely valuable. The main objective behind this was to for the first time have a proper parser that works reliably, covers the entirety of the language and is always up to date, but if it's only available for the newer versions then parsing code for the older versions that still are relevant today remains a problem. This means Solidity parsing as a whole will remain an unsolved problem, making it hard to step up the tooling ecosystem.

This is super important for developer experience.

fzeoli on 28 Oct 2020

👍2

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Native solidity staticcall

leviadam · 4Comments

New ABI decoder fails on arrays of dynamic types as calldata

axic · 4Comments

Introduce "require()" and "assert()" into documentation

chriseth · 3Comments

Update documentation: change Julia to yul

chriseth · 3Comments

Documentation: Move `bytes` and `strings` out of "value types" section

kkagill · 4Comments