go version)?$ go version go version go1.15 darwin/amd64
Yes
go env)?go env Output
$ go env
GO111MODULE="on"
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/m/Library/Caches/go-build"
GOENV="/Users/m/Library/Application Support/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/m/Go/pkg/mod"
GONOPROXY="github.com/matthewmueller"
GONOSUMDB="github.com/matthewmueller"
GOOS="darwin"
GOPATH="/Users/m/Go"
GOPRIVATE="github.com/matthewmueller"
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/Users/m/Go/src/github.com/matthewmueller/duo/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/4f/tcxcr6_55v9bp38d8g4hjlf80000gn/T/go-build905655404=/tmp/go-build -gno-record-gcc-switches -fno-common"
I'd like to be able to parse a file or package with only the type declarations and function signatures, without the function bodies. I find myself wanting this quite a bit for code generation, where you want the shapes, but you don't care about the implementations.
In go/parser/interface.go, you'll find the following modes:
// A Mode value is a set of flags (or 0).
// They control the amount of source code parsed and other optional
// parser functionality.
//
type Mode uint
const (
PackageClauseOnly Mode = 1 << iota // stop parsing after package clause
ImportsOnly // stop parsing after import declarations
ParseComments // parse comments and add them to AST
Trace // print a trace of parsed productions
DeclarationErrors // report declaration errors
SpuriousErrors // same as AllErrors, for backward-compatibility
AllErrors = SpuriousErrors // report all errors (not just the first 10 on different lines)
)
I'd love an additional mode, perhaps DeclarationsOnly. This would parse a bit more than ImportsOnly, but not the whole source tree.
Unfortunately, I couldn't find a way to parse at this specific granularity.
Which declarations would be included? only var, type, const, and funcs? at the file level?
That's a good question. For my use-case, it'd just be function signatures and struct declarations, but I could see all general declarations in that group as well as constants.
I think where I'd draw the line is parsing everything except function bodies and variables. That would still eliminate parsing large chunks of the most files.
Parsing is generally really cheap. Typechecking is expensive, and go/types supports not typechecking function bodies. Also, the parser would still have to do the work to find out where the function body ends and ensure that there are no parsing errors inside the function body.
Have you determined that parsing is a bottleneck or performance issue for your application?
Have you determined that parsing is a bottleneck or performance issue for your application?
I have not. My line of thinking was try to eliminate as much work as possible and since there are already other modes for eliminating parsing, maybe we could add one more.
I'm running this inside a file watching loop that parses, re-generates, then runs the code on each change, so I'm trying to reduce latency wherever possible.
That's a good question. For my use-case, it'd just be function signatures and struct declarations, but I could see all general declarations in that group as well as constants.
I think where I'd draw the line is parsing everything except function bodies and variables. That would still eliminate parsing large chunks of the most files.
What about func, var, type, or const declarations inside a function?
the parser would still have to do the work to find out where the function body ends
This is a great point @josharian. This may make my suggestion invalid since it seems like you'd still need to do some bookkeeping as you're scanning over tokens. Then at that point, why not parse it?
What about func, var, type, or const declarations inside a function?
@davecheney my suggestion would be to ignore everything inside the function body, including those declarations. Pulling out some random code and re-purposing the diff syntax a bit. The red is what I'd expect to be ignored with this mode:
package convo
import (
"fmt"
"regexp"
"strings"
"github.com/matthewmueller/jack/internal/pogo/standupuser"
"github.com/dustin/go-humanize/english"
"github.com/matthewmueller/convo"
"github.com/matthewmueller/jack/internal/pogo/standup"
"github.com/matthewmueller/jack/internal/pogo/team"
"github.com/pkg/errors"
)
// StandupDeleteIntent intent
const StandupDeleteIntent = "standup_delete"
// StandupDeleteInput input
const StandupDeleteInput = "delete standup"
- // StandupDelete topic
- var StandupDelete = NewTopic(StandupDeleteIntent).
- Slot("standup_name", regexp.MustCompile(".*")).
- Slot("confirm_delete", regexp.MustCompile(".*")).
- Input(StandupDeleteInput).
- Topic(func(ctx *Context, intent convo.Intent) convo.Topic {
- return &standupDelete{ctx, intent}
- })
- // replies
- var (
- StandupDeleteIntro = Template(`
- Are you sure you want to delete {{.standup_name}}? You cannot undo this action.
- `)
-
- StandupDeleteIntroWithActiveUsers = Template(`
- Are you sure you want to delete *{{.standup_name}}*? This will also delete everyone's past reports for this standup. You cannot undo this action.
-
-
- Right now {{.user_list}} participate in standup.
- `)
-
- )
// standupDelete struct
type standupDelete struct {
*Context
convo.Intent
}
- var _ convo.Topic = (*standupDelete)(nil)
// Handle team onboard
func (c *standupDelete) Handle(r *convo.Request, w *convo.Response) {
- if nevermind(r.Text) {
- w.Text = Nevermind()
- return
- }
-
- findingStandup := r.Slots["standup_name"] != nil
- confirmingDelete := r.Slots["confirm_delete"] != nil
-
- switch {
- case findingStandup:
- c.FindingStandup(r, w)
- case confirmingDelete:
- c.ConfirmingDelete(r, w)
- default:
- c.Introduce(r, w)
- }
}
func (c *standupDelete) Introduce(r *convo.Request, w *convo.Response) {
- // get the bot token
- botToken := r.Meta["bot_access_token"]
- if botToken == "" {
- c.catch(r, w, errors.New("no bot access token found"))
- return
- }
-
- // get the team
- tm, err := team.FindByBotAccessToken(c.db, botToken)
- if err != nil {
- c.catch(r, w, errors.Wrap(err, "error getting the team"))
- return
- }
-
- // Find all the standups for the team
- // TODO: access control
- standups, err := standup.FindMany(c.db, standup.NewFilter().TeamID(tm.ID))
- if err != nil {
- c.catch(r, w, errors.Wrap(err, "error finding many standups"))
- return
- }
- return
}
// Finding the standup, because we received a name
func (c *standupDelete) FindingStandup(r *convo.Request, w *convo.Response) {
- standupName := strings.ToLower(*r.Slots["standup_name"])
-
- // get the bot token
- botToken := r.Meta["bot_access_token"]
- if botToken == "" {
- c.catch(r, w, errors.New("no bot access token found"))
- return
- }
}
func (c *standupDelete) ConfirmingDelete(r *convo.Request, w *convo.Response) {
- confirm := *r.Slots["confirm_delete"]
-
- // same as nevermind
- if no(confirm) {
- w.Text = Nevermind()
- return
- }
-
- // catch all
- if !yes(confirm) {
- w.Intent = StandupDeleteIntent
- w.Slot = "confirm_delete"
- w.Text = StandupDeleteConfirmCatchAll()
- return
- }
}
CC @griesemer
As @josharian has already observed, it's not trivial to "skip" over the contents of function bodies; i.e., at least if we don't know if they are correct. If we know that the code is correct, one could accelerate the parsing of function bodies by only counting opening/closing braces (they have to match up); but one would still have to read and tokenize the source text. One could do a little experiment and see what the performance difference would be, but it may not matter in the overall application (presumably, the result of parsing is used somehow).
@matthewmueller Is parsing a particularly time-consuming component of your application?
@matthewmueller Is parsing a particularly time-consuming component of your application?
I did some benchmarking of the parser in a more isolated environment and it's very fast. 2-4ms fast.. 馃檶
Closing. Thanks everyone!
Most helpful comment
Parsing is generally really cheap. Typechecking is expensive, and go/types supports not typechecking function bodies. Also, the parser would still have to do the work to find out where the function body ends and ensure that there are no parsing errors inside the function body.
Have you determined that parsing is a bottleneck or performance issue for your application?