go 🚀 - embed, cmd/go: add support for embedded files

Would it be an error to have a //go:embed directive without importing embed?

jimmyfrasche on 2 Sep 2020

@jimmyfrasche Yes, fifth-to-last bullet in list at https://go.googlesource.com/proposal/+/master/design/draft-embed.md#go_embed-directives.

rsc on 2 Sep 2020

👍1

@rsc Maybe I have missed it in the draft, but I don't see the ability to embed a single file that you mention in your comment.
Also, would you be able to embed a single file as a const string as well?
Thanks for this great proposal.

pierrec on 2 Sep 2020

@pierrec It's not in the draft doc (the "one addition" is the text in the comment above). Const strings can end up playing a role in deciding whether a program type checks, which would mean all type checkers would need to understand //go:embed'ed consts. In contrast, if we stick to vars, type checkers are none the wiser and can be left alone. Seems like we should probably stick to vars.

Is there a particular reason you wanted a const instead of a var? Using them should be about the same as far as efficiency. (References to const strings end up compiling to what amount to references to hidden vars anyway.)

rsc on 2 Sep 2020

👍3

Thanks for the explanation. I tend to embed static assets as const strings at the moment this is why I asked. I am fine with vars too!

pierrec on 2 Sep 2020

Interesting, so I could do something like:

//go:embed version.txt
var Version string

And potentially even have a //go:generate comment to generate version.txt. That would cut out a large use case for makefiles/ldflags.

carlmjohnson on 2 Sep 2020

👍30 ❤10 👀4

Is it an error if the file isn't found? If so, where technically is the error considered to occur? Link time?

carlmjohnson on 2 Sep 2020

👍1

Can we make sure that go:embed runs after go:generate so we can do things like easily generate versions, etc?

docmerlin on 2 Sep 2020

Can we make sure that go:embed runs after go:generate so we can do things like easily generate versions, etc?

From my understanding,go:generate will occur with go generate while go:embed would happen with go build.

tooolbox on 2 Sep 2020

👍7

@carlmjohnson Yes, it is always an error to say //go:embed foo where foo does not exist.
The error happens when compiling the source file containing that line.
(If you were to remove foo after compiling that source file, you still wouldn't get to a link step - the go command would notice that the package needs rebuilding because foo was deleted.)

rsc on 2 Sep 2020

👍6

I think that this proposal is not complete without saying something about ETag.
https://old.reddit.com/r/golang/comments/hv96ny/qa_goembed_draft_design/fzi0pok/

tv42 on 2 Sep 2020

@tv42, yes, we will make ETag work. I'm not sure what the shape of that is but we will.
(Also affirmed on https://github.com/golang/go/issues/35950#issuecomment-685845173.)

rsc on 2 Sep 2020

~~Two~~Three things I've noticed from working with mjibson/esc:

As go:embed doesn't need to generate go files for embedding as read-only filesystem, it would take away the pain of changing timestamps on the go:generateed files that defied git porcelain tests on CI- very nice
One thing I didn't find in the proposal but would need is the ability to live-reload the embedded files during development cycles. Using mjibson/esc I can currently do that instructing it to use the local filesystem (though it wouldn't pick up new files) and change the behaviour using build tags. I'm wondering what that could fit into the proposal?
Update Another thing I remember is that esc required to be able to transparently strip (parts of) the base path in order to e.g. export an assets folder as web root.

Afterthought: I guess the second point could be remedied in conjunction with the io/fs proposal where I would either use the embedded or the live filesystem for inclusion? Implement the path stripping as io/fs middleware?

andig on 3 Sep 2020

@andig You can already strip prefixes when serving a filesystem over HTTP. I agree that the live-reloading can be done by a third party library wrapping an io/fs.

Merovius on 3 Sep 2020

👍5

One more thing: if I understand correctly, embed will consider files locally to the package and forbids ... My current design has /assets and /server/ where the latter contains the server‘s code and today hosts the generated files. With this proposal the embed would need to move to the root folder as assets wouldn‘t be accessible from server. This imposes different accessibility constraints from normal imports. I was wondering if this is necessary for security reasons or if module-local embeds should be generally allowed.

andig on 3 Sep 2020

👍2

One more thing: if I understand correctly, embed will consider files locally to the package and forbids ... My current design has /assets and /server/ where the latter contains the server‘s code and today hosts the generated files. With this proposal the embed would need to move to the root folder as assets wouldn‘t be accessible from server. This imposes different accessibility constraints from normal imports. I was wondering if this is necessary for security reasons or if module-local embeds should be generally allowed.

You can create a emed.go file in your assets directory and make the assets available as it's own package to the rest of your program.

thomasf on 3 Sep 2020

👍9 😄1

Another explicit goal is to avoid a language change. To us, embedding static assets seems like a tooling issue, not a language issue.

Agreed. In my opinion, adding syntactical sugar in the language to support this tooling change is a language change. I'm sure this is obvious to others, but this is effectively comment-as-code.

I strongly feel that magic/sugar detracts from the simplicity and readability of the language; it is very easy to miss a magical comment that embeds a file. While a response to this could easily be "okay, then don't use it", this change means that a reviewer still has to be vigilant for others using this feature and has to remember that comments around variable declarations can break builds or fail at compile-time.

I believe this is going to add confusion, detract from language usability, and will result in opaque, large binaries without clear benefit (regarding the lattermost concern, this will even lead to an anti-pattern of re-building binaries due to plain file changes). If go mod allowed for a --withNonGoCodeAssets, I believe this would solve the needs of most developers that don't want to write more complex build pipelines (I assume end-user distribution is a smaller subset of the problem for users).

tristanfisher on 3 Sep 2020

👎11 👍5

@tristanfisher, I understand your point about language vs tooling change. It's certainly near the line. The reason that I consider it more a tooling change is that the language spec is unaffected - whether a program is valid does not change, the type checking process does not change. All that changes is the initial value of that variable following the comment. In this way it is a bit like the linker's -X flag, which can set the initial value of a top-level var of type string. It's fine for us to disagree; I just wanted to make my definition clear and explain the distinction I'm making.

As for bloat, I guess we'll have to see, but I don't anticipate programs getting much larger than they already are. People _already_ run tools that turn arbitrary files into Go code, check them into their repos, and make the compiler build them. The design removes some overhead from this process but does not enable anything new. Maybe people will abuse it now that it's easier to do, but on balance I don't expect that to be much of a problem. (And if some dependency embeds something so big that it bloats your binaries, you can always choose not to use that dependency.)

As for rebuilds due to plain file changes, the only files that can trigger rebuilds are the ones in your own top-level module, since dependencies are immutable. If you found rebuilds happening more often than you'd like, the only explanation is (1) you are embedding files and (2) you are modifying those files. You would be in complete control of doing something about either cause. (It would be another thing entirely if a dependency's choice of what to use was somehow forcing extra rebuilds or other expense on you. But that's not the case here.)

rsc on 3 Sep 2020

👍19

@rsc I agree that it's okay for us to disagree and I appreciate your response. My feeling is that if it's included by default in the standard tooling and comments can lead to an implicit initialization of a variable, then it's a language change. Outside of that debate, I guess my icky feeling is around more directives as "magic" comments that need to be memorized by (human) code readers. This could be taken to the absurd conclusion of adding new features via block comments that get handled at build time.

That said, if this gets added to the ecosystem, I will be thankful that importing embed is required -- that's better than nothing as a "hey, head's up" when auditing code. I think go mod allowing for non .go would solve a majority of use cases (I imagine most people are going to glob files for webservers) and would also live entirely in tooling.

I think your point regarding the linker is a good one. It also helps explain my feelings on this: if the very end user (e.g. not someone that simply imports a package) is making the decision, there's no way to be surprised by blobs of non-code coming along for the ride. My concerns are born out of reviewing/pairing-on others' work and "tech-leady" responsibilities, which is why I felt the need to respond.

I think "we'll have to see" sums it up well (I'm more cynical about bloat/misuse).

tristanfisher on 3 Sep 2020

👎7 👍1

I will read through the draft design tonight, so far it looks good from the TinyGo perspective.

I just wanted to clarify one thing:

On the other hand, projects like TinyGo and U-root target systems with more RAM than disk or flash. For those projects, compressing assets and using incremental decompression at runtime could provide significant savings.

I don't know about U-root, but for TinyGo the main targets are microcontrollers which normally have far more flash than RAM (usually a factor of 8 or 16). A quick look at the draft design seems to suggest the idea is to keep the files in read-only memory, which would work fine for these targets: the embedded files can be read directly from flash. It would most likely not be desirable for TinyGo targets to decompress files at runtime.

aykevl on 3 Sep 2020

👍5

The io/fs proposal on which this depends looks to be blocked on Readdir/FileInfo problems, under discussion in #41188 and previously #40352.

I've drafted an API to replace them in https://github.com/golang/go/issues/41188#issuecomment-686283661

networkimprov on 4 Sep 2020

@andig

One thing I didn't find in the proposal but would need is the ability to live-reload the embedded files during development cycles.

embed.Files implements fs.FS, hence all you need to do is to use dev vs !dev build tag to switch a variable between embed.Files and the real FS.

tv42 on 4 Sep 2020

👍4

I filed #41265. It offers a new ReadDir() API for io/fs.

networkimprov on 8 Sep 2020

I have similar concerns as @tristanfisher . Go has been using magic comments as compiler directives for a long time (since the beginning?), but they are meant for corner cases and it is rare for them to appear in code. Given the popularity of embedding static content in Go binaries, //go:embed is likely to be more common. Maybe it's time to consider a different syntax for compiler directives?

jonbodner on 10 Sep 2020

👍5

Just a reminder that changing the Go syntax has a very high cost. Pretty much every Go tool out there would need to be updated and/or fixed to support the new syntax, for example.

mvdan on 10 Sep 2020

👍4 👀1 ❤1

I don't consider them magic comments. Lines starting with //go: are directives and could be defined as such in the spec. There's not a lot of semantic difference between //go:embed, @embed, [[embed]] or any other number of syntax variations, except the //go: prefix is already treated as non-code by Go tools. (my editor highlights those lines differently for example)

iand on 10 Sep 2020

👍7

@mvdan If this proposal happens, Go syntax has changed. It's just changed in a way that doesn't break existing tooling. Maybe that seems pedantic.

@iand I'm not fussy about the specific syntax for compiler directives. I just think that it needs to be formalized at some point and the rules specified.

I think this proposal is a good idea. It solves a common problem. My concern is that the cost of adopting it should be made a little more explicit.

jonbodner on 10 Sep 2020

@jonbodner I share your concerns about magic comments. But to some extent the rules are specified by #37974.

ianlancetaylor on 10 Sep 2020

@networkimprov, this is not the io/fs proposal. Please stop commenting about ReadDir here.

rsc on 11 Sep 2020

👍6

@jonbodner

I'm not fussy about the specific syntax for compiler directives. I just think that it needs to be formalized at some point and the rules specified.

I would just point out that we made the decision to use //go: to mark Go toolchain directives when
we added (the limited use) //go:nointerface annotation back in 2012.
We added //go:noescape for assembly authors in 2013.
We added //go:generate in 2014.
We are likely adding //go:build in 2020-2021 as well.
There are others; that's just the highlights.
You can think of //go: as meaning #pragma from C if it helps.

At this point, the convention is very well established.
We chose that syntax back in 2012 because
(1) it's obviously not a comment for a person;
(2) tools that don't know about the comments will ignore them because they're comments; and
(3) it generalizes to other tools (s/go/yourtool/).

And as Ian said, #37974 formalized the exact generalized comment syntax, for what that's worth.

rsc on 11 Sep 2020

👍4 ❤3

Based on the discussion above, this seems like a likely accept.
(Again, assuming but separate from the FS proposal.)

rsc on 16 Sep 2020

👍20 🎉15

No change in consensus, so accepted.

rsc on 23 Sep 2020

🎉57 👍39 ❤19

I'm eager to get my hands on embed- can this already be tested on master or are there any plans to ship it as experiment during the 1.15 cycle?

andig on 30 Sep 2020

😄1 👎1

@andig, Go 1.15 is out already. I still hope this will be in Go 1.16 and landing in the development branch this month.

rsc on 1 Oct 2020

@rsc 1.16 available?

septs on 1 Oct 2020

👎3

@septs, no we are still working on Go 1.16. The code freeze is Oct 31, with a target release date of Feb 1.

rsc on 1 Oct 2020

fastest 2021Q1 or 2021Q2 release?

septs on 1 Oct 2020

👎5

@septs please stop asking questions about Go releases in this thread. Over twenty people follow it and get notified. See https://golang.org/wiki/Questions and https://github.com/golang/go/wiki/Go-Release-Cycle.

mvdan on 1 Oct 2020

👍15 ❤4

Change https://golang.org/cl/243941 mentions this issue: go/build: recognize and report //go:embed lines

gopherbot on 20 Oct 2020

Change https://golang.org/cl/243940 mentions this issue: go/build: refactor per-file info & reader

gopherbot on 20 Oct 2020

Change https://golang.org/cl/243942 mentions this issue: embed: implement Files

gopherbot on 20 Oct 2020

Change https://golang.org/cl/243944 mentions this issue: cmd/compile: add //go:embed support

gopherbot on 20 Oct 2020

Change https://golang.org/cl/243945 mentions this issue: cmd/go: add //go:embed support

gopherbot on 20 Oct 2020

One detail that came up in the implementation review is that "Files" as a singular noun is pretty awkward ("A Files holds ...").

The choice of embed.Files for the name predated both the io/fs proposal and also the support for string and []byte.
Given both of those developments, one seemingly sensible way to resolve the "A Files holds" problem is to call it an FS instead of a Files.

Then the three ways to embed & print data are:

import "embed"

//go:embed hello.txt
var s string
print(s)

//go:embed hello.txt
var b []byte
print(string(b))

//go:embed hello.txt
var f embed.FS
data, _ := f.ReadFile("hello.txt")
print(string(data))

That seems clearer about what you get: a string, a []byte, or an FS.
That is, most of the functionality of the embed.F* comes from it being an fs.FS, and calling it FS makes that clearer than calling it Files.

I've made that change in my latest draft of the CL implementing package embed, but I wanted to circle back here and see if there are any objections to the name change.

(A more radical change would be to do var f fs.FS instead of var f embed.FS, but that would preclude ever having any method on f other than Open. For example, above, having ReadFile is convenient and would not be possible. And in general we've learned that using a concrete type for something that might want to add methods later is good future-proofing compared to using an interface type directly.)

rsc on 22 Oct 2020

👍25

I think the rename is a good change.

In regards to the more radical change:

If we'd use fs.FS, would we even need the embed package anymore? I guess the dynamic value still must have some type, that lives in some package? I find the idea of not having to add a package a plus.
I don't find "we can't add methods" super convincing, because IMO f.ReadFile(…) isn't significantly ~~less~~more convenient than fs.ReadFile(f, …).
I agree that concrete types are better in general though, so that's a plus for keeping it embed.FS
Another question: Does embed.FS use pointer-receivers, or value-receivers? IMO having to pass around &f is awkward, using value receivers is slightly unexpected. We might also allow var f *embed.FS though. If the variable has an interface-type, this question goes away.

Overall, I still agree using the concrete embed.FS is better - if nothing else, then for documentation purposes.

Merovius on 22 Oct 2020

Now that you've mentioned it, I don't think I got this clear: we can embed directories right?

chabad360 on 22 Oct 2020

Yes as an embed.FS which implements fs.FS.

carlmjohnson on 23 Oct 2020

@Merovius, embed.FS uses value receivers. embed.FS is a one-word struct containing a single pointer, so there's no actual overhead to doing so, but it means you can assign them around and use them without worrying about *s and \&s everywhere.

@chabad360, yes, you can embed directories.

rsc on 23 Oct 2020

👍1

What about symlinks?

burik666 on 23 Oct 2020

@burik666, see https://golang.org/s/draft-embed-design for details, but no, you cannot embed a symlink.

rsc on 23 Oct 2020

👍1

will it be possible to embed and use dynamic C libraries? if so how would we use the embed path in #cgo headers such as: #cgo LDFLAGS: -L./lib -lmylib -Wl,-rpath=./lib ?

benitogf on 23 Oct 2020

👀4 👎1

@benitogf I assume the only real way to do this would be to write them to disk and use dlopen. I can't imagine how you could tell the dynamic loader how to find embedded files. Also, if you want to bundle in C code, static linkage would seem more appropriate anyway, no?

Merovius on 23 Oct 2020

@benitogf Embedding lets you put a file from disk into a []byte in your program conveniently, nothing more.
If you have a way to use a dynamic C library that is already in your program in the form of a []byte, then embedding will help you get a disk file there. Otherwise, no.

rsc on 28 Oct 2020

👍1

static linkage would seem more appropriate anyway, no?

@Merovius agreed, but I have several use cases working with vendors who will only provide dynamic libs

If you have a way to use a dynamic C library that is already in your program in the form of a []byte
the only real way to do this would be to write them to disk and use dlopen

writing the embedded library from the []byte to filesystem and using dlopen seems ok, though having the embedded files optionally "dumped" to filesystem on build/run so the #cgo header can access them would be useful, not only for cgo imho

benitogf on 29 Oct 2020

👎1

Trying this out now; one wart with the go:embed directive is that if I embed build/*, the filenames still have the prefix build/. If I want to then serve that directory via http.FS, there's no easy way to _add_ the prefix that's required to access them if needed (without writing a wrapper, which then hits the problem of needing to list out every potential method that the FS may have...).

e.g.:

//go:embed build/*
var buildDir embed.FS

// Serve some SPA build dir as the app; oops, needs to be build/index.html
http.Handle("/", http.FileServer(http.FS(buildDir)))

// or

//go:embed static/*
var staticDir embed.FS

// Oops; needs to have a static prefix.
http.Handle("/static/*, http.StripPrefix("/static", http.FileServer(http.FS(staticDir))))

// Could be this, but only because the prefix happens to match:
http.Handle("/static/*, http.FileServer(http.FS(staticDir)))

I know the intent is that one could write go:embed foo/* bar/* baz.ext and get all of those files, but I think it's going to be very common to simply embed a directory and serve it as static assets via the http package. I expect this to be a gotcha as people switch from things like http.Dir("static") or pkger.Dir("/internal/web/static") where the prefix is already handled, to the new embed.FS.

I'm not really sure how to file this, as it's sort of an interplay with embed, io/fs, and net/http.

zikaeroh on 30 Oct 2020

@zikaeroh Writing a http.Handler wrapper would also work there right? That's only ever one method and there's even http.HandlerFunc. Perhaps the standard library could even provide one to mirror http.StripPrefix (something like http.AddPrefix or http.ReplacePrefix).

tmthrgd on 30 Oct 2020

Potentially, though it feels a bit weird to be modifying the HTTP request to work around the FS implementation (as opposed to a generalized "give me an FS that is a subdir of another FS", which is not simple with optional methods). It wouldn't be the most efficient thing, stripping then adding another prefix again (given http.Request copies), but I'll give it a try later. It's at least not _different_ than the current scheme of things where you have to work with the request, I suppose.

I have a few other places I use static data not via the http package that I'll have to have to come up with a similar fix for.

zikaeroh on 30 Oct 2020

👍3

If I have to have a look at how it is being implemented where can I look. A branch where it is being implemented?

alwindoss on 30 Oct 2020

👎1

It was suggested before to embed the files in-place, i.e. do that in the build directory and then import it. That would strip the build prefix. Then use the handler to add the required prefix. I‘m unsure how to exclude the go file doing the embedding from embedding itself. See https://github.com/golang/go/issues/41191#issuecomment-686621090

andig on 30 Oct 2020

👍3 👎1

It was suggested before to embed the files in-place, i.e. do that in the build directory and then import it. That would strip the build prefix. Then use the handler to add the required prefix. I‘m unsure how to exclude the go file doing the embedding from embedding itself. See #41191 (comment)

That's unfortunately no good for directories produced by other tooling, e.g. the output of say a webpack build or CRA (where they're often cleaned beforehand, and not checked in). I'd rather hack the filenames.

zikaeroh on 30 Oct 2020

It was suggested before to embed the files in-place, i.e. do that in the build directory and then import it. That would strip the build prefix. Then use the handler to add the required prefix. I‘m unsure how to exclude the go file doing the embedding from embedding itself. See #41191 (comment)

That's unfortunately no good for directories produced by other tooling, e.g. the output of say a webpack build or CRA (where they're often cleaned beforehand, and not checked in). I'd rather hack the filenames.

If you use such a huge system of plugins as webpack it's easy to just install another webpack plugin to generate the embed.go along the assets themselves. If you are using something simpler with a makefile or shell script it's also easy to generate the .go file from there.

thomasf on 30 Oct 2020

@zikaeroh

as opposed to a generalized "give me an FS that is a subdir of another FS", which is not simple with optional methods

It's supposed to be simple. Handling the wrapping problem was part of the design process. In particular, the wrapper is supposed to implement all the optional methods, by just calling the appropriate helper function in fs. If that doesn't work, that's concerning and it would be great to get some details.

Also, IMO, an implementation of such a wrapper (that strips a prefix) should ultimately be provided by io/fs (analogous to io.LimitWriter, etc). I would assume the only reason it hasn't happen yet is time.

Merovius on 30 Oct 2020

👍1

@andig The problem with doing that is that the Go-file containing the embed-directive and the variable is then also visible from the FS (and would be served by HTTP or might get exposed in another manner).

Merovius on 30 Oct 2020

👍2

The problem with doing that is that the Go-file containing the embed-directive and the variable is then also visible from the FS (and would be served by HTTP or might get exposed in another manner).

One way to remedy that could be to add the ability to exclude specific files/folders from embedding (@rsc ?)

andig on 30 Oct 2020

One way to remedy that could be to add the ability to exclude specific files/folders from embedding (@rsc ?)

The proposal was accepted over a month ago and is already implemented; I don't think large design changes like being able to exclude paths are reasonable at this point. If you have a problem with the implemented design that you can't work around, I would suggest filing a separate bug report with details, which can then be tracked before the final 1.16 release.

mvdan on 30 Oct 2020

👍3 👎1

@Merovius

It's supposed to be simple. Handling the wrapping problem was part of the design process. In particular, the wrapper is supposed to implement all the optional methods, by just calling the appropriate helper function in fs. If that doesn't work, that's concerning and it would be great to get some details.

How would the prefix stripping work for Glob?

icholy on 30 Oct 2020

@icholy I assume something like

func (f *stripFS) Glob(pattern string) (matches []string, err error) {
    matches, err = fs.Glob(f.wrapped, path.Join(f.prefix, pattern))
    for i, m := range matches {
        matches[i] = strings.TrimPrefix(m, f.prefix+"/")
    }
    return matches, err
}

Maybe with some additional care.

Merovius on 30 Oct 2020

👍1

Playing with this on gotip, I notice that it will include .DS_Store files. This is pretty much inescapable, I guess, but I do worry that including dot files will lead to accidental file inclusion. Maybe the docs should have a strong warning about this?

carlmjohnson on 31 Oct 2020

👍2

My shell doesn't include dot-files in *, so if I want to include them, I have to use * .*. This might be a (perhaps equally surprising) way to give a level of control.

I'm not sure what to think - IMO, dot-files shouldn't really be treated differently by the pattern, but OTOH the .DS_Store example seems like something that really should be addressed.

Merovius on 31 Oct 2020

Why not just do git clean -dffx && go build? If the DS_Store files are in git, then they'll be included in the build. If they're not, they won't be. This will also work nicely with gitignore.

You should be building with a clean VCS checkout, anyway. If you add random temporary files, they might make it into the final build and you can't possibly know what files people will end up with. We might want to document this.

mvdan on 31 Oct 2020

👎6 👍2

@mvdan I agree in principle, but in practice, I worry that a lot of people will get burned by dirty builds if they are not warned. I don't want to see a secret leak someone didn't realize their .env file was embedded by mistake. I have seen tons of examples of an equivalent mistake with PHP hosting, even though it could have been easily prevented by telling Apache to exclude dot files.

Re: embeding http.FileServers

If you use such a huge system of plugins as webpack it's easy to just install another webpack plugin to generate the embed.go along the assets themselves.

That's true, but it's very fiddly. The point of embed.FS is to reduce the need for crazy Makefile workarounds. I think the simple solution is to have fs.WithPrefix(string) fs.FS that locks an FS into a subdirectory. I thought there was some discussion of this in the proposal, but I can't find it now. Maybe it was just mooted on Reddit or something.

carlmjohnson on 31 Oct 2020

👍3 👎1

One way to remedy that could be to add the ability to exclude specific files/folders from embedding (@rsc ?)

It could be something like

//go:embed static
//go:embed-exclude .*
var staticFiles embed.FS

The embed-exclude directive could just do a glob filter on the accepted files and remove any matches…

carlmjohnson on 31 Oct 2020

If we want to avoid adding more to this proposal it could also be a lint rule that checks embedded filesystems for potentially unexpected dotfiles and warns you so that you can fix your build to remove them.

SamWhited on 31 Oct 2020

Or exclude dot files by default unless they are specifically mentioned by adding .* or similar.

That still wouldn‘t take care of exposing an assets.go file. As for the proposal already having implemented, please note that the question about the asset generation was raised during the discussion phase. It‘s probably not dangerous to have an (otherwise empty, except for the embed directive) assets.go exposed but it would be cleaner to not have it. As usual there are all sorts of workarounds that can be applied.

andig on 31 Oct 2020

It‘s probably not dangerous to have an (otherwise empty, except for the embed directive) assets.go exposed but it would be cleaner to not have it.

I agree that this is very unlikely to be a problem, but I would hate to see any closed source code accidentally leaked due to misconfiguration if we can make it easy to configure things correctly.

carlmjohnson on 1 Nov 2020

👍1

I don't want to see a secret leak someone didn't realize their .env file was embedded by mistake.

If someone uses //go:embed static/* and there's a static/.env or static/.super-secret, wouldn't you say that the user really meant to include those files? Otherwise, why would they be in the static directory?

I get that this is up to what the user expects and what * means in most contexts, but I personally think that https://golang.org/pkg/path/filepath/#Glob semantics is our only good option. It's the simplest and what most Go users will be used to in the context of developing Go.

I think warning against the dangers of embedding * is a good idea in any case, though, because in a significant amount of cases one could reduce the chance of error by using more specific globs like *.png.

Also, I still encourage you to file a separate issue that can be tracked against the 1.16 release, written from the point of view of a bug report. This proposal is accepted and implemented, so I imagine it will be closed very soon. For example, you could phrase the bug report like: embedded file support easily leads to including unintended files (and give some examples).

mvdan on 1 Nov 2020

👍11 👎3

If someone uses //go:embed static/* and there's a static/.env or static/.super-secret, wouldn't you say that the user really meant to include those files? Otherwise, why would they be in the static directory?

There is a huge amount of corner cases, for example you opened an edit session with vim, but didn't close it, and it created .*.swp file with some contents that you would like nobody to see.

inliquid on 1 Nov 2020

👎4 👍2

Moving discussion to #42321.

carlmjohnson on 1 Nov 2020

This would be highly appreciated by the Prisma team for our Database Go client, since we use a query engine at runtime which is written in rust and needs to be in the built go binary somehow.

The way we're currently doing it is packing the binary into a .go file at go:generate time, but the file size is much higher than when it's a binary .gz file.

Native embeds would make this much better so we could directly embed a .gz file into the final go binary.

steebchen on 10 Nov 2020

If someone uses //go:embed static/* and there's a static/.env or static/.super-secret, wouldn't you say that the user really meant to include those files?

No, I would not.

$ mkdir z
$ touch z/.secret z/intended
$ ls z/*
z/intended
$ ls z
intended

tv42 on 10 Nov 2020

👍5 👎2

See my later comment in https://github.com/golang/go/issues/42328#issuecomment-720169922.

mvdan on 10 Nov 2020

👍1

I love the idea to make static files / templates embeddable which definitely can boost go dev productivity a lot.

But should we innovate another tag (e.g. @ or whatever) other than to reuse this //, which is supposed to be comments?

As of now, the // has been overused I guess, think of these:

// +kubebuilder:object:root=true
// +kubebuilder:subresource:status
// +kubebuilder:webhook:verbs=create;update,path=/validate-batch-tutorial-kubebuilder-io-v1-cronjob,mutating=false,failurePolicy=fail,groups=batch.tutorial.kubebuilder.io,resources=cronjobs,versions=v1,name=vcronjob.kb.io
// +optional
...

brightzheng100 on 24 Nov 2020

👎7 😕1

But should we innovate another tag (e.g. @ or whatever) other than to reuse this //, which is supposed to be comments?

That's a separate discussion and while it would be nice for directives not to be comments, a lot of go1 compat code already relies on that, so it's most likely not going to change.

ptman on 24 Nov 2020

👍2

But should we innovate another tag (e.g. @ or whatever) other than to reuse this //, which is supposed to be comments?

I tried finding it and failed, but I remember there being a decision to standardize on //go:<word> for these kinds of directives.
It doesn't seem like a good idea to change syntax around, given the decision to expressly converge on them.
(Also, of course, they are comments specifically so that the compiler ignores them - these directives are specific to the go tool, so they shouldn't leak into the language proper)

Merovius on 24 Nov 2020

I saw a mention on the io/fs issue that embed.FS supports ETag: https://github.com/golang/go/issues/41190#issuecomment-737702433

I tried running a test for it but I'm not seeing an ETag response header set. Maybe I'm misunderstanding the usage. Should I expect to see one here? https://play.golang.org/p/Wq5xU5blLUe

benbjohnson on 4 Dec 2020

I don't think it does. http.ServeContent (used by http.FileServer) inspects the ETag header, but it does not set it, AIUI.

In a comment above Russ has said that ETag will be made to work. The difficulty is how to have embed.FS communicate to http.FileServer the necessary information to set ETag or other caching headers. There probably should be a separate tracking issue for this.

Personally, I'd argue that embed.FS should use the time of the last commit of the relevant module as a ModTime. This would roughly correspond to debug.BuildInfo, so it wouldn't impact reproducibility. Though I'm unsure how that is set for commits that don't correspond to a tagged release and it would still leave the question for what to set it to for builds from dirty worktrees.

But, I trust @rsc has a good solution in mind :)

Merovius on 5 Dec 2020

👍1

I'm not sure I understand the comment about "commit time"; if module source is a zip file, there's no "commit". I don't see any times mentioned in debug.BuildInfo or debug.Module.

More importantly, I'd argue that any timestamp-based mechanism is strictly inferior to a proper (hash of content based) etag.

tv42 on 7 Dec 2020

👍2

@tv42 Every module version is either a) a semantic version derived from a tag (which points at a commit) or b) a pseudo-version containing a commit-hash. I think? At least in git. I may be misunderstanding something.

More importantly, I'd argue that any timestamp-based mechanism is strictly inferior to a proper (hash of content based) etag.

I'm not so sure. It either needs some side-channel to communicate the hash, or the server needs to compute the hash of the file on request (which seems pretty expensive). After all, net/http doesn't know, a priori, if the fs.FS contents may change. The end-result of a hash-based ETag might justify the cost of adding such a side-channel (like an optional interface), but it doesn't seem obviously strictly better.

Also, I would argue that supporting at least also a time-based approach would mean you can work with more clients. I don't have any data to support that, though (i.e. I don't know if and how many clients are out there that might support If-Modified-Since but not ETag and vice-versa).

But really, I don't care much which approach is chosen. I just wanted to mention the option of using the time a module version was tagged.

Merovius on 7 Dec 2020

@Merovius Go modules are specified in terms of zip files. The fact that Git is a common source used for constructing that doesn't change the spec. The zips have zero timestamps:

$ unzip -v ~/go/pkg/mod/cache/download/golang.org/x/crypto/@v/v0.0.0-20200510223506-06a226fb4e37.zip|head
Archive:  /home/tv/go/pkg/mod/cache/download/golang.org/x/crypto/@v/v0.0.0-20200510223506-06a226fb4e37.zip
 Length   Method    Size  Cmpr    Date    Time   CRC-32   Name
--------  ------  ------- ---- ---------- ----- --------  ----
     345  Defl:N      233  33% 1980-00-00 00:00 237856c8  golang.org/x/[email protected]/.gitattributes

There does seem to be a timestamp in the accompanying *.info file, but I have no idea if that's reliable or available to the tooling.

$ cat ~/go/pkg/mod/cache/download/golang.org/x/crypto/@v/v0.0.0-20200510223506-06a226fb4e37.info; echo
{"Version":"v0.0.0-20200510223506-06a226fb4e37","Time":"2020-05-10T22:35:06Z"}

Even then, what timestamp should go:embed use in the main module (the one you run go build in)?

Personally, in the context of static files that are immutably embedded in the binary, hashing just seems superior to timestamps in every respect (except legacy support from before ETag existed, pre-HTTP/1.1, the 90s), and has no possible source of confusion in distributed systems and concurrent builds either. I seriously doubt clients that don't understand ETag exist anymore, and they sure won't make the HTTP/2 transition.

A side-channel to communicate the hash sounds like the right thing to me; an optional FS interface to return a hash of the file. (And maybe another to request a specific hash algorithm, if they happen to have it?. Some serving uses cases really specifically need sha256/sha1/md5, not just "some hash"). An FS that doesn't have a cheap enough answer can choose to not implement it.

(Though modern hashes are gigabytes/second/core even when cryptographically secure, tens of gigabytes/second for less secure ones, and easy to cache based on stat call. Just support ETag everywhere, people.)

tv42 on 8 Dec 2020

I've been thinking about this proposal today. I'm happy to see this functionality included in the Go toolchain, and I appreciate all the thought and effort that has gone into the proposal to date. Thanks for proposing this and driving this forward.

I have two questions, and then one suggestion (appreciating this is past the proposal discussion period, and already into release freeze):

Allowed types

I notice that the current code expects the variables to be declared as precisely ~embed.FS~, string, or []byte. In particular, these are not allowed: []uint8, ~FS after dot-import of "embed"~, or any other identical types constructed using type aliases. (Edit: I forgot how dot imports work within gc, and misread the code for detecting embed.FS.)

Is this intended or a bug?

Semantics of `[]byte`-typed variables.

I don't see any mention of what the expected identity semantics of []byte-typed variables are. In particular, for function-scoped variables. This isn't important for string- and embed.FS-typed variables, because they reference immutable data that can be safely deduplicated. But it's important to know the intended semantics for []byte-typed variables.

With the current implementation, the test program below prints false true (when foo.txt is non-empty). Is that intended/guaranteed?

package main

//go:embed foo.txt
var a []byte

//go:embed foo.txt
var b []byte

func f() *byte {
    //go:embed foo.txt
    var x []byte
    return &x[0]
}

func main() {
    println(&a[0] == &b[0], f() == f())
}

I think the more Go-like semantics for //go:embed variables would be those of composite literals: that each execution yields a new copy.

If there's no consensus on the proper semantics for this, we can always punt on it and make function-scoped, []byte-typed embeds an error for Go 1.16: users can still either declare a package-level variable if they want the current semantics (one byte slice per source declaration), or use a string-typed variable and convert to []byte if they want composite-literal semantics. We could then revisit later which behavior users would benefit from more.

Avoiding more `//go:` directives

I recommend against adding //go: directives for end-users that impact program semantics, and I don't find the arguments given for favoring //go:embed over normal Go syntax compelling. I respectfully encourage reconsidering this decision before the Go 1.16 release. (Again, I appreciate how late this request is.)

I'll start by pointing out I have a working proof-of-concept CL at CL 276835, which changes the embed API from:

//go:embed foo.txt bar.txt
var x embed.FS

to

var x = embed.Files("foo.txt", "bar.txt")

Similarly, functions embed.Bytes and embed.String are available to embed a single file and get it as a []byte or string value.

Responses

Similarly, an embed.Files variable can be a global or a local variable, depending on what is more convenient in context.

Having embed.Files, etc. allows also using them in expression contexts, which may be more convenient still.

It is an error to use //go:embed in a source file that does not import "embed" (the only way to violate this rule involves type alias trickery).

With embed.Files, etc., this is an error by virtue of the functions being exported by package embed.

Goimports (and gopls etc) can be taught this rule and automatically add the import in any file with a //go:embed as needed.

No special goimports or gopls logic is needed to be aware that importing "embed" is the proper way to fix uses of embed.Files, etc.

This approach fixes the type-checking problem—it is not a full language change—but it still has significant implementation complexity.

Notably, CL 276835 is a net removal of code. In particular, the compiler code (which will have to be reimplemented in gccgo and other compilers) is much simpler.

I also expect it will be easier to teach go/types about the nuanced semantics of embed.Files, etc. (i.e., that they only accept string literal arguments), than to teach it about //go:embed.

The go command would need to parse the entire Go source file to understand which files need to be made available for embedding. Today it only parses up to the import block, never full Go expressions

With CL 276835, the behavior is the same as at tip: the go command parses the entire Go source file for files that import package embed, and only the imports for files that do not.

Admittedly, for files that import embed, CL 276835 does a full parse and walk, whereas tip does a more efficient string scan for //go:embed comments. I think a more optimized, one-pass algorithm for finding embed.Files calls is doable though if desired.

It would also be unclear to users what constraints are placed on the arguments to these special calls: they look like ordinary Go calls but they can only take string literals, not strings computed by Go code, and probably not even named constants (or else the go command would need a full Go expression evaluator).

This doesn't seem substantially different from the //go:embed directive to me: users don't have any expectation what arguments they're allowed to use there until they use it the first time. Moreover, either way, users will get compiler error messages if they use it incorrectly, but IDEs and other tools will automatically provide better godocs and cross-references for embed.Files, etc.

mdempsky on 10 Dec 2020

👍6

@mdempsky FS after a dot-import of embed is the same type. So, that particular case seems a straight-forward "yes, that's fine" (tried it and it already works). Likewise, byte is an alias for uint8, so []uint8 is also the same type as []byte, though somewhat surprisingly, this doesn't work right now. However, I'd argue that the semantics as implemented are fine, for now - we can always allow more types and/or aliases and/or even "the same underlying type" later, if the need arises.

I think the more Go-like semantics for //go:embed variables would be those of composite literals: that each execution yields a new copy.

I tend to agree, that was my expectation as well. At first I thought this might've been an artifact of escape analysis and the compiler realizing you don't change the data, but no:

func f() {
    //go:embed foo.txt
    var x []byte
    fmt.Printf("%q\n", x)
    x[0] = 'x'
}

func main() {
    f()
    f()
}

However, "every var declaration creates a new variable" also has its problems, because it means code like this

func ServeIndex(w http.ResponseWriter, r *http.Request) {
    //go:embed "index.html"
    var x []byte
    http.ServeContent(w, r, "index.html", time.Now(), bytes.NewReader(x))
}

would needlessly allocate and copy. Perhaps that's fine and gets outweighed by the benefits of clearer semantics. But perhaps it's also a sign to forbid local []byte embeds as you mention.

I also expect it will be easier to teach go/types about the nuanced semantics of embed.Files, etc. (i.e., that they only accept string literal arguments), than to teach it about //go:embed.

Only string constants at least, can even be done in the Go type-system.

But, does go/types even need to know about embedded files, with //go:embed? The type of those variables are pretty clear, after all (with the exception of local []byte embeds, as discussed).

Merovius on 10 Dec 2020

👍1

//go:embed "foo"
var x []byte

Is this really meant to be mutable? I can't think of a single use case where I would want to embed a static asset, but mutate it at runtime, so I feel like that'll just enable bugs. I would have been happier shoving it in .rodata and panicking if something tries to mutate it.

(Prompted by x[0] = 'x' above.)

tv42 on 10 Dec 2020

@Merovius wrote:

@mdempsky FS after a dot-import of embed is the same type. So, that particular case seems a straight-forward "yes, that's fine" (tried it and it already works).

Thanks. I forgot how dot imports work within the compiler, so I misread the code. I should have tried first before including that example.

However, I'd argue that the semantics as implemented are fine, for now - we can always allow more types and/or aliases and/or even "the same underlying type" later, if the need arises.

[]byte and []uint8 are identical types under the Go spec, as are countless other types constructed using type aliases. If it's not acceptable for a compiler to treat embed.Files("foo" + ".txt") differently from embed.Files("foo.txt"), then the same argument should extend to not allowing it to treat aliased types differently.

But, does go/types even need to know about embedded files, with //go:embed?

No, it doesn't need to, just like it doesn't need to know about embed.Files's special semantics either. But for either syntax, there could still be value in go/types being able to warn about misuse.

--

@tv42 wrote:

Is this really meant to be mutable?

Evidently. The Go compiler specifically arranges for string and embed.FS embeds to be deduplicated and put in rodata, while []byte embeds are not:

https://github.com/golang/go/blob/56b783ad94f9a55163d494d5b34c783a9603d478/src/cmd/compile/internal/gc/embed.go#L224

However, it's true that the proposal does not seem to have addressed this point.

mdempsky on 10 Dec 2020

Thanks for the thoughtful comments. I've filed two issues to follow up:

#43216 - remove support for embedding directives on local variables
#43217 - define String and Bytes type aliases that must be used with //go:embed

I didn't file an issue for the //go:embed syntax itself. I wanted to say why here, so that it doesn't look to everyone like I'm just dismissing that.

Back when I blogged about the proposal process I spent a long time looking at ways to (and not to) handle decisions in large groups. One source I found that made a lot of sense to me was John Ousterhout's “Open Decision-Making” post. The whole post is worth reading, but I'll just quote here the section on Reconsideration:

The last rule for open decision-making is to make sure you don't reconsider a decision unless there is significant new information. This rule has two parts. First, you must be prepared to correct decisions that turn out to be wrong in a significant way. This is particularly true in startups: many decisions have to made without complete information and inevitably some of them will turn out to be wrong. A wrong decision that is corrected quickly causes little or no damage, but a wrong decision that is not corrected can be catastrophic.

On the other hand, you should not reconsider a decision unless significant new information has come to light since the original decision was made. If no new information is available, then reconsidering a decision will probably produce the same outcome as the original decision, wasting everyone's time. It's not unusual for employees to come to me weeks after a decision was made: "John, I voted against the decision to XYZ, and the more I think about it the more convinced I am that it was wrong; I really think we need to reconsider this." My answer is "What new information do you have?" If the answer is "none", we don't reconsider. In situations like this it helps to have a record of the arguments presented during the discussion, so you can verify that new information really is new. If you make it too easy to reopen decisions you end up vacillating back and forth where no decision is ever final and employees hesitate to implement decisions because they don't believe they are permanent.

If you want to make sure you don't reconsider very many decisions, be sure to gather input widely during the decision-making process. If you don't get enough input, you increase the likelihood that significant new input will arise after the decision, which means you'll have to reconsider. If you do a good job of collecting input it's much less likely that you will have to revisit your decisions.

This really resonated with me: the Go proposal process (like large open source projects generally) is a system with a far higher offered load than work capacity, so it's important that we (disagree if necessary and) commit and move on to the next decision.

string and []byte were added fairly late in the process of considering this issue, in response to initial feedback, and we clearly didn't think through all the implications of those. So I filed those two new issues, #43216 and #43217.

On the other hand, the //go:embed syntax was a core part of the original discussions and was extensively discussed, both pros and cons. I don't believe there's “significant new information” that should cause us to reconsider that syntax entirely, so in the interest of moving forward and keeping Ousterhout's advice on Reconsideration in mind, I'm leaving that to the side.

Thanks again for pointing out the string and []byte issues!

rsc on 16 Dec 2020

❤7

Evidently. The Go compiler specifically arranges for string and embed.FS embeds to be deduplicated and put in rodata, while []byte embeds are not:

@rsc did you think about this last point by any chance? With the current implementation, it might become "best practice" to use string instead of []byte for the sake of producing better binaries. Are we OK with that?

mvdan on 16 Dec 2020

I don't understand why that would be a "best practice". To me, that's akin to saying it's "best practice" to make sentinel errors constants, ergo we shouldn't allow package-scoped variables of type error - in that I disagree both with that being a good practice and with the additional restriction as a solution.

I could see an argument to only use string in local vars. But in package-scoped variables, the semantics are clear and well-defined and I wouldn't recommend against using a []byte embed anymore than I would recommend against using any other []byte variable.

Merovius on 16 Dec 2020

@mvdan, if folks bias toward using string instead of []byte as a best practice, I would consider that a good thing. string is the appropriate Go type for “an immutable string of bytes”, whereas []byte is the appropriate type for “a mutable string of bytes” or “a string of bytes of indeterminate mutability”.

For the cases where you have a value of type string (immutable) and want to use it as type []byte (indeterminate), you can already use unsafe to do that correctly. (See, for example, my unsafeslice.OfString). Perhaps we should add a supported standard library for that operation, but that seems like a separate proposal.

So it seems fine to always use the type string if you really intend for the value to be read-only.

bcmills on 16 Dec 2020

👍2 👎1

@Merovius @bcmills you raise good points, and to be clear I don't object. I just want to make sure the proposal designers think about this distinction before the final release.

mvdan on 16 Dec 2020

❤1

I really don't think deduplication is going to come up much in practice. (In what setting are multiple packages going to embed the same exact files?) People should use the form that they need and not worry about "string means my binaries are smaller", because in general I don't think that will be true.

rsc on 16 Dec 2020

A few people asked about ETags. We ran out of time for that, but I've filed a proposal at https://github.com/golang/go/issues/43223 and hopefully that will lead to a good idea that can go into Go 1.17. Sorry for not being able to get it in this round.

rsc on 16 Dec 2020

👍3

Thanks for filing #43216 and #43217. If those are accepted, they'll substantially address my outstanding concerns with the //go:embed proposal.

On the other hand, the //go:embed syntax was a core part of the original discussions and was extensively discussed, both pros and cons. I don't believe there's “significant new information” that should cause us to reconsider that syntax entirely, so in the interest of moving forward and keeping Ousterhout's advice on Reconsideration in mind, I'm leaving that to the side.

I can respect not wanting to revisit matters that have already been decided through extensive discussion. But after reviewing the discussion that took place on #35950, the Reddit thread, and here, I don't think they justify the decision to use //go:embed.

Here are the comments I found that touched on syntax for indicating what files to embed:

18 GitHub issue and Reddit comments

https://github.com/golang/go/issues/35950#issuecomment-561443566 "The //go:embed approach introduces another level of complexity too. You'd have to parse the magic comments in order to even typecheck the code. The "embed package" approach seems friendlier to static analysis." (Note: The revised //go:embed proposal does make it easier to type check code, but it's still non-trivial if an analyzer wanted to actually see the strings used, because they're not in go/ast.)
https://github.com/golang/go/issues/35950#issuecomment-561450136 "That's a very strong argument for using a package. It also makes it more readable & documentable, since we can document it all with regular godoc, rather than deep in cmd/go's docs." (Note: I think this is mostly addressed by #43217, as then the package embed types provide an easy reference point.)
https://github.com/golang/go/issues/35950#issuecomment-561840094 "Is there a reason it couldn't be part of go build / link... e.g. go build -embed example=./path/example.txt and some package that exposes access to it (e.g. embed.File("example"), instead of using go:embed?" (Suggests function syntax over //go: directive. Downvoted, but I suspect because it suggested go build flags.)
https://github.com/golang/go/issues/35950#issuecomment-561726107 "I'm not particularly a fan of a comment that compiles to code, but I also find the pseudo-package that affects compilation to be a bit strange as well. If the directory approach isn't used, maybe it might make a bit more sense to have some kind embed top-level declaration actually built into the language. It would work similarly to import, but would only support local paths and would require a name for it to be assigned to." (Doesn't like either //go: or new builtin functions, but still prefers code rather than comments.)
https://github.com/golang/go/issues/35950#issuecomment-561856033 "Also, since //go:generate directives don't run automatically on go build the behavior of go build may seem a bit inconsistent: //go:embed will work automatically but for //go:generate you have to run go generate manually. (//go:generate can already break the go get flow if it generates .go files needed for the build)." (Included for completeness, but we already have //go: directives that both do and do not affect go build behavior even without //go:embed, so I don't find this argument compelling.)
https://github.com/golang/go/issues/35950#issuecomment-562005821 "I’d vote for the new package as opposed to the directive. Much easier to get a grip on, easier to handle/manage and much easier to document and extend. e.g. Can you easily find the documentation for a Go directive such as “go:generate”? What about the documentation for the “fmt” package? Do you see where I am going with this?" (Again, mostly addressed by #43217.)
https://github.com/golang/go/issues/35950#issuecomment-562200553 "One argument in favor of a comment pragma (//go:embed) instead of a special directory name (static/): a comment lets us embed a file in the test archive for a package (or the xtest archive) but not the library under test. The comment just needs to appear in a _test.go file." (Note: Argument is against special directory, not for //go:embed specifically; same argument extends to embed.Files functions, etc.)
https://github.com/golang/go/issues/35950#issuecomment-562235108 "I like the package embed approach for its using Go syntax"
https://github.com/golang/go/issues/35950#issuecomment-562713786 "The import "C" approach has already set a precedent for "magic" import paths. IMO it has worked out pretty well."
https://github.com/golang/go/issues/35950#issuecomment-562768318 "Multiple people have talked about automatically providing an API to read the embedded files from, instead of needing another signal like a magic comment. I think that should be the way to go, as it offers a familiar, programmatic syntax to the approach. Going for a special package, possiblly runtime/embed as previously mentioned, would satisfy that and it would allow for easy extensibility in the future."
https://github.com/golang/go/issues/35950#issuecomment-562959330 "Instead of reusing comments (which end up scattered all over the place, and, on a personal note, just feel gross), [...]." (Goes on to suggest using go.mod files for enumerating embedded files. To be fair, also argues "This works around issues like having to restrict the magic package to only allow string literals as arguments, but does mean that it's harder to check if the embedded assets are actually used anywhere or end up being dead weight.")
https://github.com/golang/go/issues/35950#issuecomment-562966654 "Yet another idea: Instead of adding a new kind of verb embed in go.mod, we could introduce a new kind of package, a data package, which gets imported and used in go.mod in the usual way. Here's a straw man sketch." (Preference for code over comments.)
https://github.com/golang/go/issues/35950#issuecomment-563156010 "Here is something that have not yet seen mentioned: the API for Go source processing tools (compiler, static analyzers) is as important as the runtime API. This kind of API is a core value of Go that helps to grow the ecosystem (like go/ast / go/format and go mod edit). [...] In the case of a special package, I see nothing to change in go.mod parsing (go mod tools) or go/ast parser."
https://github.com/golang/go/issues/35950#issuecomment-601748774 Suggests go.res resource file listing. Uses code, not comments.
https://old.reddit.com/r/golang/comments/hv96ny/qa_goembed_draft_design/fyv9gxw/ "Why do we need the //go:embed comment e.g. binclude does binclude.Include(filename) to include a file/ directory what about fs.Embed(filename)?"
https://github.com/golang/go/issues/41191#issuecomment-686625127 "In my opinion, adding syntactical sugar in the language to support this tooling change is a language change. I'm sure this is obvious to others, but this is effectively comment-as-code. I strongly feel that magic/sugar detracts from the simplicity and readability of the language; it is very easy to miss a magical comment that embeds a file. "
https://github.com/golang/go/issues/41191#issuecomment-690423900 "I have similar concerns as @tristanfisher . Go has been using magic comments as compiler directives for a long time (since the beginning?), but they are meant for corner cases and it is rare for them to appear in code. Given the popularity of embedding static content in Go binaries, //go:embed is likely to be more common. Maybe it's time to consider a different syntax for compiler directives?"
https://github.com/golang/go/issues/41191#issuecomment-690522509 "I share your concerns about magic comments."

As I read the comments, they appear to always favor Go syntax, with the caveat of wanting to avoid changes that will break tools. I didn't see anyone arguing for //go:embed as the spelling, as much as just accepting it as is. Adding new compiler intrinsics with backwards compatibility stubs for legacy type checkers seem more in line with the expressed preference of discussion participants than more //go: directives.

mdempsky on 16 Dec 2020

👍2

In the style of the thumbs up / thumbs down polls in #43216 and #43217:

Thumbs up (👍) if you prefer new compiler intrinsics (e.g., CL 276835):
```
import "embed"

var website = embed.Files("logo.jpg", "static/*")
```
See embed_test.go for more usage examples.

(Note: The arguments must be string literals, not merely string values. That is, declared string constants like const logo = "logo.jpg" or constant string expressions like "logo" + ".jpg" are not allowed either. However, embed.Files, etc. may be used in any context, not only to initialize variables.)
Thumbs down (👎) if you prefer new compiler directives (i.e., the current proposal):
```
import "embed"

//go:embed logo.jpg static/*
var website embed.FS
```

mdempsky on 17 Dec 2020

👎18 👍15

@mdempsky I feel like Russ has been quite clear, what would be needed to justify a reversal of the decision to him - new information. I think gathering previous comments clearly isn't that. No offense intended.

Merovius on 17 Dec 2020

👍1

There's no precedent for these new kinds of functions, right? That is, something that looks like a normal package function but is actually a special kind of builtin that can only be called in a certain way?

You say "intrinsic" but the current intrinsics behave exactly like normal Go functions.

x := 10000
_ = bits.RotateLeft64(10, x)

Dressing up new directives to look like Go functions but have a different (more restrictive) syntax than Go functions seems like a poor option from where I'm sitting. I think that embed would need to be described in the spec, for one thing, unlike //go: directives.

(You can approximate the "only literal arguments allowed" in normal Go code by making a function which takes a non-exported type internalString string arguments, but I gather that's not what you're proposing since your CL makes changes to the parser and type checker.)

cespare on 17 Dec 2020

❤5

@Merovius According to the Go proposal process:

Consensus and Disagreement

The goal of the proposal process is to reach general consensus about the outcome in a timely manner.

If general consensus cannot be reached, the proposal review group decides the next step by reviewing and discussing the issue and reaching a consensus among themselves. If even consensus among the proposal review group cannot be reached (which would be exceedingly unusual), the arbiter (rsc@) reviews the discussion and decides the next step.

In my reading of comments, consensus seemed to favor Go code syntax. If consensus now is in fact to stick with //go:embed, I respect that. But I don't think the documented process justified the initial decision to move forward with //go:embed.

(At the moment, the poll results weakly favor new functions over new directives, but not by much. If thumbs up don't outnumber thumbs down at least 2:1, I'm fine with just dropping this.)

@cespare

There's no precedent for these new kinds of functions, right? That is, something that looks like a normal package function but is actually a special kind of builtin that can only be called in a certain way?

There's both the universe's predeclared functions and package unsafe's functions.

You can argue that package unsafe is documented in the Go spec, but I argue it doesn't need to be. Go used to support modes where package unsafe was unavailable to users, and even today package unsafe has functionality and restrictions that are only documented in the package docs, not in the Go spec.

There are also internal functions within the Go runtime that are implemented only by the Go compiler. E.g., getg, getcallerpc, getcallersp, and getclosureptr.

(You can approximate the "only literal arguments allowed" in normal Go code by making a function which takes a non-exported type internalString string arguments,

I think that would be a reasonable addition to further narrow down the behavior of legacy type checkers.

but I gather that's not what you're proposing since your CL makes changes to the parser and type checker.)

CL 276835 doesn't change the parser, except to remove the new parser code added for //go:embed. It does change the type-checker, but comparably to //go:embed did before.

It would be easy to extend go/types to be aware of package embed, but I chose not to for CL 276835 specifically to show that it still works (e.g., cmd/vet isn't failing on package embed's unit tests).

mdempsky on 17 Dec 2020

@mdempsky You may disagree about whether or not consensus was reached at that time. Just as you might disagree with the decision itself. I don't think that really changes things, though. At the end of the day, "there is consensus" is also a decision that was made. And exactly the same points about requiring new information for a reversal apply to that decision.

Needing to satisfy every single person is a DDoS-vector - both in regards to the decision and for the process it was made with.

FTR, the question about tooling vs. language change has been discussed, as has the tradeoff between "string-literal vs. string-constant" ("string-literal" requires magic in the type-checker, "string-constant" requires the go tool to do type-checking - comments need neither). So, again, there isn't really anything new here. You might disagree with the outcome of that decision or the process it was made with - but the arguments you mention have been considered when making the decision.

Merovius on 17 Dec 2020

Needing to satisfy every single person is a DDoS-vector - both in regards to the decision and for the process it was made with.

I'm not demanding that I be personally satisfied, and I find it insulting that you're characterizing my posts as such. Earlier you also talked down to me as though I'm not familiar with the Go language or the Go compiler. Please stop with the condescension.

I listed above numerous comments where people near-universally expressed a preference to not add new //go: directives, whereas no one had commented affirmatively in support of them. As such, the overwhelming expressed preference of the community appeared to me to favor Go syntax, and my comment was arguing in defense of their stated preferences.

However, as it stands, https://github.com/golang/go/issues/41191#issuecomment-747095807 has more thumbs-downs than thumbs-ups. That's surprising to me, because it seems inconsistent with all of the comments during earlier discussion. But I'm happy to accept that the question has been directly addressed, and (especially as someone who will be involved in long-term support for this feature within the Go compiler) I'm happier now to support //go:embed seeing that it is in fact the community's preference and not just the proposal authors' preference. This outcome wouldn't have been reached if discussion had been shut down, as you seem intent upon.

FTR, the question about tooling vs. language change has been discussed, as has the tradeoff between "string-literal vs. string-constant" ("string-literal" requires magic in the type-checker, "string-constant" requires the go tool to do type-checking - comments need neither).

That comment is irrelevant to the arguments I was making. The alternative spelling I prototyped in CL 276835 has exactly the same technical properties as the existing //go:embed spelling: tools that need to know what files are to be embedded will need updating to process Go source files differently (e.g., to error about misuse of //go:embed comments or the embed.Bytes builtin function), whereas existing tools can continue to process code reasonably without worrying about them (e.g., go/types will ignore //go:embed comments but won't detect if it's applied to incorrect variable types, and it can type-check embed.Bytes using the stub declarations but it won't know to reject all calls that use arguments other than string literals).

The question of whether either of these are a "language change" is more philosophical than technical.

(Russ's argument that "whether a program is valid does not change" under the //go:embed proposal is also incorrect. https://github.com/golang/go/issues/41191#issuecomment-747799509 gives an example of a package that was and is valid according to the Go spec and was also accepted by Go toolchain releases up through Go 1.15, but will no longer be valid in Go 1.16.)

mdempsky on 17 Dec 2020

As someone who gave a 👍 to Matt's proposal to use var website = embed.Files("logo.jpg", "static/*"), My concern with using the comment form (//go:embed logo.jpg static/*) is "ease of use".

For example, these 2 program samples would output 2 different things, just because an "import" was missed:

import (
    "fmt"
)

//go:embed sample.txt
var sample string

func main() {
    fmt.Println(sample) // will print a blank line
}

import (
    "embed"
    "fmt"
)

//go:embed sample.txt
var sample string

func main() {
    fmt.Println(sample) // will print contents of sample.txt
}

By forcing the developer to use the import through language semantics you minimize issues where the embedded files are not working as expected because they aren't initialized as expected.

kushieda-minori on 18 Dec 2020

@kushieda-minori While I think my proposal is easier to use as well, the first example already doesn't compile at tip:

./x.go:7:3: //go:embed only allowed in Go files that import "embed"

I think this issue is also further mitigated by #43217, as you need to import "embed" to declare variables of type embed.Bytes and embed.String anyway. And the compiler (but not go/types or cmd/vet) also already reports an error if you apply the //go:embed directive to a variable of an incorrect type.

mdempsky on 18 Dec 2020

👍3

@mdempsky , however, accidentally compiling on an older version of Go will not fail and could give a false sense that the embed worked.

kushieda-minori on 18 Dec 2020

Older versions of Go don't have package embed, so the import "embed" will fail. It's true there's a risk that users can write:

package p

//go:embed foo.txt
var foo []byte

and it will be silently accepted by Go 1.15 and older. But it won't be accepted by Go 1.16 and newer. There at least aren't any programs that will compile with both Go 1.15 and Go 1.16 and have different semantics (due to package embed, at least).

I think an appropriate (partial) fix here would be for cmd/compile to stop accepting unknown //go: directives. For example, another limitation of the current proposal relative to Go builtin syntax is that if you mistype the //go:embed directive (say //go:enbed, //go;embed, or // go:embed with a space), it will also silently be ignored. (Whereas typos like enbed.Bytes("foo.txt") would cause a type-checking error, even with unmodified go/types.)

mdempsky on 18 Dec 2020

Great point about non-validated go: directives. If that was enforced that would help alleviate hard to spot typo errors.

Another thought I had just now is that my tooling is set up to add/remove imports automatically as needed. If my tooling is out of date, do I have to fight it from removing the "unused" embed import? I realized it's solved if I use the embed.String etc, but using regular []byte] and string is supposed to be completely valid. This could be frustrating to a new gopher who is cherry-picking web snippets to get something to work if they didn't see the specific embed.* aliases.

kushieda-minori on 18 Dec 2020

but using regular []byte] and string is supposed to be completely valid.

They won't be if #43217 is accepted. I recommend reading up on #43216 and #43217. They've both received overwhelmingly positive support so far, and seem very likely to be accepted to me. (I am not on the proposal-review committee though.)

mdempsky on 18 Dec 2020

Thanks, when I read through #43217 the first time, I missed the key word
"have" for using the embed.* types.

I think my only concern left then is the last one you pointed out.

On Thu, Dec 17, 2020, 20:24 Matthew Dempsky notifications@github.com
wrote:

but using regular []byte] and string is supposed to be completely valid.

They won't be if #43217 https://github.com/golang/go/issues/43217 is
accepted. I recommend reading up on #43216
https://github.com/golang/go/issues/43216 and #43217
https://github.com/golang/go/issues/43217. They've both received
overwhelmingly positive support so far, and seem very likely to be accepted
to me. (I am not on the proposal-review committee though.)

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/golang/go/issues/41191#issuecomment-747808153, or
unsubscribe
https://github.com/notifications/unsubscribe-auth/ADLZABJWBJX475BVYDVD6ODSVKVOTANCNFSM4QTHVTUA
.

kushieda-minori on 18 Dec 2020

Go: embed, cmd/go: add support for embedded files

Most helpful comment

All 114 comments

Allowed types

Semantics of `[]byte`-typed variables.

Avoiding more `//go:` directives

Responses

Related issues

Go: embed, cmd/go: add support for embedded files

Most helpful comment

All 114 comments

Allowed types

Semantics of []byte-typed variables.

Avoiding more //go: directives

Responses

Related issues

Semantics of `[]byte`-typed variables.

Avoiding more `//go:` directives