Go: x/build/cmd/release: reduce Go download sizes

Created on 22 Aug 2018  路  26Comments  路  Source: golang/go

Go1's source download was 11 MB.
Go 1.11rc1's source download is 20MB.
Go 1.4.3's Linux download was 58 MB.
Go 1.11rc1's Linux download was 169 MB.

Clearly 169 MB is pretty ridiculous.

https://go-review.googlesource.com/c/build/+/129955 removed some useless *.a files to save 20 MB.

https://go-review.googlesource.com/c/build/+/130575 removed the blog to save 34MB.

We're now down to 116 MB for what will be the Go 1.11rc2 Linux download. Better, but not great.

This is a tracking bug for further improvements.

/cc @ianlancetaylor @andybons @dmitshur

Builders NeedsFix

Most helpful comment

We ship three copies of Mark Twain's Tom Sawyer?

$ ls -l $(find . | grep -i twain)
-rw-r--r-- 1 bradfitz bradfitz 118509 Nov 22  2017 ./compress/bzip2/testdata/Mark.Twain-Tom.Sawyer.txt.bz2
-rw-r--r-- 1 bradfitz bradfitz 387851 Nov 22  2017 ./compress/testdata/Mark.Twain-Tom.Sawyer.txt
-rw-r--r-- 1 bradfitz bradfitz 387851 Nov 22  2017 ./net/testdata/Mark.Twain-Tom.Sawyer.txt

All 26 comments

Untarring the release and gzip-compressing each file before doing du stats gives a reasonable way to measure the cost of removing bits:

$ find . -type f -not -name '*.gz' -exec gzip {} \;

Here are the top directories:

```$ du -b -a | sort -n -r | grep -v gz$ | head -20
127731439 .
127727343 ./go
83486614 ./go/pkg
54847292 ./go/pkg/tool
54843196 ./go/pkg/tool/linux_amd64
21066018 ./go/src
16095151 ./go/bin
15354919 ./go/pkg/linux_amd64_race
13273773 ./go/pkg/linux_amd64
8094311 ./go/src/cmd
3127907 ./go/src/cmd/vendor
2732975 ./go/doc
2138255 ./go/src/runtime
2102037 ./go/doc/gopher
2051880 ./go/pkg/linux_amd64_race/net
1933218 ./go/pkg/linux_amd64_race/go
1773043 ./go/misc
1752589 ./go/src/cmd/vendor/golang.org
1750311 ./go/pkg/linux_amd64/net
1748493 ./go/src/cmd/vendor/golang.org/x

* It turns out that `go/pkg/linux_amd64_race` directory is not required. Go 1.11 will rebuild that stuff as needed. I verified with a Dockerfile (https://play.golang.org/p/O-IW_vVnol_g). That'll save us 15 MB.

* Do we really need 2MB of Gopher images? We need some for the local godoc perhaps.

Biggest binaries:

$ du -b -a | sort -n -r | grep gz$ | head -20
8881040 ./go/pkg/tool/linux_amd64/compile.gz
7957541 ./go/bin/godoc.gz
6467215 ./go/pkg/tool/linux_amd64/pprof.gz
6435168 ./go/bin/go.gz
5961018 ./go/pkg/tool/linux_amd64/tour.gz
5183556 ./go/pkg/tool/linux_amd64/trace.gz
3756217 ./go/pkg/tool/linux_amd64/vet.gz
2739853 ./go/pkg/tool/linux_amd64/link.gz
2451059 ./go/pkg/tool/linux_amd64/cover.gz
2257606 ./go/pkg/tool/linux_amd64/cgo.gz
2187914 ./go/pkg/tool/linux_amd64/doc.gz
2145237 ./go/pkg/tool/linux_amd64/asm.gz
2060780 ./go/pkg/tool/linux_amd64/objdump.gz
1863347 ./go/pkg/tool/linux_amd64/addr2line.gz
1842210 ./go/pkg/tool/linux_amd64/nm.gz
1739594 ./go/pkg/tool/linux_amd64/dist.gz
1698346 ./go/bin/gofmt.gz
1608565 ./go/pkg/tool/linux_amd64/fix.gz
1324476 ./go/pkg/tool/linux_amd64/test2json.gz
1295801 ./go/pkg/tool/linux_amd64/buildid.gz

* Do we really need to include the tour?

* Can any binaries be combined into one binary and have different behavior based on invocation name, busybox style?

We have 6.6 MB of testdata:

$ du -b -c $(find -name testdata) | tail -1
6664279 total
```

  • Do we need tests to pass for people who download binary distributions?

  • Perhaps we should have a full download and a minimal download? (Minimal could lack godoc, gopher images, tests, test data, prebuilt *.a files, etc)

We ship three copies of Mark Twain's Tom Sawyer?

$ ls -l $(find . | grep -i twain)
-rw-r--r-- 1 bradfitz bradfitz 118509 Nov 22  2017 ./compress/bzip2/testdata/Mark.Twain-Tom.Sawyer.txt.bz2
-rw-r--r-- 1 bradfitz bradfitz 387851 Nov 22  2017 ./compress/testdata/Mark.Twain-Tom.Sawyer.txt
-rw-r--r-- 1 bradfitz bradfitz 387851 Nov 22  2017 ./net/testdata/Mark.Twain-Tom.Sawyer.txt

About the pkg files that get rebuilt, a common issue in the past has been users not having write permissions on the system GOROOT, hence breaking or rebuilding every time. Will that be a problem?

@FiloSottile, that hasn't been a problem since at least Go 1.10. It writes to the build cache instead. You can experiment with the Dockerfile linked above.

Do we really need to include the tour?

Currently we ship the tour so one can do go tool tour to run it locally, but alas that doesn't work if you built from source or you installed Go using the OS package manager, so we also have to explain in the welcome that if go tool tour doesn't work you have to go get it from the x repo.

This is all quite convoluted and I've seen reports of people confused by this. If we remove the tour from the binary releases we will also be able to simplify our instructions for running it locally: it's always "go get it, and then run the gotour command".

In that case I would like to remove the tour, maybe even for Go 1.11.

/cc @katiehockman

Change https://golang.org/cl/131156 mentions this issue: cmd/release: remove the tour from the releases

Change https://golang.org/cl/135495 mentions this issue: compress: reduce the testing copies of the decompressed book from two

Change https://golang.org/cl/138495 mentions this issue: compress: move benchmark text from src/ to src/compress/testdata

Change https://golang.org/cl/138737 mentions this issue: Revert "compress: move benchmark text from src/testdata to src/compress/testdata"

Change https://golang.org/cl/143537 mentions this issue: cmd/release: fix tour after rename, do less tour work for Go 1.12+ releases

Now that the tour is removed from cmd/release (for Go 1.12), and CLI support is removed from godoc so it's only a webserver, is there any value in shipping the godoc webserver with releases?

I think we should remove it and trim the size of releases further. Anybody wants to read the docs offline can go get golang.org/x/tools/cmd/godoc first.

/cc @andybons @katiehockman @ianlancetaylor @FiloSottile @bcmills

(Braindump about whether we make "lite" releases.... I'm leaning towards no)

We currently ship 15 downloads with releases at https://golang.org/dl/ ....

  • 1 source tarball
  • 3 installers (Mac amd64, Windows 32-bit, Windows 64-bit)
  • 11 binary archives (zip or tar.gz)

The binary archives take about 112MB compressed, 329MB extracted (linux-amd64 sizes). People sometimes complain that these are too large, and they've been growing over time. (See meta bug #27151)

One thing we could do is ship "lite" versions without:

  • any test data (saves 31MB extracted)
  • any pre-built *.a files (saves 89MB extracted)
  • docs & gopher images (4.5MB)
  • godoc (16MB)

Maybe also:

  • the trace viewer (~15 MB)
  • pprof (~13 MB)
  • objdump (~ 4.4 MB)
  • the "fix" tool (3.2MB)
  • api/* version history (6.5M)
  • zoneinfo
  • etc

It looks like we could halve the sizes of the downloads. (compressed & on-disk)

The question is whether we want to.

If we had "lite" versions, who is their intended audience?

If it's CI users doing builds in loops rather than humans, then one might counter argue that they'd benefit from pre-built *.a files (for faster builds) and the bandwidth isn't a concern for a cloud-connected CI system.

If it's for people building Docker images, one could counter argue that multi-stage Dockerfiles solve the problem: you should build your application in some heavy stage, then copy your final binary to a small image.

Go for it. I have no issue as long as there's a path to having offline docs (which one can get through go doc).

Change https://golang.org/cl/144281 mentions this issue: cmd/release: don't ship race detector *.syso for other platforms

Per https://github.com/golang/go/issues/29713, does it make sense to omit the test/ subdir for Go distributions? As far as I can tell, it's not needed for, e.g,, a Linux distribution packaging Go.

@darkfeline, if there are some users who don't want to be able to run tests, yes. See discussion above about "lite" builds and who their audience might be.

The downstream bugreport should be fixed.
https://git.archlinux.org/svntogit/community.git/commit/trunk/PKGBUILD?h=packages/go&id=dec13937b5b2df24e8015341caa6fada8a2d8957

https://bugs.archlinux.org/task/61585?dev=25983

We don't close Arch or other downstream bug reports. They can do whatever they'd like. They can ask for help if they need it or have questions, though.

Now that the tour is removed from cmd/release (for Go 1.12), and CLI support is removed from godoc so it's only a webserver, is there any value in shipping the godoc webserver with releases?

I think we should remove it and trim the size of releases further. Anybody wants to read the docs offline can go get golang.org/x/tools/cmd/godoc first.

This sounds reasonable to me. There's a minor anticipated benefit of doing this. I expect that godoc will gain module support before the next major release, and so this would mean when people do go get -u golang.org/x/tools/cmd/godoc to install it, there's less of a chance of the older godoc in GOROOT/bin interfering.

@bradfitz This hasn't happened by now. Is it too late for 1.12 now?

@dmitshur, yeah, too late. It would require doc/go1.12.html changes, etc. But I filed #30029 for that for Go 1.13.

@bradfitz I commented on the wrong issue :/ sorry about that.

Update to first comment:

Go 1.12.4 linux-amd64: 122 MB
Go 1.12.4 source: 21 MB

Change https://golang.org/cl/174322 mentions this issue: cmd/release: don't include godoc in releases in Go 1.13+

Why not support more format downloads, for example xz.
It has small size, but it would take more time to compress.

tar.bz created by linux command tar -jcf
tar.xz created by linux command tar -Jcf
tar.gz created by linux command tar -zcf

-rw-rw-r--  1 root root 107M Sep 16 21:26 go1.13.linux-amd64.tar.bz
-rw-rw-r--  1 root root 115M Sep 16 21:13 go1.13.linux-amd64.tar.gz
-rw-rw-r--  1 root root  84M Sep 16 21:18 go1.13.linux-amd64.tar.xz
Was this page helpful?
0 / 5 - 0 ratings