Installation is noticeably slow. I believe there is redundant disk work during installation, associated with the transaction machinery.
I don't specifically know what is inefficient here, but somebody could do some stracing on Linux and see if there's anything obvious.
stat appears to get called 3 times for each file in set_file_perms:
needs_x https://github.com/rust-lang/rustup.rs/blob/67d740fca643f07b38618f42a28757fa988b5ba7/src/rustup-dist/src/component/package.rs#L131utils::is_directory https://github.com/rust-lang/rustup.rs/blob/67d740fca643f07b38618f42a28757fa988b5ba7/src/rustup-dist/src/component/package.rs#L141This could probably be reduced to just one stat call.
Given that the majority of filesystem operations which Rustup performs are reading and writing large (10s of MBs) datasets, I think the stat() inefficiency is by no means the most important thing to consider for speed.
I do, however, think that we should look at consider if there's anything we can do to improve install performance on Windows.
Installation on Linux filesystems is fast even on HDD so measuring performance there won't help much for Windows but still it'd be nice to make it even faster.
I don't know which one is more limiting NTFS or AV software but shipping less files would be the way to go here.
Nightly on my Linux machine consists of 17957 files, 16617 of them docs. It'd be significant improvement for Windows if they could be fused together into few bigger files.
If I understand correctly, the set_file_perms function walks the entire tree of files after they have been extracted to ensure their permissions are correct. This is especially costly in the docs, since there are a tremendous number of files there as @mati865 points out. I would expect tens of thousands of fewer system calls in an installation operation to be noticeable, especially e.g. on NFS.
I've done some basic analysis on Windows 10 Surface; since OS specific code paths can make Linux traces (while convenient to grab) misleading :). My test setup had Windows Defender running normally; controlled folder access was off, no defender exclusions defined.
The high level view is that for docs we spend:
2m20s extracting ~16K files
30 seconds copying from staging to final dir
26 seconds deleting from staging dir
sidebar: We are quite chatty in the OS syscalls we make - many extra opens than needed, but looking at the timing, that should be considered a microoptimisation, not a first-pass target. See the trace excerpts below for examples.
What should we be able to expect without getting into hairy levels of OS specific integration?
A trivial Python program can write a similar sized data set in 60seconds, and the archive should be in hot cache, and is tiny compared to RAM these days so - that should be a decent target.
Python benchmark
import os
import shutil
import time
now = time.time()
os.mkdir("root")
os.chdir("root")
written = 0
for dirname in range(80):
os.mkdir("%s" % dirname)
for fname in range(200):
with open("%s/%s"%(dirname, fname), "wt") as f:
written += f.write(("this is a small file %s\n" % fname)*512)
finish = time.time()
os.chdir("..")
shutil.rmtree("root")
print("created 16000 files in 80 directories in %0.2fs with %s bytes" % (finish-now, written))
Total time right now is 3m20s, of which 60 seconds is entirely dealing with making an unused copy of the content: the staging process is:
So we can shave a minute off if we unpack directly into the final directory. Would that be safe? If the archive has been validated during download, so we know its contents are intact, then there are no additional IO errors that can occur unpacking directly:
Alternatively, we could unpack to a staging directory but then rename into the final directory - that would require double-touching each file, but ensure we could fit all the files on disk before placing any of them into the target location (if thats important - I don't think it is :)).
That would still leave 2m20s to unpack vs 60s for a trivial driver; some of that may be decompression overhead, IO from the source archive etc- I don't have a breakdown for that yet.
Trace excerpts
7:23:52.10: extraction starts
7:24:52.8898672 AM rustup.exe 4316 CreateFile C:\Users\robertc\.rustup\tmp\mvrji9wpubejlhg2_dir\rust-docs\share\doc\rust\html\alloc SUCCESS Desired Access: Read Attributes, Synchronize, Disposition: Open, Options: Synchronous IO Non-Alert, Attributes: n/a, ShareMode: Read, Write, Delete, AllocationSize: n/a, OpenResult: Opened
7:24:52.8898941 AM rustup.exe 4316 QueryInformationVolume C:\Users\robertc\.rustup\tmp\mvrji9wpubejlhg2_dir\rust-docs\share\doc\rust\html\alloc BUFFER OVERFLOW VolumeCreationTime: 15/04/2017 2:56:45 PM, VolumeSerialNumber: 92EB-4520, SupportsObjects: True, VolumeLabel: Loc艢
7:24:52.8899057 AM rustup.exe 4316 QueryAllInformationFile C:\Users\robertc\.rustup\tmp\mvrji9wpubejlhg2_dir\rust-docs\share\doc\rust\html\alloc BUFFER OVERFLOW CreationTime: 9/04/2019 7:23:59 AM, LastAccessTime: 9/04/2019 7:24:31 AM, LastWriteTime: 9/04/2019 7:24:31 AM, ChangeTime: 9/04/2019 7:24:31 AM, FileAttributes: D, AllocationSize: 0, EndOfFile: 0, NumberOfLinks: 1, DeletePending: False, Directory: True, IndexNumber: 0x23000000062484, EaSize: 0, Access: Read Attributes, Synchronize, Position: 0, Mode: Synchronous IO Non-Alert, AlignmentRequirement: Long
7:24:52.8899204 AM rustup.exe 4316 CloseFile C:\Users\robertc\.rustup\tmp\mvrji9wpubejlhg2_dir\rust-docs\share\doc\rust\html\alloc SUCCESS
7:24:52.8900168 AM rustup.exe 4316 CreateFile C:\Users\robertc\.rustup\tmp\mvrji9wpubejlhg2_dir\rust-docs\share\doc\rust\html\alloc\all.html NAME NOT FOUND Desired Access: Read Attributes, Delete, Disposition: Open, Options: Non-Directory File, Open Reparse Point, Attributes: n/a, ShareMode: Read, Write, Delete, AllocationSize: n/a
7:24:52.8901162 AM rustup.exe 4316 CreateFile C:\Users\robertc\.rustup\tmp\mvrji9wpubejlhg2_dir\rust-docs\share\doc\rust\html\alloc\all.html SUCCESS Desired Access: Generic Write, Read Attributes, Disposition: OverwriteIf, Options: Synchronous IO Non-Alert, Non-Directory File, Attributes: n/a, ShareMode: Read, Write, Delete, AllocationSize: 0, OpenResult: Created
7:24:52.8903866 AM rustup.exe 4316 WriteFile C:\Users\robertc\.rustup\tmp\mvrji9wpubejlhg2_dir\rust-docs\share\doc\rust\html\alloc\all.html SUCCESS Offset: 0, Length: 8,192, Priority: Normal
7:24:52.8905696 AM rustup.exe 4316 WriteFile C:\Users\robertc\.rustup\tmp\mvrji9wpubejlhg2_dir\rust-docs\share\doc\rust\html\alloc\all.html SUCCESS Offset: 8,192, Length: 7,309, Priority: Normal
7:24:52.8906167 AM rustup.exe 4316 CloseFile C:\Users\robertc\.rustup\tmp\mvrji9wpubejlhg2_dir\rust-docs\share\doc\rust\html\alloc\all.html SUCCESS
7:24:52.8973734 AM rustup.exe 4316 CreateFile C:\Users\robertc\.rustup\tmp\mvrji9wpubejlhg2_dir\rust-docs\share\doc\rust\html\alloc\all.html SUCCESS Desired Access: Generic Write, Read Attributes, Disposition: Open, Options: Synchronous IO Non-Alert, Non-Directory File, Attributes: n/a, ShareMode: Read, Write, Delete, AllocationSize: n/a, OpenResult: Opened
7:24:52.8974230 AM rustup.exe 4316 SetBasicInformationFile C:\Users\robertc\.rustup\tmp\mvrji9wpubejlhg2_dir\rust-docs\share\doc\rust\html\alloc\all.html SUCCESS CreationTime: 1/01/1601 12:00:00 PM, LastAccessTime: 8/04/2019 7:34:21 AM, LastWriteTime: 8/04/2019 7:34:21 AM, ChangeTime: 1/01/1601 12:00:00 PM, FileAttributes: n/a
7:24:52.8974922 AM rustup.exe 4316 CloseFile C:\Users\robertc\.rustup\tmp\mvrji9wpubejlhg2_dir\rust-docs\share\doc\rust\html\alloc\all.html SUCCESS
7:24:52.8976601 AM rustup.exe 4316 CreateFile C:\Users\robertc\.rustup\tmp\mvrji9wpubejlhg2_dir\rust-docs\share\doc\rust\html\alloc\all.html SUCCESS Desired Access: Read Attributes, Synchronize, Disposition: Open, Options: Synchronous IO Non-Alert, Attributes: n/a, ShareMode: Read, Write, Delete, AllocationSize: n/a, OpenResult: Opened
7:24:52.8976873 AM rustup.exe 4316 QueryInformationVolume C:\Users\robertc\.rustup\tmp\mvrji9wpubejlhg2_dir\rust-docs\share\doc\rust\html\alloc\all.html BUFFER OVERFLOW VolumeCreationTime: 15/04/2017 2:56:45 PM, VolumeSerialNumber: 92EB-4520, SupportsObjects: True, VolumeLabel: Loc艢
7:24:52.8977007 AM rustup.exe 4316 QueryAllInformationFile C:\Users\robertc\.rustup\tmp\mvrji9wpubejlhg2_dir\rust-docs\share\doc\rust\html\alloc\all.html BUFFER OVERFLOW CreationTime: 9/04/2019 7:24:52 AM, LastAccessTime: 8/04/2019 7:34:21 AM, LastWriteTime: 8/04/2019 7:34:21 AM, ChangeTime: 9/04/2019 7:24:52 AM, FileAttributes: A, AllocationSize: 16,384, EndOfFile: 15,501, NumberOfLinks: 1, DeletePending: False, Directory: False, IndexNumber: 0x14000000064f02, EaSize: 0, Access: Read Attributes, Synchronize, Position: 0, Mode: Synchronous IO Non-Alert, AlignmentRequirement: Long
7:24:52.8977145 AM rustup.exe 4316 CloseFile C:\Users\robertc\.rustup\tmp\mvrji9wpubejlhg2_dir\rust-docs\share\doc\rust\html\alloc\all.html SUCCESS
7:24:52.8978130 AM rustup.exe 4316 CreateFile C:\Users\robertc\.rustup\tmp\mvrji9wpubejlhg2_dir\rust-docs\share\doc\rust\html\alloc\all.html SUCCESS Desired Access: Write Attributes, Synchronize, Disposition: Open, Options: Synchronous IO Non-Alert, Open Reparse Point, Attributes: n/a, ShareMode: Read, Write, Delete, AllocationSize: n/a, OpenResult: Opened
7:24:52.8978458 AM rustup.exe 4316 SetBasicInformationFile C:\Users\robertc\.rustup\tmp\mvrji9wpubejlhg2_dir\rust-docs\share\doc\rust\html\alloc\all.html SUCCESS CreationTime: 1/01/1601 12:00:00 PM, LastAccessTime: 1/01/1601 12:00:00 PM, LastWriteTime: 1/01/1601 12:00:00 PM, ChangeTime: 1/01/1601 12:00:00 PM, FileAttributes: AN
7:24:52.8978630 AM rustup.exe 4316 CloseFile C:\Users\robertc\.rustup\tmp\mvrji9wpubejlhg2_dir\rust-docs\share\doc\rust\html\alloc\all.html SUCCESS
7:26:12.18 extraction finishes
-> 30 seconds of copy from staging dir to final dir:
7:26:12.1927218 AM rustup.exe 4316 CreateFile C:\Users\robertc\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\share\doc\rust\html\alloc SUCCESS Desired Access: Read Data/List Directory, Synchronize, Disposition: Create, Options: Directory, Synchronous IO Non-Alert, Open Reparse Point, Attributes: N, ShareMode: Read, Write, AllocationSize: 0, OpenResult: Created
7:26:12.1928832 AM rustup.exe 4316 CloseFile C:\Users\robertc\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\share\doc\rust\html\alloc SUCCESS
7:26:12.1930491 AM rustup.exe 4316 CreateFile C:\Users\robertc\.rustup\tmp\mvrji9wpubejlhg2_dir\rust-docs\share\doc\rust\html\alloc SUCCESS Desired Access: Read Data/List Directory, Synchronize, Disposition: Open, Options: Directory, Synchronous IO Non-Alert, Attributes: n/a, ShareMode: Read, Write, Delete, AllocationSize: n/a, OpenResult: Opened
7:26:12.1930788 AM rustup.exe 4316 QueryDirectory C:\Users\robertc\.rustup\tmp\mvrji9wpubejlhg2_dir\rust-docs\share\doc\rust\html\alloc\* SUCCESS Filter: *, 1: .
7:26:12.1931110 AM rustup.exe 4316 QueryDirectory C:\Users\robertc\.rustup\tmp\mvrji9wpubejlhg2_dir\rust-docs\share\doc\rust\html\alloc SUCCESS 0: .., 1: all.html, 2: alloc, 3: borrow, 4: boxed, 5: collections, 6: fmt, 7: format.m.html, 8: index.html, 9: macro.format!.html, 10: macro.format.html, 11: macro.vec!.html, 12: macro.vec.html, 13: prelude, 14: rc, 15: sidebar-items.js, 16: slice, 17: str, 18: string, 19: sync, 20: vec, 21: vec.m.html
7:26:12.1932216 AM rustup.exe 4316 CreateFile C:\Users\robertc\.rustup\tmp\mvrji9wpubejlhg2_dir\rust-docs\share\doc\rust\html\alloc\all.html SUCCESS Desired Access: Generic Read, Disposition: Open, Options: Sequential Access, Synchronous IO Non-Alert, Non-Directory File, Open Reparse Point, Attributes: n/a, ShareMode: Read, Delete, AllocationSize: n/a, OpenResult: Opened
7:26:12.1932645 AM rustup.exe 4316 QueryAttributeTagFile C:\Users\robertc\.rustup\tmp\mvrji9wpubejlhg2_dir\rust-docs\share\doc\rust\html\alloc\all.html SUCCESS Attributes: A, ReparseTag: 0x0
7:26:12.1932783 AM rustup.exe 4316 QueryStandardInformationFile C:\Users\robertc\.rustup\tmp\mvrji9wpubejlhg2_dir\rust-docs\share\doc\rust\html\alloc\all.html SUCCESS AllocationSize: 16,384, EndOfFile: 15,501, NumberOfLinks: 1, DeletePending: False, Directory: False
7:26:12.1932892 AM rustup.exe 4316 QueryBasicInformationFile C:\Users\robertc\.rustup\tmp\mvrji9wpubejlhg2_dir\rust-docs\share\doc\rust\html\alloc\all.html SUCCESS CreationTime: 9/04/2019 7:24:52 AM, LastAccessTime: 8/04/2019 7:34:21 AM, LastWriteTime: 8/04/2019 7:34:21 AM, ChangeTime: 9/04/2019 7:24:52 AM, FileAttributes: A
7:26:12.1933018 AM rustup.exe 4316 QueryStreamInformationFile C:\Users\robertc\.rustup\tmp\mvrji9wpubejlhg2_dir\rust-docs\share\doc\rust\html\alloc\all.html SUCCESS 0: ::$DATA
7:26:12.1933412 AM rustup.exe 4316 QueryBasicInformationFile C:\Users\robertc\.rustup\tmp\mvrji9wpubejlhg2_dir\rust-docs\share\doc\rust\html\alloc\all.html SUCCESS CreationTime: 9/04/2019 7:24:52 AM, LastAccessTime: 8/04/2019 7:34:21 AM, LastWriteTime: 8/04/2019 7:34:21 AM, ChangeTime: 9/04/2019 7:24:52 AM, FileAttributes: A
7:26:12.1933611 AM rustup.exe 4316 QueryEaInformationFile C:\Users\robertc\.rustup\tmp\mvrji9wpubejlhg2_dir\rust-docs\share\doc\rust\html\alloc\all.html SUCCESS EaSize: 0
7:26:12.1934607 AM rustup.exe 4316 CreateFile C:\Users\robertc\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\share\doc\rust\html\alloc\all.html SUCCESS Desired Access: Generic Read/Write, Delete, Write DAC, Disposition: OverwriteIf, Options: Sequential Access, Synchronous IO Non-Alert, Non-Directory File, Attributes: A, ShareMode: None, AllocationSize: 15,501, OpenResult: Created
7:26:12.1936583 AM rustup.exe 4316 QueryAttributeInformationVolume C:\Users\robertc\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\share\doc\rust\html\alloc\all.html SUCCESS FileSystemAttributes: Case Preserved, Case Sensitive, Unicode, ACLs, Compression, Named Streams, EFS, Object IDs, Reparse Points, Sparse Files, Quotas, Transactions, 0x3c00600, MaximumComponentNameLength: 255, FileSystemName: NTFS
7:26:12.1936750 AM rustup.exe 4316 QueryBasicInformationFile C:\Users\robertc\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\share\doc\rust\html\alloc\all.html SUCCESS CreationTime: 9/04/2019 7:26:12 AM, LastAccessTime: 9/04/2019 7:26:12 AM, LastWriteTime: 9/04/2019 7:26:12 AM, ChangeTime: 9/04/2019 7:26:12 AM, FileAttributes: A
7:26:12.1937029 AM rustup.exe 4316 QueryAttributeInformationVolume C:\Users\robertc\.rustup\tmp\mvrji9wpubejlhg2_dir\rust-docs\share\doc\rust\html\alloc\all.html SUCCESS FileSystemAttributes: Case Preserved, Case Sensitive, Unicode, ACLs, Compression, Named Streams, EFS, Object IDs, Reparse Points, Sparse Files, Quotas, Transactions, 0x3c00600, MaximumComponentNameLength: 255, FileSystemName: NTFS
7:26:12.1937204 AM rustup.exe 4316 QueryRemoteProtocolInformation C:\Users\robertc\.rustup\tmp\mvrji9wpubejlhg2_dir\rust-docs\share\doc\rust\html\alloc\all.html INVALID PARAMETER
7:26:12.1937353 AM rustup.exe 4316 QuerySecurityFile C:\Users\robertc\.rustup\tmp\mvrji9wpubejlhg2_dir\rust-docs\share\doc\rust\html\alloc\all.html SUCCESS Information: Attribute
7:26:12.1937534 AM rustup.exe 4316 SetEndOfFileInformationFile C:\Users\robertc\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\share\doc\rust\html\alloc\all.html SUCCESS EndOfFile: 15,501
7:26:12.1938157 AM rustup.exe 4316 ReadFile C:\Users\robertc\.rustup\tmp\mvrji9wpubejlhg2_dir\rust-docs\share\doc\rust\html\alloc\all.html SUCCESS Offset: 0, Length: 15,501, Priority: Normal
7:26:12.1938545 AM rustup.exe 4316 WriteFile C:\Users\robertc\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\share\doc\rust\html\alloc\all.html SUCCESS Offset: 0, Length: 15,501, Priority: Normal
7:26:12.1939425 AM rustup.exe 4316 SetBasicInformationFile C:\Users\robertc\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\share\doc\rust\html\alloc\all.html SUCCESS CreationTime: 1/01/1601 12:00:00 PM, LastAccessTime: 1/01/1601 12:00:00 PM, LastWriteTime: 8/04/2019 7:34:21 AM, ChangeTime: 9/04/2019 7:24:52 AM, FileAttributes: n/a
7:26:12.1939951 AM rustup.exe 4316 QueryRemoteProtocolInformation C:\Users\robertc\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\share\doc\rust\html\alloc\all.html INVALID PARAMETER
7:26:12.1940088 AM rustup.exe 4316 CloseFile C:\Users\robertc\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\share\doc\rust\html\alloc\all.html SUCCESS
7:26:12.1949163 AM rustup.exe 4316 CloseFile C:\Users\robertc\.rustup\tmp\mvrji9wpubejlhg2_dir\rust-docs\share\doc\rust\html\alloc\all.html SUCCESS
7:26:45.95 deletes start
7:27:11.35 deletes finish
-> 26 seconds deleting the staged package
(I will happily mail a tarball of the full trace in PMLformat to anyone that wants it)
Codewise this is layered like so:
Current performance quick metrics:
With defender on:
my perf script above: 1m8s
rustup master release build: 3m26
stat-per-dir rather than stat-per-file: 3m14s
8M IO buffer on the tar: 3m14s
less stats + buffer - 3m11
With a defender exclusion on ~/.rustup:
my perf script: 24s
rustup master release build: 1m6s
stat-per-dir: 1m3
8M IO buffer on the tar: 50s
less stats + buffer - 36s
So there is still a lot of fat we should be able to cut; I have some experiments (see above) but nothing that is as definitive as halving the IO as my first patch did :). The benchmarks are pretty noisy too :/.
A variable I have to isolate yet is the windows search index; that is also possibly double / triple working (once on initial write, once on the rename, and then removing from the index at the rename step as well). Setting the indexing service to pause reduces load while testing, but I haven't run a set of benchmarks with it enabled/off : and when it is enabled it dynamically throttles, so enables is not always enabled, adding complexity in generating robust numbers.
tl;dr: there is hope for more improvements; a lot of syscall noise still remains though changes deeper in e.g. tar and possibly rust itself will be needed to fix that (e.g. setting the mode on a file in windows should be done while it is open - chmod is sugar for fchmod).
Users wanting things to be as fast as possible today should do the following IMO:
I think this ticket should be closed: we're doing create, write, close now for each file, no extra copies. We do do the move-into-place thing but as docs is moved as a top level dir, its a single syscall.
I concur, thank you @rbtcollins