This is an umbrella issue for making the Roslyn compilers deterministic. See also Open Issues for Determinism.
There are a few issues:
GetHashCode
method of anonymous types is not deterministic@gafter Would you be so kind to explain why making the compilers deterministic is important? You are probably envisioning scenario's I'm not aware of. I'm just curious :)
Not that I would claim to know the mind of @gafter, but consider a continuous build system which uses the output of one stage to know whether or not it has to recompile other source. If the same input always produces the exact same output, you can easily tell what binaries have _actually_ been affected by a change. If the binaries will always change anyway (because they include timestamps, random GUIDs etc) then you can't do this.
Another important reason is that with a deterministic or reproducible build you can now verify that the binary you got from somewhere (e.g. NuGet) is really built from the source code you have access to and wasn't modified/tampered with.
Debian is doing a push to move most of its packages to be reproducible: https://wiki.debian.org/ReproducibleBuilds
The Tor project also put a lot of time into this: https://blog.torproject.org/blog/deterministic-builds-part-one-cyberwar-and-global-compromise
Thanks for sharing, greatly appreciated!
Yeah, what @jskeet and @akoeplinger said.
Just my 2 cents - the msbuild flags /deterministic and /pathmap looks like already implemented based on the checklist, but they are not documented anywhere. Any chance to fix it?
https://github.com/MicrosoftDocs/visualstudio-docs/issues/361
Thanks @martinsuchan for pointing out those issues.
Documenting /pathmap is tracked by https://github.com/dotnet/docs/issues/1800 (do chime there to voice your interest).
I filed another documentation issue just now for /deterministic at https://github.com/dotnet/docs/issues/3828
Why is /deterministic flag an optional thing in many compilers; is there a penalty / cost attached can be avoided without csc /deterministic
? Otherwise, if it is all about bringing goodness then does it has to be optional, hidden behind a flag?
@kasper3 I think the only reason why determinism isn't the default is to avoid surprising customers, who have been using the compiler (without determinism) for a long time.
Deterministic assemblies have strange timestamps, which would be surprising to many if the default was changed. Also determinism prohibits the use of wildcard in assembly version.
That's why customers have to make an explicit choice to turn determinism on.
@kasper3 @jcouv BTW, /deterministic
is the default for .Net Core SDK projects, so over time, it should be the default for more and more new projects.
@svick Good point.
Filed https://github.com/dotnet/project-system/issues/3438 to update the desktop templates (so at least newly created projects can used determinism).
@jcouv, the still-unchecked items in the first post can be revisited. Some of them are closed issues (some are discussion-going-nowhere open issues).
Wondering, are there any plans to make UWP app builds deterministic by default? Timestamps and wildcards in assembly versions are not really important when publishing apps to Store.
Last time I was testing deterministic builds for UWP apps it was not possible to produce identical packages because WinMDExp and MakePri.exe tools from the Windows SDK build pipeline don't support producing deterministic outputs yet.
Also since Microsoft Store uses differential updates for downloading new packages, having deterministic builds by default could save tons of Store traffic basically for free.
Tagging @MichalStrehovsky @sergiy-k for UWP question.
This came up in the context of .NET Native in the past (#23456) but I couldn't find the owners of the tools in question. Maybe @tarekgh would know for MakePri?
@axelandrejs can help answering MakePri question.
MakePri is expected to produce identical output for identical input. Now, it is possible that some of the inputs change unexpectedly, since MakePri parses the file system. If you have two files that were produced over the same set of inputs but are not identical, can you share the files?
Note that if you are implementing your own build pipeline, we now have a programmatic equivalent to MakePri. See https://msdn.microsoft.com/en-us/library/windows/desktop/mt845690(v=vs.85).aspx . It avoids dependencies on the file system and as such takes out some of the potential non-determinism. Also, it's faster. We are developing a tool that supports specifying all the data needed to build a UWP resources file via a relatively simple XML, in case that would help.
Does anyone know if resgen.exe is deterministic?
@axelandrejs @tarekgh Reviving the discussion, MakePri does not produce deterministic outputs, nor WinMDExp, just checked again in VS 15.7.4. There is a simple repro solution available in #23456. Just build the solution twice in the same clean folder and extract all appxupload/appx files in it.
The expected result is identical content of both folders, the reality is differences in resources.pri and Winrt.Component.winmd.
Any chance to find the product owners of those tools and make them deterministic by default?
Files with different checksum:
AppxBlockMap.xml
AppxSignature.p7x
ReproTest_1.0.0.0_x86.appx
AppxMetadataAppxBundleManifest.xml
ReproTest_1.0.0.0_x86AppxBlockMap.xml
ReproTest_1.0.0.0_x86AppxSignature.p7x
ReproTest_1.0.0.0_x86resources.pri
ReproTest_1.0.0.0_x86Winrt.Component.winmd
ReproTest_1.0.0.0_x86AppxMetadataCodeIntegrity.cat
@axelandrejs may be able to help better here.
The problem is not actually makepri. The problem are the XBF files, specifically App.xbf and MainPage.xbf. They are different between runs.
By default XAML files are embedded as BLOBs into the PRI files to improve performance. This means that if the XBF files change, you will see the difference in the resources.pri file. You can see for yourself. You can dump the contents of the resoures.pri file via the following command.
makepri dump /dt detailed /if [the resoures.pri file] /of [some .xml file name]. I did this for the two files you had in ReproTest.build1 and ReproTest.build2. You can see the the App.xbf contents below.
@LarryOsterman can speak to the .winmd differences.
@jevansaks can speak to the .xbf differences.
WEJGAJYBAABaAAAAAgAAAAEAAAB4AAAAAAAAAHYBAAAAAAAAegEAAAAAAAB+AQAAAAAAAIIBAAAAAAAAhgEAAAAAAAAzNjZCNDlDODVFRjdEOTIyQkRERTQ4RTA2MzY3MUJCMQAICAAzAGEAEiodUrgWAIzg/iADAQAAAAAAAAAAAAAAAwAAADkAAABoAHQAdABwADoALwAvAHMAYwBoAGUAbQBhAHMALgBtAGkAYwByAG8AcwBvAGYAdAAuAGMAbwBtAC8AdwBpAG4AZgB4AC8AMgAwADAANgAvAHgAYQBtAGwALwBwAHIAZQBzAGUAbgB0AGEAdABpAG8AbgAAACwAAABoAHQAdABwADoALwAvAHMAYwBoAGUAbQBhAHMALgBtAGkAYwByAG8AcwBvAGYAdAAuAGMAbwBtAC8AdwBpAG4AZgB4AC8AMgAwADAANgAvAHgAYQBtAGwAAAAPAAAAdQBzAGkAbgBnADoAUgBlAHAAcgBvAFQAZQBzAHQAAAAAAAAAAAAAAAAAAAAAAAAAAwAAAAAAAAABAAAAAgAAAAEAAAAAAAAATgAAABIAAAAAAAADAQABAAAAeAADAgAFAAAAbABvAGMAYQBsAAsNAAAAUgBlAHAAcgBvAFQAZQBzAHQALgBBAHAAcAAXH4AaSIALQgIAAAAAIQ==
VERSUS
WEJGAJYBAABaAAAAAgAAAAEAAAB4AAAAAAAAAHYBAAAAAAAAegEAAAAAAAB+AQAAAAAAAIIBAAAAAAAAhgEAAAAAAAAzNjZCNDlDODVFRjdEOTIyQkRERTQ4RTA2MzY3MUJCMQAICAAAAAAAHQT6JAASAIpTAHkAcwB0AGUAbQAuAE8AAwAAADkAAABoAHQAdABwADoALwAvAHMAYwBoAGUAbQBhAHMALgBtAGkAYwByAG8AcwBvAGYAdAAuAGMAbwBtAC8AdwBpAG4AZgB4AC8AMgAwADAANgAvAHgAYQBtAGwALwBwAHIAZQBzAGUAbgB0AGEAdABpAG8AbgAAACwAAABoAHQAdABwADoALwAvAHMAYwBoAGUAbQBhAHMALgBtAGkAYwByAG8AcwBvAGYAdAAuAGMAbwBtAC8AdwBpAG4AZgB4AC8AMgAwADAANgAvAHgAYQBtAGwAAAAPAAAAdQBzAGkAbgBnADoAUgBlAHAAcgBvAFQAZQBzAHQAAAAAAAAAAAAAAAAAAAAAAAAAAwAAAAAAAAABAAAAAgAAAAEAAAAAAAAATgAAABIAAAAAAAADAQABAAAAeAADAgAFAAAAbABvAGMAYQBsAAsNAAAAUgBlAHAAcgBvAFQAZQBzAHQALgBBAHAAcAAXH4AaSIALQgIAAAAAIQ==
Added #375 to the list of issues.
@gafter, just as an FYI. decimal parsing should now respect all digits given in netcoreapp3.0 and forward.
@tannergooding Thank you. That just confirms that different runtimes do it differently from each other.
Is there decimal parsing code that we could copy or adapt into Roslyn?
@gafter, yes. The logic lives here: https://source.dot.net/#System.Private.CoreLib/shared/System/Number.Parsing.cs,f156a872d71c54fd,references
Most of this is similar to the floating-point parsing code Roslyn already has. That is TryStringToNumber converts the string into a digit buffer and scale (CoreFX also tracks the sign, but I believe Roslyn handles that separately).
It then converts that into the actual decimal
metadata in TryNumberToDecimal, which is also where the rounding and additional digit considerations occur.
When i was first added to this issue I wasn't on GitHub. But I am now and GitHub started notifying me of new comments. :) For the XBF issue, are you still seeing it? I can file an issue for the XAML compiler team to take a look.
@jevansaks I've added repro for deterministic UWP app building here https://github.com/dotnet/roslyn/issues/23456
I'm quite sure it wasn't fixed yet.
Most helpful comment
Another important reason is that with a deterministic or reproducible build you can now verify that the binary you got from somewhere (e.g. NuGet) is really built from the source code you have access to and wasn't modified/tampered with.
Debian is doing a push to move most of its packages to be reproducible: https://wiki.debian.org/ReproducibleBuilds
The Tor project also put a lot of time into this: https://blog.torproject.org/blog/deterministic-builds-part-one-cyberwar-and-global-compromise