Roslyn: Turn on UseCommonOutputDirectory and move to deployment projects for deploying dependencies

Created on 20 Feb 2016  路  12Comments  路  Source: dotnet/roslyn

Currently, there are lots of problems in the build as projects copy themselves and their dependencies (other projects, references, NuGet packages, etc) into the same output folder. This leads to races, bad behavior (where two projects might disagree on a version of a dependency) and broken PR builds.

This is what we should to prevent this:

  • [ ] Product binaries should be deployed into a single folder and deploy only themselves. @agocke has an idea where we should split this above based on deliverable; ie one folder for VS deliverables, one for NuGet, etc
  • [ ] We should add test deployment projects that are responsible for deploying nuget/vs dependencies/product dependencies into a separate folder for test reasons. See https://github.com/dotnet/roslyn/tree/future/src/VisualStudio/ProjectSystem/DeployTestDependencies for an example.
  • [ ] Turn UseCommonOutputDirectory on for all projects (make sure you rip the values from the samples project added in: https://github.com/dotnet/roslyn/pull/8964.
  • [ ] Run BuildCop to make sure there are no more double-writes or write-after-reads.
  • [ ] Turn on parallel build on the signed builds
0 - Backlog Area-Infrastructure Feature Request

Most helpful comment

You must make sure that only one project can copy a given binary to the output directory

This is the key part. At a high level a build system can fit into three categories based on how files are written to the output directory:

  1. Provably correct: a given output file is written to exactly once.
  2. Correct: a given output file is written to more than once but always with the same content.
  3. Incorrect: a given output file is written to more than once with different content.

A build system which actually has the properties of 1 or 2 is fine and will produce predictable output. The problem with 2 though is it's basically impossible to differentiate from 3 without deep and costly inspection of the build. It's much simpler to put the effort into getting to 1 and use tools like structured logger to ensure you stay there going forward.

We use a common output directory[1] over on https://github.com/dotnet/roslyn-project-system/ and it works well, but we strictly follow the above rules.

Roslyn went one step further with this. I added a leg to our Jenkins CI system that uses structured logger to monitor the output and verify that we don't have any double writes. It's saved us from accidentally introducing them several times since we cleaned up the build.

Roslyn, however, produces lots of different things with lots of different dependencies (for VS, .NET Core, etc) and it's a big product - so ultimately it doesn't work so well.

Exactly. One example of where this is a problem is NuGet packages. Even though two projects may reference the same NuGet package, their deployment environment may cause a different binary (of the same name) to be copied to the output directory. For example a project targeting CoreClr and one targeting Desktop may get a different copy of System.dll in the binaries directory. Hence even ensuring that only one of the projects copies the binary to the output directory isn't good enough, we fundamentally need different output directories in order to be correct.

Roslyn ended up dividing up projects into two categories:

  1. Projects for which deployment isn't necessary: simple DLLs for example. For those we don't copy anything other than build artifacts to the output directory.
  2. Projects which need deployment: EXEs, UnitTests and VSIX. Each of these outputs to its own directory and fully copies the dependencies.

All 12 comments

@agocke I've added this to infra backlog, we'll tackle anything that you haven't already done around this.

Can you provide a pointer to information on UseCommonOutputDirectory? Not sure what that implies.

It鈥檚 an UseCommonOutputDirectory is a well-known MSBuild property that basically turns off copying dependencies including project references, binary references and NuGet references by default.

Hard to classify it as well-known though given that they have 0 documentation for it :smile:

It seems the only docs I can find are:

"well known" as in MSBuild is aware of it, in contrast to something specific to our build system.

Here's where MSBuild uses it:
http://referencesource.microsoft.com/#MSBuildProperty=UseCommonOutputDirectory

@jaredpar ultimately moved us away from single output directory. This can be closed.

Is there a reference to the conversation that lead you away from using a single output directory?

@taylorjonl No, just lots of pain and various other things, single output directories work well if you:

  • Have the same set of dependencies across the entire tree
  • Turn on UseCommonOutputDirectory to have projects only copy themselves to the output directory and not dependencies (this prevents the races that occurs when two projects attempt to deploy the same binary that they depend on)
  • Have dedicated "deployment" projects that are responsible for deploying your test dependencies to the output directory

You must make sure that _only one project can copy a given binary to the output directory_. For example, if I have two projects that reference JSON.NET, only one of them is allowed to copy to the output directory.

We use a common output directory[1] over on https://github.com/dotnet/roslyn-project-system/ and it works well, but we strictly follow the above rules. Roslyn, however, produces lots of different things with lots of different dependencies (for VS, .NET Core, etc) and it's a big product - so ultimately it doesn't work so well.

[1] Well really two common output directories, one for "shipping binaries" and one for tests and their dependencies).

You must make sure that only one project can copy a given binary to the output directory

This is the key part. At a high level a build system can fit into three categories based on how files are written to the output directory:

  1. Provably correct: a given output file is written to exactly once.
  2. Correct: a given output file is written to more than once but always with the same content.
  3. Incorrect: a given output file is written to more than once with different content.

A build system which actually has the properties of 1 or 2 is fine and will produce predictable output. The problem with 2 though is it's basically impossible to differentiate from 3 without deep and costly inspection of the build. It's much simpler to put the effort into getting to 1 and use tools like structured logger to ensure you stay there going forward.

We use a common output directory[1] over on https://github.com/dotnet/roslyn-project-system/ and it works well, but we strictly follow the above rules.

Roslyn went one step further with this. I added a leg to our Jenkins CI system that uses structured logger to monitor the output and verify that we don't have any double writes. It's saved us from accidentally introducing them several times since we cleaned up the build.

Roslyn, however, produces lots of different things with lots of different dependencies (for VS, .NET Core, etc) and it's a big product - so ultimately it doesn't work so well.

Exactly. One example of where this is a problem is NuGet packages. Even though two projects may reference the same NuGet package, their deployment environment may cause a different binary (of the same name) to be copied to the output directory. For example a project targeting CoreClr and one targeting Desktop may get a different copy of System.dll in the binaries directory. Hence even ensuring that only one of the projects copies the binary to the output directory isn't good enough, we fundamentally need different output directories in order to be correct.

Roslyn ended up dividing up projects into two categories:

  1. Projects for which deployment isn't necessary: simple DLLs for example. For those we don't copy anything other than build artifacts to the output directory.
  2. Projects which need deployment: EXEs, UnitTests and VSIX. Each of these outputs to its own directory and fully copies the dependencies.
  1. Projects which need deployment: EXEs, UnitTests and VSIX. Each of these outputs to its own directory and fully copies the dependencies.

Do the EXEs, UnitTests and VSIXes then have direct references to all the DLLs required by the projects they reference? (Even if they don't consume these DLLS directly?)

@JoshVarty

Do the EXEs, UnitTests and VSIXes then have direct references to all the DLLs required by the projects they reference?

Eventually this will be the case. Today though it is possible that projects pick up references transitively via project references. I'd like to move us to a place where it's explicitly listed though.

Was this page helpful?
0 / 5 - 0 ratings