Is there a secure method of sandboxing Roslyn's code execution. I'm thinking something along the lines of what dotnetfiddle.net is doing?
With such a huge .NET framework, how do you go about choosing what code is allowed to execute?
We currently don't have any means to do so in the interactive window but it is an interesting idea.
Run the code in an isolated process.
@tmat Things like the file system, P/Invoke and Assembly loading need to be blocked, as well as a whole host of other things. Doing this securely is quite difficult. It would be great if there were some kind of curated list of API's that Roslyn could be restricted to.
dotnetfiddle does this but is not open source. They have a UserVoice entry asking for it to be made open source.
This is a very old post that I wrote, but it gives a few ideas on how to approach this, keep in mind the API has changed a lot since I wrote this:
http://www.filipekberg.se/2011/12/08/hosted-execution-of-smaller-code-snippets-with-roslyn/
Haha, I was just reading that and posted this issue to your blog. I think your blog post provides a good start. There is another blog post from the creators of dotnetfiddle where they also talk about timing out the execution and setting limits for memory and things like that. Building a bulletproof sandbox is not very straight forward it seems.
For those interested I put together a Roslyn.Sandbox project on GitHub with a very rudimentary implementation so far.
Roslyn seems to want a lot of permissions, so when I tried to use the Internet zone on my AppDomain I got a bunch of security exceptions. I had to switch to the My Computer zone for now, to get things working.
@fekberg I guess you did not come accross this issue?
@RehanSaeed & @fekberg, thanks for the code you guys have put out. It seems like they will work on .NET Framework only? Can you please confirm if it is cross-platform compatible?
Hi it would be much easier to implement a "light" sandbox if ScriptOptions constructor was public.
Then we could restrict the list of assemblies, namespaces.
Could we change that?
Any updates on this issue or when it might be complete. I'd like to be able to allow my users to write and execute their own scripts while limiting things like memory usage, access to File IO, etc.
@mprothme this is not being worked on at the moment. Code contributions are welcome.
@gafter What about my suggestion of opening ScripOptions? I can make a PR
@molinch https://github.com/dotnet/roslyn/pull/32668 is adding more ScriptOptions but we are not making the constructor public.
@molinch I don't actually see why would you need it to be public.
@mprothme As a suggested above, the _only_ way to isolate the script securely is to run it in a separate process with limited permissions. Any in-process sandbox can be circumvented and is not secure.
@tmat If you have your own API and want to limit executation to that defined surface area only, then we could leverage the script options to provide our "surface area assembly" only, nothing else.
Currently we cannot do that since we cannot change the references from ScriptOptions.
Would you be OK, if I submitted a new method to ScriptOptions: WithReferences ?
As an alternative we could also analyze the syntax tree and check that all calls are made to our surface API only, but I would rather keep it simple.
@molinch As far as I can see WithReferences does already exists, or did I missed something?
As an alternative we could also analyze the syntax tree and check that all calls are made to our surface API only, but I would rather keep it simple.
This would hard to accomplish - How would you ensure that reflection or dynamic code wouldn't call forbidden API's?
Would you be OK, if I submitted a new method to ScriptOptions: WithReferences ?
If you mean that you would pass in reference assemblies that have limited APIs exposed then no, that's not possible currently since the scripting compilation assumes that given references are implementation assemblies. We do not support reference assemblies. Although that is something we would definitely want to support at some point, it's a non-trivial change. Perhaps you could try using custom facade assemblies - assemblies that type-forward to the real implementation assemblies. That only allows you to limit the set of types though, not the set of members. Another problem is that we add reference to mscorlib by default as we need primitive types like System.Object and others to even construct script code context. So, you can't avoid mscorlib types being added, which means Reflection APIs are available and thus the script can really do pretty much anything.
I am actually very surprised that such a feature is not available.
At the end of the day, Roslyn is a fantastic tool for "metadevelopment" (development of development tools), but, out of the box, quiet useless for business application development (dislike claimed on a microsoft presentation where roslyn is also introduced as a way to provide a rule engine in the 6th point). It is indeed not conceivable that a user can access the whole framework through a rule feature of his business application.
As sandboxing is not permitted in dotnet core like it is in dotnet framework, I endded up using roslyn semantic analyser to check EVERY function call, and ensure all of them are on a function that belongs to a small set of allowed classes.
Here is an extract of what I did:
private bool CanExecute()
{
var comp = CSharpScript.Create<O>(this._rule,
ScriptOptions.Default
.WithImports(this._imports)
.AddReferences(this._assemblyReferences),
typeof(G)).GetCompilation();
var actualTree = comp.SyntaxTrees.First();
var model = comp.GetSemanticModel(comp.SyntaxTrees.First());
var root = (CompilationUnitSyntax)actualTree.GetRoot();
var invocationExpressions = root.DescendantNodes().Where(i => i.IsKind(SyntaxKind.InvocationExpression)).OfType<InvocationExpressionSyntax>();
bool forbiddenCall = false;
var allowedClassesCalls = _allowedTypes.Select(i => i.FullName).ToHashSet();
foreach (var invocationExpression in invocationExpressions)
{
var memberAccessExpressionSyntax = invocationExpression.Expression as MemberAccessExpressionSyntax;
var symbolInfo = model.GetSymbolInfo(memberAccessExpressionSyntax);
if (symbolInfo.Symbol != null)
{
if (!allowedClassesCalls.Contains(symbolInfo.Symbol.ContainingSymbol.ToString()))
{
forbiddenCall = true;
var location = invocationExpression.GetLocation();
// item.GetLocation(); // this is the way to get where the forbidden call is done
}
}
else if (symbolInfo.CandidateSymbols != null)
{
foreach (var symbol in symbolInfo.CandidateSymbols) // if any ambiguity in the method, candidates are here. Surprisingly, for the actual execution roslyn has no pb choosing the right one
{
if (!allowedClassesCalls.Contains(symbol.ContainingSymbol.ToString()))
{
forbiddenCall = true;
var location = invocationExpression.GetLocation();
// item.GetLocation(); // this is the way to get where the forbidden call is done
}
}
}
}
return !forbiddenCall;
}
@paillave There's a potential flaw in your code. It's only looking at invocations expressions. However a property access could circumvent it. For example, I changed your code to take in the code as a string, and then wrote this xunit test for it.
[Theory]
[InlineData(@"var s = Environment.GetEnvironmentVariable(""test"");")]. // Passes
[InlineData("var s = Environment.CommandLine;")] // Fails
public void Test(string code)
{
Assert.False(CanExecute(code));
}
CanExecute
returned true
for Environment.CommandLine
even though System.Environment
was not in my list of allowed types.
I wrote a Roslyn Analyzer that seems to do the trick. Check it out and let me know if it works for you.
https://gist.github.com/haacked/00de560d00692b7f4859336c747af10e
@paillave If your goal is to run untrusted code you need to run it in a separate process. That's the only guaranteed security boundary. That's why .NET Core does not support Code Access Security - it has never worked well. Running in a separate process also allows you to guard your application against scripts that accidentally or on purpose crash the process. The simplest way to crash the process without calling any API is to run script
C#
void F() { F(); } F();
@haacked Sure... this is only an extract as an example that checks only method calls. Of course, constructors and properties must be checked as well!
@tmat you are right, nothing will never be as safe as a sandbox/different process. Nevertheless, if, like I mentioned above, you check constructors and properties, you code gets extremely safe (but still less than a sandbox). You solution to run something in a different process, you get a complexity of communication between them that can cause a lot of bugs as well. As a matter of a fact, we all dream of the new equivalent of the ApplicationDomain that existed on the framework. It is by far the best we can dream of.
@haacked just checked your code. It is OK, but in security, like firewalls for example, everything should be considered as not allowed, unless it is granted. In what I understood from your code, everything is granted, except what is forbidden. If you ever make a mistake in your code, or if an unsafe API is ever added in what was granted before, your code gets dangerous.
Looks like the Roslyn team has something that does what we want. https://github.com/dotnet/roslyn-analyzers/tree/master/src/Microsoft.CodeAnalysis.BannedApiAnalyzers
Looks like the Roslyn team has something that does what we want. https://github.com/dotnet/roslyn-analyzers/tree/master/src/Microsoft.CodeAnalysis.BannedApiAnalyzers
Very interesting indeed... But it seems to work by banning a few, instead of granting a few. Am I right?
Yes, but it would be easy to invert the logic.
@haacked Just checked the code that is in it... There is not too much, but it is very... intense! :D
Not impossible, but still, many hours of pain to invert the purpose I think. I'll consider taking on this challenge at some point I believe. Thanks for the tip!
Most helpful comment
I am actually very surprised that such a feature is not available.
At the end of the day, Roslyn is a fantastic tool for "metadevelopment" (development of development tools), but, out of the box, quiet useless for business application development (dislike claimed on a microsoft presentation where roslyn is also introduced as a way to provide a rule engine in the 6th point). It is indeed not conceivable that a user can access the whole framework through a rule feature of his business application.
As sandboxing is not permitted in dotnet core like it is in dotnet framework, I endded up using roslyn semantic analyser to check EVERY function call, and ensure all of them are on a function that belongs to a small set of allowed classes.
Here is an extract of what I did: