Ref: https://github.com/aspnet/Home/issues/2774 for discovery point
.NET Core is missing suitable APIs for dealing with filesystems that contain symbolic links
FileInfo does not resolve symbolic links, leading to subtle bugs. There's no API that gives information about the link target. There's other problems with the existing APIs causing them to have trouble when you call Directory.EnumerateFileSystemEntries involving having to do not one but two stat calls for each node. Might as well resolve them all at once.
Proposed API surface:
```c#
// These are deliberately set to the *nix constant values where possible.
public enum FileTypes {
Missing = 0,
Fifo = 0010000,
CharacterDevice = 0020000,
Directory = 0040000,
BlockDevice = 0060000,
File = 010000,
SymbolicLink = 0120000,
Socket = 0140000,
SymbolicLinkMissingTarget = 0200000,
SymbolicLinkLoop = 0400000,
ReparsePoint = 01000000, // ReparsePoint attribute set but not a symbolic link
}
public struct FileNode {
public string Path { get; private set; }
public string FileName { get => System.IO.Path.GetFileName(Path); }
public DateTime LastAccessTime { get; private set; }
public DateTime LastAccessTimeUTC { get; private set; }
public DateTime LastWriteTime { get; private set; }
public DateTime LastWriteTimeUTC { get; private set; }
public DateTime LastChangeTime { get; private set; }
public DateTime LastChangeTimeUTC { get; private set; }
public FileAttributes Attributes { get; private set; }
public FileTypes FileNodeType { get; private set; }
public string SymbolicLinkTargetPath { get; private set; }
public bool Exists { get => FileNodeType != 0 }
FileNode(string path, bool resolvesymboliclink)
{
/* The general idea of this API is it doesn't throw; just gives the appropriate information You could probably put Cer.Success on it. */
Path = path;
bool statpermissiondenied;
if (resolvesymboliclink)
{
/* This code path would call CreateFile with only FILE_READ_ATTRIBUTES and then call GetFileInformationByHandle; on AccessDenied or PermissionDenied fall through below */
/* on unix this would be a stat() call */
}
/* This code path would call FindFirstFileEx to get the file information by name from the node attribute */
if (resolvesymboliclink && FileNodeType == FileTypes.SymbolicLink)
FileNodeType = 0;
}
/* Deserialization constructor */
FileNode(string path, FileTypes fileNodeType, FileAttributes Attributes, DateTime lastAccessTimeUTC, DateTime lastChangeTimeUTC, DateTime lastWriteTimeUTC, string symbolicLinkTargetPath);
}
public partial class File {
public static void CreateSymbolicLink(string path, string targetPath, bool targetisdirectory = false);
}
public partial class Directory {
// This one exists only code readability
// The idea is the programmer would normally only pass the third parameter if it was indirection from another layer of indirection, and otherwise would call File.CreateSymbolicLink to create a symbolic link to a file and Directory.CreateSymbolicLink to create a symbolic link to a directory
public static void CreateSymbolicLink(string path, string targetPath, bool targetisdirectory = true);
=> File.CreateSymbolicLink(path, targetPath, targetisdirectory);
}
```
Thanks for this proposal. Do you want to recocile this with the proposals made in https://github.com/dotnet/corefx/issues/25569 so we can have some all up proposal? It might make sense to move this there.
@carlreinke
cc @JeremyKuhne
@danmosemsft @JeremyKuhne I attempted a merge but didn't get too far. I updated my version with some pieces of his.
The main trouble is there's more possible file types than Attributes can handle. You really should be able to detect if you got a node of a strange type. In addition, it is a bad idea to assume that readlink() actually does the same thing as followlink() [the act of opening a link triggers followlink() which you can't call yourself]
I had actually hoped to be able to set obsolete on FileInfo on a per-project basis (with a code quality rule) so I could track down all places in legacy codebases that aren't prepared to handle symbolic links. But maybe it could be done on the constructor. It is vital to be able to actually declare on passing to FileInfo whether you want to follow the link or not.
My first inclination would be to extend FileInfo, is there a reason we can't do that?
We should look at exposing the reparse point type on Windows if we're going to go this far with file types. Additionally we should look at how we might expose character file types on Windows as well.
Note that I want to get this merged with @carlreinke's proposal dotnet/runtime#24271 if at all possible.
@JeremyKuhne : I've been over that path a few more times. Reusing FileInfo requires a commitment to auditing all users of FileInfo in .NET Core immediately as well as adding the full file type enumeration. Adding the type enumeration is easy but will break the System.Runtime -> mscorlib shim due to differing ABI surface. I don't care if that shim breaks but you do.
@jhudsoncedaron
requires a commitment to auditing all users of FileInfo in .NET Core immediately
If we don't change the default behavior, why would we have to audit?
Adding the type enumeration is easy but will break the System.Runtime -> mscorlib shim
I'm not sure I follow you. Adding new types and APIs to existing types doesn't break the shim- we have a ton of these in CoreFX already. The normal route is adding new things on CoreFX (1), porting to desktop (2), elevating to .NET Standard (3).
requires a commitment to auditing all users of FileInfo in .NET Core immediately
If we don't change the default behavior, why would we have to audit?
You have to audit 100% of callers anyway. I expect that something like 90% of them are not correct. I've already filed a bug in MVC with a root cause of incorrect use of FileInfo. The operative word is "immediately"; having a new type permits a staged audit as unaudited code can be found by noting its still using the old type.
You have to audit 100% of callers anyway.
Why? The current behavior may not be ideal, but it has sufficed. Adding constructors that specify link following behavior is pretty easy to audit if the behavior matters to you.
Depreciating the *Info classes would be a massive change that would require some pretty serious, compelling evidence for the API review team to consider it. I'm happy to explore the reasoning/justification, but I'm not seeing it right now- can you help clarify the scenarios that would back depreciating these?
Lots and lots of code involving FileInfo reduces to this non-working code blob:
var info = new FileInfo(path);
if (info.Exists && (info.Attributes & FileAttribtes.Directory) == 0 && info.Length > 0)
using (var f = new FileStream(path, ...))
{
// Do something with f
}
Raymond might be pointing out the race condition, but it's more fundamental. info.Exists can lie if you intended to do much of anything with that knowledge. It turns out almost all the time you care about the attributes of the target of the link rather than the link itself. This means everybody needs to search their codebases anyway to see what the code should do in the presence of an symbolic link.
It gets worse. If some API returns a FileInfo object, that API is most likely broken already.
So we face the horns. Either every single .NET API needs to have its behavior on encountering symbolic links set to a sane behavior immediately (and just blanket saying doesn't follow links doesn't cut it--this is effectively deprecating all the APIs that return FileInfo--but int he future you can't tell which APIs have been implicitly deprecated by this and which ones have not been) or you deprecate FileInfo so that broken APIs can be identified and avoided by their potential callers. Since I don't believe the first will happen ...
FileInfo, etc. are documented to work on the link itself. CreateFile doesn't open the symlink itself, it opens the target. While I agree that can be confusing, it works in most cases. Your example isn't safe even if they were aligned better. Even if you check Exists on the target there is a risk the file will be gone when you try to create the FileStream around it, let alone any number of other IO related errors (access issues, volume dismounts, etc.).
We could potentially add a TargetExists or something like that to FileInfo/DirectoryInfo. (Or maybe FileInfo Target { get; } that gives you back this or the link target if applicable.)
C#
var info = new FileInfo(path).Target;
if (info.Exists && (info.Attributes & FileAttribtes.Directory) == 0 && info.Length > 0)
try
{
using (var f = new FileStream(path, ...))
{
// Do something with f
}
}
catch (Exception e)
{
// handle IO errors
}
I absolutely agree that we need better symbolic link support, but I don't feel depreciating existing APIs is the way to do it.
I constructed that example because that's the kind of nonsense I keep on finding in .NET Core itself when code falls down in the presence of symbolic links. Most callers aren't calling it because they want the link's attributes. Most callers are calling it because they want the file's attributes and there's no other API to do so. Your "it works in most cases" is because people are largely not using symbolic links yet. What's it going to take to get all of the code audited?
It occurred to me that this api var info = new FileInfo(path).Target; is really bad as it results in a call to lstat() followed by stat() rather than just a call to stat() (on Windows a call to FindFirstFile() followed by GetFileInformationByHandle() rather than just GetFileInformationByHandle()).
Most callers aren't calling it because they want the link's attributes.
This is documented and follows the behavior of the underlying Win32 APIs. I personally would have probably designed things differently, but we now have decades of code written with this behavior. Breaking with that and depreciating a large swath of APIs because it is considered confusing isn't the right answer, imho. Making it more discoverable is.
there's no other API to do so.
There will be- I'm fully committed to getting symbolic link handling into the framework.
What's it going to take to get all of the code audited?
Open issues in the case of .NET code. Writing code that doesn't correctly follow the documentation or allow for the completely transient nature of IO happens, and would be a bug with or without symbolic links involved.
It occurred to me that this api var info = new FileInfo(path).Target; is really bad
How is it so bad? You still have to make multiple calls to go from path to finalPath. Can you give more detailed code samples?
Note that there is also a need to identify _exactly_ what type of reparse point is in play for Windows. Understanding whether the reparse point is actually a symbolic link is a critical first step, but digging deeper into reparse point types is needed. A specific scenario is that Windows 10 has a new type of reparse point for OneDrive Files On-Demand.
I'll open up an issue for how to expose platform specific data so we can discuss general approach. I'll link to this from there.
You still have to make multiple calls to go from path to finalPath. Can you give more detailed code samples?
No, you don't. In fact, trying to resolve final path yourself is a bug. readlink() might not resolve the same file as open().
You go directly to final path by making the appropriate API calls. Windows:
try {
} finally {
/* this code must not be split by async throw */
IntPtr MinusOne = IntPtr.Zero - 1;
IntPtr h = NativeMethods.CreateFile(pathname, 128 /* FILE_READ_ATTRIBUTES */, 7 /* FILE_SHAARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE */, IntPtr.Zero, 3 /* OPEN_EXISTING */, 0x00100000 /* FILE_FLAG_NO_RECALL */, MinusOne);
if (h != MinusOne) {
struct NativeMethods.BY_HANDLE_FILEINFORMATION nfi;
bool r = NativeMethods.GetFileInformationByHandle(h, &nfi);
NativeMethods.CloseHandle(h);
if (!r) throw new IOException(new System.ComponentModel.Win32Exception); /* should not happen */
/* Populate FileInfo from nfi */
} else {
/* File not found or no permission to resolve link -- check GetLastError() to find out which */
}
}
Linux & Mac OS (they would differ in the decl of the stat structure)
struct NativeMethods.stat statbuf;
if (stat(pathname, &statbuf)) {
/* file not found or link can't be followed -- check errno */
} else {
/* Populate FileInfo from statbuf */
}
I spent a little time looking for cases. I found a lot of cases where I could not track because the code spanned repos. I only found one case that appeared to work, and that by chance. It blows up on encountering a symlink to a non-extant directory when given a path within it but maybe that's intended.
Tries to set ACls on a symbolic link. Unix symbolic links don't have ACLs. Trying to do so would set the ACL on the target instead.
corefx/src/System.IO.FileSystem.AccessControl/src/System/IO/FileSystemAclExtensions.cs
Definitely does the wrong thing on encountering symbolic links; can be thrown into a stack overflow.
corefx/src/System.IO.Compression.ZipFile/src/System/IO/Compression/ZipFile.cs
private static void DoCreateFromDirectory(string sourceDirectoryName, string destinationArchiveFileName,
CompressionLevel? compressionLevel, bool includeBaseDirectory,
Encoding entryNameEncoding)
Code only makes sense if FileInfo were resolving link target paths; otherwith the Path functions do its job
coreclr/src/ToolBox/SOS/DacTableGen/main.cs
Certainly fails in the presence of symbolic links but hard to see what it should do
Razor/src/Microsoft.AspNetCore.Razor.Language/FileSystemRazorProject.cs
Tries to listen to symbolic links for changes. Hard to say if the code works or not.
corefx/src/System.Runtime.Caching/src/System/Runtime/Caching/FileChangeNotificationSystem.cs
In PowerShell we already dynamically add a Target property to FileInfo/DirectoryInfo objects.
Hello,
Is there any news on this ? at least a workaround to allow netcore application to follow symlinks ?
It seems to be a basic feature on linux environment.
Thanks
@xp-1000 nope, but perhaps we can pick this up in January. We do want this, and if we can get to an approved API shape, implementation can follow.
understood. thanks for the quick answer. I will continue to follow this issue.
I am more than half way through a real implementation of this suitable as an independent nuget package but I am stopped due to not being able to run personal Windows machines very well. I have specific ADA problems that makes it hard without on-site support.
@xp-1000 : Here's something to play with
https://github.com/joshudson/Emet/tree/master/FileSystems
I published Emet.FileSystems 0.0.1-alpha1 on nuget; it's still verifying right now.
@joshudson could you add some documentation to you library, is it possible to edit file attributes such as author.
i just ran into this. currently changing all my code from using FileInfo.Length to File.OpenRead().Length, ugh...
@Spongman : that leaks open handles. Either use a using block (possibly in a helper method) or try my library. Emet.Filesystems. It's released now.
that leaks open handles
thanks, i'm aware of how IDisposable work...
Duplicate of dotnet/runtime#24271
Most helpful comment
Hello,
Is there any news on this ? at least a workaround to allow netcore application to follow symlinks ?
It seems to be a basic feature on linux environment.
Thanks