In July, @robpike and I posted a draft design for file system interfaces. That doc links to a video, prototype code, and a Reddit discussion.
The feedback on that design has been almost entirely positive.
A few people raised concerns about the use of optional interfaces, but those are an established pattern in Go that we understand how to use well (informed in part by some earlier mistakes, such as optional interface like http.Hijacker with methods that cannot return an error to signal failure/unavailability).
A few people suggested radical redesigns of the os.File
interface itself, but for better or worse installed base and the weight of history cautions against such drastic changes.
I propose to adopt the file system interfaces draft design for Go 1.16.
Accepting this proposal would also let us land the embedded files draft design in Go 1.16, which I've proposed in #41191.
A few people raised concerns about the use of optional interfaces
This is my concern.
It's not that the alternative (attempting to define all FS methods in one huge interface which packages may then implement as no-op) is better. Rather, my perception is that the ergonomics of this approach are poor enough that it won't achieve broad community adoption and the level of composability and general success that interfaces like io.Reader
and io.Writer
have.
For example, it's clear that I will be able to pipe a zip file to text/template
, and that's good, but I'm concerned about more general composability of filesystems and files. I can wrap a stack of io.Reader
with confidence, but with io/fs
it seems like some middle layer may not have the right optional interfaces and I will lose access to functionality.
In spite of my concerns, it seems like the best approach available to Go at this time, and I anticipate it will be accepted given that the very exciting #41191 depends upon it.
However, I have this inkling that the advent of generics may allow a more powerful/robust/safe abstraction. Has any thought been given to this, or to how io/fs
could evolve in a backwards-compatible fashion if/when that occurs? Again, not to hold up this proposal, but I think I would be more excited if I knew what the future held.
The feedback page:
https://www.reddit.com/r/golang/comments/hv976o/qa_iofs_draft_design/?sort=new
I think this API looks promising... and would benefit from a prototype phase.
A lot of feedback was posted, but there's been rather light discussion of the comments, presumably because you can't subscribe to a Reddit thread, and/or many in Go's github-centered community don't frequent Reddit. It would help to see a review and analysis of feedback here, and perhaps a roadmap to likely future features.
Problems were identified with the FileInfo
interface, ~but not discussed~ and are in discussion #41188. Timeouts and/or interrupts bear consideration.
Landing a prototype in x/ seems like a logical step before stdlib. Go has long been deliberative and conservative about new features. Is this urgent somehow?
FWIW, my Go apps make heavy use of the filesystem, on Windows, MacOS, and Linux.
I think optional interfaces can work if there is a way to indicate that even though a method exists on a wrapper, it hasn't been implemented by the underlying wrapped type. Something like ReadFile(name string) ([]byte, error)
needs to be able to return ErrNotImplemented
so that the function calling it can say "Oh, well, then let me fallback to Open()
+ Read()
." The main sin of the existing optional interfaces is that there's no way to signal "I have this method just in case the thing I'm wrapping implements it." This shortcoming really needs to be addressed in the io/fs optional interfaces.
So, for the ReadFile
top level func, I am proposing this implementation:
func ReadFile(fsys FS, name string) ([]byte, error) {
if fsys, ok := fsys.(ReadFileFS); ok {
b, err := fsys.ReadFile(name)
if err != ErrNotImplemented { // Or errors.Is?
return b, err
}
}
file, err := fsys.Open(name)
if err != nil {
return nil, err
}
defer file.Close()
return io.ReadAll(file)
}
Discussion of ErrNotImplemented has moved to #41198.
I've marked @carlmjohnson's two comments above this one
as well as @randall77's comment below this one
as "off-topic" to try to funnel discussion over there.
On Thu, Sep 3, 2020 at 10:15 AM Carl Johnson notifications@github.com
wrote:
I think optional interfaces can work if there is a way to indicate that
even though a method exists on a wrapper, it hasn't been implemented by the
underlying wrapped type. Something like ReadFile(name string) ([]byte,
error) needs to be able to return ErrNotImplemented so that the function
calling it can say "Oh, well, then let me fallback to Open() + Read()."
The main sin of the existing optional interfaces is that there's no way to
signal "I have this method just in case the thing I'm wrapping implements
it." This shortcoming really needs to be addressed in the io/fs optional
interfaces.The other way to handle this is to have a factory for the wrapper that
returns a type with the correct methods on it.
type I interface {
Foo()
}
type Optional interface {
Bar()
}
func NewWrapper(i I) I {
if _, ok := i.(Optional); ok {
return &wrapperWithBar{i: i}
}
return &wrapperWithoutBar{i:i}
}
type wrapperWithoutBar struct {
i I
}
type wrapperWithBar struct {
i I
}
func (w *wrapperWithoutBar) Foo() { w.i.Foo() }
func (w *wrapperWithBar) Foo() { w.i.Foo() }
func (w *wrapperWithBar) Bar() { w.i.(Optional).Bar() }
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/golang/go/issues/41190#issuecomment-686632868, or
unsubscribe
https://github.com/notifications/unsubscribe-auth/ABUSAIFGSNLHB456YB5L6C3SD7FKNANCNFSM4QTHTZEA
.
@networkimprov, see https://go.googlesource.com/proposal/+/master/design/draft-iofs.md#why-not-in-golang_org_x.
This isn't worth much without the standard library integration. It also can't be used for embed.Files from x.
I'd feel more comfortable with this if it included basic combinatory file systems either in io/fs or somewhere official like golang.org/x. They would still have the issue with not understanding nonstandard optional methods but they would at least be guaranteed to keep up with official optional methods.
The two file systems I'm thinking of are an overlay fs and one that can "mount" other file systems in subdirectories of its root. With those two you could stitch multiple fs together easily.
@jimmyfrasche I don't understand the difference between "an overlay fs" and "one that can mount other file systems in subdirectories of its root." I agree we should provide something like that, and we intend to. But those sound like the same thing to me. :-)
I was thinking just:
func Overlay(fses ...FS) FS
for the former and the latter would satisfy
interface {
FS
Mount(dirName string, fs FS) error
}
and not have any files other than those mounted.
Got it, thanks @jimmyfrasche: union vs replace.
The io/fs integration with stdlib, and embed.Files can all be prototyped in x/
I wasn't suggesting x/ as the permanent home.
EDIT: Also, Readdir() & FileInfo have performance problems and missing features. The replacement APIs need prototypes. A draft is in https://github.com/golang/go/issues/41188#issuecomment-686283661
I have two comments. I found similar comments in the reddit thread, but didn't see a satisfying discussion/conclusion. Apologies if I missed previous conclusions.
I think we should consider an official mechanism to wrap fs.FS objects (and probably fs.File objects). For example, I want to wrap an fs.FS to track the total number of bytes read. I need to intercept calls to fsys.Open and calls to fsys.ReadFile, if implemented. I also don't want to lose any other optional interfaces such as fs.GlobFS. Based on my experience with http.ResponseWriter, this is commonly needed, but hard and tedious to do correctly
For a concrete idea to discuss, something like this:
type FSWrapper struct {
FS
OpenFunc func(name string) (File, error)
ReadFileFunc func(name string) ([]byte, error)
// ... all other extensions ...
}
func (w *FSWrapper) ReadFile(name string) ([]byte, error) [
rf, ok := w.FS.(ReadFileFS)
if !ok {
return nil, errors.ErrNotImplemented
}
if w.ReadFileFunc != nil {
return w.ReadFileFunc(name)
} else {
return rf.ReadFileFunc(name)
}
}
Granted there are cases where a generic wrapper would expose extensions you don't want to pass through. Anyway, I think at least the proposal would benefit from discussion or FAQ addressing wrapping.
It seems like we are starting with read-only because that is the simplest interface that enables the motivating embed feature. However, in some sense writing is more fundamental because you have to write before you can read (only half joking). I worry writing will be relegated to second class citizenship forever due to optional interfaces. For example, the hypothetical OpenFile extension:
func OpenFile(fsys FS, name string, flag int, perm os.FileMode) (File, error)
OpenFile returns an fs.File which has no Write method. It seems a bit strange to always have to type assert to get a writable file. I think the eternal friction between io.Writer and fs.File as proposed will be more painful than starting with a broader proposal.
In particular, I think we should consider:
Guys, PLEASE just add context everywhere! It costs nothing to ignore it or add context.TODO() for callers, but it will make life of network filesystem implementers and users much easier. In particular, it's needed for better contextual logging and cancellation.
You're all strictly opposed to thread-local variables, but then why are you designing APIs without a way to pass a context?!?
Deadlines and interrupts have been suggested as another way to solve the same problem, without affecting every function signature. It's unlikely that the os package will add dozens of new APIs with Context
, see also #41054.
I think the simplest way to solve this is to pass a context on filesystem creation. So instead of having a type implementing fs.FS
directly, it would have a method WithContext(context.Context) fs.FS
, which returns a child-instance bound to a given context.
Cancelling all pending ops by the stdlib file API (which will implement fs.FS) is not desirable. It probably isn't useful for other FS types, as well. The common case, in my experience, is interrupting any pending ops trying paths within a tree rooted at a certain path. An API for that looks like one of:
(f *MyFs) SetDeadline(t time.Time, basepath string) error // if deadline past, interruption is immediate
(f *MyFs) InterruptPending(basepath string) error
Note that os.File already supports Read & Write deadlines.
I doubt that io/fs
wants context
as a dependency. Where needed, you could easily wire context into an fs.FS implementation to do one of the above.
Gah. The deadlines/interrupts design is just horrible. No, it's seriously horrible. The whole idea for not including thread IDs in Golang was to make sure APIs are forced to deal with clients potentially running in multiple goroutines.
Introducing the per-FS state will defeat this purpose, making the FS object behave more like a TCP connection rather than a dispatcher for an underlying FS. And only one goroutine at a time would be able to use it, otherwise they might step on each others' toes with deadlines. Never mind the badness of introducing a hidden state where it arguably shouldn't even be in the first place.
What are the actual downsides of simply adding context.Context to every method?
I think the simplest way to solve this is to pass a context on filesystem creation. So instead of having a type implementing
fs.FS
directly, it would have a methodWithContext(context.Context) fs.FS
, which returns a child-instance bound to a given context.
This will require the FS implementation to be a thin wrapper that supplies context to the underlying implementation. Certainly doable, but still ugly.
And it will still introduce dependency on context.Context in the FS code.
@Cyberax All you need is f, err := fsys.WithContext(ctx).Open(p)
to make that one open file obey that one context. Easy sharing.
This has been before with x.IO(ctx)
returning io.Reader
or such, to keep the io.Reader
interface. It's a pretty simple layer.
So @tv42 that's (f *MyFs) WithContext(context.Context) *MyFs
? That's reasonable.
@Cyberax All you need is
f, err := fsys.WithContext(ctx).Open(p)
to make that one open file obey that one context. Easy sharing.
Not unless you want to do FS wrapping, but that's already been mentioned here. To expand this a bit, currently FS is supposed to consist of multiple optional interfaces (such as ReadFileFS
), and the WithContext
method's signature will have to use the most basic interface (FS
).
So your example will actually be: f, err := fsObject.WithContext(ctx).(ReadFileFS).ReadFile(name)
- it's NOT typesafe at all.
This has been before with
x.IO(ctx)
returningio.Reader
or such, to keep theio.Reader
interface. It's a pretty simple layer.
The whole TCP and the general IO layer in Go is a mess, so it's not at all a good example. Witness the number of questions on Google about cancelling IO operations on TCP connections (via SetDeadline).
And let me remind everybody about Go's own style guide: https://github.com/golang/go/wiki/CodeReviewComments#contexts
A function that is never request-specific may use context.Background(), but err on the side of
passing a Context even if you think you don't need to. The default case is to pass a Context;
only use context.Background() directly if you have a good reason why the alternative is a mistake.
Don't add a Context member to a struct type; instead add a ctx parameter to each method on
that type that needs to pass it along. The one exception is for methods whose signature must
match an interface in the standard library or in a third party library.
So your example will actually be: f, err := fsObject.WithContext(ctx).(ReadFileFS).ReadFile(name) - it's NOT typesafe at all.
I think that would be:
data, err := fs.ReadFile(fsys.WithContext(ctx), "name")
Also, what's preventing WithContext
from returning a concrete type?
Re Context
everywhere, the stdlib file API has dozens of functions, and even if you replicate them all to add a Context
argument, every package that calls any of them would have to replicate its own API to add Context
. It's just not viable.
I haven't heard a good argument for why it's wrong to ask pending file ops to return an InterruptError
in whatever goroutines invoked them.
I think that would be:
data, err := fs.ReadFile(fsys.WithContext(ctx), "name")
Sure, putting the code inside a helper function will help in this one particular case. But it won't help with the Stat
interface and other optional interfaces defined in the spec.
Also, what's preventing
WithContext
from returning a concrete type?
Because it's going to be defined in the interface Contexter
(or something like it) and it can't have the knowledge of the concrete type.
Because it's going to be defined in the interface Contexter (or something like it) and it can't have the knowledge of the concrete type.
Perhaps this is a silly suggestion, and I know this doesn't help us now, but as I mentioned near the start of this thread, would generics help with this particular issue?
Re
Context
everywhere, the stdlib file API has dozens of functions, and even if you replicate them all to add aContext
argument, every package that calls any of them would have to replicate its own API to addContext
. It's just not viable.
This is a _new_ API, so I don't see the problem with adding context to it. Existing stdlib code that needs to touch files can either scrounge up the context from somewhere (for example, the HTTP server code has it) or just put context.TODO().
I haven't heard a good argument for why it's wrong to ask pending file ops to return an InterruptError in whatever goroutines invoked them.
Your API is not thread-safe. It's perfectly possible to open the same file twice from two different goroutines, but your interruption code will affect both of them.
It's also completely incorrect on Unix, because name doesn't uniquely identify a file, so it's possible to do:
fl := fs.Open("file.a")
fs.Move("file.a", "file.b")
fl2 := fs.Open("file.a", O_CREATE)
...
fl.Read(...)
fl2.Read(...)
fs.Interrupt("file.a") // Should we interrupt both reads?
Perhaps this is a silly suggestion, and I know this doesn't help us now, but as I mentioned near the start of this thread, would generics help with this particular issue?
No, not in this particular case.
Re: the stdlib API argument. I've just searched all the uses of os.Open*
in the stdlib. Here are my findings:
Basically, if the standard library is refactored to use the new FS API, it can be done in minutes just by adding context.Background() to most of the places without changing any important semantics. And some code (e.g. HTTP client for multipart uploads) will actually be improved by adding full cancellation support.
I haven't heard a good argument for why it's wrong to ask pending file ops to return an InterruptError in whatever goroutines invoked them.
context.Context
argument.If you want existing packages to use fs.File types, Context
arguments aren't an option. But we still need to be able to interrupt ops on those files. It's fine if you don't like InterruptPending("/some/tree")
, But can we please hear alternatives that let existing code benefit?
It's perfectly possible to open the same file twice from two different goroutines, but your interruption code will affect both of them.
As intended! The InterruptPending()
API applies to a tree, not a file. You interrupt all file ops on a tree only when something's gone wrong, and the only good option is to stop them all -- because they won't return otherwise. A more flexible API could let you inspect a table of pending ops, and choose which to interrupt.
explicit is better than implicit
Ha, Go has a metric tonne of implicit behavior. That aside, apps calling packages that use today's file ops need a way to interrupt them implicitly, because explicit isn't an option.
every file operation has to pay the cost of preparing for and handling interrupts
I believe this is sufficient: if err != nil { return err }
. Only a module that calls InterruptPending()
must handle InterruptError
.
then we have to modify every file operation in the standard library anyhow
Haven't you conflated changing internals with changing APIs? I'm not categorically opposed to new APIs, but many third party packages won't get around to using them.
Still waiting to hear a good argument :-)
@rsc are there ways to let an app change the default FS used by os.Open()
for a specific file -- or all files?
It's common to pass filenames to package functions; we may also need to set the FS object where those filenames reside.
Perhaps a filename prefix starting with a character not allowed in filenames? Rather a hack, tho.
@networkimprov
The common case, in my experience, is interrupting any pending ops trying paths within a tree rooted at a certain path.
Do you have any data to support that assertion? Because I have never seen that use case and I can't think of a case where this would be useful. And in RPC servers it's certainly counterproductive, because you would break other, concurrent RPCs.
Still waiting to hear a good argument :-)
FTR, this isn't a very productive way to discuss.
are there ways to let an app change the default FS used by os.Open() for a specific file -- or all files? It's common to pass filenames to package functions
The goal of this proposal is to make that less common - or to pair it up with passing an fs.FS
if one isn't already known. Doing some implicit side-loading of fs.FS
dependencies seems counter to Go's spirit (see also Ian's "explicit is better than implicit").
I have never seen that use case and I can't think of a case where this would be useful.
A file tree rooted on a network filesystem will cease responding if the network or server goes down, a pretty commonplace event. A single file op could stall due to congestion, so perhaps InterruptPending(path)
is part of a broader API.
Still waiting to hear a good argument :-)
Apologies if that seemed pejorative. This debate has been going across three issues (see https://github.com/golang/go/issues/40846#issuecomment-676777624 and #41054) and no one's addressed the concerns I've raised re Context
arguments, nor carefully considered the idea of terminate/interrupt APIs. I'm not sure that anyone on the Go team even agrees that file ops should be interruptible, even tho at least three of us have stated cases for it, and it is a feature of Linux CIFS & FUSE and Windows I/O.
implicit side-loading of fs.FS dependencies seems counter to Go's spirit
I asked because anything that expected *os.File
could take fs.File
by changing a single type. It would be nice if changing os.Open()
to xfs.Open()
were similarly trivial. In any case, we may need type FileName struct { f FS; p string }
.
Floating another idea... the pending ops table:
type PendingFS interface {
FS
TrackPendingOps(bool)
ListPendingOps() []PendingOp
InterruptOp(op ...PendingOp)
}
type PendingFile interface {
File
ListPendingOps() []PendingOp
}
type PendingOp interface {
Op() string
Pathname() string
Params() []interface{}
String() string
}
If you want existing packages to use fs.File types, Context arguments aren't an option. But we still need to be able to interrupt ops on those files. It's fine if you don't like InterruptPending("/some/tree"), But can we please hear alternatives that let existing code benefit?
Existing code doesn't expect file operations to be interrupted. So I don't think there is a strong argument for providing some mechanism that works with existing code.
I agree that if we add Context parameters to io/fs, then we will, over time, want to add them to the standard file system operations as well. Or we'll want to add some other mechanism to permit interruption. This is not necessarily infeasible. Whether it is the right thing, I don't know. But to me it seems more likely to be the right thing to do than adding a global hammer that affects all inflight file system operations.
explicit is better than implicit
Ha, Go has a metric tonne of implicit behavior.
That is not a counter-argument.
That aside, apps calling packages that use today's file ops need a way to interrupt them implicitly, because explicit isn't an option.
I think that it is an option. I don't know if it is the right option, but I see no reason to reject it out of hand.
every file operation has to pay the cost of preparing for and handling interrupts
I believe this is sufficient: if err != nil { return err }. Only a module that calls InterruptPending() must handle InterruptError.
What I mean is that if we need some mechanism to interrupt every inflight file operation, then every file operation needs to somehow register itself before starting and then deregister itself when done. Otherwise the runtime will have no way to interrupt it. That is the cost that must be paid by every file operation. But, as I said earlier, even in the best case only a very small percentage of file operations could ever possibly be interrupted.
I'm not categorically opposed to new APIs, but many third party packages won't get around to using them.
Those third party packages already do not expect their file operations to be interrupted. Given that file operations fail in predictable ways, there is no special reason to expect that packages are prepared to correctly handle a new kind of file operation failure. I don't see supporting existing third party packages with no change as a priority.
no one's addressed the concerns I've raised re Context arguments, nor carefully considered the idea of terminate/interrupt APIs
I really can't agree with this. People have addressed those concerns, and they have considered those ideas. You don't agree with what people have said, but that doesn't mean that your thoughts haven't been considered.
A file tree rooted on a network filesystem will cease responding if the network or server goes down, a pretty commonplace event. A single file op could stall due to congestion, so perhaps
InterruptPending(path)
is part of a broader API.
I actually have code right now that has something similar. It uses an NFS disk as a cache in front of a DynamoDB table. The code first tries to read the data from this disk (using an NFS client in Go) and if the data doesn't come back within 2ms, it cancels the request and makes (an expensive) request to DynamoDB. Cancelling all requests in a subtree would not help a bit.
@networkimprov
no one's addressed the concerns I've raised re Context arguments
I'm not sure what they are. So far, I've seen a) it adds a dependency on context
to os
and b) *os.File
doesn't allow cancellation yet, as syscalls can't be interrupted.
But can we please hear alternatives that let existing code benefit?
I haven't seen that comment before. I already mentioned an alternative when you wrote that. It would allow supporting cancellation on a per-filesystem basise - that is, without any need to add context.Context
to existing os
implementations. Note that this solves both the concerns I mention above. It allows cancellable and not-yet-cancellable implementations to provide the same API. If a not-yet-cancellable implementation implements cancellation, it can add that support by exposing func WithContext(context.Context) *FooFS
.
I already mentioned an alternative [to
Context
in the FS API]
Axel, so you did, thanks for that. It seems that omitting Context
from the FS API is more popular than including it.
Existing code doesn't expect file operations to be interrupted. So I don't think there is a strong argument for providing some mechanism that works with existing code.
I don't see supporting existing third party packages with no change as a priority.
Ian, I'm sure some packages would need changes to accommodate interrupt errors initiated by a caller. In others, the std practice of returning the error is sufficient. In either case, a package should have to alter its APIs _as a last resort!_ It's surprising and disappointing to hear that you disagree.
[
Context
] seems more likely to be the right thing to do than adding a global hammer that affects all inflight file system operations
I just offered a scalpel :-) https://github.com/golang/go/issues/41190#issuecomment-687632101. I'll file a proposal for that in a day or so.
explicit is better than implicit
Much of the rationale of Go is that implicit behavior saves complexity in user code, so that assertion is not generally true.
every file operation needs to somehow register itself before starting and then deregister itself when done
Oh, yes. Since a blocking op may entail a thread creation and/or context switch, I guessed that maintaining a table of ops would be relatively inexpensive. And you wouldn't do it unless the app requested the ability to interrupt things.
no one's addressed the concerns I've raised re Context arguments, nor ... considered the idea of terminate/interrupt APIs
People have addressed those concerns, and they have considered those ideas.
Your opening argument to me on this topic was that _network filesystems are irrelevant for Go programs_ -- https://github.com/golang/go/issues/40846#issuecomment-678560324. That was startling, and even insulting; I don't know my end-users' environments? Have you changed your thinking? Is there a problem to be addressed here?
If so, can we try to support existing packages before banishing them from apps that depend on network filesystems?
Your opening argument to me on this topic was that network filesystems are irrelevant for Go programs -- #40846 (comment). That was startling, and even insulting; I don't know my end-users' environments? Have you changed your thinking? Is there a problem to be addressed here?
I apologize if that seemed insulting. That was certainly not my intent.
My position was and remains that if you know beforehand that your program is going to have to run on a network file system, you will be better off using a client server protocol rather than relying on the network file system protocol.
The point of a network file system is to hide the network. Decades of experience have shown us that the network is hard to hide. Better to embrace it rather than hide it.
My position was and remains that if you know beforehand that your program is going to have to run on a network file system, you will be better off using a client server protocol rather than relying on the network file system protocol.
If this is a reason not to support the context API (or in general a sane way of deadline/cancellation), then what about slow disk IO? Wouldn't those still be blocking and thus should still have some sort of deadline as well?
Quoting @bcmills from https://github.com/golang/go/issues/40846#issuecomment-679195944:
"If a program _does not know_ whether it may be running on a networked file system, then it must use ordinary file system calls (because it may be on an ordinary file system), but it must also provide for canceling stalled operations (because it may be on a networked filesystem)."
This describes any app whose authors don't control its deployment environments.
@diamondburned I'm fine with adding a context.Context
or a deadline to file operations.
I filed #41249. It offers a few ways to interrupt stalled file ops, including two that take context.Context
.
Would Context
arguments mean spawning a goroutine for any synchronous io/fs op, to wait on Context.Done()
? If not, how does the op obtain a cancellation notification?
If so, isn't a goroutine per op rather significant overhead for use of a filesystem?
@networkimprov, it's true that an external library accepting a context
would potentially need to start a goroutine per operation, but that library could then either be used as the prototype for a standard-library API (which can hook into the context
package internally to provide cancellation without a separate goroutine), or could provide solid evidence for a new exported hook in the context
package (as originally proposed in #28728).
This issue is about generalizing the FS API we have. It is not about designing a whole new API with bells and whistles for cancellation, deadlines, hung network file systems, and so on.
Unpopular opinion, maybe, but @ianlancetaylor is right: If your network service is not reliable enough to maintain the illusion of being like an ordinary on-disk file system, the solution is not to duplicate all the network concepts into the file system layer. Instead, the solution in that case is to use explicit network RPCs that already have all those concepts.
There are plenty of possible file system implementations that are reliable enough to maintain the illusion of an ordinary on-disk file system, and the generalized API in this proposal is targeted at those.
@networkimprov:
The io/fs integration with stdlib, and embed.Files can all be prototyped in x/
It cannot, because stdlib cannot import x/.
@rsc are there ways to let an app change the default FS used by os.Open() for a specific file -- or all files?
Package os is about os-provided files only. If you have an *os.File, you need to be sure that's what it really is - an operating system file. Otherwise you can't reliably do things like pass f.Name() to another process, or start another process with that file as one of its file descriptors, and so on.
It's common to pass filenames to package functions; we may also need to set the FS object where those filenames reside.
Yes, indeed. Package functions that want to work on arbitrary FS implementations need to be extended to take an FS, path pair, just as the text/template and html/template packages do in the prototype code.
Perhaps a filename prefix starting with a character not allowed in filenames? Rather a hack, tho.
That would imply some kind of a global registration table, which turns out to be a good thing to avoid. Just because I'm using a zip file as if it were an FS doesn't mean I want other parts of the program (or evaluation of command-line arguments) to be able to make up just the right path name and get at those same files.
Re context specifically, I spent a long time trying different possible ways to mention context anywhere in the FS proposal and ended up at the best choice being not to. Not because context is bad or should be ignored, but because the cleanest way to integrate context seems to be at the time of FS creation.
We can't pass context to methods like fs.File.Read and fs.File.Write, because then those methods would not satisfy io.Reader, io.Writer, and so on.
We could move up one level and pass the context to fs.FS.Open, having the context saved and applied to all operations on the resulting file. It sounds like many people would be happy with that level of control. A downside, though, is that you then either have to add a context parameter to all the helper methods and functions (Glob, ReadFile, ...) as well as any function that accepts an FS (template.ParseFS, ...). Nearly all the time, that context will be context.Background(), making the calls annoyingly verbose for no real benefit. And many functions accepting FS will choose _not_ to take a context and put in context.Background themselves. People who want to use context with those functions will be completely out of luck.
Or we could move up one more level, as @Merovius suggested, and set the context at the creation of the fs.FS and have it apply to all operations on the resulting file system. That avoids all the downsides of the previous paragraph. And all the people who would be happy with "context set during Open" should be equally happy with "context set during FS creation" provided that's a lightweight enough operation. They should also be happy that it is then _impossible_ for people to write FS-accepting code that fails to plumb a context through: using the FS object implies use of context.
That last option - which doesn't require mentioning context in package fs at all - is the one that I recommend here, both because it seems to be the only one that actually works in practice from an API standpoint (as just noted) and because when you start getting into very fine-grained control over individual underlying network operations, you're better off using RPCs, as @ianlancetaylor said and I repeated above.
Not because context is bad or should be ignored, but because the cleanest way to integrate context seems to be at the time of FS creation.
Thank you
@muirdm, thanks very much for bringing up wrapping and future extension methods. Your example is a really great one:
For example, I want to wrap an fs.FS to track the total number of bytes read. I need to intercept calls to fsys.Open and calls to fsys.ReadFile, if implemented. I also don't want to lose any other optional interfaces such as fs.GlobFS. Based on my experience with http.ResponseWriter, this is commonly needed, but hard and tedious to do correctly.
Absolutely true. The problem is, the wrapper-builder-helper approach more often than not replaces "hard to do completely" with "easy to do incorrectly".
As you noted, in the current proposal's API, to count bytes you need to intercept Open. That's all you need, really:
type countingFS struct {
total int64
real fs.FS
}
func (fsys *countingFS) Open(name string) (fs.File, error) { ... fsys.real.Open & return counting file ... }
With no changes, the current proposal makes this kind of wrapper "easy to do correctly".
The problem is it may not be as efficient as you might want, because maybe the underlying FS has a ReadFile that's more efficient in some important way compared to Open+Read+Close, and the countingFS is hiding that. So you have to add a counting version:
func (fsys *countingFS) ReadFile(file string) ([]byte, error) {
data, err := fs.ReadFile(fsys.real, file)
fsys.total += int64(len(data))
return data, err
}
Fair enough. But now suppose the underlying FS has a Glob that's more efficient too. Even though there's no counting involved, you need to write a wrapper just to preserve access to the fast implementation:
func (fsys *countingFS) Glob(pattern string) ([]string, error) {
return fs.Glob(fsys.real, pattern)
}
So now we're at "easy to do correctly, a bit tedious to do completely". Having an incomplete wrapper may mean that the returned fs is not as efficient at certain operations, or it may mean that it doesn't support them at all.
At this point we might well think: that Glob method is going to get written a lot, let's write a wrapper-helper to help, along the lines of the code you posted. Suppose that wrapper-helper exists and consider what happens when we forget about a particular extension method, say ReadFile:
The same concerns happen regardless of example. Another popular example is a sub-directory file system, defined by the property:
fs.Subdir(fsys, prefix).Open(name) == fsys.Open(path.Join(prefix, name)).
This is an even clearer example of the wrapper-helper failing. Because every method on FS is likely to take a name as an argument, any method forwarded automatically by the wrapper-helper will bypass the prefix addition and therefore be buggy. Maybe that's a case where it would be clear that you would avoid the wrapper-helper from the start. But maybe not. And most examples are not that clear. They're more like the counting FS example, where the wrapper-helper seems like a good idea until you realize that there's almost always some future method that might be added that your wrapper would absolutely need to adjust the behavior of. When that happens, it's much better not to have the method at all than to have a buggy one provided by the wrapper.
That is, the wrapper-helper is a good idea _only_ if you are very careful when you use it the first time and it never changes. But as soon as you add change into the mix—as soon as you start doing software engineering—the wrapper-helper ends up being a one-time convenience with a fine print guarantee of future buggy behavior.
@muirdm Thanks also for bringing up writable files. It's true that if we later want to add writing, OpenFile would still return File, which would mean a type assertion at an OpenFile call site when opening for writing. That seems OK to me, especially if we paired OpenFile with a helper func Create:
type WriteFile interface { io.Writer; File }
func Create(fsys FS, name string) (WriteFile, error) { ... }
This seems fine to me in part because I expect that most programs that just want a File with a Write method would use the Create helper function.
The alternative is a lot more dummy methods that have to be implemented. (Of course, you could use a wrapper-helper, but that has other problems, as noted in my comment above.)
Another reason that it seems OK to make writing a little more work is that I think this (half-joke) is not too accurate:
However, in some sense writing is more fundamental because you have to write before you can read (only half joking).
On average, files are read more than they are written: there's no point at all writing a file that will never be read, which makes the average reads per write > 1.
Writing may or may not be more fundamental but is certainly more _specific_. It involves giving a concrete answer to every possible kind of metadata supported by the underlying file storage - modification times, uids, compression algorithm in a zip file, and so on. If you use an abstracted writer, you give up the ability to set those and get whatever defaults are provided. Maybe that's fine, but maybe not. And you only get one chance to set all these: the time when the file is written.
In contrast, reading is easier to make _abstract_. A particular reader may well not care about any particular detail like modification times, uids, or compression algorithms. That's fine: it can ignore them. And another reader that does care can look at info.ModTime() or even use info.Sys() to get implementation-specific additional detail. Both can exist simultaneously for the same file: the abstract reader and the more specific reader. Using an abstract reader does not preclude also using a specific reader.
In contrast, using an abstract writer _does_ preclude using a specific reader, because the specific bits the reader is interested in won't be there at all. So code writing files inevitably trends toward handling all the extra metadata even if there's only one reader that cares. That makes any attempt at an abstract writing API inevitably trend toward complexity, because you only get to write a file once, one way.
But since you get to read a file repeatedly, many different ways, and because most readers only care about a few details, there's much more opportunity for a simple abstract API.
All this is to say I think the design decision of limiting this proposal to read-only operations is a good one. In fact it was the key insight (by @robpike) that unlocked years of being stuck and let us make any progress defining this interface at all.
Writing files is where the dragons are.
The io/fs integration with stdlib, and embed.Files can all be prototyped in x/
It cannot, because stdlib cannot import x/.
Sorry I wasn't clear. I meant you can duplicate to x/ any parts of stdlib necessary to prototype embed.Files etc in x/.
Thanks for your other responses.
@rsc
On average, files are read more than they are written: there's no point at all writing a file that will never be read, which makes the average reads per write > 1.
I find this statement hard to understand. Many programs write files that they never consume. A trivial example is logging. But there are also format converters, content downloaders and compression utilities as the first few that come to mind.
@rsc Thank you for responding.
Regarding wrapping, you give a lot of good reasons to avoid passing through optional interfaces by default. I officially withdraw my example wrapper.
As you point out in https://github.com/golang/go/issues/41198#issuecomment-686618566, "opt-in" wrapping is not difficult because the wrapper can unconditionally implement the extension interface and fall back to the default implementation if the underlying object doesn't implement the extension interface. That insight makes me feel better about wrapping.
Looking forward, though, as more disjoint extension interfaces (i.e. extensions with no fallback implementation) are introduced, we may yet have to grapple with the combinatorial explosion of extension interfaces in order to maintain required functionality when wrapping. Do you see that problem as avoidable? Solvable? Inevitable?
Looking forward, though, as more disjoint extension interfaces (i.e. extensions with no fallback implementation) are introduced, we may yet have to grapple with the combinatorial explosion of extension interfaces in order to maintain required functionality when wrapping. Do you see that problem as avoidable? Solvable? Inevitable?
To magnify this point a bit, I think there are two classes of optional interfaces:
When replacing the optional efficiency methods, it doesn't make sense to do naive wrapping for all the reasons explained above by Russ. But I think that for the capability methods there need to be separate mechanisms for a) testing the existence of the capability and b) wrapping the capability. The problem for http.ResponseWriter is there is no way to wrap an http.Flusher (b) without providing a Flush method _which implies the existence of a Flush capability_ (a). That's what leads to the combinametric explosion where you need to have WrapperWithFlusher, WrapperWithPusher, and WrapperWithFlusherAndPusher, etc. For the efficiency methods, there's no real need or reason to distinguish b from a because you can just do the inefficient thing if the method isn't there. For the capability interfaces, it's totally different.
I think in introducing new optional interfaces, this difference needs to be explicitly marked and thought through to make sure we don't have more Pusher/Flusher-style unwrappable interfaces. AFAICT, so far the fs interfaces are basically just efficiency methods, but we need to be careful that if any capabilities interfaces are added, they are added with a way to test for their existence besides just checking for the methods.
Reading through the proposal again, these interfaces (which are just being mooted and not seriously planned AFAICT) are capabilities:
type RenameFS interface {
FS
Rename(oldpath, newpath string) error
}
func Rename(fsys FS, oldpath, newpath string) error {
if fsys, ok := fsys.(RenameFS); ok {
return fsys.Rename(oldpath, newpath)
}
return fmt.Errorf("rename %s %s: operation not supported", oldpath, newpath)
}
type OpenFileFS interface {
fs.FS
OpenFile(name string, flag int, perm os.FileMode) (fs.File, error)
}
func OpenFile(fsys FS, name string, flag int, perm os.FileMode) (fs.File, error) {
if fsys, ok := fsys.(OpenFileFS); ok {
return fsys.OpenFile(name, flag, perm)
}
if flag == os.O_RDONLY {
return fs.Open(name)
}
return fmt.Errorf("open %s: operation not supported", name)
}
Fortunately, because both capabilities interfaces involve returning an error, there's a logical way to signal that you have the method but cannot actually use it: return some canonical error value if the fs.FS you're wrapping doesn't have the required capability.
Sorry I wasn't clear. I meant you can duplicate to x/ any parts of stdlib necessary to prototype embed.Files etc in x/.
Correct me if I'm wrong, but the issue with x/ is that the build
command has to be aware of embed.Files
. That's why x/ is out, because we can't have build
importing from there. I agree with you that a prototype/trial phase would be nice, but it's just not on the table for this reason.
Unless you mean prototyping with a different Go tool binary a la go2go
but I don't really see the draw in that.
Aren't some x/ packages vendored into the core repo for use by tools? 1.16 could provide a prototype of embedding that way.
This issue is about generalizing the FS API we have. It is not about designing a whole new API with bells and whistles for cancellation, deadlines, hung network file systems, and so on.
I'm not going to participate in the discussion further, but Go has this infuriating tendency: a lot of very good thought out interfaces with attention to detail and among them some brain-dead bogosities that are not extensible and have to be worked around.
Examples include: the SQL layer, deadlines in TCP, Readers/Writers, concrete classes in the log package, testing.T not being an interface, etc. Some of them are relics of the past before the modern practices have evolved, like passing a context everywhere, some are just designs that gave no thought to extensibility and evolution.
And here we have another such example. The FS API that is meant for a very narrow use-case and will be difficult to extend in future. Seriously, just pass the freaking context everywhere and be done with it. It's mentioned in the Go's own style guide for FSM's sake: https://github.com/golang/go/wiki/CodeReviewComments#contexts
If you personally don't need it NOW doesn't mean that other people won't need it later.
Not because context is bad or should be ignored, but because the cleanest way to integrate context seems to be at the time of FS creation.
No, it's not. I already gave an example: a filesystem with deadlines on Read() operations. Each read might have its own deadline, independent of anything else.
No, it's not. I already gave an example: a filesystem with deadlines on Read() operations. Each read might have its own deadline, independent of anything else.
For what it's worth, the existing os.File
type already provides a SetDeadline
method (https://pkg.go.dev/os?tab=doc#File.SetDeadline). It doesn't seem like a big stretch to add that as an io/fs extension for file systems for which that makes sense.
As a bystander - @Cyberax I don't agree with the way you've written multiple comments in this thread. If you intend to participate in any issue in the future, please remember the code of conduct - be respectful and thoughtful.
For what it's worth:
Go has this infuriating tendency: a lot of very good thought out interfaces with attention to detail and among them some brain-dead bogosities that are not extensible and have to be worked around.
This turns out to be the case for just about everything: some parts are great, some are not. The tough part is that people disagree about which parts are which.
@muirdm and @carlmjohnson both commented about the "combinatorial explosion of extension interfaces."
I think we understand how to avoid that now:
Consider these optional interfaces: http.Hijacker, http.Flusher, http.Pusher.
The mistake in this trio is Flusher, which doesn't say anything about failure.
It's nice that Pusher defines a specific error, but I don't think that's actually too important. Hijacker is fine too.
The fact that you can't have a Flush method that fails is what leads to Wrapper vs WrapperWithFlusher. On the other hand, you should never need to distinguish Wrapper vs WrapperWithPusher, nor WrapperWithFlusher vs WrapperWithFlusherAndPusher: having a Push method does not guarantee it works.
A wrapper can provide any extension methods the author knows about and forward them along to the wrapped implementation if available and otherwise implement the right fallback: for optimizations, call the generic code (fs.ReadFile, etc), and for new functionality, return an error - that functionality is not available.
The biggest anti-pattern is code assuming that the presence of a method means it is guaranteed never to fail. As long as every extension method has an error result, that problem should be avoided.
When we think about file systems, we expect every operation to be able to fail - if nothing else, there's always the possibility of "permission denied". So making the Flusher mistake here of not specifying failure behavior is pretty unlikely.
As long as we are careful to make sure all extension methods have an error result, it seems to me there is no possibility of a combinatorial explosion of wrappers. Each wrapper should implement what it knows about, and it is OK to update the wrappers as more are added. (A reasonable initial objection is "but the wrappers will need to be updated", but the answer to that is my earlier comment, perhaps supplemented with the observation that if you are depending on code that has no possibility of being updated, you may have other problems as well.)
P.S. Straying from the general case a bit into HTTP-specific details, but I'd argue that a no-op Flush method should be considered a valid implementation of http.Flusher. Given all the middleboxes that might sit between the Go HTTP code and the actual client, even flushing to the server's wire is no guarantee that the bits arrive on the client's wire, so it seems weird to be picky about the Go layers but not all the other parts along the connection. I would suggest that wrappers just always provide a Flush method, and make it a no-op if the ResponseWriter being wrapped doesn't have a Flush method.
In the discussion above, I see the concerns about wrapping and context, both of which I've replied to above and seem to have gotten positive responses. Are there other concerns about moving this proposal forward?
(Again, remember that we're not going to redo the os.File interface as part of this. That's off the table.)
FWIW, I am satisfied that my concerns about wrapping are addressed for now.
I'd like to see #41198 and #41188 settled first
In the discussion above, I see the concerns about wrapping and context, both of which I've replied to above and seem to have gotten positive responses. Are there other concerns about moving this proposal forward?
What are exactly the concerns that prevent Context from being just added explicitly as an argument for all methods? I haven't seen the explanation anywhere.
OK, sorry, won't comment in this thread again....
@Cyberax Among other things the fact that the *os.File
interface doesn't support cancellation and as was made clear, that API isn't going to change at this point. Given that this is the primary implementation of the proposed interface, it's definitely incorrect to add a context - it would be pretending a capability that doesn't exist.
Quoting https://github.com/golang/go/issues/41190#issuecomment-690814116: "There are plenty of possible file system implementations that are reliable enough to maintain the illusion of an ordinary on-disk file system, and the generalized API in this proposal is targeted at those."
IOW, in-memory data mostly.
I'd guess the Go team hasn't heard enough reports of stalled file ops in deployed apps to consider that a problem. It's most commonly seen by desktop apps on LANs with fileservers, where Go has little impact. I'm working on a desktop app, so am concerned about it; @Cyberax and @tv42 are the only others I've seen raise the issue. See also #41054.
@networkimprov I want to see file I/O cancellation if and when the OS supports it, but I don't see that needing much more here than fsys := fsys.WithContext(ctx)
in the right spot. The work to plumb that up resides elsewhere.
@Cyberax
What are exactly the concerns that prevent Context from being just added explicitly as an argument for all methods? I haven't seen the explanation anywhere.
I believe I addressed that in https://github.com/golang/go/issues/41190#issuecomment-690819876.
Waiting on #41467, which seems to be converging.
Based on the discussion above and the fact that #41467 is now a likely accept, this too seems like a likely accept.
No change in consensus, and #41467 is accepted, so accepting.
Change https://golang.org/cl/243909 mentions this issue: testing/iotest: add TestReader to test readers
Change https://golang.org/cl/243905 mentions this issue: go/build: allow io/fs to depend on time
Change https://golang.org/cl/243900 mentions this issue: os: use keyed literals for PathError
Change https://golang.org/cl/243907 mentions this issue: all: update references to symbols moved from os to io/fs
Change https://golang.org/cl/243906 mentions this issue: io/fs: move FileInfo, FileMode, PathError, ErrInvalid, ... from os to io/fs
Change https://golang.org/cl/263142 mentions this issue: all: update references to symbols moved from io/ioutil to io
Change https://golang.org/cl/243911 mentions this issue: os: add DirFS
Change https://golang.org/cl/243908 mentions this issue: io/fs: add FS, File, ReadDirFile; move DirEntry from os
Change https://golang.org/cl/243910 mentions this issue: testing/fstest: new package for testing file system code
Change https://golang.org/cl/243912 mentions this issue: io/fs: add ReadFile and ReadFileFS
Change https://golang.org/cl/243914 mentions this issue: io/fs: add ReadDir and ReadDirFS
Change https://golang.org/cl/243913 mentions this issue: io/fs: add Stat and StatFS
Change https://golang.org/cl/243915 mentions this issue: io/fs: add Glob and GlobFS
Change https://golang.org/cl/243916 mentions this issue: io/fs: add Walk
Change https://golang.org/cl/243939 mentions this issue: net/http: add FS to convert fs.FS to FileSystem
Change https://golang.org/cl/243938 mentions this issue: html/template, text/template: add ParseFS
Change https://golang.org/cl/243937 mentions this issue: archive/zip: make Reader implement fs.FS
Change https://golang.org/cl/264057 mentions this issue: cmd/go: fix TestScript/test_cache_inputs
The docs for io/fs don’t current refer to package io. It’s obvious to me that fs.File.Read is an io.Reader, but it should be in the docs for new Gophers. Is there any reason to not just define File as this?—
type File interface {
Stat() (FileInfo, error)
io.ReadCloser
}
This is such a fundamental interface for the package that it seems fine as is, but perhaps that is an old-fashioned perspective.
Whether it has an io.Reader or not is less important than making sure new Gophers know where to read the documentation for the method, which are kind of long and subtle (eg don’t retain p).
If that's your argument, shouldn't it be,
type File interface {
Stat() (FileInfo, error)
io.Reader
io.Closer
}
? I would expect after that to see a request to shorten it by a line by reverting to your version. And on it will go. So I say leave it.
In any case I'm not sure how many times people need to be told about what Read means. And if it's important, why isn't that information decorating os.File.Read and every other place?
How about documentation instead of trickery? Dress up the comment on the struct a little.
A File provides access to a single file that implements the methods as documented in the io and os packages.
Even that seems needless to me. What other methods would they be?
FWIW, the docs for os.File Read and ReadAt just succinctly explain the method, but ReadFrom says “ReadFrom implements io.ReaderFrom.” I suspect the difference is just that Read predates io.Reader and ReaderFrom postdates it.
Change https://golang.org/cl/268417 mentions this issue: io/fs: fix reference to WalkFunc
Change https://golang.org/cl/273946 mentions this issue: all: update references to symbols moved from io/ioutil to io
In contrast, reading is easier to make abstract. A particular reader may well not care about any particular detail like modification times, uids, or compression algorithms. That's fine: it can ignore them. And another reader that does care can look at info.ModTime() or even use info.Sys() to get implementation-specific additional detail. Both can exist simultaneously for the same file: the abstract reader and the more specific reader. Using an abstract reader does not preclude also using a specific reader.
Do you have plan to provide way to update ModTime in embed file? Current implementation of ModTime return always zero-time. So http.FS does not set Last-Modified header. i.e. browser does not cache the contents. I know it's possible to implement with adding ETag but not easy way, I think.
IIRC ETag is already implemented in http.FS
FS support ETag but it seems only scan. FS does not set ETag. I wrote small example code. But server does not set ETag header.
@mattn Wrong issue. You are looking for #41191
Ah, u are right.
Most helpful comment
Based on the discussion above and the fact that #41467 is now a likely accept, this too seems like a likely accept.