It's not possible without race conditions.
See #7849
Race condition avoidance is one of the numerous reasons it's necessary. This flag may be combined with O_EXLOCK or O_NOFOLLOW for various uses. Other examples are plentiful using google.
Opening a file with a lock atomically lets you easily modify files without race conditions or risk of data loss. The atomic_write shard would benefit from this.
Example of race condition free atomically rewriting a file:
File.open "file", O_EXLOCK | ... do |old_file|
buf = modify old_file.read
File.open "file.tmp", O_EXCL | O_EXLOCK | ... do |new_file|
new_file.write buf
end
File.rename old_file, new_file
end
Existing application already do this. This feature is required for compatibility with other applications as File#flock_* has unavoidable race conditions. Existing flock functions are still useful for upgradable locks or multi reader/writers on an existing file.
It seems like all major targeted platforms already support compatible open modes.
Linux/MacOS/FreeBSD/OpenBSD all support the following modes:
Windows
Creating an OS abstraction layer isn't needed as the necessary functions are widely supported.
Crystal claims to catch errors at compile time.
File.new("foo", "bar")
is a runtime error. This already bit me in development where a rarely used code path had a typo when porting a ruby application and changing O_CREAT to "e" instead of "w".
Existing crystal file modes and their POSIX mappings:
| Crystal | POSIX |
| --- | --- |
| r | O_RDONLY |
| r+ | O_RDWR |
| w | O_WRONLY O_CREAT O_TRUNC |
| w+ | O_RDWR O_CREAT O_TRUNC |
| a | O_WRONLY O_CREAT O_APPEND |
| a+ | O_RDWR O_CREAT O_APPEND |
| rb | same as without b |
| wb | same as without b |
| ab | same as without b |
Common uses that are missing:
Any of the above may be combined with either of the 2 locking options.
Any of the above may be combined with O_NOFOLLOW.
So that's ~14 different options * binary_or_not * 2_locking_options_or_not * nofollow_or_not for applications I've come across or have developed. Security software is rather specific in it's locking and race condition needs. There may be more exotic combinations for other apps.
That's way too many letters.
| Crystal | Type | Example |
| --- | --- | --- |
| c# | almost POSIX flags | File.Open("foo", FileMode.Open) |
| go | POSIX flags | OpenFile("foo", os.O_RDWR, 0644)
| java | multiple. almost POSIX flags. additional methods with common options. 3rd party packages that provide native API with POSIX flags | options = new OpenOption[] { WRITE, CREATE_NEW }; FileSystemProvider.newInputStream("foo", options) |
| nim | almost POSIX flags | open("foo", fmWrite)
| python | POSIX flags with extra work | os.fdopen(os.open("foo", os.O_RDWR), 'rb+')
| ruby | POSIX fopen or POSIX flags | File.new("foo", File::RDWR) |
| rust | almost POSIX method chaining | OpenOptions::new().write(true).create_new(true).open("foo"); |
Many of the almost POSIX flags are renamed flags with 1 to 1 or 2 to 1 mappings. CREATE_NEW is often O_CREAT | O_EXCL.
How a file handle is specified as read/write also changes between languages.
For most languages it's part of the file flags.
Java specifies reader/writer using different methods with a separate file mode param.
With rust it's part of the method chain.
With python it's specified twice (when using specific file modes).
Most of the languages above have an option to not truncate existing files when creating, something that crystal lacks. Most of them also have options to not create new files if one already exists (java, rust, any POSIX, maybe more).
Opening a file for append, read, create, excl becomes:
Maybe that didn't work out so well.
File.create_new.append.new "foo"
File.new "foo", "w", create_exclusive: true, flock_exclusive: true
File.new "foo", File::Write | File::Create | File::FlockExclusive
File.new "foo", File::RDWR | File::CREAT
POSIX flags seem like the obvious choice for documentation clarity and increasing the chances of someone finding an answer to why you can't seek+write with O_APPEND vs searching for File::CrystalCustomName.
POSIX flags have other benefits for applications that take advantage of more platform specific open flags like O_DSYNC for WAL or O_TMPFILE for automatic temp file cleanup as additional flags are easy to define in a shard or PR.
Most POSIX flags are completely or mostly portable to windows often have compatibility layers.
Also 100% compatibility with ruby porting.
Simplifying the flags down to a few options makes writing secure software impossible. (Without nonstandard extensions)
Let's not add a minigame for coming up with words out of a combination of letters.
I vote for POSIX flags (though usually I hate posix-isms). Keep using the old approach for most use cases, and for those using exotic cases -- you will know what you're doing...
With 2 different signatures it's backwards compatible too, yay.
Thanks for the very detailed RFC. Here is my opinion:
keep current letters: they're fopen
relicas, limited and not very explicit, I'm not sure it's worth keeping them (maybe deprecate);
kwargs are simple, and is an acceptable API but will lead to a bunch of booleans arguments (maybe not a problem) whose value will almost always be true
(impacts readability):
File.open("x.log", append: true)
File.open("x.log", write: true, create: true, exclusive: true)
enums with explicit names (Create, ReadWrite) are nice with 1 argument thanks to symbols mapping to enums but ugly with many:
File.open("x.log", :append)
File.open("x.log", File::Options.flags(Write, Create, Exclusive))
File.open("x.log", File::Options::Write | File::Options::Create | File::Options::Exclusive)
We can have delegations to remove some noise:
File.open("x.log", File.options(Write, Create, Exclusive))
File.open("x.log", File::Write | File::Create | File::Exclusive)
Of course it would be nice if crystal was mapping piped symbols to enum flags, but I'm not sure it would be practical to implement (@asterite ?):
File.open("x.log", :write | :create | :exclusive)
enums with POSIX abbreviations (CREAT, RDWR) are easier to search for help, but they impact readability, and I foresee lots of typos.
Of course it would be nice if crystal was mapping piped symbols to enum flags, but I'm not sure it would be practical to implement (@asterite ?): ```crystal File.open("x.log", :write | :create | :exclusive) ```
Can that happen when the type is specified as a enum? Or when the symbol is capitalized to match the enum?
File.open("x.log", :Write | :Create | :Exclusive)
Or maybe drop the : for the special case of flags type?
File.open("x.log", Write | Create | Exclusive)
* enums with POSIX abbreviations (CREAT, RDWR) are easier to search for help, but they impact readability, and I foresee lots of typos.
Those familiar with POSIX may have lots of typos. CREAT is almost automatic for me. <-- Don't weight this too heavy. IDK much.
Detailed documentation stating what each enum maps to would help with searching.
It seems like everyone likes moving to either POSIX or a custom named enum. Considering POSIX is already used behind the scenes to provide the current file modes should I make a PR or wait?
Could do File.open("x.log", :write, :create, :exclusive)
with what we have now, but that would incur the cost of a runtime reduce
on these *args.
Hmm, can't be worse than parsing a string, right?
I say I like POSIX, but then again, I couldn't stand seeing CREAT
everywhere 馃槀
The flags enum method is my preferred method, then deprecating the old fopen-like API.
So create a PR with custom named posix flags?
Read Write Create Exclusive
etc.
I think someone should list all possible combinations and then we can aim for a better API. I actually really like Ruby's way with a String: it's not type-safe, but it's short and easy to write. It's also a bit intuitive ("r" for read, "w" for write, "a" for append, etc.).
I wouldn't mind having an API with enums or similar, but first:
Here's a partial list. I may have missed some. Also, this doesn't include any platform specific options.
Uhh I was gonna suggest a decision tree on these but that's unlikely to clarify things.
I really like the suggestion of "three enums". Or, again, perhaps just the clarity that they might bring in understanding the possible combinations.
@didactic-drunk Thank you!
Another thing worth looking at is like how it's done in Go: https://golang.org/pkg/os/#pkg-constants
It seems only one of read, write or read-write can be specified, and then the rest are or-ed. I can't see that working with an enum. Three enums might be good...
But thanks much for the list. I'm sure there's at least some way to shape it in a way that can be insightful to look at.
I'm thinking at least like "flags that never appear together" and "flags that are completely independent"
Here are some additional platform specific optimization options that can be safely ignored on other platforms that don't support them. This would provide a performance boost on the supported platforms and gracefully degrade on all others by defining the flag as 0.
| Flag | Platforms | Notes |
| --- | --- | --- |
| DIRECT | POSIX | Useful for databases or applications that do their own caching. |
| NOATIME | Linux | Fewer writes. |
| SEQUENTIAL | Windows | Optimize buffering for sequential reading. It may be possible to emulate this on some other platforms. |
| Several more |
Platform specific flags that may possibly be emulated or provide useful performance improvements over fsync when supported.
All of the flags above may be combined with any of the flags listed in the prior post.
There are a variety of other platform specific flags that may be useful but were not mentioned, some of which I already have plans on using. With POSIX flags either extensions could provide them or they could be marked as :nodoc:. Checking for support is as easy as a macro of either defined
or flag == 0
.
Basically every call requires one of READ | WRITE | RDWR
except when it doesn't for more rare uses.
CREAT APPEND TRUNC
may be used with any other options.
EXCL
optionally pairs with CREAT
. This is the only real exclusive dependency.
SHLOCK
and EXLOCK
are mutually exclusive but may be combined with any other flags including RDONLY by itself.
All other flags can be combined with any other combinations with the exception of the sync options. Normally you would choose one. I have no idea what happens if you supply more than one.
The number of combinations is:
DIRECT SEQUENTIAL
etc are added, not including platform specific options.The C# open options look like they were created to support Windows CreateFile*
functions which specify additional parameters. (Created for Microsoft's platform).
The go constants are straight POSIX.
I have a ruby application I'm trying to convert that uses a no READ | WRITE
open option which is the reason I'm pushing for POSIX flags. That way I can define flag if the main crystal project won't accept extra or hidden flags to get my application working. Otherwise I need to implement my own open method digging in to some crystal internals along the way.
@asterite You suggested 3 arguments. There may be a need for 4 based on reading the mono source code. Or maybe only 2. Or 3. Or 1.
# `Open()` contains 3 arguments.
FileMode [Append, Create, CreateNew, OpenOrCreate, Truncate]
FileAccess [Read, Write, ReadWrite]
FileShare [None, Read, Write, ReadWrite] # Windows specific.
# `Create` has a different argument and seems like it contains the optional extras.
# It's a subset of the file options available to `CreateFile`.
FileOptions [Sequential, RandomAccess, WriteThrough, ...] # Some of these map to POSIX functions.
Most of the options above are subsets of CreateFile arguments.
C# has separate arguments for Create/Append/Truncate/etc and Read/Write/ReadWrite. In C#'s POSIX implementation the 2 arguments are OR'd and passed to open
. If you want to split the arguments it would make File.new
calls longer and slightly less readable.
# 1 argument
File.new "foo", File::Options.flags(Write, Create, Exclusive)
# 2 arguments
File.new "foo", File::Mode::Write, File::Access.flags(Create, Exclusive)
FileShare
is an optional argument for mandatory locking inherited from DOS. (See "The Tar Pit: Backwards Compatibility"
It has no equivalent on POSIX. C# uses a default FileShare based on the passed in FileAccess to provide the expected application behavior on windows. It does nothing on POSIX. A similar approach could be used in crystal by providing an optional FileShare
argument for applications that require specific Windows compatibility almost if not identical to the C# implementation with the C# defaults when the argument is missing.
This would not effect any of the file flags listed in prior posts or help split them in to multiple arguments as FileShare
is it's own completely separate beast.
What choice you make for the default behavior without an explicit share mode comes down to two main choices:
C# chose 2. I think that goes against crystals goals of a unified abstract API between OS's.
C# uses Windows mandatory locking defaults on Windows and POSIX behavior on POSIX as POSIX doesn't implement mandatory file locking. They could have used one of the 3-4 advisory locking API's as an approximation but instead chose OS norms.
Consistent and saner behavior would be to never lock a file by default unless requested. Numerous articles outline the insanity of mandatory file locking and it's problems.
Ultimately I don't care which approach is used since I'm developing POSIX software using features that Windows doesn't have and has no means to emulate.
If FileMode
and FileAccess
are split and FileShare
is it's own thing then where do the other options go?
If you think they fit with Read/Write or Create/Truncate you're done and can stop reading. Otherwise a 4th option is needed.
# 4 arguments
File.new "foo", File::Mode.flags(Read, Write), File::Access.flags(Create, Exclusive), File::Options.flag(Sequential, SymNoFollow), File::Share::LockExclusive
# 1 argument
File.new "foo", File::Options.flags(Read, Write, Create, Exclusive, Sequential, SymNoFollow, LockExclusive)
Feel free to mix and match enums in order to make two or three argument version and see how they look.
What's easier for me to read is 1 argument.
For C# they created an API with with 3 arguments for Windows compatibility and still don't have a place to put extra options. One argument is mandatory. The other 2 have defaults or are derived from the first if not supplied.
On POSIX they end up OR'd and passed as a single value to open()
.
Adding enums adds noise and naming problems.
You could try to come up with more descriptive names but that comes down to taste. No name will meet everyone's expectations of where the enum's should be split. That means mandatory documentation reading for everyone who didn't come up with the names trying to figure out what to put where.
Do you still want to copy an API designed for Windows compatibility?
Writing this is issue is 20x longer than the code to implement it. Tell me what goes where and I'll make the PR. If you left it to me I'd do POSIX flags just like go and ruby. (Which means ruby programs would port without extra work)
Mode (Read
, Write
, ReadWrite
) and Flags enums based off posix but with some thought towards how they map towards windows would be my preference. They also need good defaults. I'm fine with one enum too, hopefully the compiler will recognise :read | :write
as a literal enum in the future.
New PR #8011.
In fact, we could have File.open(name : String, read : Bool = true, write : Bool = false, *flags : File::Flags)
What do people think of this API?
Well, when writing, am I supposed to always pass read: false, write: true
like a peasant
In that case perhaps these two could work
File.open(name : String, *flags : File::Flags)
# means readFile.open(name : String, *, write : Bool = false, read : Bool = false, *flags : File::Flags)
And I would be confused what read+write means alone, namely whether it would first truncate the file or not. And then the default initial position (beginning / end) can become uncertain as well.
I assumed the String
modes were here to stay. Are they?
Does append get it's own bool? Why or why not? What about truncate?
The most common operations in crystal are [Read]
and [Write | Create | Trunc]
or possibly [Write, CreateNew]
. Other common modes when there are more options available are probably [Read, Write]
, [Read, Write, Create]
, [Write, Create, Append]
, [Write, Create, Append, Trunc]
. This is not an exhaustive list.
With the above proposed API:
File.open(name, read : true)
File.open(name, write: true, File::Mode.flags(Create, Truncate))
File.open(name, write: true, File::Mode.flags(CreateNew))
File.open(name, read: true, write: true)
File.open(name, read: true, write: true, File::Mode::Create)
File.open(name, write: true, File::Mode.flags(Create, Append))
File.open(name, write: true, File::Mode.flags(Create, Append, Truncate))
With a single param API:
File.open(name, File::Mode.flags(Read))
File.open(name, File::Mode.flags(Write, Create, Truncate))
File.open(name, File::Mode.flags(Write, CreateNew))
File.open(name, File::Mode.flags(Read, Write))
File.open(name, File::Mode.flags(Read, Write, Create))
File.open(name, File::Mode.flags(Write, Create, Append))
File.open(name, File::Mode.flags(Write, Create, Append, Truncate))
If they can be shortened using a hypothetical %f:
File.open(name, %f(Read))
File.open(name, %f(Write, Create, Truncate))
File.open(name, %f(Write, CreateNew))
File.open(name, %f(Read, Write))
File.open(name, %f(Read, Write, Create))
File.open(name, %f(Write, Create, Append))
File.open(name, %f(Write, Create, Append, Truncate))
It doesn't seem like individual bool params come out ahead in terms of clarity or typing except for [Read]
which is the default and often not supplied.
What are the usecases of WRONLY
? It seems very very rare that anyone will use read: false
. Just do:
File.open("foo", write: true, :append, :create)
for a+
. I want to avoid a :readwrite
flag, it's ugly.
It'd just be read
and write
which are named arguments, since they are the most basic permissions for the file, the rest are just optional flags.
I assumed the
String
modes were here to stay. Are they?
For now.
Does append get it's own bool? Why or why not? What about truncate?
Because they're optional flags, but read, write, or both have to be specified.
the rest are just optional flags.
Kind of. Write
is often paired with Create
, CreateNew
and maybe Append
and maybe Truncate
. Since Flags
will almost always be specified (Read
is the default) I think the provided examples show clarity is improved by keeping them together.
Just look at the crystal code base. Compare how manyopen
calls use Write
without Create*
. Create*
is the common case which means supplying flags. But is it Create
or CreateNew
? That seems to go back and forth. A single param wouldn't work for that either.
Even if one or both [Read, Write]
was required I don't see the improvement, especially when reading another person's code or doing security auditing. I'd want them right next to each other and easily parseable for code audits.
Because they're optional flags, but read, write, or both _have_ to be specified.
No they don't.
File.open "lockfile", File::Mode.flags(Create) do |file|
file.flock_exclusive do
# ...
end
end
This works on my #8011 branch right now.
From the GNU Man page:
A file access mode of zero is permissible; it allows no operations that do input or output to the file, but does allow other operations such as fchmod.
There are also corner cases for use of O_EXCL
without O_CREAT
.
What are the usecases of
WRONLY
? It seems very very rare that anyone will useread: false
.
When I was working with secure log services WRONLY
was paired with APPEND
allowing multiple processes to append to a file without reading it or overwriting each other. At most a process could attempt to append extra data but couldn't erase or read anything written.
The program was setuid(loguser) to save files to an inaccessible location and used with a pipe.
Just do:
File.open("foo", write: true, :append, :create)
for
a+
. I want to avoid a:readwrite
flag, it's ugly.It'd just be
read
andwrite
which are named arguments, since they are the most basic permissions for the file, the rest are just optional flags.
How is
File.open("foo", write: true, :append, :create)
Better than
File.open("foo", :write, :append, :create)
Having individual arguments for read and write is only extra typing.
To be honest, the reason is that I think read: true
should be the default, and I don't want a :no_read
flag. So that means that every call would have to be File.open("foo", :read)
. But I guess that's fine.
Append
was updated to imply Write
. I haven't seen a single use of Append
in > 20 years without Write
even for odd platform specific cases.
The API could be:
File.open("foo") # Read.
File.open("foo", :write)
File.open("foo", :readwrite) # Or :read, :write. Don't care.
File.open("foo", :append)
File.open("foo", :read, :append) # Outlier.
Error checking is a reason to not grant read for everything. A file opened for writing raises when read from. This could be a programmer error when using the wrong variable name, memory corruption of the a pointer of file descriptor value or malicious act when used with trusted programs.
I don't think :read
should be implied for everything. Many of crystal's own uses of "w" don't read from the file.
Does shorthand enum notation work with multiple enums? My examples don't use a shorthand notation because of varying requests to split open
in to multiple parameters.
Anticipated exponential change requests are the reasons I haven't touched locking or operating system specific arguments. I'm trying to get the basic features through then add the rest when there's less to argue over.
I think at this point, the various API proposals need to be collected up, with usage examples, and then voted on.
shorthand enum notation work with multiple enums?
Depends on the type signature...
The reason for having multiple enums is to enforce one of :read
, :write
or :rdwr
. Since we've decided that's not neccesary, one flags enum is fine.
I'll leave this for 24h before working on a more robust implementation, and I'd appreciate :+1: on this comment to indicate people are happy with the API being:
File.open(name : String)
for read-onlyFile.open(name : String, *flags : File::Flags)
for other usesWhat do you plan to do with the existing File::Flags
that serves a completely different purpose?
class File
# Represents the various behaviour-altering flags which can be set on files.
# Not all flags will be supported on all platforms.
@[Flags]
enum Flags : UInt8
SetUser
SetGroup
Sticky
end
@didactic-drunk whoops, it'll be called File::OpenFlags
then.
May I suggest moving or removing File::Flags
as it's usefulness is dubious. Maybe rename to File::Permissions::Flags
if there's a reason to keep it.
Moved to #8026.
Whilst we're deprecating things - what's the opinion on File.new
and File.open
? I'd rather have a protected File.new(path, fd)
, then use File.open
with and without a block, since File.new
implies you're creating a file, when you're often not.
So I'd suggest deprecating all File.new
overloads except the platform-specific one, and then having File.open(filename, mode(s), *, permissions, encoding, invalid)
with and without a block.
Final layout:
| Method | Current | PR |
|---------------------------------------------------------------------------------------|---------|-----------------------------------------------------------|
| File.new(path, fd, blocking, encoding, invalid)
| private | public? :nodoc:
? |
| File.{new,open}(filename, mode : String, perm, encoding, invalid)
| public | deprecated, use File.open(filename, File::Mode)
|
| File.open(filename, mode : String, perm, encoding, invalid, &block)
| public | deprecated, use File.open(filename, File::Mode, &block)
|
| File.open(filename, {mode,*modes} : File::Mode, *, perm, encoding, invalid)
| | public |
| File.open(filename, {mode,*modes} : File::Mode, *, perm, encoding, invalid, &block)
| | public |
I see Any.new
as creating a new object, not creating what the object refers to.
If you change the verb syntax on one object should the rest of the language change? Array.alloc
,Zip.open
, Mysql.connect
, Blockchain.idk
.
Doesn't it create more cognifitive load by having exceptions? All other objects use .new
except File
.
.open
is generally used when opening an (OS-level) resource, and .new
is used everywhere else. I'd like to see .new
or .open
though, not both.
Just for reference: many Socket
classes have both .new
and .open
, where the latter is called with a block. This might not necessarily apply identically to File
, but maybe this is also an option worth considering.
On all accounts, I'd speak against .new
with a block. .new
is a constructor method, but when called with a block, it doesn't return an instance. Instead the instance is yielded to the block. For this, .open
seems better. The non-yielding variant could be .new
which would match existing APIs (Socket) and this is IMO a clever solution. Yet, it's an additional method name. So .open
for everything is fine, too.
Just for reference: many
Socket
classes have both.new
and.open
, where the latter is called with a block. This might not necessarily apply identically toFile
, but maybe this is also an option worth considering.
This is the approach I've taken mostly based on convention.
On all accounts, I'd speak against
.new
with a block..new
is a constructor method, but when called with a block, it doesn't return an instance. Instead the instance is yielded to the block.
Isn't that what open
does? File.open("test") { 2 } => 2
Most helpful comment
I think at this point, the various API proposals need to be collected up, with usage examples, and then voted on.
Depends on the type signature...
The reason for having multiple enums is to enforce one of
:read
,:write
or:rdwr
. Since we've decided that's not neccesary, one flags enum is fine.I'll leave this for 24h before working on a more robust implementation, and I'd appreciate :+1: on this comment to indicate people are happy with the API being:
File.open(name : String)
for read-onlyFile.open(name : String, *flags : File::Flags)
for other uses