While working on something I discovered this bug: https://github.com/crystal-lang-tools/scry/pull/116, where we were getting a Too many open files
error. This is because we were using the iterator version of File.each_line
method.
It appears this method does not close the file, nor does there appear to be an obvious way to close a file that is opened via this way since an Iterator is returned.
Ah yeah, this is the same as I hit in #5623.
I vote for remove these methods - make people tie them into a specific File
instance which is (hopefully) tied to a File.open
block which mitigates the problem.
@RX14 makes sense for the iterator version, but why the yielding one, since it's useful and doesn't leak? EDIT: my bad, nvm.
@Sija i never said to remove the yielding one...
@asterite relevant but I don't think it interesects here. The problem with File.each_line
is that you're creating a File
instance that always leaks and you cannot close. Hence we must remove the method.
Yes, you can close it. The GC will close it once you are done with it. That's the whole point of a GC. And why I linked the issue: once that file isn't used, but the GC didn't run, and you try to create a new file and you run out of descriptors, run the GC to close those files.
Ruby works fine and has exactly the same method.
@asterite I consider the GC closing FDs as a crutch, and you can't manually close the iterator you have to wait for the GC.
Maybe I should reconsider... and maybe we should just run the GC before we report EMFILE
and copy ruby. But i don't like it.
The GC won't clean up and close the file handle if it's still referenced somewhere. Shouldn't rely on that.
Could this be fixed by adding a flag to File#each_line
to tell it to close itself when it reach the end?
Why don't we just provide a close method on file related iterators and return an iterator that calls that for class methods when at the end? Probably could even be a generic one. I think it's rather unlikely that you don't consume those to the end or even want to auto close it if you do do not.
@jhass but I think the difference between
File.each_line(foo).map { ... }
and
File.open(foo) do |file|
file.each_line.map { ... }
end
isn't much of a deal to type and the latter is so much clearer about the scope. So I'd prefer the latter instead of fixing the former.
I like having these available as single line or chain. That would become almost impossible then without heavy &.
usage.
File.each_line(foo)
vs
File.open(foo) &.each_line
They are fairly close. I also think that File.each_line
can be removed.
For what it adds, it doesn't worth the hassle.
each_line should be in an instance of a file
Most helpful comment
@asterite I consider the GC closing FDs as a crutch, and you can't manually close the iterator you have to wait for the GC.
Maybe I should reconsider... and maybe we should just run the GC before we report
EMFILE
and copy ruby. But i don't like it.