reading directory with files names using buffer as path does not work.
const fs = require('fs');
fs.readdir( Buffer.from( "."),{withFileTypes:true,encoding:"buffer"},(e,d)=>console.log("dir",d));
It works correct in version 12.14.0 but after upgrade it stopped to work.
using it without files types works:
// fs.readdir( Buffer.from( "."),{withFileTypes:false,encoding:"buffer"},(e,d)=>console.log("dir",d));
every time
get directory entries with file types i.e. array of Dirent objects
error message:
Uncaught:
TypeError [ERR_INVALID_ARG_TYPE]: The "path" argument must be of type string. Received an instance of Buffer
at validateString (internal/validators.js:117:11)
at Object.join (path.js:1039:7)
at getDirents (internal/fs/utils.js:159:39)
at FSReqCallback.req.oncomplete (fs.js:858:7) {
code: 'ERR_INVALID_ARG_TYPE'
}
This bug is degradation because it doesn't exists in version 12.14.0 and present in versions 12.16.1 and 12.16.3 at least.
I use buffer instead of string because working with my old archives with non utf-8 names.
This seems like a real bug, although it would be helpful to know what kind of directory entries there are in your directory since that will determine whether an error occurs or not.
Sorry.
After you comment I've checked once again.
It doesn't work just on my UDF blue ray mounted file systems.
It work correct in other cases.
than it is possible not a degradation.
but in any case the code throws "uncaught" exception and my program fails.
Well, as I mentioned, it seems like a real bug for sure – but it would have been nice to have a reproduction available as a basis for a regression test. :+1:
Basically, the issue is that this line:
uses path.join, which can only handle strings, but path and/or name can be Buffer instances as well (so 4 combinations in total – string, string + Buffer, string + string, Buffer + Buffer, Buffer). This happens a few times in that file, all of which probably need to be updated – it might make sense to put that in a separate join function that takes any of the 4 argument combinations.
@vasyanewlifehere If you like, you are welcome to work on a PR?
I could help with this issue.
(UPD): will start working on it tommorow if there will be no answer from @vasyanewlifehere
@vasyanewlifehere are you planning to work on PR?
it would have been nice to have a reproduction available as a basis for a regression test
The following lines of code work if a dirent type is UV_DIRENT_UNKNOWN (which is crossplatform equivalent of DT_UNKNOWN).
https://github.com/nodejs/node/blob/8a6fab02adab2de05f6e864847f96b0924be0840/lib/internal/fs/utils.js#L188
https://github.com/nodejs/node/blob/8a6fab02adab2de05f6e864847f96b0924be0840/lib/internal/fs/utils.js#L217
As to create a reproduction we need to force OS to return DT_UNKNOWN. I spent an hour on researching, and still don't know how to achive this on macOS.
Currently, only some filesystems (among them: Btrfs, ext2, ext3, and ext4) have full support for returning the file type in d_type. All applications must properly handle a return of DT_UNKNOWN.
The paragraph above is from this manpage. It got me thinking, that for reporoducing the bug on CI we will need to mount such a FS that will cause OS to return UV_DIRENT_UNKNOWN.
Alternatively, according to this and that, we could find such a system where DT_UNKNOWN if undefined, so libuv will always return UV_DIRENT_UNKNOWN
All that sounds really tough). @addaleax Are there other ways to force OS to return DT_UNKNOWN?
Instead of having a reproduction, we could create unit tests for getDirents and getDirent. In the tests we might mock OS calls.
In code it will look something like
function getDirent(path, name, type, callback, loadFs = lazyLoadFs) {
if (typeof callback === 'function') {
if (type === UV_DIRENT_UNKNOWN) {
loadFs().lstat(pathModule.join(path, name), (err, stats) => {
// rest of the code
}
Tests
function callback(err) {
assert.strictlyEqual(err, null);
}
function lazyLoadFsMock() {
// return something
}
getDirent(Buffer('.'), 'foo', 'bar', callback, lazyLoadFsMock)
UPD: I created a reproduction on macOS. I definetely won't work on Windows. The test doesn't fail on Ubuntu.
'use strict';
const tmpdir = require('../common/tmpdir');
const fs = require('fs');
const common = require('../common');
const assert = require('assert');
fs.readdir(
Buffer.from("/dev"),
{withFileTypes: true, encoding: "buffer"},
common.mustCall((error,d) => {
assert.strictEqual(error, null)
})
)
@addaleax will it be OK if add unit tests only?
@addaleax Can we assume that name and path args are correct and buffers are in 'utf-8' when we get string + Buffer or Buffer + string or Buffer + Buffer combination?
If yes, then the join function will look like that
path: string, -> path
path: string, name: Buffer -> Buffer.concat([Buffer.from(path + path.sep)), name])
path: string, name: string -> path.join(string, string)
path: Buffer, name: Buffer -> Buffet.concat([path, Buffer.from(path.sep), name])
path: Buffer, name: string -> Buffer.concat([path, Buffer.from(path.sep + name)])
path: Buffer -> path
If not, it seems that we need to implement something similar to path.join for buffers.
> path.join('123/////', '////123')
'123/123'
> path.join('123/////', '////123/////')
'123/123/'
Considereing the fact that a buffer may contain data in different encodings, the algorythm will be quite complex. Also, it seems to me that we can't stringify buffers and use path.join since we don't know a Buffer's encoding.
I'd like to work on this issue. Has it been fixed?
@kkz13250 It's not been fixed. I started to work on this issue and created a reproduction here.
I think I know how to do it. We could use options.encoding field as to perform convertions from Buffer to string and vice versa. will continue working on PR ASAP.
@shackjjj Thanks for the ping – no, unfortunately we can’t assume that Buffers contain UTF-8, otherwise we wouldn’t need to have a Buffer variant in the fs API. We can, however, assume that / is the path separator, so joining two Buffer paths in this case here boils down to Buffer.concat([buf1, slashBuffer, buf2]).
Most helpful comment
Well, as I mentioned, it seems like a real bug for sure – but it would have been nice to have a reproduction available as a basis for a regression test. :+1:
Basically, the issue is that this line:
https://github.com/nodejs/node/blob/8a6fab02adab2de05f6e864847f96b0924be0840/lib/internal/fs/utils.js#L188
uses
path.join, which can only handle strings, butpathand/ornamecan beBufferinstances as well (so 4 combinations in total –string, string+Buffer, string+string, Buffer+Buffer, Buffer). This happens a few times in that file, all of which probably need to be updated – it might make sense to put that in a separatejoinfunction that takes any of the 4 argument combinations.@vasyanewlifehere If you like, you are welcome to work on a PR?