Node: Feature request: fs.ReadStream.tell()

Created on 26 Mar 2016 · 7Comments · Source: nodejs/node

When I try to parse a large local utf8-encoded file, it takes about 2 or 3 hours to accomplish the job. So I need to monitor the rate of progress. Because it is a utf8 file, I can't just read it and convert it by chunk.toString("utf8"), it will cause the characters broken. I set .setEncoding('utf8') first. But then I can't use chunk.length to get the right value of the file position.

I have read some comments, but there is no way to meet my request.
(e.g., https://github.com/nodejs/node-v0.x-archive/issues/1527 http://stackoverflow.com/questions/13932967/how-do-i-do-random-access-reads-from-large-files-using-node-js )

If we could have an "ftell" as c does (see http://www.gnu.org/software/libc/manual/html_node/File-Positioning.html ), then it would be possible to monitor the progress. So may we add the "ftell" function to the file stream?

Version: 5.9.1
Platform: Linux

feature request fs stream

Source

kanasimi

Most helpful comment

const Transform = require("stream").Transform
class CounterTransform extends Transform { // does this work with classes btw?
  constructor() { 
    this.bytes = 0;
  }
  _transform = function (chunk, encoding, cb) {
  this.bytes += chunk.length
  this.push(chunk)
  cb();
}

Or, just keep track of chunk.length and aggregate it in your stream?

(I realize this might not give you the accurate number of UTF8 chars but you'll get an estimate of how far into the file you are)

benjamingr on 27 Mar 2016

👍2 ❤1

All 7 comments

FWIW there is already an undocumented readStream.pos which gets updated if you pass a numeric start value in your fs.ReadStream options (e.g. fs.createReadStream('foo', { start: 0 })). I'm not sure if merely documenting this would be ideal or not...

mscdex on 26 Mar 2016

👍1

Thank you. But I find the .pos is not accuracy. For example, the file is 2372126 bytes (with CJK Unified Ideographs), and at last .pos points to 2490368, bigger than the file size. I'm wondering how could this happen.

Perhaps we should move the line at https://github.com/nodejs/node/blob/master/lib/fs.js#L1902 :

this.pos += toRead;

into function onread(er, bytesRead),
and set:

this.pos += bytesRead;

kanasimi on 26 Mar 2016

Probably a dumb question: Can't you pipe it through a stream that counts how many bytes it saw and then convert the output stream to UTF8?

benjamingr on 27 Mar 2016

const Transform = require("stream").Transform
class CounterTransform extends Transform { // does this work with classes btw?
  constructor() { 
    this.bytes = 0;
  }
  _transform = function (chunk, encoding, cb) {
  this.bytes += chunk.length
  this.push(chunk)
  cb();
}

Or, just keep track of chunk.length and aggregate it in your stream?

(I realize this might not give you the accurate number of UTF8 chars but you'll get an estimate of how far into the file you are)

benjamingr on 27 Mar 2016

👍2 ❤1

you can use stringdecoder (Which is what the stream does internally) to convert to a string and see how much of the file has been read so far