Nim: readLine is slow (and doesn't match Python's definition of a "line")

Created on 21 Sep 2018 · 13Comments · Source: nim-lang/Nim

Perhaps this should be 3 GitHub issues? But maybe fixing one problem will fix the other.

Summary

streams.readLine is slow
streams.readLine chokes on null bytes
streams.readLine doesn't produce the same "lines" as either Python or wc (this or more surprising than it is a "bug")

Details

I'm porting a Node project to Nim and noticed that the Nim version is much slower (about twice as slow) as the Node version. So I ran some comparisons here.

To test what's going on, I made 4 scripts that read through each line of a file and print out the number of lines found.

Node
Nim (using builtin streams.readLine)
Nim (using custom-made readLine proc)
Python

Each script was run against two files:

Mostly binary data (with many newlines)
Only string data

Here is my custom-made readLine:

proc readLine(s: Stream, line: var TaintedString, delimiter = '\n'): bool =
  line.string.setLen(0)
  result = true
  while result:
    var c = readChar(s)
    if c == delimiter:
      break
    elif c == '\0':
      if s.atEnd():
        result = false
    line.string.add(c)

Accuracy Results

wc, Python and my custom-made readLine proc all agree on what a "line" is (i.e. they print out the same number for "number of lines")
Node reports more lines than wc
streams.readLine reports more lines than wc on text-only files (probably because it counts \L too)
streams.readLine chokes if there are null bytes

Speed Results

| program | inputfile | time | line count |
|---|---|---|---|
| wc | hugebin.txt | | 1444473 |
| reader.js | hugebin.txt | 2.18 | 1844188 |
| Nim (custom proc) | hugebin.txt | 5.13 | 1444473 |
| streams.readLine | hugebin.txt | 0 | 279 |
| reader.py | hugebin.txt | 0.5 | 1444473 |
| wc | hugestr.txt | | 2078017 |
| reader.js | hugestr.txt | 0.65 | 3095841 |
| Nim (custom proc) | hugestr.txt | 5.11 | 2078017 |
| streams.readLine | hugestr.txt | 5.01 | 3085918 |
| reader.py | hugestr.txt | 0.64 | 2078017 |

Performance Stdlib

Source

iffy

👍1

Most helpful comment

@skilchen You are correct, but we shouldn't be forced to work around this. Make Nim better, don't make hacks.
Most higher level languages wrap low level File streams fairly performantly. Nim should as well. There is no reason not to.

That being said, I implemented the direct File implementation and ran it with @iffy's test here:
https://gist.github.com/rayman22201/6b26e1a14b5e4f04776103de976cd8d7

It is obviously much faster. @skilchen your mmap solution will probably be even faster, but it is a different approach / doesn't really solve the problem that Nim Streams should probably be fixed.

There is a nice buffered / fgets implementation of readline for Files, but that implementation is not used for generic streams.
https://github.com/nim-lang/Nim/blob/2745111fc333d58f663480639aae5b2e0e0e1636/lib/system/sysio.nim#L144

fgets should at least work for File streams and Sockets, the two main uses for Streams, so why not use it in the stream module instead of the naive implementation?

Also see irc logs for discussion
here: https://irclogs.nim-lang.org/21-09-2018.html#17:02:24
here: https://irclogs.nim-lang.org/21-09-2018.html#17:47:37,
and here: https://irclogs.nim-lang.org/21-09-2018.html#19:45:28

rayman22201 on 21 Sep 2018

👍8

All 13 comments

Indeed streams.readline is slow. Therefore I would recommend to avoid streams.nim whenever possible. Files are already streams and there is very little point to wrap them in a FileStream. (The other thing where Nim doesn't shine compared to the concurrence is the string handling in strutils.nim. Thats all also rather slow.)
Change your builtinreader.nim to something like:

import os

iterator readFile*(stream:File): int =
  var num = 0
  var line = ""
  while stream.readLine(line):
    num += 1
    yield line.len

iterator readFile*(s:string): int =
  for x in readFile(open(s)):
    yield x

var lines = 0
let filename = paramStr(1)
for x in readFile(open(filename)):
  lines += 1
echo filename, " ", lines

and you will already gain much speed (and the line separating behaviour you seem to look for ).

If you are looking for maximum speed for a line/word/byte counting task, try something like:

import os
import memfiles
import strutils

proc wc(fn: string): tuple[linec, wordc, bytec: int] =
  var mf = memfiles.open(fn)
  var cs: cstring
  var linec, wordc, bytec: int
  var inWord: bool
  var s: string
  for slice in memSlices(mf):
    inc(linec)
    cs = cast[cstring](slice.data)
    let length = slice.size
    #s = $slice
    inc(bytec, length)
    var j = -1
    for i in 0..length-1:
      j = i
      if cs[i] in WhiteSpace:
        if inWord == true:
          inc(wordc)
          inWord = false
      else:
        inWord = true
    if j >= 0:
      inc(wordc)
  result.linec = linec
  result.wordc = wordc
  result.bytec = bytec + linec

when isMainModule:  
  let fn = paramStr(1)
  let r = wc(fn)
  echo r.linec, " ", r.wordc, " ", r.bytec, " ", fn

This is probably faster than anything the concurrence has to offer...

skilchen on 21 Sep 2018

That being said, I implemented the direct File implementation and ran it with @iffy's test here:
https://gist.github.com/rayman22201/6b26e1a14b5e4f04776103de976cd8d7

fgets should at least work for File streams and Sockets, the two main uses for Streams, so why not use it in the stream module instead of the naive implementation?

rayman22201 on 21 Sep 2018

👍8

we shouldn't be forced to work around this. Make Nim better, don't make hacks.

The original problem and the solutions presented here can be replaced with this simple (without hacks and custom-made procs) code, which is both correct and fast:

import os

var lines = 0
let filename = paramStr(1)
for x in lines(filename):
  inc lines
echo filename, " ", lines

That said, I've submitted a PR which makes streams.readLine correct at least, even though it is still very slow.

narimiran on 6 Feb 2019

👍1

Well, there is still room for improvement. Python for reading files is still 30% faster.

Nim version: 1.2.0
Python version: 3.8.2

Lines read: 2,240,432

Nim runtime: 2.11s
Python runtime: 1.623s

Python code:

import time

most = time.time()
sor = 0

f = open("C:\\Temp\\myfile.xml", "r", encoding="utf8")
for x in f:
  sor = sor + 1
  if sor == 1:
    print(x)

print(f"{sor:,}")
print(time.time()-most)

Nim code:

import times

var sor= 0
let most= cpuTime()
let filename = "C:\\Temp\\myfile.xml"
for x in lines(filename):
  inc sor
  if sor== 1:
      echo x
echo filename, " ", sor
echo "Time taken: ", cpuTime() - most

boumboum on 18 Jun 2020

@boumboum did you compile with -d:danger?

Yardanico on 19 Jun 2020

Which OS? And why can't you use an XML parser? ;-)

Araq on 19 Jun 2020

@Danil: I've compiled with -d:release --opt:speed (without this the runtime was twice as long: 5.29s)

@Araq: The test was on Windows 10. Both. And at first I wanted to make the simplest test possible, before going any further. And this ticket was originally opened about the same issue/remark. And if Nim is behind of Python even for the simplest file walk trough, why should an XML parsing be faster? And I have some issues with the XML parser (see below)

Because for me to change from Python to Nim (what I'd love to) there are several prerequisites which should be given:

Speed (of course) - Here I'm questioning that
XML handling (to be honest I'm struggling with the parsexml documentation. I know, it is because of my limited brain capacity, but I cannot find proper examples how to parse deeply nested big (300+ MiB) XML files. On google always the same html parsing sample is repeated. (On Python I'm using lxml, which is fast and delivers all the functionality I need and there are tons of examples on the net)
SOAP client and server. I know, nobody should use any more SOAP, but the enterprise world (especially the railway industry) is still heavily relying on it. (On Python I use zeep and spyne)
FTP and SFTP (I couldn't find any SFTP package for Nim. On Python I use pysftp)
File system operations
LDAP support
SMTP handling
Performant database (MSSQL) access (like pyodbc on Python)

boumboum on 19 Jun 2020

@boumboum you should use -d:danger (or more complex options) for measuring performance; not -d:release --opt:speed

@araq can we reopen this? there is something actionable here

let's keep this issue focused on readLine, other topics (eg xml etc) should be discussed in a separate issue.
readLine is not very good, and should be improved. It's an important function to optimize since it's rather fundamental.
I'm able to get a ~2X speedup using fgetln; OS that have this should use it
readLine should gain a keepEOL optional parameter to choose whether to keep EOL (instead of assuming it's "\n") which matters when you need exact data transfer

quick and dirty experiment showing a 2X speedup (with -d:danger):

import times
let file = "/tmp/z03.log"
proc prepareData()=
  let t=cpuTime()
  let n = 1_000_000
  let f = open(file, fmWrite)
  defer: close(f)
  for i in 0..<n:
    for j in 0..<100:
      write(f, $i)
    write(f, "\n")
  echo "Time taken: ", cpuTime() - t

proc fgetln(stream: File, len: var csize_t): cstring {.importc.}

proc lines2[Fun](file: string, fun: Fun) =
  #[
  TODO:
  use iterator
  customize whether to keep EOL (if any)
  ]#
  let f = open(file)
  defer: close(f)
  var s2: string
  while true:
    var n = 0.csize_t
    let s = fgetln(f, n)
    #[
    TODO: proper error handling
    > The fgetln() function does not distinguish between end-of-file and error; the routines feof(3) and ferror(3) must be used to determine which occurred.
    ]#
    if s == nil: break
    s2.setLen n
    for i in 0..<n: s2[i] = s[i]
    fun(s2)

proc readData()=
  let n = 10
  let t=cpuTime()
  var m = 0
  for i in 0..<n:
    proc fun(s: string) =
      # echo (s,)
      m += s.len
    when defined case1:
      for x in lines(file):
        fun(x)
    elif defined case2:
      lines2(file, fun)
    else:
      static: doAssert false
  let t2 = cpuTime() - t
  echo (file, m, t2)

proc main()=
  prepareData()
  readData()
main()

timotheecour on 19 Jun 2020

@timotheecour: with -d:danger the response time comes closer to the python's one. Now I can achieve 1.813s, which is 15% improvement compared to -d:release, but still worse than python, but quite close.
IMHO you should put that into the documentation but the name danger is somehow strange, but never mind :-)

boumboum on 19 Jun 2020

If the performance of readLine is important to you, you might want to check out our FastStreams library:

https://github.com/status-im/nim-faststreams/blob/master/faststreams/textio.nim#L129

zah on 19 Jun 2020

Being 10% slower than Python for an IO bound problem isn't a bug. PRs are welcome anyway.

Araq on 19 Jun 2020

@Araq: There was no blame at all from my side. I just said, there is room for improvement. I personally think (and I may be wrong) that a highly optimized compiled app (-d:danger) should be faster than an absolutely not optimized interpreted Python script.
Unfortunately I'm too new to Nim to propose any PR, but I think a kind of buffered IO could be a big help. And I think IO is an important basic part of every language as it is the base of a lots of other packages. So it should be performant. But again, I might be wrong.

boumboum on 19 Jun 2020

@zah

see https://github.com/timotheecour/vitanim/blob/master/testcases/t10975.nim

I tried faststreams, getting these numbers:

case1 4.37 # stdlib version
case2 2.241 # my version
case3 26.51 # FastStreams

please advise

also see bug i found for x in textio.lines(f, keepEol = true): # BUG: goes on forever

timotheecour on 19 Jun 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

[regression] duplicate member `_i1` codegen bug

Vindaar · 3Comments

getOccupiedMem Undocumented

juancarlospaco · 3Comments

deprecation syntax inconsistent for objects, not possible for const/let/var

timotheecour · 3Comments

About translating Nim's document into Japanese

koki-koba · 3Comments

[style guide] [low priority] Naming conventions for modules?

awr1 · 3Comments