Nim: Async readAll in httpclient produces garbled output.

Created on 17 Sep 2018  Â·  28Comments  Â·  Source: nim-lang/Nim

When I run this code it crashes with json error about 25% of the time:

import httpclient, asyncdispatch, json

proc test*() {.async.} =
  var client = newAsyncHttpClient()
  var response = await client.get("https://raw.githubusercontent.com/treeform/chrono/master/tzdata/dstchanges.json")
  echo parseJson(await response.body)

waitFor test()

Json url endpoint is static, but if you print out the json body it looks like chunks are missing.

If you run the script do you get the same error?

I think some thing is wrong with readAll + async in https://github.com/nim-lang/Nim/blob/master/lib/pure/httpclient.nim#L196

Nim Compiler Version 0.18.1 [Windows: amd64]
Compiled at 2018-09-07
Copyright (c) 2006-2018 by Andreas Rumpf

git hash: e81fe6d32f485f721fccd87e922addc5715a5fdd
active boot switches: -d:release
Regression Showstopper Stdlib

Most helpful comment

I think we should revert this to last working point like @ehmry said. We can't seem to figure out how to fix it. We can fix the streaming stuff later when the right person comes along...

Most people don't want to stream HTTP get requests, but everyone wants to get the correct data back.

All 28 comments

The missing data can happen at the end, but usually it appears to be in the middle, maybe the data is just out of order, its hard to tell.

I am trying to look into: if the async loop does read callbacks out of order it might case the data to be out of order.

It also fails on mac:

Nim Compiler Version 0.18.1 [MacOSX: amd64]
Compiled at 2018-09-07
Copyright (c) 2006-2018 by Andreas Rumpf

git hash: 90025e242db1f4b2f2bca18

On mac it fails 100% of the time.

On mac it fails 100% of the time.

confirmed for devel branch
but it works without a problem on osx with 0.18.0; so it looks like some regression

@dm1try Have you tried running it over and over. It might have a small chance of failing?

I've just tried 50 times with repeat 50 ./app and there is no any fail =) on devel it fails 100% of the time. I can do some bisect a little bit later

Cool. Thank you for verifying!

This regression was introduced by:

Author: Emery Hemingway <[email protected]>
Date:   Tue Aug 7 17:36:56 2018 +0200

  AsyncHttpClient: return from requests before body completion

  Store the body completion future at the client and wait for it to
  complete before issuing additional requests. This allows the body
  FutureStream reader to drain the stream and read buffers to be freed
  asynchronously.

  Fix #8109

It seems clear that

  when client is AsyncHttpClient:
    if not client.parseBodyFut.isNil:
      # let the current operation finish before making another request
      await client.parseBodyFut
      client.parseBodyFut = nil

is at the wrong place in httpclient. If you perform only a single request (which you always should, but this might be controversial...) then this await client.parseBodyFut will not be executed.

If you move the block when client is AsyncHttpClient... below the assignment to result (ie. to the end of requestAux), then your issue should go away.

I don't know if this move reintroduces #8109 which @ehmry tried to fix.

Wow thank you for the detective work!

I think it mostly works with smaller bodies, this is why the bug has gone unnoticed. When the body is large and requires a ton of reads this happens.

Do you have a patch that fixes it?

@dom96 did the patch that fixed the prev bug https://github.com/nim-lang/Nim/issues/8109 and created this one https://github.com/nim-lang/Nim/pull/8559/files . I hope he can help us.

It's been a long day so I don't quite understand the cause of this as explained by @skilchen.

But the bug that was fixed has a repro so you should be able to verify whatever solution you come up with.

I don't quite understand the cause of this

+1

If you move the block when client is AsyncHttpClient...

this action will revert (in the weird way :) ) the previous work; in result you will have same blocking behaviour for parsing a body; so this solution is not so good

interesting thing that this is only reproducible with enabled ssl; any thoughts?(some limit values?)

maybe we could add something like:

proc responseCompleted*(client: AsyncHttpClient) {.async.} = 
  if not client.parseBodyFut.isNil:
    # let the client parse the complete body
    await client.parseBodyFut
    client.parseBodyFut = nil

to httpclient.nim. Then you would have to call:

await client.responseCompleted()

before doing anything with the response.body:

import httpclient, asyncdispatch, json

proc test*() {.async.} =
  var client = newAsyncHttpClient()
  var response = await client.get("https://raw.githubusercontent.com/treeform/chrono/master/tzdata/dstchanges.json")
  await client.responseCompleted()
  echo parseJson(await response.body)

waitFor test()

Something like this (adapted from #8109) would then continue to work:

import asyncdispatch
import httpclient
import strutils

proc test*() {.async.} =
  var client = newAsyncHttpClient()
  let r = await client.get("http://stream.meetup.com/2/rsvps")
  for i in 1..10:
    let b = await r.bodyStream.read()
    echo intToStr(i, 2), " ", b

waitFor test()

My bad, I say revert the commit for #8109 and nobody has to figure out what went wrong and we keep the old behavior, in the case of processing bodies asynchronously (the low-memory, large body situation) procedures may be provided to GET into streams prepared by the caller, see 72106d8a5f291b94701a50349ed9d580df14cf8b.

Could you please check, if adding

await sleepAsync(1)

after Line 1002 in httpclient.nim

fixes the garbling issue #8994 without reintroducing #8109?

It seems to me that the problem in theFutureStream implementation.
complete method assumes that he calls some "client" callback(?): https://github.com/nim-lang/Nim/blob/723d838559b26f87363f0530e03803b93e244f45/lib/pure/asyncstreams.nim#L30-L34
while "under pressure"(when a bunch of read callbacks still in a dispatcher queue) it calls some of scheduled read callback; in result the client code misses this peace of stream data(in our case this is 4KB of HTTP response body)
in the context of current issue FutureStream.complete proc is called by httpclient here

there is some controversial guard in read callback implementation: https://github.com/nim-lang/Nim/blob/723d838559b26f87363f0530e03803b93e244f45/lib/pure/asyncstreams.nim#L96-L97
which assumes that this callback can be called many times(?). while the comment above it says
https://github.com/nim-lang/Nim/blob/723d838559b26f87363f0530e03803b93e244f45/lib/pure/asyncstreams.nim#L84-L85
anyway by removing the guard that mentioned above you can check that some of read callbacks are called twice

Error: unhandled exception: An attempt was made to complete a Future more than once. Details:
  Future ID: 3802

# first completion
Traceback (most recent call last)
hey.nim(25)              hey
asyncdispatch.nim(1658)  waitFor
asyncdispatch.nim(1515)  poll
asyncdispatch.nim(1281)  runOnce
asyncdispatch.nim(189)   processPendingCallbacks
asyncmacro.nim(36)       parseBody_continue
httpclient.nim(1008)     parseBodyIter
asyncstreams.nim(35)     complete
asyncstreams.nim(47)     :anonymous
asyncstreams.nim(99)     :anonymous
asyncfutures.nim(137)    complete

# second completion
Traceback (most recent call last)
hey.nim(25)              hey
asyncdispatch.nim(1658)  waitFor
asyncdispatch.nim(1515)  poll
asyncdispatch.nim(1281)  runOnce
asyncdispatch.nim(189)   processPendingCallbacks
asyncstreams.nim(47)     :anonymous
asyncstreams.nim(99)     :anonymous
asyncfutures.nim(137)    complete

one call from FutureStream#complete, one call from pending dispatcher queue.

Is there any work-around at this time?

If not, was the previous bug a showstopper? This bug loses data non-deterministically for large-ish async requests, so it is definitely a showstopper for my usage of 0.19.0.

@mcintyrechris Sorry about this, but we intend to release 0.19.2 with the most pressing regressions fixed.

PR #9003 is a fix for this.

@mcintyrechris you could try if my proposal from above works as a temporary workaround until PR #9003 has been merged or another solution has been implemented:

add

await sleepAsync(1)

after Line 1002 in httpclient.nim so that you will have:

  when client is AsyncHttpClient:
    await sleepAsync(1)
    client.bodyStream.complete()
  else:
    client.bodyStream.setPosition(0)

@skilchen I gave it a shot (the await sleepAsync(1) line), but it did not fix the issue, I still got only a partial JSON blob for several requests. I also added a "sleeping".debugEcho just below the line above, to make sure it was actually using my code (I use choosenim so I wasn't sure I was editing in the right file location), and confirmed it was sleeping.

@Araq No worries, I can still use 0.18.0 (only disappointed I can't use the new features), I was just worried that with all the recent "newbies" to Nim from all the recent HN posts, I didn't want anyone to get frustrated right away and give up on Nim (this language is great for my usage). Personally, I think a non-deterministic data-loss bug is way worse than completely breaking "infinite streams" because I imagine more people will be using non-infinite requests most of the time, which means most people are possibly getting this bug and may not know it. For example, when donwloading a CSV blob, I am missing several lines but there is no way to know it except that my unit tests count the lines for a known request and tell me that I'm missing several. I do, however, understand that bugs are an expected part of new releases, so thanks to you and the team for working on them so quickly!

Yeah. This is pretty bad, I should have reverted the PR that introduced it.

We should release 0.19.2 asap to alleviate this.

On Fri, 28 Sep 2018, 09:32 Chris McIntyre, notifications@github.com wrote:

@skilchen https://github.com/skilchen I gave it a shot (the await
sleepAsync(1) line), but it did not fix the issue, I still got only a
partial JSON blob for several requests. I also added a
"sleeping".debugEcho just below the line above, to make sure it was
actually using my code (I use choosenim so I wasn't sure I was editing in
the right file location), and confirmed it was sleeping.

@Araq https://github.com/Araq No worries, I can still use 0.18.0, I was
just worried that with all the recent "newbies" to Nim from all the recent
HN posts, I didn't want anyone to get frustrated right away and give up on
Nim (this language is great for my usage). Personally, I think a
non-deterministic data-loss bug is way worse than completely breaking
"infinite streams" because I imagine more people will be using non-infinite
requests most of the time, which means most people are possibly getting
this bug and may not know it. For example, when donwloading a CSV blob, I
am missing several lines but there is no way to know it except that my unit
tests count the lines for a known request and tell me that I'm missing
several.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/nim-lang/Nim/issues/8994#issuecomment-425492641, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAPDe6d4L2geD4OYgQrkxgX9FXv5fPWHks5ufk8LgaJpZM4WsP0c
.

The right thing to do, would be to come up with a solution similar to Python's http.client, i.e. expose the body as a readable stream as soon as the headers from the request have been received. Then it would be possible to read from that stream in whatever mode the client prefers, e.g. per line, per fixed blocksize or all at once.

The right thing to do, would be to come up with a solution similar to Python's http.client, i.e. expose the body as a readable stream as soon as the headers from the request have been received.

Unless I'm terribly misunderstanding you, that's exactly what we have right now. The Response object contains a FutureStream which represents the body and you can read from it whenever you want. The only problem with this is that the stream gets filled even if nobody is reading from it.

I think we should revert this to last working point like @ehmry said. We can't seem to figure out how to fix it. We can fix the streaming stuff later when the right person comes along...

Most people don't want to stream HTTP get requests, but everyone wants to get the correct data back.

There is a fix at fd539d22f17c06ddd53c7fe8a05a72349923a577 / #9003 that uses a Deque for serializing read futures.

@treeform waiting for a right person is a good proposal =)

anyway I politely disagree with the suggestion about reverting. #9024 shows that the issue lies in the FutureStream implementation. It means that any client code that uses it will have the same problem.
ex. downloadFile will be broken. so the code based on the snippet that provided by you

import httpclient, asyncdispatch, json

proc test*() {.async.} =
  var client = newAsyncHttpClient()
  await client.downloadFile("https://raw.githubusercontent.com/treeform/chrono/master/tzdata/dstchanges.json", "my.json")
  echo parseJson(readFile("my.json"))

waitFor test()

will lead to the same error even on version 0.18

as a short-term solution, I suggest #9159 (or #9003 mentioned above many times)
in the long term, async core might be updated with some synchronization primitives(as example, some of them already implemented in asyncdispatch2)

Either I'm missing something fundamental or all of the attempted fixes so far managed to work around this bug instead of actually fixing it. See my fix here: https://github.com/nim-lang/Nim/pull/9183.

Wow your fix is amazing. Everything works.

Was this page helpful?
0 / 5 - 0 ratings