The program below would use 830 MB of memory (tested on both macOS and Linux). It consists of a simple async upload of a 100MB buffer to any server using TLS/SSL.
import asyncdispatch
import httpclient
import random
proc inefficientCopiesAllOver() {.async.} =
# Generate a garbage of 100MB in size
echo "Generating random garbage ..."
randomize()
let rpool = int('A') .. int('z')
var randomBuff = newString(100*1024*1024)
for i in .. (randomBuff.len - 1):
randomBuff[i] = char rand(rpool)
echo "Garbage ready ... sending to Google"
# Send it to Google
let client = newAsyncHttpClient("", proxy=nil)
client.headers = newHttpHeaders({
"Accept": "application/json",
"Content-Type": "multipart/related; boundary=zgitdjizofpqamg",
"Authorization": "Bearer Zfjfjii.fizjfomdmgfjefjzeifmf.fjzef",
"Content-Length": $randomBuff.len,
})
let url = "https://storage.googleapis.com/upload/storage/v1/b/myBucket/o?uploadType=multipart"
let response = await client.request(url, "POST", randomBuff)
echo await response.body
# Run async
waitFor inefficientCopiesAllOver()
shallow can help. And it does, but it has to be done manually. And it's not far reaching and the effect is not enoughhttpclient module, it was not hard to spotAny version of Nim is affected by this.
COW strings would help. But the stdlib could also use streams intead of strings...
Serious consideration of the problem, it is a fact that this is a problem
Yes, indeed. It is a problem. No excuses from my side.
I'm not excusing this either, FutureStreams are the solution here but they remain sub-par. IIRC there was a PR somewhere which allowed FutureStreams to be used in the context of a HTTP POST.
COW strings would help
What is that? and how can I use it?
Guys, I really do want to help in this situation, but I have no clue where to go to. I've asked on the chat, where in the source code that is responsible of detecting and inferring the move. But, nobody seemed to care, and I still don't know the answer to that question to this day.
You know, I consider myself as a new-comer, even though I've already been involved in Nim and using it for production in OpenWRT routers in 2014 and 2015. And as new comer, I'm interested in a set of advertised feature of the language. But then the langage falls short of my expectations because of some issues, and recently, these are really something deep in the language.
Please give me direction on how I can help, URL, documentations, ...
I've asked on the chat, where in the source code that is responsible of detecting and inferring the move. But, nobody seemed to care, and I still don't know the answer to that question to this day.
Move inference is done inside compiler / injectdestructors.
Please give me direction on how I can help, URL, documentations, ...
Well you're already on IRC, please be patient and forgive me when I'm not online 24h a day.
It is not new that httpclient uses a lot of memory. I opened an issue 10825 myself when it was version 0.19.4. Until today it has not been fixed, as I took the test now and consumed 350MB of memory the example.
And I remember that I even searched all issues, open and closed, on httpclient. It seems that it is a known problem, but that it is not given due attention. I believe that problems with the stdlib httpclient should be a high priority, given that it should be used a lot.
I recently had a case of high memory usage with httpclient, but a synchronous client. I haven't had time to report yet.
I believe that problems with the stdlib httpclient should be a high priority, given that it should be used a lot.
@rockcavera You're right, and it looks as if it is a deeper rooted problem in how strings are handled in async and closure contexts. So, the root of this problem (and yours) is string.
If understand well, when you start an async or a closure, it will need to capture surrounding variables, and for that, Nim seems to make deep copies. This is where the problem starts to grow because when you use async, typically every chain of call that you make (or the stdlib is making) will also be async, and so you end-up with many deep copies of the same data filling-up the memory quickly.
While this is safe, I think it should be the responsibility of the programmer, not the compiler. With move, sink and shallow, the programmer can be explicit about it. Nim has taken good things from the likes of Golang, and in this particular problem, Nim could also treat strings like Go in any async context, and it is the programmer's responsibility to ensure he doesn't change the original buffer while it is being used in deeper async procs, or if he wishes to, he should be allowed to. This is the efficient route.
Another thing, it seems that shallow doesn't fully have the desired effect. So, I suppose string slicing is another problem. I don't know if the stdlib is making use of string.substr() when slicing, but I think it should make a view without any copy of a shallow string instead of making a copy. But I'll be surprised if that's not already the case.
Until this string handling problem is solved, Nim is not going to be a serious option for doing production async programs.
Personally I disagree that this is up to the programmer to solve. Even if all these strings are deep copied, they should eventually be released. The GC should do work to do that, if it's not then there is an issue with it.
They are released but by then the memory consumption grew considerably. A GC is not a magic wand, GC activity costs CPU cycles and invoking the GC after every big allocation is too expensive, not too mention that usually there are stale pointers on the stack meaning that the GC sees an imprecise picture and it cannot free the objects quickly enough. Maybe so much effort was put into --gc:arc development for a reason, you know.
The GC should do work to do that, if it's not then there is an issue with it.
@dom96 Leaving the compiler to do a bunch of deep copies and letting the GC collect them after is exactly what we're trying to solve here; it is better to not make the deep copies in the first place; it's not as if deep copies is the only way to go.
Let me give you some figure and use case why it needs solving ... let's say that you get yourself a VPS with 1GB RAM, in theory, it should be able to serve 9 persons asynchronously where each person will allocate 100MB; at this time, that same server can only server 1 person at any given time, because currently, it will allocate 830MB not even leaving enough for a second person.
Cost-wise, using Nim to serve 9 persons would require a server with 8GB RAM in the simple bug report scenario. This needs addressing because people will be very disappointed in Nim, it promises efficiency and then people want to adopt it, then it doesn't live up to the expectation. For a language to be used and to become widespread, it also needs to stop disappointing.
I'm trying to impress "Efficiency" to Nim here; I hope you still believe in that. You already know that Go has GC, and I was referring to how things are treated in Go ... let me try to explain with this simple pseudo code:
connectedSocket := GetConnectedAndWritableSocketSomehow()
connectedSocket.Write = func(buff_passedByRef []byte) {
// this is a rough idea of what's going to happen in a Write()
waitForSocketToBeWritable() // this could take a while depending on context
time.Sleep(10 * time.Millisecond) // simulates delay in readiness
writeToSocket(buff_passedByRef)
}
myBuff := []byte{"643210"}
go connectedSocket.Write(myBuff)
myBuff[0] = '5'
time.Sleep(10 * time.Millisecond)
// "543210" will be written to the socket instead of the original "643210"
As you can see here, the []byte is passed by ref to the Write() function, and as long as the write hasn't occurred, the changes that I make in myBuff will be written to the socket. I would say that is a sensible and good assumption.
This is not the case in Nim because it will make a copy of myBuff as I call Write(), and changes that I make in myBuff will be invisible to Write().
And what I was saying is: it should be up to the programmer to make this copy if he wants to, not the compiler. This has nothing to do with the GC, because in any case, myBuff will be GCed. The difference is that in Go, I had only 1 buffer with 2 references, where as in Nim I had 2 references pointing to 2 completely different things (and there will be much more deep copies as there are more deeper calls).
Interesting. How does Go's writeToSocket function work? Does it also take strings by reference? Is this just a convention across all of Go's stdlib?
How do other languages handle this? It seems to me that the problem here is that Nim's default behaviour with regards to strings and closures is inefficient. You're implying that we need these new shallow, sink, move operators to solve it, but I am skeptical. After all, Python and many other GC'd and async languages somehow manage to do this right. Why is it up to the programmer to solve when this seems like a language problem?
Well we can easily use ref array in Nim too, or a stream. You don't have to use strings everywhere which then don't support laziness anyway. Why can't the stdlib use streams?
Interesting. How does Go's
writeToSocketfunction work? Does it also take strings by reference? Is this just a convention across all of Go's stdlib?
I believe so ... []byte slices are passed by reference every-time. If you need to make a copy, you have to call copy(...). Moreover, slicing the original slice will also give you a reference into the same slice, and any modification you make in the sub-slice will be reflected in the original.
Well we can easily use ref array in Nim too, or a stream. You don't have to use strings everywhere which then don't support laziness anyway
Yes, but I guess that all the stdlib are using string, passing it around, which doesn't seem to be a problem in synchronous programs. I guess the amount of works to do that will be huge, but probably necessary to make Nim fully efficient.
Why is it up to the programmer to solve when this seems like a language problem?
I agree that it is a problem in the language. The way I see this is that it needs to be solved 2 ways:
I already tried to see how this could be fixed, and I know it can be fixed at the compiler level, at least, partially. Unfortunately, I didn't have enough context information. This is what I tried:
compiler/lambdalifting.nim::rawClosureCreation() around line 619, I tried to check to check (local.kind == skParam or local.kind == skLet) and fieldAccess.typ.kind == tyString and do a nkFastAsgnskLet matched, always skParam.--gc:markAndSweepecho there, and debugging natively doesn't help either because it's not possible to see values of variables or their types. And printf didn't help much.Any progress on this?
Not that I'm aware of. Someone that's passionate about this needs to push for better async streams and make them first class citizens across the stdlib. Maybe you'd be interested in doing this?
I did try, and I faced blocking issues:
But that said, I am interested.
Yeah, I can sympathise with that. I'm happy to help explain the code as much as I can, but I also have mouths to feed :)
I believe you can get away with by simply understanding how Futures work (not just in Nim, although their specific implementation in Nim will help you). Once you have a grasp of them you should be able to design a better FutureStream.
What I thought of doing is disregard async/await completely, and just use an existing pure C lib for async stuffs, and do things manually. But that's kind of dumb. So, I guess until this is solved, I'll keep using Golang.
Now that you're working fulltime for Facebook (I kinda heard that in your Nim talk few days ago), I'm guessing the chance of this getting solved is even slimmer as @Araq will be extremely overloaded. :sigh:
Now that you're working fulltime for Facebook (I kinda heard that in your Nim talk few days ago)
For what it's worth, this isn't a recent thing, I've been at FB for the past (almost) two years.
If you don't have time to fix these kinds of issues then yes, use Golang.
Well, I don't think beginners like me would have any chance of fixing this. Given that the problem requires much deeper knowledge of the language, only core team members would have a chance of being able to move this forward.
And I personally think it should be a priority issue, given the fact that it just makes async basically broken. As before, the fact that async is currently being used in jester and some other libs isn't an indication that it's fine.
Few months back, even some core team members where not feeling confident to tackle this problem, so count me out, I'm far from being a capable core contributor, as I said before, just a beginner. Oh, and I also talked to @mratsim and their team even decided to completely skip Nim's async and built their own library instead AFAIK; what a shame, and that is a good indication that it's not a problem for beginners to be able to solve.
Oh, and I also talked to @mratsim and their team even decided to completely skip Nim's async and built their own library instead AFAIK; what a shame, and that is a good indication that it's not a problem for beginners to be able to solve.
To be fair we also have/had memory issues in Chronos due to closure iterators (the backbone of both async implementations) introducing cycles and timers holding resources.
However as we are gearing up for audit and for a very long running and high profile application that will very likely be DOS-ed we are pouring significant effort to optimize memory usage and fix them as recurrent crashes would be a red flag:
Most helpful comment
To be fair we also have/had memory issues in Chronos due to closure iterators (the backbone of both async implementations) introducing cycles and timers holding resources.
However as we are gearing up for audit and for a very long running and high profile application that will very likely be DOS-ed we are pouring significant effort to optimize memory usage and fix them as recurrent crashes would be a red flag: