Use Ubuntu 16.04. Install mono-devel, mono-utils. mono-profiler, whatever is necessary to have the compiler and profiler. I doubt this is specific to the Linux version, however.
The program below simply loops over a set of popular websites and gets their homepages using HttpWebRequest and HttpWebResponse. Compile and run it under the profiler, e.g.
mono --gc=sgen --profile=log:heapshot=50gc TestMemGrow.exe
mprof-report output.mlpd > out.txt
grep System.Byte out.txt
You should see a list of allocations of System.Byte[] that grows increasingly larger. You can leave the program running and re-execute those two commands, to see the continued growth of memory usage as the profiler adds more heap shots to its data file.
I put in the GC calls to see what would happen, and to see if the problem was only in the large object space. It appears not to be limited to those objects. If you take the GC calls out, memory usage fluctuates quite a bit up and down as the collections take place, and possibly grows more rapidly. The line of code that actually reads the response stream and writes the web page data to a file is currently commented out. If you comment that back in, memory use grows more rapidly than without it.
CODE:
```C#
using System;
using System.Collections.Generic;
using System.IO;
using System.Net;
using System.Runtime;
using System.Threading;
namespace TestMemGrow
{
class Program
{
static void Main()
{
List
"http://www.marketwatch.com", "http://www.bn.com", "http://www.newegg.com", "http://www.wsj.com", "http://www.arstechnica.com", "http://www.slashdot.org",
"http://www.mediaite.com", "http://www.disqus.com", "http://www.twitter.com", "http://www.snap.com", "http://www.facebook.com", "http://www.usps.com",
"http://www.ups.com", "http://www.techcrunch.com", "http://www.oracle.com", "http://www.java.com", "http://www.apple.com", "http://www.microsoft.com",
"http://www.ibm.com", "http://www.dell.com", "http://www.asus.com", "http://www.gigabyte.com", "http://www.intel.com", "http://www.crucial.com",
"http://www.westerndigital.com", "http://www.samsung.com", "http://www.sandisk.com", "http://www.brother.com", "http://www.hp.com",
"http://www.msn.com", "http://www.disney.com", "http://www.nintendo.com", "http://www.twitter.com", "http://www.youtube.com",
"http://www.instagram.com", "http://www.linkedin.com", "http://www.wordpress.org", "http://www.pinterest.com", "http://www.wikipedia.org",
"http://www.blogspot.com", "http://www.adobe.com", "http://www.tumblr.com", "http://www.vimeo.com", "http://www.flickr.com", "http://www.godaddy.com",
"http://www.buydomains.com", "http://www.reddit.com", "http://www.w3.org","http://www.nytimes.com", "http://www.statcounter.com",
"http://www.weebly.com","http://www.blogger.com","http://www.github.com", "http://www.jimdo.com", "http://www.myspace.com",
"http://www.mozilla.org", "http://www.gravatar.com", "http://www.theguardian.com", "http://www.bluehost.com", "http://www.cnn.com", "http://www.foxnews.com",
"http://www.msnbc.com", "http://www.wix.com", "http://www.paypal.com","http://www.stumbleupon.com", "http://www.digg.com","http://www.huffingtonpost.com",
"http://www.feedburner.com", "http://www.imdb.com","http://www.yelp.com","http://www.dropbox.com", "http://www.baidu.com","http://www.washingtonpost.com",
"http://www.slideshare.net","http://www.etsy.com","http://www.telegraph.co.uk", "http://www.about.com", "http://www.bing.com", "http://www.latimes.com",
"http://www.tripadvisor.com","http://www.opera.com", "http://www.live.com", "http://www.wired.com", "http://www.bandcamp.com"};
while (true)
{
foreach (string website in websites)
{
try
{
GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce;
GC.Collect();
Console.WriteLine("Getting " + website);
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(website);
HttpWebResponse response = request.GetResponse() as HttpWebResponse;
StreamReader reader = new StreamReader(response.GetResponseStream());
// when this line is in, mem use grows even more quickly
//File.WriteAllText("florb.txt", reader.ReadToEnd());
reader.Close();
response.Close();
}
catch (Exception ex)
{
Console.WriteLine("Caught exception " + ex.ToString());
}
Thread.Sleep(1000);
}
}
}
}
}
## Current Behavior
Memory use grows and grows without limit.
## Expected Behavior
Garbage collection would stop the infinite growth of memory usage after a while.
### On which platforms did you notice this
[ ] macOS
[X] Linux
[ ] Windows
**Version Used**:
Mono JIT compiler version 5.10.0.140 (tarball Sat Feb 24 15:33:47 UTC 2018)
Copyright (C) 2002-2014 Novell, Inc, Xamarin Inc and Contributors. www.mono-project.com
TLS: __thread
SIGSEGV: altstack
Notifications: epoll
Architecture: amd64
Disabled: none
Misc: softdebug
Interpreter: yes
LLVM: supported, not enabled.
GC: sgen (concurrent by default)
```
Let me mention that I wonder if this bug has any connection with bug #6651, which also has to do with running HttpWebRequest / Web Response for a long period. The motivation for this report was programs of mine which scrape web pages growing and growing as they run, to the point that the systems they are running on freeze up when they run out of memory and swap.
/cc @baulig
I can reproduce the leak when the test is run for a long time. When the delay in the sample is reduced to 10 it leaks about 10MB per an hour for me.
Yes that sounds at least in range of what I am seeing. On a 16 Gb Ubuntu machine running 200 or so processes at about one hit per 50 seconds each, thus 4 per second, I am running out of remaining memory (5 Gb) and 16 Gb swap in between two and three days, which (taking it as three days) would be more like 72 Mb/hr. Since you are probably only getting a page per second or so running the single process, my leak rate is going to be some multiple of what you observe, but as I say, we are in range of each other.
In production I am actually reading the response stream; my sample above has that commented out. Please comment it back in for at least some of your testing.
When is this issue being fixed? We have major memory leak issues because of this!
Does this affect older versions of Mono as well? Can someone confirm?
@baulig could you please investigate
I am sure this problem is in 5.8. I am pretty sure it was in 5.4.1. If I had to guess, I'd say the problem started with 5.4 or 5.4.1. I say this because it was sometime in late fall / early winter of 2017 that we had to start restarting our systems every couple of days to avoid problems.
I'm not sure but it could have something to do with the implementation of btls because I think the problem does not exist if MONO_TLS_PROVIDER=legacy is set!
someone can test ?
I can verify that
export MONO_TLS_PROVIDER=legacy
does seem to eliminate the problem. Unfortunately numerous websites can no longer be communicated with by my test program if that setting is in place.
Well, this particular test case is really not ideal for testing this kind of stuff. For much more accurate results, your main loop should take at least ServicePointManager.MaxServicePointIdleTime (which defaults to 100 seconds) plus a sufficiently large buffer.
What I would suggest is to add a loop counter to your while (true) and once every 10 or 20 iterations or so, you sleep for 2 minutes. This will give the scheduler time to close all pending connections. Profiling data should then ideally be taken at those checkpoints.
Another thing which could help is adding a super long sleep - like 5+ minutes - after hitting a couple of such checkpoints, like for instance after running for half an hour.
You can also use request.AllowAutoRedirect = false to prevent some of those websites from redirecting your to their secure versions.
That being said, I spent some time playing around with this and it is definitely increasing in size after running for a long time. However, this doesn't seem to be related to TLS and the object it's leaking might be System.Threading.Timer.
My latest profiling data:
out.txt
Updated test case:
```using System;
using System.Collections.Generic;
using System.IO;
using System.Net;
using System.Runtime;
using System.Threading;
namespace TestMemGrow
{
class Program
{
static void Main ()
{
List
"http://www.amazon.com", "http://www.google.com", "http://www.yahoo.com", "http://www.ebay.com", "http://www.overstock.com",
"http://www.marketwatch.com", "http://www.bn.com", "http://www.newegg.com", "http://www.wsj.com", "http://www.arstechnica.com", "http://www.slashdot.org",
"http://www.mediaite.com", "http://www.disqus.com", "http://www.twitter.com", "http://www.snap.com", "http://www.facebook.com", "http://www.usps.com",
"http://www.ups.com", "http://www.techcrunch.com", "http://www.oracle.com", "http://www.java.com", "http://www.apple.com", "http://www.microsoft.com",
"http://www.ibm.com", "http://www.dell.com", "http://www.asus.com", "http://www.gigabyte.com", "http://www.intel.com", "http://www.crucial.com",
"http://www.westerndigital.com", "http://www.samsung.com", "http://www.sandisk.com", "http://www.brother.com", "http://www.hp.com",
"http://www.msn.com", "http://www.disney.com", "http://www.nintendo.com", "http://www.twitter.com", "http://www.youtube.com",
"http://www.instagram.com", "http://www.linkedin.com", "http://www.wordpress.org", "http://www.pinterest.com", "http://www.wikipedia.org",
"http://www.blogspot.com", "http://www.adobe.com", "http://www.tumblr.com", "http://www.vimeo.com", "http://www.flickr.com", "http://www.godaddy.com",
"http://www.buydomains.com", "http://www.reddit.com", "http://www.w3.org","http://www.nytimes.com", "http://www.statcounter.com",
"http://www.weebly.com","http://www.blogger.com","http://www.github.com", "http://www.jimdo.com", "http://www.myspace.com",
"http://www.mozilla.org", "http://www.gravatar.com", "http://www.theguardian.com", "http://www.bluehost.com", "http://www.cnn.com", "http://www.foxnews.com",
"http://www.msnbc.com", "http://www.wix.com", "http://www.paypal.com","http://www.stumbleupon.com", "http://www.digg.com","http://www.huffingtonpost.com",
"http://www.feedburner.com", "http://www.imdb.com","http://www.yelp.com","http://www.dropbox.com", "http://www.baidu.com","http://www.washingtonpost.com",
"http://www.slideshare.net","http://www.etsy.com","http://www.telegraph.co.uk", "http://www.about.com", "http://www.bing.com", "http://www.latimes.com",
"http://www.tripadvisor.com","http://www.opera.com", "http://www.live.com", "http://www.wired.com", "http://www.bandcamp.com"
"http://www.mozilla.org", "http://www.apple.com"
};
int totalCount = 0;
while (true) {
// ServicePointManager.MaxServicePointIdleTime = 100;
for (int i = 0; i < websites.Count; i++) {
var website = websites[i];
try {
GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce;
GC.Collect ();
MartinTest.Run ();
Console.WriteLine ($"Getting ({i}/{websites.Count}) {website}");
HttpWebRequest request = (HttpWebRequest)WebRequest.Create (website);
request.AllowAutoRedirect = false;
HttpWebResponse response = request.GetResponse () as HttpWebResponse;
StreamReader reader = new StreamReader (response.GetResponseStream ());
// when this line is in, mem use grows even more quickly
//File.WriteAllText("florb.txt", reader.ReadToEnd());
reader.Close ();
response.Close ();
} catch (Exception ex) {
Console.WriteLine ("Caught exception " + ex.ToString ());
}
Thread.Sleep (10);
}
++totalCount;
Console.WriteLine ($"LOOP DONE: {totalCount}!");
if ((totalCount % 10) != 0)
continue;
MartinTest.Run ();
Thread.Sleep (TimeSpan.FromSeconds (125));
MartinTest.Run ();
GC.Collect ();
Thread.Sleep (TimeSpan.FromSeconds (10));
}
}
}
}
What's interesting about this is that if I interpret the profiler output correctly, then we don't seem to leak any of
System.Net.ServicePointSystem.Net.ServicePointSchedulerSystem.Net.IPAddressSystem.UriSystem.Uri.UriInfoSystem.Net.ServicePointScheduler.AsyncManualResetEvent.<WaitAsync>d__3System.Collections.Hashtable.bucket[]System.Byte[]There obviously is a huge fluctuation in the number of System.Threading.Tasks.Task.DelayPromise and System.Threading.Timer instances between checkpoints, but while the 2-minute checkpoint wait seems to be releasing the vast majority of them, their number still doesn't go down completely.
Thanks for checking into this. I really appreciate it, as this is a matter of some urgency for us.
My goal was to give you the simplest test case I could that reflects our actual use case, which is to scrape data from websites with short pauses, over and over. I don’t have much idea what goes on behind these facades and I have to leave the deeper testing to someone who knows the internals. But my original test is a pretty good reflection / reduction of what we are actually doing.
I was surprised to hear that you are not seeing leaks of System.Byte[], since that is the main problem I am seeing. I am now running 5.10.1.20. I reran my original test and created -- well, an output file that I cannot figure out how to upload here -- anyway . . .
when I grep System.Byte on the output file I see this:
2301704 348 6614 System.Byte[]
3856376 567 6801 System.Byte[] (bytes: +1554672, count: +219)
3963800 582 6810 System.Byte[] (bytes: +107424, count: +15)
4096688 594 6896 System.Byte[] (bytes: +132888, count: +12)
4135224 598 6915 System.Byte[] (bytes: +38536, count: +4)
4268888 608 7021 System.Byte[] (bytes: +133664, count: +10)
4299392 612 7025 System.Byte[] (bytes: +30504, count: +4)
4433736 624 7105 System.Byte[] (bytes: +134344, count: +12)
4467152 628 7113 System.Byte[] (bytes: +33416, count: +4)
4569096 636 7184 System.Byte[] (bytes: +101944, count: +8)
4706240 648 7262 System.Byte[] (bytes: +137144, count: +12)
4769416 655 7281 System.Byte[] (bytes: +63176, count: +7)
4901640 665 7370 System.Byte[] (bytes: +132224, count: +10)
4935088 669 7376 System.Byte[] (bytes: +33448, count: +4)
5066056 682 7428 System.Byte[] (bytes: +130968, count: +13)
4993952 667 7487 System.Byte[] (bytes: -72104, count: -15)
4968344 662 7505 System.Byte[] (bytes: -25608, count: -5)
5370792 713 7532 System.Byte[] (bytes: +402448, count: +51)
5471824 721 7589 System.Byte[] (bytes: +101032, count: +8)
5605160 733 7646 System.Byte[] (bytes: +133336, count: +12)
5644184 741 7616 System.Byte[] (bytes: +39024, count: +8)
5769240 749 7702 System.Byte[] (bytes: +125056, count: +8)
I cannot tell what these System.Byte[] objects are related to, but they are increasing.
I can put in the long sleeps as you describe, but as that is not what we do, nor is it anything that I’m aware of that is documented as something one needs to do, I feel like it would change the test case to something that is not the problem at issue.
If my test case is somehow unreasonable (that is, is misusing these methods or otherwise doing something it should not be doing), please let me know. Similarly, if there is a workaround -- preferably something other than sleeping for two or five minutes – such as setting the ServicePointManager parameters to different values, I’d be happy to try it.
To be clear, we have been using Mono for this purpose and in this way, with hundreds of processes running 24/7, for at least six years, and only recently has this problem cropped up, which leads me to think something has changed under the hood.
I’ll give the long sleeps you mention at try. If they do show a change, however, that would itself seem to be an undesirable (for us, certainly!) variance from previous behavior. But they may constitute a workaround that would at least allow us to stop restarting our processes every 48 hours.
Had to try a different browser -- here is my out.txt file.
FYI I tried the "sleep 2 minutes after 20 requests / sleep 5 minutes every half hour" version, and it appears that memory does not increase in that case. However, it is also the case that under those strictures the program does only about 200 scrapes an hour instead of 3600, which would be a painful drop in performance to have to tolerate. (I'm also not clear on the validity of the test I did, now that I think about it -- given that it runs at about 1/18 the speed, the 12000 seconds of testing that I did is like testing the first version for only 11 minutes.)
I hope you will be able to see, perhaps, what has changed since 5.4 or prior to now that is causing this issue in my original test.
I did a few more tweaks, disabled TLS, connection reuse and all timeouts. Then let it running while making lunch. After about 275 iterations, the profiler doesn't show any leaked objects, but size still keeps increasing.
This is from a long run:
out.txt
As you can see in the last heap shot:
Heap shot 496 at 2975.806 secs: size: 713888, object count: 8105, class count: 350, roots: 0
The size has been somewhere between 700k and 750k and the object count has been somewhere between 8000-8200.
However, on another terminal, I did top -pid XXX and it shows an increasing memory size:
92241 mono-sgen 6.4 02:24.32 18 1 188 61M+ 0B 0B 92241 21124 sleeping *0[1] 0.00000
There also seems to be a leak in the code that I disabled since results were a lot worse with that stuff still in place.
I know you had said earlier that you didn't think this issue was TLS-related, but now both of us have disabled TLS for testing and in doing so eliminated the leak, so it does make me wonder.
It is certainly possible that we have more than one leak. I am simply disabling things at the moment to narrow things down then fix one after the other.
I will spend some more time working on this and will hopefully be able to give you an update shortly.
Sounds good, thanks. If there's anything else I can try please let me know.
Good news - I may have a fix coming up shortly :-)
Ok, this probably won't quite fix your problem just yet, but I believe the leak will be a lot less severe with #8016.
I'm still working on this and will probably make another PR on top of this to finally solve this problem.
Another problem that I just stumbled upon - which your test case does not reveal - is that we are putting the ServicePoint instance into a hash table, but never remove it. So over time, we will leak ServicePoint instances. Because your test case is using the same websites over and over again, those instances will be reused.
Interesting. Yes, we would never encounter that one; the set of sites we look at is both finite and small, so our set of ServicePoints would reach its maximum quickly and unnoticeably.
Glad to hear about the other fix #8016; it's like some sort of new class of issue, a "not-quite-a-leak", or perhaps "sticky-memory".
I appreciate the update, and I'm very happy to hear there's progress on this.
Well, I would probably classify it as a temporary or transient memory leak - even though it will eventually even out, the short to medium term effects are so severe that it should be handled like a "traditional" leak.
Thank you so much for filing this bug report, this was such an exciting experience working on it and I learned a lot about Tasks, timeouts and related memory issues. I need to thank you for giving me the opportunity to learn so much about these topics.
I don't know your procedures; what happens now? How long would it normally be until this fix finds its way into released code? Thanks.
@smr888 for which product, Mono?
Yes, the compiler / runtime for Linux, specifically.
That release (5.12( is already available via Nighly and should go to Stable soon.
/cc @directhex
Great, thanks!
My understanding is 5.12 is gonna be marked as stable by the runtime team Real Soon Now. As soon as it does, I'll switch building 5.10 to 5.12, and can have packages out in a day or two.
OK, cool. I think I'll wait for that, because it's a bunch of machines whose repository setups I'd have to mess with to get the nightlies. It's much easier for me to just let it come in when it comes in, much as I would like to get this fix in right now!
I want to upvote this issue. We recently upgraded to Mono 5.10.0.160, and suddenly our application started steadily increasing its used memory. Usually, it only uses 100 MB, but it went up to more than 20GB RSS in a couple of days. We investigated this, and it seems to be directly linked to HttpWebRequests made by our application, as described above.
Looking for something to do, I ran my test under the current Mono nightly, which is 5.15, and it did not show the memory leak. So that's good.
Given that the nightly has moved on from 5.12 to 5.15 over these two weeks, does that mean that 5.12 may become the stable release fairly soon?
5.12 is already in the Preview repo. Give that a shot, so we know whether it's fixed in 5.12?
Yup, did that, ran it for over an hour, memory is not growing. The amount showing allocated as System.Byte in the 5.12 version hovers around 1 Mb. In the 5.15 version it stays slightly less than 3 Mb. Not sure why that is since I can't see who allocated what, but neither version is growing / leaking, which is the desirable change.
Good news and bad news. The good news and bad news are that with the stable release of 5.12, I am seeing significantly slower memory leakage in our programs. Unfortunately there is still leakage present, as Martin had said there might be. I would say the leak speed has slowed from about 250 Mb/hr on one of our systems to maybe 100 Mb/hr. Which is great, because it means we can restart them every four or five days instead of every two (which for us, practically speaking, means one restart a week, because we shut down for maintenance every Sunday morning anyway.) But of course I'd like to get back to not having to restart at all.
The test program above does not have this leak, so I'm trying to work up a test to pinpoint the cause of the remaining leak. If I can do so I'll open another bug.
Thanks again to everyone, and especially @baulig, for your help with this.
any news on this?
Most helpful comment
Good news - I may have a fix coming up shortly :-)