Cgeo: Do not build complete responses in memory

Created on 19 Jun 2012 · 5Comments · Source: cgeo/cgeo

Many crashes are linked to OutOfMemoryError on low-end devices due to the fact that we systematically store responses in strings, whereas we could process them as a stream. This is due to our all-regex approach, which could be replaced in many cases by a sax-like parser such as the one from tagsoup. That would not require storing whole web pages in memory.

java.lang.OutOfMemoryError: (Heap Size=11719KB, Allocated=6276KB, Bitmap Size=7969KB)
at ch.boye.httpclientandroidlib.util.CharArrayBuffer.expand(CharArrayBuffer.java:63)
at ch.boye.httpclientandroidlib.util.CharArrayBuffer.append(CharArrayBuffer.java:93)
at android.support.v4.app.ActivityCompatHoneycomb.toString(RequiredFields.java:225)
at cgeo.geocaching.network.Network.getResponseDataNoError(Network.java:371)
at cgeo.geocaching.network.Network.getResponseData(Network.java:387)
at cgeo.geocaching.network.Network.getResponseData(Network.java:380)
at cgeo.geocaching.connector.gc.Login.switchToEnglish(Login.java:221)
at cgeo.geocaching.connector.gc.Login.login(Login.java:87)
at cgeo.geocaching.cgeo$firstLogin.run(cgeo.java:825)

Bug

Source

samueltardieu

Most helpful comment

Well, I think that we should not use regular expressions at all to parse HTML pages. We should use something like TagSoup instead: http://ccil.org/~cowan/XML/tagsoup/

samueltardieu on 20 Jun 2012

👍2

All 5 comments

I was always dreaming of applying regular expressions to streams as an alternative solution to the problem. However, I have no clue how well that works. A possible implementation is hinted at here: http://stackoverflow.com/questions/716927/applying-a-regular-expression-to-a-java-i-o-stream

Bananeweizen on 20 Jun 2012

Well, I think that we should not use regular expressions at all to parse HTML pages. We should use something like TagSoup instead: http://ccil.org/~cowan/XML/tagsoup/

samueltardieu on 20 Jun 2012

👍2

@samueltardieu This issue is rather old. Is it still relevant? As far as I understand we meanwhile started using things like JSoup and other methods, and perhaps this issue is much to generic to ever be solved?!

Lineflyer on 28 Aug 2017

We use JSoup in some places, but not everywhere, far from it. The issue is still relevant.

samueltardieu on 28 Aug 2017

Jsoup is only used in some (small) parts, and there are still many regex used :(. we should keep the issue and probably rename it to "Use a dom parser instead of Regex"