Notebook: Large notebook fails to save

Created on 24 Oct 2015  路  27Comments  路  Source: jupyter/notebook

When I was using IPython notebook to analyze our experiment data, I noticed I could not save notebook.
The console (from which I started ipython notebook) stated:

[I 17:37:11.736 NotebookApp] Malformed HTTP message from ::1: Content-Length too long

So I guess this problem comes from notebook size.
I was using "bokeh" library to plot my data, and the notebook file was about 100 MB on disk.

To reproduce, I prepared a new notebook and did many plot to produce a large-filesize notebook.

2015-10-24 17 30 24
This does 30001-point plot repeatedly (e.g. 100 plots in above screen shot,)
I could not save the notebook above: when I repeated saving with increasing number of plots, again above about 100MB, I could not save the notebook (with the same console message).

In a little more detail, I could save notebook until 88 plots, when notebook file size was 104756892 byte or 99.904 MB. And I could not save it with 89 plots. By increasing number of plots, file size increased about 1.1 MB per one plot.

I searched issue list, but could not find about this.
Is this limit intentional? Is there some work around for this problem (without removing cells from notebook)?


My environments are:

$ python -c "import IPython; print(IPython.sys_info())"
{'commit_hash': u'2d95975',
 'commit_source': 'installation',
 'default_encoding': 'UTF-8',
 'ipython_path': '/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/IPython',
 'ipython_version': '3.2.1',
 'os_name': 'posix',
 'platform': 'Darwin-13.4.0-x86_64-i386-64bit',
 'sys_executable': '/opt/local/Library/Frameworks/Python.framework/Versions/2.7/Resources/Python.app/Contents/MacOS/Python',
 'sys_platform': 'darwin',
 'sys_version': '2.7.10 (default, Aug 26 2015, 18:15:57) \n[GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)]'}

OS: Mac (OSX, 10.9.5 Mavericks)
Browser: Safari 9.0 (9537.86.1.56.2)
matplotlib python library ver. 1.4.3
numpy python library ver. 1.9.2
bokeh python library ver. 0.10.0

Notebook Bug

Most helpful comment

@takluyver I think these default limits should suffice for now. Let's close this and for reference, if any users are encountering this issue (not being able to save a notebook due to file size), you can increase the limit by editing these lines: https://github.com/jupyter/notebook/blob/master/notebook/notebookapp.py#L237-L238

All 27 comments

Hi @glider-gun. Thank you for the detailed issue report. The details are really helpful to our developers.

I believe that the scenario and error message that you are seeing is hitting the default max_body_size limitation of Tornado (which is a dependency of Jupyter notebook) https://github.com/tornadoweb/tornado/blob/eaf34865a63460cdd64abd1ae2c8835b174c6e93/tornado/http1connection.py#L537

@glider-gun I don't know if using Python 3 and a more recent version of IPython would have the same limitation. If you are able to test easily, please do. If not, no worries.

@minrk @Carreau Is there a way to workaround the default max_body_size limit by chunking the body (https://github.com/tornadoweb/tornado/blob/eaf34865a63460cdd64abd1ae2c8835b174c6e93/tornado/http1connection.py#L346) or setting a different body_size limit (https://github.com/tornadoweb/tornado/blob/eaf34865a63460cdd64abd1ae2c8835b174c6e93/tornado/http1connection.py#L324)?

We might want to look at this problem while working on https://github.com/jupyter/notebook/pull/536

Thank you for quick reply!
I tried the same thing with IPython 4.0.0 for both python 2.7.10/3.4.3.
And similarly, I could save until with 88 plots and not with 89 plots. File size for the notebooks were similar.


environments for python 2.7.10 with IPython 4.0.0:

$ python -c "import IPython; print(IPython.sys_info())"
{'commit_hash': u'f534027',
 'commit_source': 'installation',
 'default_encoding': 'UTF-8',
 'ipython_path': '/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/IPython',
 'ipython_version': '4.0.0',
 'os_name': 'posix',
 'platform': 'Darwin-13.4.0-x86_64-i386-64bit',
 'sys_executable': '/opt/local/Library/Frameworks/Python.framework/Versions/2.7/Resources/Python.app/Contents/MacOS/Python',
 'sys_platform': 'darwin',
 'sys_version': '2.7.10 (default, Aug 26 2015, 18:15:57) \n[GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)]'}

OS, browser, all library versions are same as my first comment


environments for python 3.4.3 with IPython 4.0.0:

$ python -c "import IPython; print(IPython.sys_info())"
{'commit_hash': 'f534027',
 'commit_source': 'installation',
 'default_encoding': 'UTF-8',
 'ipython_path': '/Users/glidergun/.pyenv/versions/miniconda3-3.16.0/lib/python3.4/site-packages/IPython',
 'ipython_version': '4.0.0',
 'os_name': 'posix',
 'platform': 'Darwin-13.4.0-x86_64-i386-64bit',
 'sys_executable': '/Users/glidergun/.pyenv/versions/miniconda3-3.16.0/bin/python',
 'sys_platform': 'darwin',
 'sys_version': '3.4.3 |Continuum Analytics, Inc.| (default, Oct 20 2015, '
                '14:27:51) \n'
                '[GCC 4.2.1 (Apple Inc. build 5577)]'}

(the user name part in this quote is replaced by hand)
OS, browser, all library version are same except numpy was 1.10.1

@glider-gun Thanks for the additional info. For now, I recommend keeping an eye on #536 as suggested by @Carreau.

In the interim, I wonder if saving more frequently would be a reasonable workaround to the limitation.

I see, thank you.
I'm afraid it wouldn't workaround after I noticed I exceeded the limitation, though it would save more cells from browser crash or so in that situation.

It looks like tornado imposes a maximum size of 100MB for HTTP requests by default, and I don't think we currently override that anywhere:
https://github.com/tornadoweb/tornado/blob/a97ec9569b1995d8aa3da0a7f499510bffc006a3/tornado/iostream.py#L154

In the long run, the fix will be to maintain notebook models on the server, so we don't have to send the whole notebook over HTTP at once. But we should probably increase that limit as an interim measure.

Any solution? I have a large notebook that I want to save.

@davidcortesortuno and I are also having this problem with Holoviews HoloMaps, where it's quite easy to go over 100mb.

We temporarily fixed this by modifying the tornado/iostream.py file, as suggested before by @takluyver . For example, by doing self.max_buffer_size = 1048576000

Having the same problem here...

Shall we bump the limit up for 5.0? Does anyone have a guide as to what a sensible limit might be?

You should be able to set a larger max_body_size or max_buffer_size by providing the torando_settings flag or by setting it in jupyter_notebook_config.py:

jupyter notebook --NotebookApp.tornado_settings="{'max_body_size': 104857600, 'max_buffer_size': 104857600}"

I don't have a good notebook to test with, but the rationale:

2139 should increase the buffer sizes to 512 MiB, based on nothing much. It doesn't really fix the problem, but it pushes it further out.

@Horta @ryanpepper @2426021684 Can you try starting the notebook with the --NotebookApp.tornado_settings flag to test if that resolves your issue?

jupyter notebook --NotebookApp.tornado_settings="{'max_body_size': 104857600, 'max_buffer_size': 104857600}"

@gnestor That didn't help me, I'm still getting a Request Entity Too Large message.

Oh, also I'm doing this over HTTPS if that makes a difference.

@SamuelMarks Can you try upgrading to notebook 5.0.0rc2 (pip install notebook --force-reinstall --no-deps --ignore-installed --pre) and see if the new rate limits help?

@gnestor Weird, can't get it to work at all now.

Even tried in a new virtualenv:

$ pip install --pre jupyter[all] notebook

But still getting:

$ jupyter notebook
Error executing Jupyter command 'notebook': [Errno 2] No such file or directory

Edit: wait am I meant to use python3 -m notebook now instead?

Did you try just pip install --pre notebook?

@gnestor - Okay, got it to work with latest --pre of notebook.

Same Request Entity Too Large error.

We have increased the default limit in 5.0 (#2139), and it is possible to configure a still larger size.

@gnestor, you marked this for 5.1 - what do you want to do? Bump up the default limit still further? Make it easier to configure the limit?

@takluyver I think these default limits should suffice for now. Let's close this and for reference, if any users are encountering this issue (not being able to save a notebook due to file size), you can increase the limit by editing these lines: https://github.com/jupyter/notebook/blob/master/notebook/notebookapp.py#L237-L238

For anyone else finding this before #3829 is actually merged, the only solution in this thread that actually currently works in notebook 5.6.0 is to modify the tornado/iostream.py file, as suggested before by @takluyver and @davidcortesortuno. For example, by setting self.max_buffer_size = 1048576000 (line 238).

Trying to pass the arguments to jupyter notebook when starting it doesn't work, nor does editing the notebookapp.py files.

@gnestor I've encountered a similar bug. Initially, I got the error of Request Entity Too Large though and after modifying streamio.py it worked fine for a while I was able to continue working on the notebook and saving it (even though my notebook size was over 100MB). As the notebook grew bigger (through the plotting of images) and I try to save the notebook, my browser simply crashes without logging anything in the console.

After much debugging, I believe that this is being triggered by the large amounts of images that I am saving. Once the number of images in notebook goes above a certain threshold everything shuts down without warning. Any ideas?

@kevinlu1211 Yeah, I basically have the same issue, but I think it's just due to the browser running out of memory, as chrome only allocates up to 1.8 GB of memory per tab by default. Watch the memory usage when it runs, if it dies after growing to about that size, that's probably your problem. Fortunately, you can adjust this as described here, which has thus far fixed my issue, though I suspect I will hit it again if the tab reaches ~3.5 GB.

@j-andrews7 I don't think it was my browser reaching the memory limit, but regardless I did the fix though it still didn't any other ideas?

@kevinlu1211 nope, sorry mate. Maybe try a different browser?

You could maybe try:

%matplotlib inline
plt.figure(dpi=70)

to reduce the resolution of the images?

Was this page helpful?
0 / 5 - 0 ratings