Streamlit: Ability to download data from Streamlit

Created on 14 Oct 2019  Â·  23Comments  Â·  Source: streamlit/streamlit

[Edited by tvst, since we narrowed the focus of this feature request]

I'd love to be able to:

  • ingest data - csv file, etc - I see this is already being worked on.
  • do stuff with it - easy with streamlit, as users can tweak stuff till they get the results they are happy with.
  • present csv/excel file of the final (with results added) data for download - I can save this to the server using pandas for example but not sure how to present it nicely for the user to download within the streamlit app itself. It would be nice to have a streamlit widget thingamajig.

Implementation suggestion: The existing streamlit object which displays stuff on screen could have a download option for objects like dataframes. So st.write(data) or just st.dataframe(data) should have a download button somewhere.

Sort of how some of the plotting tools in python like plotly displays a download button for charts.

enhancement spec_needed

Most helpful comment

I think csv download would be ideal as a part of st.dataframe. i would really love to see a button in the top right of the dataframe display, for instance, that downloads csv.

All 23 comments

The upload part of this is being tracked under #120

Is the download part being worked on to?

I'd love to be able to:

  1. ingest data - csv file, etc - I see this is already being worked on.
  2. do stuff with it - easy with streamlit, as users can tweak stuff till they get the results they are happy with.
  3. present csv/excel file of the final (with results added) data for download - I can save this to the server using pandas for example but not sure how to present it nicely for the user to download within the streamlit app itself. It would be nice to have a streamlit widget thingamajig.

Implementation suggestion: The existing streamlit object which displays stuff on screen could have a download option for objects like dataframes. So st.write(data) or just st.dataframe(data) should have a download button somewhere.

Sort of how some of the plotting tools in python like plotly displays a download button for charts.

Since "upload" is already covered by #120 (which is currently in progress!) I'll change this issue to track the download case.

As a temporary workaround, you can patch streamlit.lib.Server.TORNADO_SETTINGS to add a "static_path" entry that points to a folder you want to serve static files from.

I made a gist here: https://gist.github.com/tconkling/1e5ead87c796a82de7fa71fcc4a74777

An alternative to @tconkling that I've been using for my prototype behind a firewall is just to run a seperate, simple fileserver before I run streamlit.

python -m  http.server 8502 --directory /app/storage/ &
streamlit run app.py

You can change the port (8502) and directory (/app/storage/) to something that suits you.

Please, see here for more info about the TORNADO_SETTINGS hack.

Another option that adds an explicit route to a static file: https://gist.github.com/monchier/b3c200a002f8030db07fa72f8827f10f

Still, in the realm of short term solutions, I like @MarcSkovMadsen file server solution better. Similarly, this can be done very efficiently with a proxy like Nginx. See some docs here.

Hi All

I’ve discovered a workaround for downloading small files from streamlit.

I’ve added it to the gallery at awesome-streamlit.org. It’s called “File Download Workaround”. You can try it out and copy the code.

file_download_workaround

Hi Marc!
Thanks for nice workaround. BTW You can add download attribute to a tag to define proper filename to avoid saving via right-click menu. Like this:

href = f'<a href="data:file/csv;base64,{b64}" download="{filename}.csv">Download CSV File</a>'

I think csv download would be ideal as a part of st.dataframe. i would really love to see a button in the top right of the dataframe display, for instance, that downloads csv.

Hey,
another simple use case for that download feature:

  • generating a plotly offline graph standalone html file and being able to download it seamlessly !

I was implementing @tconkling gist and the download was enabled. However I see this issue loading css chunk failed. The app is supposed to display a dataframe where it fails. I assumed thi was bcz, since we're using a folder named static to download the files and the css also being pointed to static, it was somehow not able to look at the css it intended to. So instead of static in the server.py, I gave it another name and it still shows the same issue.

However if I remove the patch, I dont see this issue anymore !

image

Looks like I messed up. While the server is running, I updated the config/server.py file which might have caused the issues. I've closed the server and redid everything and it worked.

Sorry for the multiple comments though. But the app randomly shows a blank page with the fix in tornado setting in Server.py for file download.

An alternative to @tconkling that I've been using for my prototype behind a firewall is just to run a seperate, simple fileserver before I run streamlit.

python -m  http.server 8502 --directory /app/storage/ &
streamlit run app.py

You can change the port (8502) and directory (/app/storage/) to something that suits you.

This is a cool hack @MarcSkovMadsen . But do you know if I deploy the app in AWS and access it from a local laptop will the file be downloaded in the local laptop or in the AWS file system?

Sorry for the multiple comments though. But the app randomly shows a blank page with the fix in tornado setting in Server.py for file download.

@vijaysaimutyala I tried @MarcSkovMadsen 's solution and it works for me when you host the app in the local computer. Yet to test if I can proceed to deploy the app in AWS and can still download the files in my local computer by accessing the app hosted in AWS.

I noticed that streamlit already serves static files from the front-end, they are in site-packages/streamlit/static. I deploy via a custom Dockerfile (base image python:38) and as a workaround added the line:

COPY custom_static/ /usr/local/lib/python3.8/site-packages/streamlit/static/custom/

This works for files that you have already ready before running the app, but probably you can also just write files to the site-packages/streamlit/static folder.

Error: Request failed with status code 400

csv size is upper 200MB, I cannot change the "server.maxUploadSize"
someone know how i can solve this problem?

Error: Request failed with status code 400

csv size is upper 200MB, I cannot change the "server.maxUploadSize"
someone know how i can solve this problem?

You can change it when running your script, e.g.:

streamlit run your_script.py --server.maxUploadSize=1028

Thanks to those in this thread!

Using an example from @pbouda, created a few tools that may help others with a visual example of usage below. Feel free to copy the functions as needed.

download_examples

(yes, until content layout is available, a hard-coded HTML table is created)

Here's a massive hack

import pathlib
import pandas as pd
import streamlit as st

# HACK This only works when we've installed streamlit with pipenv, so the
# permissions during install are the same as the running process
STREAMLIT_STATIC_PATH = pathlib.Path(st.__path__[0]) / 'static'
# We create a downloads directory within the streamlit static asset directory
# and we write output files to it
DOWNLOADS_PATH = (STREAMLIT_STATIC_PATH / "downloads")
if not DOWNLOADS_PATH.is_dir():
    DOWNLOADS_PATH.mkdir()

def main():
    st.markdown("Download from [downloads/mydata.csv](downloads/mydata.csv)")
    mydataframe = pd.DataFrame.from_dict({'col_1': [3, 2, 1, 0], 'col_2': ['a', 'b', 'c', 'd']})
    mydataframe.to_csv(str(DOWNLOADS_PATH / "mydata.csv"), index=False)

if __name__ == "__main__":
    main()

@pdxjohnny That's really awesome! Thank you. It seems to work really well on EC2. Even if I delete the file mid download (rm DOWNLOADS_PATH/file)- it continues downloading till the end. I'm working with files which are a few hundred MB.

Could you please explain how it works? I have no background understanding of this. Thanks!

@KKJSP Thanks! Glad it helps :)

First off, I wouldn't rely on the ability to be able to delete files halfway through the download. I haven't confirmed this, but what is likely going on is that streamlit is reading the entire file into memory and then sending it to the client (web browser). That may not be what's happening but it's a likely explanation for why you can delete the file mid-download and it still comes through. That seems like a side effect some something "under the hood" that may or may not be guaranteed to change or not, but I have no idea, just a word of warning.

As for how this works:

If you were to download streamlit from PyPi: https://pypi.python.org/pypi/streamlit

$ pip download streamlit --no-dependencies
Collecting streamlit
  Using cached streamlit-0.70.0-py2.py3-none-any.whl (7.4 MB)
  Saved ./streamlit-0.70.0-py2.py3-none-any.whl
Successfully downloaded streamlit

You would see the static/ directory inside it

$ python -m zipfile -l streamlit-0.70.0-py2.py3-none-any.whl | grep static/ | grep -v static/static/
streamlit/static/asset-manifest.json           2020-10-28 20:34:22        11520
streamlit/static/favicon.png                   2020-10-28 20:28:54         1019
streamlit/static/index.html                    2020-10-28 20:34:22         5039
streamlit/static/precache-manifest.0aa236b608954c9f14e8c7b2e2323b3d.js 2020-10-28 20:34:22        16909
streamlit/static/service-worker.js             2020-10-28 20:34:22         1183
streamlit/static/assets/streamlit.css          2020-10-28 20:28:54        20637
streamlit/static/vendor/viz/viz-1.8.0.min.js   2020-10-28 20:28:54      7055772
streamlit/static/vendor/viz/viz.js-LICENSE.txt 2020-10-28 20:28:54         1063

I'm pretty sure this is created by the stuff in https://github.com/streamlit/streamlit/tree/1c3a3ebce32a200c7e64c1dcc311b698d8dc1268/frontend/public but I'm not sure

You'll notice the streamlit/static/index.html file.

streamlit/static/index.html                    2020-10-28 20:34:22         5039

If you were to open your streamlit app and then open the developer tools in your bowser, you'd be looking at this index.html page. You'll notice that the paths of the static assets it loads are relative to this streamlit/static/ directory. index.html for one instance, vendor/viz/viz-1.8.0.min.js is another instance of this.

$ python -m zipfile -e streamlit-0.70.0-py2.py3-none-any.whl streamlit
$ python -c 'import sys; from bs4 import BeautifulSoup; print(BeautifulSoup(sys.stdin.read()).prettify())' < streamlit/static/index.html
<!DOCTYPE html>
<html lang="en">
 <head>
  <meta charset="utf-8"/>
  <meta content="width=device-width,initial-scale=1,shrink-to-fit=no" name="viewport"/>
  <link href="./favicon.png" rel="shortcut icon"/>
  <title>
   Streamlit
  </title>
  <script src="./vendor/viz/viz-1.8.0.min.js" type="javascript/worker">
  </script>
  <link href="./static/css/8.3bee90c5.chunk.css" rel="stylesheet"/>
  <link href="./static/css/main.b971e4ac.chunk.css" rel="stylesheet"/>
 </head>
 <body>
  <noscript>
   You need to enable JavaScript to run this app.
  </noscript>
  <div id="root">
  </div>
  <script>
   !function(e){function t(t){for(var n,c,o=t[0],d=t[1],u=t[2],i=0,s=[];i<o.length;i++)c=o[i],Object.prototype.hasOwnProperty.call(a,c)&&a[c]&&s.push(a[c][0]),a[c]=0;for(n in d)Object.prototype.hasOwnProperty.call(d,n)&&(e[n]=d[n]);for(l&&l(t);s.length;)s.shift()();return f.push.apply(f,u||[]),r()}function r(){for(var e,t=0;t<f.length;t++){for(var r=f[t],n=!0,c=1;c<r.length;c++){var d=r[c];0!==a[d]&&(n=!1)}n&&(f.splice(t--,1),e=o(o.s=r[0]))}return e}var n={},c={7:0},a={7:0},f=[];function o(t){if(n[t])return n[t].exports;var r=n[t]={i:t,l:!1,exports:{}};return e[t].call(r.exports,r,r.exports,o),r.l=!0,r.exports}o.e=function(e){var t=[];c[e]?t.push(c[e]):0!==c[e]&&{9:1,16:1,17:1,18:1,19:1,20:1,21:1,23:1,24:1,25:1,26:1,27:1,28:1,29:1}[e]&&t.push(c[e]=new Promise((function(t,r){for(var n="static/css/"+({}[e]||e)+"."+{0:"31d6cfe0",1:"31d6cfe0",2:"31d6cfe0",3:"31d6cfe0",4:"31d6cfe0",5:"31d6cfe0",9:"0a5b19c0",10:"31d6cfe0",11:"31d6cfe0",12:"31d6cfe0",13:"31d6cfe0",14:"31d6cfe0",15:"31d6cfe0",16:"2e1a471f",17:"6b25d6a7",18:"94f6e3c8",19:"50b8cd3f",20:"4dca09ae",21:"50b8cd3f",22:"31d6cfe0",23:"1f27639d",24:"1f27639d",25:"1cea3eb4",26:"7a41d28c",27:"507ca017",28:"b85c2d30",29:"2f14b019",30:"31d6cfe0",31:"31d6cfe0",32:"31d6cfe0",33:"31d6cfe0",34:"31d6cfe0",35:"31d6cfe0",36:"31d6cfe0",37:"31d6cfe0",38:"31d6cfe0",39:"31d6cfe0",40:"31d6cfe0",41:"31d6cfe0",42:"31d6cfe0",43:"31d6cfe0"}[e]+".chunk.css",a=o.p+n,f=document.getElementsByTagName("link"),d=0;d<f.length;d++){var u=(l=f[d]).getAttribute("data-href")||l.getAttribute("href");if("stylesheet"===l.rel&&(u===n||u===a))return t()}var i=document.getElementsByTagName("style");for(d=0;d<i.length;d++){var l;if((u=(l=i[d]).getAttribute("data-href"))===n||u===a)return t()}var s=document.createElement("link");s.rel="stylesheet",s.type="text/css",s.onload=t,s.onerror=function(t){var n=t&&t.target&&t.target.src||a,f=new Error("Loading CSS chunk "+e+" failed.\n("+n+")");f.code="CSS_CHUNK_LOAD_FAILED",f.request=n,delete c[e],s.parentNode.removeChild(s),r(f)},s.href=a,document.getElementsByTagName("head")[0].appendChild(s)})).then((function(){c[e]=0})));var r=a[e];if(0!==r)if(r)t.push(r[2]);else{var n=new Promise((function(t,n){r=a[e]=[t,n]}));t.push(r[2]=n);var f,d=document.createElement("script");d.charset="utf-8",d.timeout=120,o.nc&&d.setAttribute("nonce",o.nc),d.src=function(e){return o.p+"static/js/"+({}[e]||e)+"."+{0:"0994e6a9",1:"7788d4b6",2:"bd36540a",3:"5c5e0805",4:"07435a63",5:"f9605ed7",9:"436e4d1f",10:"d8bbf586",11:"e2f43c51",12:"e206a8e1",13:"1e70f6d8",14:"4706eee6",15:"5b4370f7",16:"4c7d2529",17:"dfacdaa5",18:"01773fb2",19:"50612e2e",20:"7b1db496",21:"7d3c3404",22:"7810074f",23:"08ab35df",24:"84e0df72",25:"13efdd35",26:"422a3a43",27:"38828d79",28:"fa1fdf0c",29:"a0ab0154",30:"8399286d",31:"136b4dd2",32:"a43bb98b",33:"4fe3fc67",34:"c978d2c5",35:"8b5aee2e",36:"4d3ae505",37:"e6c4ef87",38:"7ec90964",39:"cace1437",40:"63e88d75",41:"a8f86bf1",42:"0f5766e1",43:"9cd3a241"}[e]+".chunk.js"}(e);var u=new Error;f=function(t){d.onerror=d.onload=null,clearTimeout(i);var r=a[e];if(0!==r){if(r){var n=t&&("load"===t.type?"missing":t.type),c=t&&t.target&&t.target.src;u.message="Loading chunk "+e+" failed.\n("+n+": "+c+")",u.name="ChunkLoadError",u.type=n,u.request=c,r[1](u)}a[e]=void 0}};var i=setTimeout((function(){f({type:"timeout",target:d})}),12e4);d.onerror=d.onload=f,document.head.appendChild(d)}return Promise.all(t)},o.m=e,o.c=n,o.d=function(e,t,r){o.o(e,t)||Object.defineProperty(e,t,{enumerable:!0,get:r})},o.r=function(e){"undefined"!=typeof Symbol&&Symbol.toStringTag&&Object.defineProperty(e,Symbol.toStringTag,{value:"Module"}),Object.defineProperty(e,"__esModule",{value:!0})},o.t=function(e,t){if(1&t&&(e=o(e)),8&t)return e;if(4&t&&"object"==typeof e&&e&&e.__esModule)return e;var r=Object.create(null);if(o.r(r),Object.defineProperty(r,"default",{enumerable:!0,value:e}),2&t&&"string"!=typeof e)for(var n in e)o.d(r,n,function(t){return e[t]}.bind(null,n));return r},o.n=function(e){var t=e&&e.__esModule?function(){return e.default}:function(){return e};return o.d(t,"a",t),t},o.o=function(e,t){return Object.prototype.hasOwnProperty.call(e,t)},o.p="./",o.oe=function(e){throw console.error(e),e};var d=this["webpackJsonpstreamlit-browser"]=this["webpackJsonpstreamlit-browser"]||[],u=d.push.bind(d);d.push=t,d=d.slice();for(var i=0;i<d.length;i++)t(d[i]);var l=u;r()}([])
  </script>
  <script src="./static/js/8.0e8edb57.chunk.js">
  </script>
  <script src="./static/js/main.29738804.chunk.js">
  </script>
 </body>
</html>

So now we know that if we put files in the static/ directory streamlit's webserver will send those files to the client (web browser).

What we do first is find the path to where streamlit is installed by accessing the __path__ property of the streamlit module. __path__ is an array of strings which are absolute path's on the filesystem. I found that the 0ith index seems to be the directory containing the above extracted wheel when we do a pip install .... We use the pathlib module to create a Path object, the / means that we'd like to reference the 'static' directory under the directory where streamlit was installed. This is usually /something/something/pythonX.X/site-packages/streamlit.

STREAMLIT_STATIC_PATH = pathlib.Path(st.__path__[0]) / 'static'

The next thing we do is create a downloads/ directory within that top level static/ directory (there is another directory named static/ under the top level one, we've chosen not to use that one for downloads). This code won't work if you ran pip install streamlit as root (i.e. with sudo). This code is going to run when you run your streamlit app, so if you installed streamlit as root which you should not do!! then this will raise a permission related error. Do not run streamlit as root, instead use something like venv, pipenv, poetry, or even conda (so long as you didn't install it with sudo). That way when the following code runs it will run with the same privileges (user/group) as you did when you ran the pip install .... The key here is to not use sudo for install or for running streamlit, then this following will not raise a permission related error.

DOWNLOADS_PATH = (STREAMLIT_STATIC_PATH / "downloads")
if not DOWNLOADS_PATH.is_dir():
    DOWNLOADS_PATH.mkdir()

The final bit is an example of creating a markdown link which is relative to the parent directory of the index.html file. Then creating a dataframe and writing it out to disk into a file named "mydata.csv" within the downloads/ directory. Once again using pathlib.Path to create the correct file path.

    st.markdown("Download from [downloads/mydata.csv](downloads/mydata.csv)")
    mydataframe = pd.DataFrame.from_dict({'col_1': [3, 2, 1, 0], 'col_2': ['a', 'b', 'c', 'd']})
    mydataframe.to_csv(str(DOWNLOADS_PATH / "mydata.csv"), index=False)

Let me know if anything is still unclear :)

@pdxjohnny That was very clear! Thank you again.

I think I should learn html. It seems to be pretty useful when working with streamlit.

Was this page helpful?
0 / 5 - 0 ratings