Streamlit: Enable file_uploader widget to provide information on file name and/ or type

Created on 26 Dec 2019  路  16Comments  路  Source: streamlit/streamlit

Problem

If you allow your user to upload multiple types of files like ["csv", "xlsx"] or ["png", "jpg"] then you have to develop some algorithm to determine the type of file your self. You don't know the file name either.

Solution

Return the filename together with the file object either by adding it as a name attribute to the BytesIO or StringIO object. Or by returning a tuple of (file, name).

Additional context

I experienced this problem when developing the file_uploader example for awesome-streamlit.org.

file_uploader

"""Streamlit v. 0.52 ships with a first version of a **file uploader** widget. You can find the
**documentation**
[here](https://streamlit.io/docs/api.html?highlight=file%20upload#streamlit.file_uploader).

For reference I've implemented an example of file upload here. It's available in the gallery at
[awesome-streamlit.org](https://awesome-streamlit.org).
"""
from enum import Enum
from io import BytesIO, StringIO
from typing import Union

import pandas as pd
import streamlit as st

STYLE = """
<style>
img {
    max-width: 100%;
}
</style>
"""

FILE_TYPES = ["csv", "py", "png", "jpg"]


class FileType(Enum):
    """Used to distinguish between file types"""

    IMAGE = "Image"
    CSV = "csv"
    PYTHON = "Python"


def get_file_type(file: Union[BytesIO, StringIO]) -> FileType:
    """The file uploader widget does not provide information on the type of file uploaded so we have
    to guess using rules or ML

    I've implemented rules for now :-)

    Arguments:
        file {Union[BytesIO, StringIO]} -- The file uploaded

    Returns:
        FileType -- A best guess of the file type
    """

    if isinstance(file, BytesIO):
        return FileType.IMAGE
    content = file.getvalue()
    if (
        content.startswith('"""')
        or "import" in content
        or "from " in content
        or "def " in content
        or "class " in content
        or "print(" in content
    ):
        return FileType.PYTHON

    return FileType.CSV


def main():
    """Run this function to display the Streamlit app"""
    st.info(__doc__)
    st.markdown(STYLE, unsafe_allow_html=True)

    file = st.file_uploader("Upload file", type=FILE_TYPES)
    show_file = st.empty()
    if not file:
        show_file.info("Please upload a file of type: " + ", ".join(FILE_TYPES))
        return

    file_type = get_file_type(file)
    if file_type == FileType.IMAGE:
        show_file.image(file)
    elif file_type == FileType.PYTHON:
        st.code(file.getvalue())
    else:
        data = pd.read_csv(file)
        st.dataframe(data.head(10))

    file.close()


main()
enhancement file_uploader

Most helpful comment

We're currently working on it and targeting early Q4!

All 16 comments

Agreed that we need some info on the uploaded file!

For just uploading a df, as a workaround, im using a simple try/except and let it throw the error in the last try:

```result = st.file_uploader("Upload", type=['csv','xls','xlsx'])

def try_read_df(f):
try:
return pd.read_csv(f)
except:
return pd.read_excel(f)

if result:
df = try_read_df(result)
st.dataframe(df)

These are all really great points. Thank you @MarcSkovMadsen and @svombergen !

For now i am just asking the input from the user itself on what type of file he is about to upload. Very much thanks @MarcSkovMadsen and @svombergen for suggesting other possible ways.

Thanks @MarcSkovMadsen to demo such excellent cases for problem & solution.
Also thanks Streamlit Develop team for create such magical utility for the community.

I am facing the similar issue write as below

Problem
can not get filename and use fileName to do next related action

Description
Some times we need filename to do next action.
Like auto mail out analysis result for admin after user upload their files.

Ex. for auto mail out result

FileName | has Outler??
file_abc.csv | yes

file_def.csv | no

to achieve the func, we need to know the filename first.
But I don't know how to get file name through st.file_uploader API.
@MarcSkovMadsen do you have any recommend hacking way to do it?

Thanks @MarcSkovMadsen to demo such excellent cases for problem & solution.
Also thanks Streamlit Develop team for create such magical utility for the community.

I am facing the similar issue write as below

Problem
can not get filename and use fileName to do next related action

Description
Some times we need filename to do next action.
Like auto mail out analysis result for admin after user upload their files.

Ex. for auto mail out result

FileName | has Outler??

file_abc.csv | yes
file_def.csv | no
to achieve the func, we need to know the filename first.
But I don't know how to get file name through st.file_uploader API.
@MarcSkovMadsen do you have any recommend hacking way to do it?

One workaround I see right now is to seperately upload the file and the provide buttons or checkboxes for "Outlier" or a text_box to provide the filename.

we kind of needed the filename earlier so forked it. let me know if you want me to create a PR or if you want to just copy and paste the minor changes.

Problem

If you allow your user to upload multiple types of files like ["csv", "xlsx"] or ["png", "jpg"] then you have to develop some algorithm to determine the type of file your self. You don't know the file name either.

Solution

Return the filename together with the file object either by adding it as a name attribute to the BytesIO or StringIO object. Or by returning a tuple of (file, name).

Additional context

I experienced this problem when developing the file_uploader example for awesome-streamlit.org.

your solution is great :)). I'm very interested in your image demo if you can share more about it. Many thanks sir

Problem

If you allow your user to upload multiple types of files like ["csv", "xlsx"] or ["png", "jpg"] then you have to develop some algorithm to determine the type of file your self. You don't know the file name either.

Solution

Return the filename together with the file object either by adding it as a name attribute to the BytesIO or StringIO object. Or by returning a tuple of (file, name).

Additional context

I experienced this problem when developing the file_uploader example for awesome-streamlit.org.

your solution is great :)). I'm very interested in your image demo if you can share more about it. Many thanks sir

Hey @tatien777, in the timeline above i have a patch that provides the file contents and file name. https://github.com/sstm2/streamlit/commit/8db66ee9bc28303d9972ae5db088547382ee86be

does sreamlit version 0.59.0 has option to retrieve filename from file_uploader(). if yes please can you tell me how to get the filename from uploaded file

Would be nice to also have a parameter, i.e. pass_filename as a bool to just pass the filename to the server. If false (the default behaviour), the file data will be passed to the server. This would allow for local applications to process the I/O of large data files. I'm suggesting this given the network error that occurs when trying to send large files through the browser to streamlit. I can create a PR for this feature if need be!

Would be nice to also have a parameter, i.e. pass_filename as a bool to just pass the filename to the server. If false (the default behaviour), the file data will be passed to the server. This would allow for local applications to process the I/O of large data files. I'm suggesting this given the network error that occurs when trying to send large files through the browser to streamlit. I can create a PR for this feature if need be!

Thanks for the suggestion @Ashton-Sidhu! Just so we can get a better understanding, could you clarify your use case for just passing the filename over? If you're just looking to get the filename, would an input field or something like this work?

Regarding the network error when uploading large files, is that the bug identified in #1522 or a different one?

does sreamlit version 0.59.0 has option to retrieve filename from file_uploader(). if yes please can you tell me how to get the filename from uploaded file

Unfortunately @MaryHelenaRose we do not have this functionality at this time. It is something that we'll definitely be doing in the next iteration.

Hi @karriebear, it stems more from Evah's issue here.

The ideal way (for my use case) was to give the option to upload a local file that can be theoretically unbound in size (realistically in the GBs) and then open the file with an engine like spark - hence the need for only the filename.

In my project I do exactly as you suggested and just use a textbox because the file could be anywhere on the host but just thought it would be a nice ( and hopefully simple ) change to be able to use the same file picker UI and return just the full path to the file on the local machine.

For reference here is the project/use case I am talking about: https://github.com/Ashton-Sidhu/sysmon-extract

@Ashton-Sidhu - thanks for the clarification.

Evah's issue has been fixed and is now available in 0.62!

Regarding your use case, we have issue #904 which looks to address this. Please feel free to follow/+1 that issue.

Would be nice to also have a parameter, i.e. pass_filename as a bool to just pass the filename to the server. If false (the default behaviour), the file data will be passed to the server. This would allow for local applications to process the I/O of large data files. I'm suggesting this given the network error that occurs when trying to send large files through the browser to streamlit. I can create a PR for this feature if need be!

Thanks for the suggestion @Ashton-Sidhu! Just so we can get a better understanding, could you clarify your use case for just passing the filename over? If you're just looking to get the filename, would an input field or something like this work?

Regarding the network error when uploading large files, is that the bug identified in #1522 or a different one?

does sreamlit version 0.59.0 has option to retrieve filename from file_uploader(). if yes please can you tell me how to get the filename from uploaded file

Unfortunately @MaryHelenaRose we do not have this functionality at this time. It is something that we'll definitely be doing in the next iteration.

Hi karriebear!
Do you know when the next iteration will be? You are doing a great job, guys! Thank you!!

We're currently working on it and targeting early Q4!

Example use case for including the filename in a file upload: creating a simple image cropping tool to preprocess images

  • user uploads "filename.jpg"
  • user interacts with Streamlit to tune parameters
  • user clicks "Save/process image"
  • image is processed and saved as "filename-crop.jpg"
Was this page helpful?
0 / 5 - 0 ratings

Related issues

andfanilo picture andfanilo  路  23Comments

DylanModesitt picture DylanModesitt  路  16Comments

tvst picture tvst  路  31Comments

naurojr picture naurojr  路  31Comments

tvst picture tvst  路  24Comments