Shiny: File upload taking a very long time using shiny 1.3.2 and R version 3.6.0

Created on 3 Jun 2019  路  46Comments  路  Source: rstudio/shiny

Data uploaded: any data 35MB or bigger.

Code to reproduce:

options(shiny.maxRequestSize = 100*1024^2)

function(input, output, session) {

  rawdata <- shiny::eventReactive (input$inFile, {
    rdata <- (data.table::fread(input$inFile$datapath, header=input$header, sep=",", data.table = F, verbose = F))
  })

}
shiny::fluidPage(title = "Hello",
                 shiny::navbarPage("Input", id = "allResults",
                                   shiny::tabPanel(value = 'inputData', title = 'Data Import',
                                                   br(),

                                                   h4("Import data"),

                                                   shiny::fileInput(inputId = "inFile", "Choose a CSV File",
                                                                    accept = c(
                                                                      "text/csv",
                                                                      "text/comma-separated-values,text/plain",
                                                                      ".csv"
                                                                    )
                                                   ),

                                                   shiny::checkboxInput("header", "Header", TRUE)

                                   )

                 )
)
Needs Repro

Most helpful comment

I think I've tracked down a potential cause of the problem.

The serviceApp() function calls httpuv::service() here: https://github.com/rstudio/shiny/blob/178872d/R/server.R#L509

service() in turn calls run_now(all=FALSE):
https://github.com/rstudio/httpuv/blob/f109c480/R/httpuv.R#L599

With all=F, it means that each time that service() is run (and each time its caller, serviceApp(), is run), it only runs one callback in the event loop queue. Each time a chunk of data comes in for a file upload, it schedules one onBodyData callback. On my Windows machine, each chunk of data is 16kB. That means that the onBodyData callback is scheduled and executed thousands of time for the 60MB test file I've been using.

Because there's a fair amount of stuff going on in each call of serviceApp(), the whole process takes a large amount of time.


I've created a branch of httpuv that tests this theory. I don't think that it's the right long-term solution, but if it helps your case, then that gives us confidence that this theory is correct.

To install it, restart R, then run:

devtools::install_github('rstudio/httpuv@wch-run-all')

Then please try the test app again and let us know how it performs.

All 46 comments

Hm, thanks for the report, we'll try to reproduce. This reminded me of a similar thread on Community.

What platform are you running on? Can you provide the output of sessionInfo(), or even better, sessioninfo::session_info()?

Thanks, both!

Here is the output:

  • Session info ---------------------------------------------------------------------------------------------------------------------------------------------------
    setting value
    version R version 3.6.0 (2019-04-26)
    os Windows 10 x64
    system x86_64, mingw32
    ui RStudio
    language (EN)
    collate English_United States.1252
    ctype English_United States.1252
    tz America/New_York
    date 2019-06-04

  • Packages -------------------------------------------------------------------------------------------------------------------------------------------------------
    package * version date lib source
    abind * 1.4-5 2016-07-21 [2] CRAN (R 3.6.0)
    assertthat 0.2.1 2019-03-21 [2] CRAN (R 3.6.0)
    bitops 1.0-6 2013-08-17 [2] CRAN (R 3.6.0)
    brew 1.0-6 2011-04-13 [2] CRAN (R 3.6.0)
    caTools 1.17.1.2 2019-03-06 [2] CRAN (R 3.6.0)
    class 7.3-15 2019-01-01 [2] CRAN (R 3.6.0)
    cli 1.1.0 2019-03-19 [2] CRAN (R 3.6.0)
    clipr 0.6.0 2019-04-15 [2] CRAN (R 3.6.0)
    cluster * 2.0.8 2019-04-05 [2] CRAN (R 3.6.0)
    codetools 0.2-16 2018-12-24 [2] CRAN (R 3.6.0)
    colorspace 1.4-1 2019-03-18 [2] CRAN (R 3.6.0)
    crayon 1.3.4 2017-09-16 [2] CRAN (R 3.6.0)
    crosstalk 1.0.0 2016-12-21 [2] CRAN (R 3.6.0)
    data.table * 1.12.2 2019-04-07 [2] CRAN (R 3.6.0)
    data.tree * 0.7.8 2018-09-24 [2] CRAN (R 3.6.0)
    dendextend 1.10.0 2019-03-15 [2] CRAN (R 3.6.0)
    DEoptimR 1.0-8 2016-11-19 [2] CRAN (R 3.6.0)
    DiagrammeR 1.0.1 2019-04-22 [2] CRAN (R 3.6.0)
    digest 0.6.18 2018-10-10 [2] CRAN (R 3.6.0)
    diptest 0.75-7 2016-12-05 [2] CRAN (R 3.6.0)
    downloader 0.4 2015-07-09 [2] CRAN (R 3.6.0)
    dplyr 0.8.0.1 2019-02-15 [2] CRAN (R 3.6.0)
    DT 0.5 2018-11-05 [2] CRAN (R 3.6.0)
    flexmix 2.3-15 2019-02-18 [2] CRAN (R 3.6.0)
    foreach 1.4.4 2017-12-12 [2] CRAN (R 3.6.0)
    fpc 2.1-11.2 2019-04-22 [2] CRAN (R 3.6.0)
    gclus 1.3.2 2019-01-07 [2] CRAN (R 3.6.0)
    gdata 2.18.0 2017-06-06 [2] CRAN (R 3.6.0)
    ggplot2 * 3.1.1 2019-04-07 [2] CRAN (R 3.6.0)
    glue 1.3.1 2019-03-12 [2] CRAN (R 3.6.0)
    gplots 3.0.1.1 2019-01-27 [2] CRAN (R 3.6.0)
    gridExtra 2.3 2017-09-09 [2] CRAN (R 3.6.0)
    gtable 0.3.0 2019-03-25 [2] CRAN (R 3.6.0)
    gtools 3.8.1 2018-06-26 [2] CRAN (R 3.6.0)
    hms 0.4.2 2018-03-10 [2] CRAN (R 3.6.0)
    htmltools 0.3.6 2017-04-28 [2] CRAN (R 3.6.0)
    htmlwidgets 1.3 2018-09-30 [2] CRAN (R 3.6.0)
    httpuv 1.5.1 2019-04-05 [2] CRAN (R 3.6.0)
    igraph 1.2.4.1 2019-04-22 [2] CRAN (R 3.6.0)
    influenceR 0.1.0 2015-09-03 [2] CRAN (R 3.6.0)
    iterators 1.0.10 2018-07-13 [2] CRAN (R 3.6.0)
    jsonlite 1.6 2018-12-07 [2] CRAN (R 3.6.0)
    kernlab 0.9-27 2018-08-10 [2] CRAN (R 3.6.0)
    KernSmooth 2.23-15 2015-06-29 [2] CRAN (R 3.6.0)
    labeling 0.3 2014-08-23 [2] CRAN (R 3.6.0)
    later 0.8.0 2019-02-11 [2] CRAN (R 3.6.0)
    lattice 0.20-38 2018-11-04 [2] CRAN (R 3.6.0)
    lazyeval 0.2.2 2019-03-15 [2] CRAN (R 3.6.0)
    lsr * 0.5 2015-03-02 [2] CRAN (R 3.6.0)
    magrittr 1.5 2014-11-22 [2] CRAN (R 3.6.0)
    MASS 7.3-51.4 2019-03-31 [2] CRAN (R 3.6.0)
    mclust 5.4.3 2019-03-14 [2] CRAN (R 3.6.0)
    mime 0.6 2018-10-05 [2] CRAN (R 3.6.0)
    modeltools 0.2-22 2018-07-16 [2] CRAN (R 3.6.0)
    munsell 0.5.0 2018-06-12 [2] CRAN (R 3.6.0)
    mvtnorm 1.0-10 2019-03-05 [2] CRAN (R 3.6.0)
    nnet 7.3-12 2016-02-02 [2] CRAN (R 3.6.0)
    openxlsx 4.1.0 2018-05-26 [2] CRAN (R 3.6.0)
    pillar 1.3.1 2018-12-15 [2] CRAN (R 3.6.0)
    pkgconfig 2.0.2 2018-08-16 [2] CRAN (R 3.6.0)
    plotrix * 3.7-5 2019-04-07 [2] CRAN (R 3.6.0)
    plyr * 1.8.4 2016-06-08 [2] CRAN (R 3.6.0)
    prabclus 2.2-7 2019-01-17 [2] CRAN (R 3.6.0)
    promises 1.0.1 2018-04-13 [2] CRAN (R 3.6.0)
    purrr 0.3.2 2019-03-15 [2] CRAN (R 3.6.0)
    R6 2.4.0 2019-02-14 [2] CRAN (R 3.6.0)
    RColorBrewer 1.1-2 2014-12-07 [2] CRAN (R 3.6.0)
    Rcpp 1.0.1 2019-03-17 [2] CRAN (R 3.6.0)
    readr 1.3.1 2018-12-21 [2] CRAN (R 3.6.0)
    registry 0.5-1 2019-03-05 [2] CRAN (R 3.6.0)
    reshape2 * 1.4.3 2017-12-11 [2] CRAN (R 3.6.0)
    rgexf 0.15.3 2015-03-24 [2] CRAN (R 3.6.0)
    rhandsontable * 0.3.7 2018-11-20 [2] CRAN (R 3.6.0)
    rlang 0.3.4 2019-04-07 [2] CRAN (R 3.6.0)
    robustbase 0.93-4 2019-03-19 [2] CRAN (R 3.6.0)
    Rook 1.1-1 2014-10-20 [2] CRAN (R 3.6.0)
    rstudioapi 0.10 2019-03-19 [2] CRAN (R 3.6.0)
    scales 1.0.0 2018-08-09 [2] CRAN (R 3.6.0)
    seriation * 1.2-3 2018-02-05 [2] CRAN (R 3.6.0)
    sessioninfo 1.1.1 2018-11-05 [2] CRAN (R 3.6.0)
    shiny * 1.2.0 2018-11-02 [1] CRAN (R 3.6.0)
    shinyAce 0.3.3 2019-01-03 [2] CRAN (R 3.6.0)
    shinyalert 1.0 2018-02-12 [2] CRAN (R 3.6.0)
    shinyjs * 1.0 2018-01-08 [2] CRAN (R 3.6.0)
    shinythemes 1.1.2 2018-11-06 [2] CRAN (R 3.6.0)
    stringi 1.4.3 2019-03-12 [2] CRAN (R 3.6.0)
    stringr 1.4.0 2019-02-10 [2] CRAN (R 3.6.0)
    tibble 2.1.1 2019-03-16 [2] CRAN (R 3.6.0)
    tidyr 0.8.3 2019-03-01 [2] CRAN (R 3.6.0)
    tidyselect 0.2.5 2018-10-11 [2] CRAN (R 3.6.0)
    trimcluster 0.1-2.1 2018-07-20 [2] CRAN (R 3.6.0)
    TSP 1.1-6 2018-04-30 [2] CRAN (R 3.6.0)
    viridis 0.5.1 2018-03-29 [2] CRAN (R 3.6.0)
    viridisLite 0.3.0 2018-02-01 [2] CRAN (R 3.6.0)
    visNetwork 2.0.6 2019-03-26 [2] CRAN (R 3.6.0)
    whisker 0.3-2 2013-04-28 [2] CRAN (R 3.6.0)
    withr 2.1.2 2018-03-15 [2] CRAN (R 3.6.0)
    XML 3.98-1.19 2019-03-06 [2] CRAN (R 3.6.0)
    xtable 1.8-4 2019-04-21 [2] CRAN (R 3.6.0)
    yaml 2.2.0 2018-07-25 [2] CRAN (R 3.6.0)
    zip 2.0.1 2019-03-11 [2] CRAN (R 3.6.0)

[1] C:/Users/Ketty Noonan/Documents/R/win-library/3.6
[2] C:/Users/Ketty Noonan/Documents/R/R-3.6.0/library

What version of RStudio are you using?

Version 1.2.1335

A couple quick comments:

  • I'm not able to reproduce the problem. When I run this app on my Windows machine (R 3.5.1, RStudio 1.2.1290), it takes about 5-10 seconds to upload a 60MB CSV, before it says Upload Complete.
  • The way the app is structured, the eventReactive() never actually fires, because there's no observer consuming the value. You can see this by adding a line like message("hello") inside the eventReactive -- it won't print it out.
  • In the profvis output (in the Community post), it would help to have a screenshot of the flame graph with internal function calls not hidden (which they are by default). See this page for how to do that.

Other possible causes of slowness that come to mind:

  • Virus scanner
  • VPN
  • A slow disk used for the R temp directory. You can run tempdir() to see what R's temp directory is, and then try copying a file there with file.copy() to see it's slow.

Thanks, Winston.

Quick comments/questions:

  • Can you try testing using R 3.6.0 and shiny 1.3.2? My colleague couldn't reproduce when using R 3.5.0 and shiny 1.2.0 but got the same issue when using R 3.6.0 and shiny 1.3.2.
  • What's the correct way to structure the app? I use eventReactive in my other apps and the data is always read in.
  • Where can I find "Settings" to uncheck Settings -> Hide internal function calls?
  • I ran tempdir() and tried copying a file there with file.copy() and it's not slow.

What's the correct way to structure the app? I use eventReactive in my other apps and the data is always read in.

The eventReactive() needs to be used by an observer (or a renderXx function) in order for it to fire. But if you get slowness with the test app that you've provided, then the eventReactive isn't the source of the problem. For the purposes of diagnosing the problem, you should just delete it from the server function.

Where can I find "Settings" to uncheck Settings -> Hide internal function calls?

It's "Options", not "Settings" -- there's a typo in the documentation which I'll fix.

image

The flame graph and data. Let me know if there's anything else that might be helpful. Thanks!

flame
data

Oh hm, that is interesting. Do you mind saving that profile and emailing it to me, at [email protected]?

I just sent it to your e-mail address. Thanks!

Thanks for sending it. It appears that most of the time is spent by Shiny in Sys.sleep(), which means that Shiny is waiting for something to happen instead of being stuck actively doing work, but at this point I don't know what exactly the slow part is.

Thanks, Winston. Let me know if you find out more or if there's anything else I can send to you to help understand it. I will keep problem-solving it with my colleagues.

Hi,

Just FYI, I have the same problem on my side (my laptop is on Windows too).

I'll try to investigate too...

Dom

I have activated the tracing withoptions(shiny.trace = TRUE), and here is what I get :

# ...
SEND {"recalculating":{"name":"table","status":"recalculated"}}
SEND {"busy":"idle"}
SEND {"errors":{"table":{"message":"input= must be a single character string containing a file name, a system command containing at least one space, a URL starting 'http[s]://', 'ftp[s]://' or 'file://', or, the input data itself containing at least one \\n or \\r","call":["data.table::fread(input = userFile()$datapath, header = input$header, ","    sep = input$sep, dec = input$dec, encoding = input$encoding, ","    quote = input$quote, showProgress = TRUE)"],"type":null}},"values":[],"inputMessages":[]}
RECV {"method":"uploadInit","args":[[{"name":"LA_TRANSITION_ECOLOGIQUE.csv","size":121157487,"type":"application/vnd.ms-excel"}]],"tag":0}
SEND {"response":{"tag":0,"value":{"jobId":"b1b0ebacf9965a37315e05a7","uploadUrl":"session/b6638bdaf6343375d00bc4bb09be7f7a/upload/b1b0ebacf9965a37315e05a7?w="}}}
#
# at this point the blue progression bar in the interface  is 100 % (fully blue) but there sill are "moving waves" on this bar and the message "Upload complete"  has NOT appeared
#
# [... 40 seconds later]
#
# at this point the "Upload complete"  appeared
#
RECV {"method":"uploadEnd","args":["b1b0ebacf9965a37315e05a7","datafile-file"],"tag":1}
SEND {"progress":{"type":"binding","message":{"id":"table"}}}
SEND {"busy":"busy"}
SEND {"response":{"tag":1,"value":null}}
SEND {"recalculating":{"name":"table","status":"recalculating"}}

|--------------------------------------------------|
|==================================================|
# this is the print of data.table::fread, this means that 40 seconds have been spent before import...
SEND {"recalculating":{"name":"table","status":"recalculated"}}
# ...

I hope it may help to understand what it is happening...

Thanks, Dom.

Here is the output after I activated options(shiny.trace = TRUE). I also need help understanding it.

@wch , hope this provides additional information...

Listening on http://127.0.0.1:4630 SEND {"config":{"workerId":"","sessionId":"adbfd8fc76a337fa3d9e91fe7fc96313","user":null}} RECV {"method":"init","data":{"inFile:shiny.file":null,"allResults":"inputData","header":true,".clientdata_pixelratio":1.5,".clientdata_url_protocol":"http:",".clientdata_url_hostname":"127.0.0.1",".clientdata_url_port":"4630",".clientdata_url_pathname":"/",".clientdata_url_search":"",".clientdata_url_hash_initial":"",".clientdata_url_hash":"",".clientdata_singletons":"",".clientdata_allowDataUriScheme":true}} SEND {"errors":[],"values":[],"inputMessages":[]} SEND {"config":{"workerId":"","sessionId":"94ded9ff47d2e6c29c1f7d60358c0cd9","user":null}} RECV {"method":"init","data":{"inFile:shiny.file":null,"allResults":"inputData","header":true,".clientdata_pixelratio":1.5,".clientdata_url_protocol":"http:",".clientdata_url_hostname":"127.0.0.1",".clientdata_url_port":"4630",".clientdata_url_pathname":"/",".clientdata_url_search":"",".clientdata_url_hash_initial":"",".clientdata_url_hash":"",".clientdata_singletons":"",".clientdata_allowDataUriScheme":true}} SEND {"errors":[],"values":[],"inputMessages":[]}

40 seconds later

"Upload Complete" appeared

RECV {"method":"uploadInit","args":[[{"name":"ABI (clean).csv","size":33396298,"type":"application/vnd.ms-excel"}]],"tag":0} SEND {"response":{"tag":0,"value":{"jobId":"f4852167a85991c3bf080dd3","uploadUrl":"session/94ded9ff47d2e6c29c1f7d60358c0cd9/upload/f4852167a85991c3bf080dd3?w="}}} RECV {"method":"uploadEnd","args":["f4852167a85991c3bf080dd3","inFile"],"tag":1} SEND {"response":{"tag":1,"value":null}}

I have created a test application in an attempt to quantify the discrepancy between upload speeds on Windows vs on other platforms: https://github.com/alandipert/uploadtest

The test app uses a snippet of Javascript to start the timing as soon as the upload is submitted via the browser's file selection dialog, and concludes when the file is available on the server.

I have included a 60mb file in that repo (60mb.pdf) for testing.

Procedure

  1. Start the test app running
  2. Connect to the test app from the browser
  3. Upload the 60mb.pdf file
  4. Take note of the time under "Elapsed"

I restarted the app between every test.

Client/server were located on the same machine, though my Linux RStudio Pro runs in a VM on my Mac. I conducted Windows testing on a separate laptop.

Timings

Measurement in seconds, rounded to the nearest tenth of a second. Measurements in bold were conducted with the server and browser on the same machine.

|Client|Linux (VM, server)|Windows (server)|
|-|-|-|
|Mac/Firefox|1.9|8.2|
|Mac/Chrome|5.6|8.1|
|Mac/Safari|3.4|7.8|
|Windows/Firefox||4.1|
|Windows/Chrome||7.9|
|Windows/Edge||2.4|

Observations

It would appear that Chrome is the slowest way to upload a file, on both server platforms. Are uploads faster for other people on Windows if using Edge or Firefox instead of Chrome?

Hi Alan,

Based on past experience, Chrome is the slowest way to upload a file.

I ran your app on Windows. This is what I found:

  1. "Elapsed" shown on Firefox is 2.59 s; on Chrome is 2.68 s.
  2. However, the actual elapsed time on both is 3-4 minutes.

What could be going on here?

Thanks for trying that example app. When you use it, do you see the "Elapsed" measurement appear after that 3-4 minute wait? Or does it appear, and then you find yourself waiting an additional 3-4 minutes?

Hi Alan, it鈥檚 the former.

Hm: how odd. That does narrow possibilities down though, I'll keep investigating and follow up here with any discoveries.

Hi, I am facing the exact issue while using fileInput. I am building a sales dashboard using shiny. When I upload a 100 MB file on shiny server, it takes ~10-20 minutes to complete the process.

@ktanizar When you say "actual elapsed time" do you mean that:

  1. You see the progress bar animation for 3-4 minutes
  2. After 3-4 minutes you see both the "Uploaded" and "Elapsed" areas appear

Thanks in advance for clarifying.

@alandipert, thanks for looking into this.

The progress bar animation finishes in a few seconds. I see "Upload complete" (see attached) after 3-4 minutes.

Capture

Hope this clarifies!

Here's a working example app:

library(shiny)
library(data.table)
options(shiny.maxRequestSize = 100*1024^2)

server <- function(input, output, session) {
  output$txt <- renderText({
    req(input$inFile)
    message("Reading data...")
    rdata <- fread(input$inFile$datapath, header = input$header,
                   sep = ",", data.table = F, verbose = F)

    message("Done")

    paste("Number of rows: ", nrow(rdata))
  })
}

ui <- fluidPage(title = "Hello",
  navbarPage("Input", id = "allResults",
    tabPanel(value = 'inputData', title = 'Data Import',
      br(),

      h4("Import data"),

      fileInput(inputId = "inFile", "Choose a CSV File",
        accept = c(
          "text/csv",
          "text/comma-separated-values,text/plain",
          ".csv"
        )
      ),
      checkboxInput("header", "Header", TRUE),
      verbatimTextOutput("txt")

    )
  )
)

shinyApp(ui, server)

Note that I'm still unable to reproduce the problem. The output text is updated within 10 seconds of uploading the 500,000 row, 60MB file.

Here's my sessioninfo:

- Session info -------------------------------------------------------------
 setting  value                       
 version  R version 3.5.1 (2018-07-02)
 os       Windows 10 x64              
 system   x86_64, mingw32             
 ui       RStudio                     
 language (EN)                        
 collate  English_United States.1252  
 ctype    English_United States.1252  
 tz       America/Chicago             
 date     2019-08-20                  

- Packages -----------------------------------------------------------------
 package     * version    date       lib source                           
 assertthat    0.2.1      2019-03-21 [1] CRAN (R 3.5.3)                   
 cli           1.1.0      2019-03-19 [1] CRAN (R 3.5.3)                   
 crayon        1.3.4      2017-09-16 [1] CRAN (R 3.5.0)                   
 data.table  * 1.12.2     2019-04-07 [1] CRAN (R 3.5.3)                   
 digest        0.6.20     2019-07-04 [1] CRAN (R 3.5.3)                   
 fastmap       1.0.0      2019-07-28 [1] CRAN (R 3.5.3)                   
 htmltools     0.3.6      2017-04-28 [1] CRAN (R 3.5.0)                   
 httpuv        1.5.1      2019-04-05 [1] CRAN (R 3.5.3)                   
 jsonlite      1.6        2018-12-07 [1] CRAN (R 3.5.2)                   
 later         0.8.0.9003 2019-08-06 [1] Github (r-lib/later@ae297fa)     
 magrittr      1.5        2014-11-22 [1] CRAN (R 3.5.0)                   
 mime          0.7        2019-06-11 [1] CRAN (R 3.5.3)                   
 promises      1.0.1.9002 2019-08-06 [1] Github (rstudio/promises@9ebad6d)
 R6            2.4.0      2019-02-14 [1] CRAN (R 3.5.2)                   
 Rcpp          1.0.2      2019-07-25 [1] CRAN (R 3.5.3)                   
 rlang         0.4.0      2019-06-25 [1] CRAN (R 3.5.3)                   
 rstudioapi    0.10       2019-03-19 [1] CRAN (R 3.5.3)                   
 sessioninfo   1.1.1      2018-11-05 [1] CRAN (R 3.5.1)                   
 shiny       * 1.3.2.9001 2019-08-20 [1] Github (rstudio/shiny@178872d)                           
 withr         2.1.2      2018-03-15 [1] CRAN (R 3.5.0)                   
 xtable        1.8-4      2019-04-21 [1] CRAN (R 3.5.3)                   

Also tested on R 3.6.1, with the same result (~10 seconds):

- Session info -------------------------------------------------------------
 setting  value                       
 version  R version 3.6.1 (2019-07-05)
 os       Windows 10 x64              
 system   x86_64, mingw32             
 ui       RStudio                     
 language (EN)                        
 collate  English_United States.1252  
 ctype    English_United States.1252  
 tz       America/Chicago             
 date     2019-08-20                  

- Packages -----------------------------------------------------------------
 package     * version date       lib source        
 assertthat    0.2.1   2019-03-21 [1] CRAN (R 3.6.1)
 cli           1.1.0   2019-03-19 [1] CRAN (R 3.6.1)
 crayon        1.3.4   2017-09-16 [1] CRAN (R 3.6.1)
 data.table  * 1.12.2  2019-04-07 [1] CRAN (R 3.6.1)
 digest        0.6.20  2019-07-04 [1] CRAN (R 3.6.1)
 htmltools     0.3.6   2017-04-28 [1] CRAN (R 3.6.1)
 httpuv        1.5.1   2019-04-05 [1] CRAN (R 3.6.1)
 jsonlite      1.6     2018-12-07 [1] CRAN (R 3.6.1)
 later         0.8.0   2019-02-11 [1] CRAN (R 3.6.1)
 magrittr      1.5     2014-11-22 [1] CRAN (R 3.6.1)
 mime          0.7     2019-06-11 [1] CRAN (R 3.6.0)
 promises      1.0.1   2018-04-13 [1] CRAN (R 3.6.1)
 R6            2.4.0   2019-02-14 [1] CRAN (R 3.6.1)
 Rcpp          1.0.2   2019-07-25 [1] CRAN (R 3.6.1)
 rlang         0.4.0   2019-06-25 [1] CRAN (R 3.6.1)
 sessioninfo   1.1.1   2018-11-05 [1] CRAN (R 3.6.1)
 shiny       * 1.3.2   2019-04-22 [1] CRAN (R 3.6.1)
 withr         2.1.2   2018-03-15 [1] CRAN (R 3.6.1)
 xtable        1.8-4   2019-04-21 [1] CRAN (R 3.6.1)

With options(shiny.trace=T), I noticed my output is slightly different from that from @ktanizar and @dominiqueemmanuel.

In their tests, the type is "application/vnd.ms-excel".

In my test, the type of the file is "". I have that when I use either Chrome or IE11 as the browser. I don't have MS Office installed on my system.

RECV {"method":"uploadInit","args":[[{"name":"500000 Sales Records.csv","size":62402019,"type":""}]],"tag":0}
SEND {"response":{"tag":0,"value":{"jobId":"5d13ccc2f92541292d307819","uploadUrl":"session/a4541220ff91e1b87d6002cc349596cc/upload/5d13ccc2f92541292d307819?w="}}}
RECV {"method":"uploadEnd","args":["5d13ccc2f92541292d307819","inFile"],"tag":1}
SEND {"progress":{"type":"binding","message":{"id":"txt"}}}
SEND {"busy":"busy"}
SEND {"response":{"tag":1,"value":null}}
SEND {"recalculating":{"name":"txt","status":"recalculating"}}
Reading data...
Done
SEND {"recalculating":{"name":"txt","status":"recalculated"}}
SEND {"busy":"idle"}
SEND {"errors":[],"values":{"txt":"Number of rows:  500000"},"inputMessages":[]}

I wonder if there's something weird going on with virus scanning.

@ktanizar What happens if you read in the data as text instead of as CSV? Here's a modified version of the app that treats the input as plain text:

library(shiny)
options(shiny.maxRequestSize = 100*1024^2)

server <- function(input, output, session) {
  output$txt <- renderText({
    req(input$inFile)
    message("Reading data...")
    rdata <- readLines(input$inFile$datapath)
    message("Done")

    paste("Number of lines: ", length(rdata))
  })
}

ui <- fluidPage(title = "Hello",
  navbarPage("Input", id = "allResults",
    tabPanel(value = 'inputData', title = 'Data Import',
      br(),

      h4("Import data"),

      fileInput(inputId = "inFile", "Choose a text file",
        accept = c(
          "text/plain",
          ".txt"
        )
      ),
      verbatimTextOutput("txt")

    )
  )
)

shinyApp(ui, server)

On my side, there is always a delay when I upload a csv of txt file...

|Listening on http://127.0.0.1:4345
SEND {"config":{"workerId":"","sessionId":"fb92280d20c54aca3ac1fd5a5772e953","user":null}}
RECV {"method":"init","data":{"inFile:shiny.file":null,"allResults":"inputData",".clientdata_output_txt_hidden":false,".clientdata_pixelratio":2.25,".clientdata_url_protocol":"http:",".clientdata_url_hostname":"127.0.0.1",".clientdata_url_port":"4345",".clientdata_url_pathname":"/",".clientdata_url_search":"",".clientdata_url_hash_initial":"",".clientdata_url_hash":"",".clientdata_singletons":"",".clientdata_allowDataUriScheme":true}}
SEND {"busy":"busy"}
SEND {"recalculating":{"name":"txt","status":"recalculating"}}
SEND {"recalculating":{"name":"txt","status":"recalculated"}}
SEND {"busy":"idle"}
SEND {"errors":{"txt":{"message":"","call":"NULL","type":["shiny.silent.error","validation"]}},"values":[],"inputMessages":[]}
RECV {"method":"uploadInit","args":[[{"name":"LA_TRANSITION_ECOLOGIQUE.txt","size":121157487,"type":"text/plain"}]],"tag":0}
SEND {"response":{"tag":0,"value":{"jobId":"3c7b65320a118b1c3814650e","uploadUrl":"session/fb92280d20c54aca3ac1fd5a5772e953/upload/3c7b65320a118b1c3814650e?w="}}}

## relatively long delay....................................

RECV {"method":"uploadEnd","args":["3c7b65320a118b1c3814650e","inFile"],"tag":1}
SEND {"progress":{"type":"binding","message":{"id":"txt"}}}
SEND {"busy":"busy"}
SEND {"response":{"tag":1,"value":null}}
SEND {"recalculating":{"name":"txt","status":"recalculating"}}
Reading data...
Warning in readLines(input$inFile$datapath) :
  incomplete final line found on 'C:\Users\admin\AppData\Local\Temp\RtmpszBynX/3c7b65320a118b1c3814650e/0.txt'
Done
SEND {"recalculating":{"name":"txt","status":"recalculated"}}
SEND {"busy":"idle"}
SEND {"errors":[],"values":{"txt":"Number of lines:  268344"},"inputMessages":[]}

NB: my file is 120 Mo so I've changed options(shiny.maxRequestSize = 100*1024^2) to options(shiny.maxRequestSize = 200*1024^2).

@dominiqueemmanuel How many seconds is the "relatively long delay"?

I just realized that the readLines might be adding a lot of time, so here's a version that just prints out the uploaded filename. How long is the delay for this one?

library(shiny)
options(shiny.maxRequestSize = 200*1024^2)

server <- function(input, output, session) {
  output$txt <- renderText({
    req(input$inFile)
    paste("Local file: ", input$inFile$datapath)
  })
}

ui <- fluidPage(title = "Hello",
  navbarPage("Input", id = "allResults",
    tabPanel(value = 'inputData', title = 'Data Import',
      br(),

      h4("Import data"),

      fileInput(inputId = "inFile", "Choose a text file",
        accept = c(
          "text/plain",
          "*"
        )
      ),
      verbatimTextOutput("txt")
    )
  )
)

shinyApp(ui, server)

I have made two tests (on two different PCs) : it takes between 16s and 25s...

Test 1 (PC 1):

# Listening on http://127.0.0.1:3551
# SEND {"config":{"workerId":"","sessionId":"7ee57638bbff88c7e6531945ca9ef84b","user":null}}
# RECV {"method":"init","data":{"inFile:shiny.file":null,"allResults":"inputData",".clientdata_output_txt_hidden":false,".clientdata_pixelratio":1.25,".clientdata_url_protocol":"http:",".clientdata_url_hostname":"127.0.0.1",".clientdata_url_port":"3551",".clientdata_url_pathname":"/",".clientdata_url_search":"",".clientdata_url_hash_initial":"",".clientdata_url_hash":"",".clientdata_singletons":"",".clientdata_allowDataUriScheme":true}}
# SEND {"busy":"busy"}
# SEND {"recalculating":{"name":"txt","status":"recalculating"}}
# SEND {"recalculating":{"name":"txt","status":"recalculated"}}
# SEND {"busy":"idle"}
# SEND {"errors":{"txt":{"message":"","call":"NULL","type":["shiny.silent.error","validation"]}},"values":[],"inputMessages":[]}
# RECV {"method":"uploadInit","args":[[{"name":"LA_TRANSITION_ECOLOGIQUE.txt","size":121157487,"type":"text/plain"}]],"tag":0}
# SEND {"response":{"tag":0,"value":{"jobId":"d0e663312772d026149b823e","uploadUrl":"session/7ee57638bbff88c7e6531945ca9ef84b/upload/d0e663312772d026149b823e?w="}}}
####################################  25 s later ###########################################
# RECV {"method":"uploadEnd","args":["d0e663312772d026149b823e","inFile"],"tag":1}
# SEND {"progress":{"type":"binding","message":{"id":"txt"}}}
# SEND {"busy":"busy"}
# SEND {"response":{"tag":1,"value":null}}
# SEND {"recalculating":{"name":"txt","status":"recalculating"}}
# SEND {"recalculating":{"name":"txt","status":"recalculated"}}
# SEND {"busy":"idle"}
# SEND {"errors":[],"values":{"txt":"Local file:  C:\\Users\\DOMINI~1\\AppData\\Local\\Temp\\RtmpwZtC2C/d0e663312772d026149b823e/0.txt"},"inputMessages":[]}


sessionInfo()
# R version 3.5.1 (2018-07-02)
# Platform: x86_64-w64-mingw32/x64 (64-bit)
# Running under: Windows 7 x64 (build 7601) Service Pack 1
# 
# Matrix products: default
# 
# locale:
#   [1] LC_COLLATE=French_France.1252  LC_CTYPE=French_France.1252    LC_MONETARY=French_France.1252
# [4] LC_NUMERIC=C                   LC_TIME=French_France.1252    
# 
# attached base packages:
#   [1] stats     graphics  grDevices utils     datasets  methods   base     
# 
# other attached packages:
#   [1] shiny_1.3.2
# 
# loaded via a namespace (and not attached):
#   [1] compiler_3.5.1  magrittr_1.5    R6_2.4.0        promises_1.0.1  later_0.8.0     htmltools_0.3.6
# [7] tools_3.5.1     Rcpp_1.0.2      jsonlite_1.6    digest_0.6.20   xtable_1.8-4    httpuv_1.5.1   
# [13] mime_0.7        packrat_0.5.0   rlang_0.4.0 

Test 2 (PC 2):

# Listening on http://127.0.0.1:4345
# SEND {"config":{"workerId":"","sessionId":"d9a6fe8eab65b4cb5d33b2c65a648b78","user":null}}
# RECV {"method":"init","data":{"inFile:shiny.file":null,"allResults":"inputData",".clientdata_output_txt_hidden":false,".clientdata_pixelratio":2.25,".clientdata_url_protocol":"http:",".clientdata_url_hostname":"127.0.0.1",".clientdata_url_port":"4345",".clientdata_url_pathname":"/",".clientdata_url_search":"",".clientdata_url_hash_initial":"",".clientdata_url_hash":"",".clientdata_singletons":"",".clientdata_allowDataUriScheme":true}}
# SEND {"busy":"busy"}
# SEND {"recalculating":{"name":"txt","status":"recalculating"}}
# SEND {"recalculating":{"name":"txt","status":"recalculated"}}
# SEND {"busy":"idle"}
# SEND {"errors":{"txt":{"message":"","call":"NULL","type":["shiny.silent.error","validation"]}},"values":[],"inputMessages":[]}
# RECV {"method":"uploadInit","args":[[{"name":"LA_TRANSITION_ECOLOGIQUE.txt","size":121157487,"type":"text/plain"}]],"tag":0}
# SEND {"response":{"tag":0,"value":{"jobId":"f36877ed37d995618cfbd4cf","uploadUrl":"session/d9a6fe8eab65b4cb5d33b2c65a648b78/upload/f36877ed37d995618cfbd4cf?w="}}}
####################################  16 s later ###########################################
# RECV {"method":"uploadEnd","args":["f36877ed37d995618cfbd4cf","inFile"],"tag":1}
# SEND {"progress":{"type":"binding","message":{"id":"txt"}}}
# SEND {"busy":"busy"}
# SEND {"response":{"tag":1,"value":null}}
# SEND {"recalculating":{"name":"txt","status":"recalculating"}}
# SEND {"recalculating":{"name":"txt","status":"recalculated"}}
# SEND {"busy":"idle"}
# SEND {"errors":[],"values":{"txt":"Local file:  C:\\Users\\admin\\AppData\\Local\\Temp\\RtmpszBynX/f36877ed37d995618cfbd4cf/0.txt"},"inputMessages":[]}


sessionInfo()
# R version 3.5.2 (2018-12-20)
# Platform: x86_64-w64-mingw32/x64 (64-bit)
# Running under: Windows 10 x64 (build 17763)
# 
# Matrix products: default
# 
# locale:
#   [1] LC_COLLATE=French_France.1252  LC_CTYPE=French_France.1252    LC_MONETARY=French_France.1252
# [4] LC_NUMERIC=C                   LC_TIME=French_France.1252    
# 
# attached base packages:
#   [1] stats     graphics  grDevices utils     datasets  methods   base     
# 
# other attached packages:
#   [1] shiny_1.3.2
# 
# loaded via a namespace (and not attached):
#   [1] compiler_3.5.2  magrittr_1.5    R6_2.4.0        promises_1.0.1  later_0.8.0    
# [6] htmltools_0.3.6 tools_3.5.2     Rcpp_1.0.2      jsonlite_1.6    digest_0.6.20  
# [11] xtable_1.8-4    httpuv_1.5.1    mime_0.7        rlang_0.4.0    

Note that this delay (16s or 25s) happen after the file has been uploaded (this is not a delay due to the upload)...

@dominiqueemmanuel What event or information are you using to denote when the file has finished uploading? That is, how do you know it鈥檚 not a delay due to the upload?

When I run the same app and upload a 60MB file, the upload itself takes about 8 seconds. In the Network panel of Chrome, during that time, the status is (pending); after it finishes, it changes to 200.

image

Do you have a virus scanner?

Update: One of our colleagues suggested using procmon to see if any indexing or virus scanning is happening during or after the file upload.

I measure the time between t1 = the blue bar is 100% completed but is still animated (with dark blue stripes moving on it), and t2 =the animatation of the dark blue stripe is stoped and "Upload complete" is printed on the bar. The time between t1 and t2 is about 23 s...

image

Screenshot at time t1 :
image

Screenshot at time t2 :
image

And here is a complete profiling :
screenshot :
image

Json (I've changed the extension to .txt so that I can upload it):
Profile-20190822T104249.txt

I think I've tracked down a potential cause of the problem.

The serviceApp() function calls httpuv::service() here: https://github.com/rstudio/shiny/blob/178872d/R/server.R#L509

service() in turn calls run_now(all=FALSE):
https://github.com/rstudio/httpuv/blob/f109c480/R/httpuv.R#L599

With all=F, it means that each time that service() is run (and each time its caller, serviceApp(), is run), it only runs one callback in the event loop queue. Each time a chunk of data comes in for a file upload, it schedules one onBodyData callback. On my Windows machine, each chunk of data is 16kB. That means that the onBodyData callback is scheduled and executed thousands of time for the 60MB test file I've been using.

Because there's a fair amount of stuff going on in each call of serviceApp(), the whole process takes a large amount of time.


I've created a branch of httpuv that tests this theory. I don't think that it's the right long-term solution, but if it helps your case, then that gives us confidence that this theory is correct.

To install it, restart R, then run:

devtools::install_github('rstudio/httpuv@wch-run-all')

Then please try the test app again and let us know how it performs.

Update: Sorry, I forgot to push my changes to the wch-run-all branch of httpuv. If you tried installing it and it didn't help, please re-install and try again.

I think you found the solution! Thank you!!

Now the time between t2 and t1 is negligible (<< 1s) : it works perfectly now.

@wch It works now. The time to see "Upload complete" is < 1 s. Thank you!
Will this solution be implemented in next version of Shiny? How come some of my colleagues do not experience this problem? Thanks again!

We're probably going to go with a more conservative fix for the upcoming Shiny release, though it is probably not as fast. Can you please test it out and let us know how long the file upload takes?

devtools::install_github("rstudio/shiny@wch-fix-sleep")

In the future we'll have a higher-performance but more involved fix.

@wch With the new fix, it also takes < 1s to see "Upload complete."

@ktanizar, sorry I forgot to mention that you should also install the CRAN version httpuv when testing. Also, we just merged the wch-fix-sleep branch, so you can do the following:

# First, restart R
install.packages("httpuv")
devtools::install_github("rstudio/shiny")

After that, then you can test the speed. Thanks!

@wch Just tested again after installing the CRAN version of httpuv.

It still only takes 1-2 seconds. Thanks!

@ktanizar That's good to hear! Thanks for testing.

Is this issue resolved in the Shiny version 1.4.0 and httpuv version 1.5.2 as I am uploading a file from UI it is taking a lot of time for fileinput in showing that animation upload complete...

I had this issue about 6 months ago and it's not resolved.

I just tried to Upload a CSV using fileInput to our workplace ShinyApps.IO and it took 5 minutes... When run locally it takes about 10 seconds.

Is there a definitive fix for this on the horizon for the packages in question? In the meantime is there a workaround?

PS Im using httpuv 1.5.4 & shiny 1.5.0

Cheers!

@cjwoodfield If you have a paid account, can you open an issue at https://support.rstudio.com/hc/en-us/requests/new ?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

RichardHooijmaijers picture RichardHooijmaijers  路  11Comments

mschilli87 picture mschilli87  路  11Comments

dmpe picture dmpe  路  13Comments

Silentdevildoll picture Silentdevildoll  路  14Comments

skeydan picture skeydan  路  22Comments