Drake: Errors store large amounts of data

Created on 16 Jun 2020  路  9Comments  路  Source: ropensci/drake

Prework

  • [x] Read and abide by drake's code of conduct.
  • [x] Search for duplicates among the existing issues, both open and closed.
  • [ ] Advanced users: verify that the bug still persists in the current development version (i.e. remotes::install_github("ropensci/drake")) and mention the SHA-1 hash of the Git commit you install.

Description

I have an issue that may be identical to #1253. Unfortunately, I can't share the example as it is proprietary (and much like you, COVID-related work takes the time I would usually take to try to create a reprex).

Describe the bug clearly and concisely.

It comes up for me, when I'm building a relatively large plan for me (the .drake directory is ~2.6GB). At the point of using rmarkdown to build a report where much of the cache will be loaded, there was a bug in the report causing an error.

After that error, I got "Repacking large object".

When I corrected the error, it ran without issue, and there was no "repacking large object".

My guess is that I'm having the same type of issue as in #1253. I'm running drake 7.12.2.

Reproducible example

Unfortunately, I can't readily make a reprex right now. My hope is that the description above and the link to #1253 will help with some troubleshooting.

Expected result

A quick error message.

As a thought from reading #1253, it seems like the stack trace from errors along with their environments are being stored, and that could be my issue. For errors, could the default behavior be not to store the environments associated with the error?

Session info

> sessionInfo()
R version 4.0.1 (2020-06-06)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19041)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] pander_0.6.3              knitr_1.28                TopicLongTable_0.0.0.9010 arsenal_3.4.0             gdtools_0.2.2             rmarkdown_2.2            
 [7] cowplot_1.0.0             forcats_0.5.0             stringr_1.4.0             dplyr_1.0.0               purrr_0.3.4               readr_1.3.1              
[13] tidyr_1.1.0               tibble_3.0.1              tidyverse_1.3.0           drake_7.12.2              xpose_0.4.10              mrgsolve_0.10.1          
[19] Hmisc_4.4-0               ggplot2_3.3.1             Formula_1.2-3             survival_3.1-12           lattice_0.20-41           assertr_2.7              
[25] rio_0.5.16                truncnorm_1.0-8           caTools_1.18.0            bsd.report_0.0.0.9067    

loaded via a namespace (and not attached):
 [1] colorspace_1.4-1          ellipsis_0.3.1            htmlTable_1.13.3          RcppArmadillo_0.9.900.1.0 base64enc_0.1-3           fs_1.4.1                 
 [7] rstudioapi_0.11           farver_2.0.3              fansi_0.4.1               lubridate_1.7.9           xml2_1.3.2                splines_4.0.1            
[13] polyclip_1.10-0           jsonlite_1.6.1            broom_0.5.6               cluster_2.1.0             dbplyr_1.4.4              png_0.1-7                
[19] ggforce_0.3.1             compiler_4.0.1            httr_1.4.1                backports_1.1.7           assertthat_0.2.1          Matrix_1.2-18            
[25] cli_2.0.2                 tweenr_1.0.1              acepack_1.4.1             htmltools_0.4.0           prettyunits_1.1.1         tools_4.0.1              
[31] igraph_1.2.5              gtable_0.3.0              glue_1.4.1                Rcpp_1.0.4.6              cellranger_1.1.0          vctrs_0.3.0              
[37] svglite_1.2.3             nlme_3.1-148              xfun_0.14                 openxlsx_4.1.5            rvest_0.3.5               PKNCA_0.9.4              
[43] lifecycle_0.2.0           MASS_7.3-51.6             zoo_1.8-8                 scales_1.1.1              hms_0.5.3                 parallel_4.0.1           
[49] RColorBrewer_1.1-2        yaml_2.2.1                curl_4.3                  gridExtra_2.3             rpart_4.1-15              latticeExtra_0.6-29      
[55] stringi_1.4.6             highr_0.8                 checkmate_2.0.0           filelock_1.0.2            zip_2.0.4                 storr_1.2.1              
[61] rlang_0.4.6               pkgconfig_2.0.3           systemfonts_0.2.3         bitops_1.0-6              evaluate_0.14             labeling_0.3             
[67] htmlwidgets_1.5.1         tidyselect_1.1.0          magrittr_1.5              R6_2.4.1                  generics_0.0.2            base64url_1.4            
[73] txtq_0.2.0                DBI_1.1.0                 mgcv_1.8-31               pillar_1.4.4              haven_2.3.1               foreign_0.8-80           
[79] withr_2.2.0               nnet_7.3-14               modelr_0.1.8              crayon_1.3.4              utf8_1.1.4                jpeg_0.1-8.1             
[85] progress_1.2.2            grid_4.0.1                readxl_1.3.1              data.table_1.12.8         qpdf_1.1                  blob_1.2.1               
[91] reprex_0.3.0              digest_0.6.25             munsell_0.5.0             askpass_1.1   
bug

All 9 comments

Yeah, I sometimes get something like this as well when a large ggplot fails and the whole data.frame/environment is saved in the error report. I have not tested the latest version in which this is solved or mitigated, still running 7.12.0.

@vkehayas , tl;dr: Updating may fix your issue.

In #1253, it was discussed that this may be fixed in 7.12.2, but my instance of this still occurs here. For my case, it may not be a bug directly, but it may be a bottleneck for trying again in these scenarios.

This definitely sounds like #1253, which I thought I fixed. Traceback objects should no longer contain strange tagalong environments. However, there could still be another stowaway in the metadata list. How big is the list you get from diagnose(target_that_failed)? If it's bigger than a couple kilobytes, what is the inner-most large object you can find with pryr::object_size().

I watched the memory use during the process, and here is what I noticed:

  • During normal running, the memory use was hovering around 2GB which makes sense for the data sizes used.
  • After the error occurred, I saw this message:
Quitting from lines 1627-1767 (filename.Rmd) 

x fail report

When the words "x fail report" showed up on the screen, memory usage went from ~2GB to ~5GB over the course of a few seconds.

I then saw the following message:

Error: target report failed.
diagnose(report)error$message:
  Problem with `mutate()` input `figures`.
x Input `figures` must be a vector, not a `gg_list` object.
i Input `figures` is `as_gg_list(...)`.

(And some other stuff indicating the stack trace as normally shows up.)

When I run pryr::object_size(diagnose(report)), it is 1.28GB. As that is bigger than a couple kb, I investigated further. I'm showing the nesting of the object that I found:

diagnose(report): 1.28GB
 $error: 1.28 GB
    $dots: 1.28 GB
      $figures: 1.28 GB
      $captions 1.28 GB (yes, both of these are showing up as 1.28 GB after the parent showed up as the same size)

Oddly, the items within diagnose(report)$error$dots$figures (and $captions) are 56 B and 3.54 kB, not the 1.28 GB that the containing objects were. Both of these items are quosures. I can share what one looks like:

> diagnose(report)$error$dots$figures
<quosure>
expr: ^as_gg_list(pmap(.l = list(data = data, parameter = Parameter, assay = assay, allo_cl = allo_CL, allo_vc = allo_VC), .f = plotter, hline_values = adult_lines))
env:  0000023B83FA2408

It looks like the problem is the environment attached to the quosure:

> pryr::object_size(rlang::get_env(diagnose(report)$error$dots$figures))
1.28 GB

Try 2295fc97fe35c0ceab8f56d1775c508aceb147d0. I gave up on language objects and decided to just have drake store tracebacks as character vectors. That ought to do it.

Wait, I just realized the traceback isn't the problem like it was for #1253. We'll have to cull the error object in other ways too.

Try adcefba554cbde08fccf76ae63879bd753cdf74d. That dots object shouldn't show up in the error object as of https://github.com/ropensci/drake/commit/af2bbbf04003005d8e2da2908a6c0e4be7be34da.

adcefba fixes it for me!

After x fail report, there was no memory usage increase like there was last time. pryr::object_size(diagnose(report)) is 25.8 kB.

Great! Closing.

Was this page helpful?
0 / 5 - 0 ratings