One of the steps in my project requires the creation of a spatial data file in the geopackage file format (.gpkg).
Every time I make() the project plan this file gets re-written, which slows down the project considerably because it's a huge file.
This reprex shows this behavior - essentially, the nc.gpkg target is never up-to-date:
library(drake)
library(sf)
## Linking to GEOS 3.6.1, GDAL 2.2.0, proj.4 4.9.3
nc <- read_sf(system.file("shape/nc.shp", package = "sf"))
plan <- drake_plan(
nc.gpkg = overwrite_gpkg(nc, "nc.gpkg")
)
overwrite_gpkg <- function(obj, dsn) {
st_write(obj, dsn, layer_options = "OVERWRITE=true")
}
make(plan)
## cache C:\Users\lexi\AppData\Local\Temp\RtmpeUCpo1\.drake
## connect 3 imports: nc, overwrite_gpkg, plan
## connect 1 target: nc.gpkg
## Warning: missing input files:
## nc.gpkg
## check 3 items: 'nc.gpkg', nc, st_write
## Warning: File 'nc.gpkg' was built or processed,
## but the file itself does not exist.
## check 1 item: overwrite_gpkg
## check 1 item: nc.gpkg
## target nc.gpkg
## Writing layer `nc' to data source `nc.gpkg' using driver `GPKG'
## options: OVERWRITE=true
## features: 100
## fields: 14
## geometry type: Multi Polygon
# make the plan again - `nc.gpkg` _should_ be up-to-date
make(plan)
## cache C:/Users/lexi/AppData/Local/Temp/RtmpeUCpo1/.drake
## Unloading targets from environment:
## nc.gpkg
## connect 3 imports: nc, overwrite_gpkg, plan
## connect 1 target: nc.gpkg
## check 3 items: 'nc.gpkg', nc, st_write
## check 1 item: overwrite_gpkg
## check 1 item: nc.gpkg
## target nc.gpkg
## Updating layer `nc' to data source `C:\Users\lexi\AppData\Local\Temp\RtmpeUCpo1\nc.gpkg' using driver `GPKG'
## options: OVERWRITE=true
## features: 100
## fields: 14
## geometry type: Multi Polygon
# make the plan again ...
make(plan)
## cache C:/Users/lexi/AppData/Local/Temp/RtmpeUCpo1/.drake
## Unloading targets from environment:
## nc.gpkg
## connect 3 imports: nc, overwrite_gpkg, plan
## connect 1 target: nc.gpkg
## check 3 items: 'nc.gpkg', nc, st_write
## check 1 item: overwrite_gpkg
## check 1 item: nc.gpkg
## target nc.gpkg
## Updating layer `nc' to data source `C:\Users\lexi\AppData\Local\Temp\RtmpeUCpo1\nc.gpkg' using driver `GPKG'
## options: OVERWRITE=true
## features: 100
## fields: 14
## geometry type: Multi Polygon
Session info
devtools::session_info()
## Session info -------------------------------------------------------------
## setting value
## version R version 3.4.2 (2017-09-28)
## system x86_64, mingw32
## ui RTerm
## language (EN)
## collate English_United States.1252
## tz America/Los_Angeles
## date 2018-02-07
## Packages -----------------------------------------------------------------
## package * version date source
## backports 1.1.1 2017-09-25 CRAN (R 3.4.1)
## base * 3.4.2 2017-09-28 local
## class 7.3-14 2015-08-30 CRAN (R 3.4.2)
## classInt 0.1-24 2017-04-16 CRAN (R 3.4.2)
## codetools 0.2-15 2016-10-05 CRAN (R 3.4.2)
## compiler 3.4.2 2017-09-28 local
## crayon 1.3.4 2017-11-16 Github (r-lib/crayon@b5221ab)
## datasets * 3.4.2 2017-09-28 local
## DBI 0.7 2017-06-18 CRAN (R 3.4.1)
## devtools 1.13.4 2017-11-09 CRAN (R 3.4.2)
## digest 0.6.15 2018-01-28 CRAN (R 3.4.3)
## drake * 5.0.1.9001 2018-02-07 Github (ropensci/drake@bcca469)
## e1071 1.6-8 2017-02-02 CRAN (R 3.4.2)
## evaluate 0.10.1 2017-06-24 CRAN (R 3.4.2)
## formatR 1.5 2017-04-25 CRAN (R 3.4.3)
## future 1.6.2 2017-10-16 CRAN (R 3.4.3)
## future.apply 0.1.0 2018-01-15 CRAN (R 3.4.3)
## globals 0.11.0 2018-01-10 CRAN (R 3.4.3)
## graphics * 3.4.2 2017-09-28 local
## grDevices * 3.4.2 2017-09-28 local
## grid 3.4.2 2017-09-28 local
## htmltools 0.3.6 2017-04-28 CRAN (R 3.4.1)
## htmlwidgets 1.0 2018-01-20 CRAN (R 3.4.3)
## igraph 1.1.2 2017-07-21 CRAN (R 3.4.3)
## jsonlite 1.5 2017-06-01 CRAN (R 3.4.1)
## knitr 1.19 2018-01-29 CRAN (R 3.4.3)
## listenv 0.6.0 2015-12-28 CRAN (R 3.4.3)
## lubridate 1.7.1 2017-11-03 CRAN (R 3.4.2)
## magrittr 1.5 2014-11-22 CRAN (R 3.4.1)
## memoise 1.1.0 2017-04-21 CRAN (R 3.4.2)
## methods * 3.4.2 2017-09-28 local
## parallel 3.4.2 2017-09-28 local
## pillar 1.1.0 2018-01-14 CRAN (R 3.4.3)
## pkgconfig 2.0.1 2017-03-21 CRAN (R 3.4.1)
## plyr 1.8.4 2016-06-08 CRAN (R 3.4.1)
## R.methodsS3 1.7.1 2016-02-16 CRAN (R 3.4.1)
## R.oo 1.21.0 2016-11-01 CRAN (R 3.4.1)
## R.utils 2.6.0 2017-11-05 CRAN (R 3.4.2)
## R6 2.2.2 2017-06-17 CRAN (R 3.4.1)
## Rcpp 0.12.15 2018-01-20 CRAN (R 3.4.3)
## rlang 0.1.6 2017-12-21 CRAN (R 3.4.3)
## rmarkdown 1.8 2017-11-17 CRAN (R 3.4.2)
## rprojroot 1.3-2 2018-01-03 CRAN (R 3.4.3)
## sf * 0.6-1 2018-01-23 Github (r-spatial/sf@349afa8)
## stats * 3.4.2 2017-09-28 local
## storr 1.1.3 2017-12-15 CRAN (R 3.4.3)
## stringi 1.1.6 2017-11-17 CRAN (R 3.4.2)
## stringr 1.2.0 2017-02-18 CRAN (R 3.4.1)
## testthat 2.0.0 2017-12-13 CRAN (R 3.4.3)
## tibble 1.4.1.9000 2018-01-17 Github (tidyverse/tibble@64fedbd)
## tools 3.4.2 2017-09-28 local
## udunits2 0.13 2016-11-17 CRAN (R 3.4.1)
## units 0.5-1 2018-01-08 CRAN (R 3.4.3)
## utils * 3.4.2 2017-09-28 local
## visNetwork 2.0.3 2018-01-09 CRAN (R 3.4.3)
## withr 2.1.1.9000 2018-01-17 Github (jimhester/withr@df18523)
## yaml 2.1.14 2016-11-12 CRAN (R 3.4.1)
I expected this to arise sooner or later. Single quotes denote reproducibly-tracked files, and double quotes are literal strings. drake_plan() does not have total control over parsing, so it errs on the side of turning quotes into single quotes. In the plan you have, drake thinks nc.gpkg is an imported file, not an output file target.
plan <- drake_plan(
nc.gpkg = overwrite_gpkg(nc, "nc.gpkg")
)
plan
## target command
## 1 nc.gpkg overwrite_gpkg(nc, 'nc.gpkg')
vis_drake_graph(drake_config(plan)) # Squares are file targets/imports

I think what you want is this:
plan <- drake_plan(
nc.gpkg = overwrite_gpkg(nc, "nc.gpkg"),
file_targets = TRUE,
strings_in_dots = "literals"
)
plan
## target command
## 1 'nc.gpkg' overwrite_gpkg(nc, "nc.gpkg")
vis_drake_graph(drake_config(plan))

Please let me know if that works for you.
I know it's a weird interface. I did my best to document it, but there is still confusion. Things will improve in #233 and especially #232.
Your suggested fix works!
Now that I know what's going on it will be easy to avoid making the same mistake, at least until those improvements come online and it is no longer a concern.
A brief word of encouragement: I think you're providing a solution to a _major_ problem in many users' workflows. It has been a treat to watch the package go through a rapid evolution over the past few days. Keep up the great work and I'm sure you'll have many more grateful users in the coming months 馃憤
I am glad to hear the solution worked.
Your support means a lot to me. drake has been my favorite project ever since its inception, and it is so wonderful to see the uptake.
Most helpful comment
Your suggested fix works!
Now that I know what's going on it will be easy to avoid making the same mistake, at least until those improvements come online and it is no longer a concern.
A brief word of encouragement: I think you're providing a solution to a _major_ problem in many users' workflows. It has been a treat to watch the package go through a rapid evolution over the past few days. Keep up the great work and I'm sure you'll have many more grateful users in the coming months 馃憤