drake's code of conduct.drake and most likely not a user error. (If you run into an error and do not know the cause, please submit a "Trouble" issue instead.)After running drake::expose_imports(testDrake), functions imported by testDrake from other packages seem to be lost.
Say, I wrote a package with a custom plot function, which imports some functions from a CRAN R package, e.g. ggplot2.
#' Custom plot
#' @param plotData data.frame
#' @param x character
#' @param y character
#' @return ggplot object
#' @importFrom ggplot2 ggplot aes_string geom_point theme_light
#' @export
myPlot <- function(plotData, x, y) {
ggplot(plotData, aes_string(x = x, y = y)) +
geom_point() +
theme_light(base_size = 16)
}
Now, this custom plot function runs perfectly.
library(testDrake)
myPlot(plotData = cars, x = "speed", y = "dist")
Next, I want to keep track of changes within the custom plot function which is under development. I do so via the drake::expose_imports(). However, after running this command, I get an error message that the ggplot functions could not be found anymore:
Error in ggplot(plotData, aes_string(x = x, y = y)) :
could not find function "ggplot"
library(drake)
expose_imports(testDrake)
myPlot(plotData = cars, x = "speed", y = "dist")
I also tried loading testDrake again, which results in the same error
library(drake)
expose_imports(testDrake)
library(testDrake)
myPlot(plotData = cars, x = "speed", y = "dist")
Of course, I can fix by loading ggplot2, but in practice, the package testDrake imports from more than 1 package and the number of imported packages can also change during development. So I was hoping for a neater solution.
I would expect the imported ggplot functions would not be lost.
> sessionInfo()
R version 3.6.3 (2020-02-29)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.4 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_GB.UTF-8 LC_NAME=en_GB.UTF-8
[9] LC_ADDRESS=en_GB.UTF-8 LC_TELEPHONE=en_GB.UTF-8
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=en_GB.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] drake_7.12.2 testDrake_0.0.1 rj_4.0.1-2
loaded via a namespace (and not attached):
[1] Rcpp_1.0.4.6 pillar_1.4.2 compiler_3.6.3 later_1.0.0
[5] prettyunits_1.0.2 progress_1.2.0 tools_3.6.3 digest_0.6.25
[9] rj.gd_4.0.1-0 tibble_2.1.3 gtable_0.3.0 pkgconfig_2.0.3
[13] rlang_0.4.5 igraph_1.2.4 cli_1.1.0 shiny_1.4.0
[17] filelock_1.0.2 parallel_3.6.3 fastmap_1.0.1 dplyr_0.8.3
[21] storr_1.2.1 hms_0.4.2 vctrs_0.3.1 grid_3.6.3
[25] tidyselect_0.2.5 glue_1.4.0 R6_2.4.1 base64url_1.4
[29] txtq_0.2.0 ggplot2_3.2.1 purrr_0.3.3 magrittr_1.5
[33] scales_1.0.0 backports_1.1.4 promises_1.1.0 htmltools_0.4.0
[37] assertthat_0.2.1 mime_0.9 colorspace_1.4-1 xtable_1.8-3
[41] httpuv_1.5.2 labeling_0.3 lazyeval_0.2.2 munsell_0.5.0
[45] crayon_1.3.4
An "import" in drake is not the same thing as an import in the context of package development. Here, imports are functions and other global objects drake sees in your environment and reproducibly tracks in order to decide whether targets are up to date. expose_imports() exists because functions and global objects from a package live in a different environment, so unless you use a namespaced call (e.g. testDrake::myPlot()) then drake will not see them and not react when the function changes. But if you call expose_imports(testDrake) before running the pipeline, drake will copy the functions from the package into the current environment where it can track them.
Thanks for the prompt reply. Still, I'm wondering why expose_imports() breaks my code.
This works:
library(testDrake)
myPlot(plotData = cars, x = "speed", y = "dist")
This doesn't work:
library(drake)
expose_imports(testDrake)
library(testDrake)
myPlot(plotData = cars, x = "speed", y = "dist")
Error in ggplot(plotData, aes_string(x = x, y = y)) :
could not find function "ggplot"
Which approach would you suggest if I want keep track of the functions in testDrake?
It looks like when you use expose_imports(), the functions that get sent to the global environment lose the effect of @importFrom because they are no longer in the package environment. That means you will need to use namespaced calls to external package functions. Sorry about that, this is something I did not anticipate.
Thanks for the feedback! I can confirm this approach works:
Explicitly state the package for functions from @importFrom calls.
#' Custom plot
#' @param plotData data.frame
#' @param x character
#' @param y character
#' @return ggplot object
#' @importFrom ggplot2 ggplot aes_string geom_point theme_light
#' @export
myPlot <- function(plotData, x, y) {
ggplot2::ggplot(plotData, ggplot2::aes_string(x = x, y = y)) +
ggplot2::geom_point() +
ggplot2::theme_light(base_size = 16)
}
Then, after expose_imports(), your custom functions work as before.
library(drake)
expose_imports(testDrake)
library(testDrake)
myPlot(plotData = cars, x = "speed", y = "dist")
Thanks for confirming. I just added a note in the help file of expose_imports() to point this out.
You know what? Instead of expose_imports(), maybe we should just recommend that users supply the package environment directly to the envir argument of make() and drake_config() and friends. If you put all the functions and globals in your analysis package, this should work much better. Even after all these years, it did not occur to me until just now.
Also cc @tiernanmartin.
Below, the envir = getNamespace() trick allows drake to find myPlot().
library(drake)
plan <- drake_plan(x = myPlot(plotData = cars, x = "speed", y = "dist"))
vis_drake_graph(plan, envir = getNamespace("myPackage"))

And the package code can still find ggplot2 functions.
make(plan, envir = getNamespace("myPackage"))
#> ▶ target x
readd(x)

Also cc @richardbayes
Just deprecated expose_imports() for this reason.
Thanks. If you have multiple analysis packages to track, however, I don't see how to implement it with the envir argument.
True, but I do not see this as very common. A single research project will typically have a single research compendium, and other packages should usually be locked down with renv rather than tracked by drake. Speaking broadly, packages are on the very edge of the scope of drake anyway.
This will work for my use case 👍
Thanks, Will!
This is great a great solution, @wlandau! I had problems with invalidating targets when executing my packaged drake workflow on other computers. These seem to be solved when providing the package's namespace as environment to make(). Plus the global environment isn't cluttered by expose_imports() :)
The limitation I see with envir = getNamespace("yourPackage") is it doesn't let you pull functions from a variety of sources. But I learned telegott in this post that the import package lets you do this with import::from(.into = ""). So let's spread the word. Example:
ls()
#> character(0)
import::from(dplyr, mutate, .into = "")
ls()
#> [1] "mutate"
library(drake)
plan <- drake_plan(x = mutate(mtcars, x = 1))
vis_drake_graph(plan)

Created on 2020-09-05 by the reprex package (v0.3.0)
@wlandau thanks for your answer, I provided another example which deals with self defined functions in the comment. As an alternative to import::from(.into = ""), it's possible to use import::here()
From the docs:
"The function import::here is short-hand for import::from with .into = "" which imports into the current environment."
Just added new package-tracking functionality in targets tar_option_set(imports = your_packages) (https://github.com/wlandau/targets/issues/239, https://github.com/wlandau/targets/issues/241). See https://wlandau.github.io/targets-manual/practice.html#packages for details.
Most helpful comment
You know what? Instead of
expose_imports(), maybe we should just recommend that users supply the package environment directly to theenvirargument ofmake()anddrake_config()and friends. If you put all the functions and globals in your analysis package, this should work much better. Even after all these years, it did not occur to me until just now.Also cc @tiernanmartin.
Below, the
envir = getNamespace()trick allowsdraketo findmyPlot().And the package code can still find
ggplot2functions.