I am hoping to do a cross() transform but I wouldn't want a complete cross product - rather a jagged version instead, e.g.:
plan <- drake_plan(
s_load = target(
load_csv(group, rep),
transform = cross(
group = c("G1", "G2"),
rep = c("R1", "R2", "R3", "R4", "R5", "R6")
)
)
)
For example, my group G1 has rep R1-R6, but G2 only has R1-R4 which is missing R5-R6.
My function load_csv is searching for input files to read, in this case Gx_Ry.csv for example, but I don't have G2_R5.csv and G2_R6.csv and so it fails with files not found for those two targets.
Any recommendations would be appreciated, thanks!
Another nice one for the FAQ. Fortunately, this is straightforward if you create your own grid in advance and then use map().
library(drake)
library(tidyverse)
grid <- crossing(
group = c("G1", "G2"),
rep = c("R1", "R2", "R3", "R4", "R5", "R6")
) %>%
filter(!(group == "G2" & rep %in% c("R5", "R6")))
drake_plan(
s_load = target(
load_csv(group, rep),
transform = map(
group = !!grid$group,
rep = !!grid$rep
)
)
)
#> # A tibble: 10 x 2
#> target command
#> <chr> <chr>
#> 1 s_load_.G1._.R1. "load_csv(\"G1\", \"R1\")"
#> 2 s_load_.G1._.R2. "load_csv(\"G1\", \"R2\")"
#> 3 s_load_.G1._.R3. "load_csv(\"G1\", \"R3\")"
#> 4 s_load_.G1._.R4. "load_csv(\"G1\", \"R4\")"
#> 5 s_load_.G1._.R5. "load_csv(\"G1\", \"R5\")"
#> 6 s_load_.G1._.R6. "load_csv(\"G1\", \"R6\")"
#> 7 s_load_.G2._.R1. "load_csv(\"G2\", \"R1\")"
#> 8 s_load_.G2._.R2. "load_csv(\"G2\", \"R2\")"
#> 9 s_load_.G2._.R3. "load_csv(\"G2\", \"R3\")"
#> 10 s_load_.G2._.R4. "load_csv(\"G2\", \"R4\")"
Created on 2019-01-31 by the reprex package (v0.2.1.9000)
Nice! Thanks for the solution.
Another thought I have now is that, can I make a target that tries to find all available files, and then dynamically generate (like yield in Python perhaps) named targets accordingly?
Sounds like #685, which many people have requested. In drake the plan needs to be fully written out before you call make(), which may limit what I think you are describing.
But if the files you mention are all available before you write the plan, then yes, you can write a plan whose target names are automatically generated.
library(drake)
files <- list.files("dir")
plan <- drake_plan(s_load = target(load_csv(file), transform = map(file = !!files)))
library(drake)
library(tidyverse)
grid <- crossing(
group = c("G1", "G2"),
rep = c("R1", "R2", "R3", "R4", "R5", "R6")
) %>%
filter(!(group == "G2" & rep %in% c("R5", "R6")))
drake_plan(
s_load = target(
load_csv(group, rep),
transform = map(.data = !!grid)
)
)
#> # A tibble: 10 x 2
#> target command
#> <chr> <expr>
#> 1 s_load_.G1._.R1. load_csv("G1", "R1")
#> 2 s_load_.G1._.R2. load_csv("G1", "R2")
#> 3 s_load_.G1._.R3. load_csv("G1", "R3")
#> 4 s_load_.G1._.R4. load_csv("G1", "R4")
#> 5 s_load_.G1._.R5. load_csv("G1", "R5")
#> 6 s_load_.G1._.R6. load_csv("G1", "R6")
#> 7 s_load_.G2._.R1. load_csv("G2", "R1")
#> 8 s_load_.G2._.R2. load_csv("G2", "R2")
#> 9 s_load_.G2._.R3. load_csv("G2", "R3")
#> 10 s_load_.G2._.R4. load_csv("G2", "R4")
Created on 2019-02-07 by the reprex package (v0.2.1.9000)