The following drake plan (in its most simplified form)
p=drake_plan( file_out(paste0("/tmp", "/" , "something" )))
failed first with a cryptic error message for me:
Error: The specified pathname is not a file: /
This by itself is maybe not a big problem, but to fix it I tried to clean the cache:
drake::clean()
which resulted in *wiping my complete home folder*.
So somehow the code executed "rm -rf ~/" or similar !!!
I have not really retried it with the drake plan above, but I could reproduce the same error message...
I think it happens as well with a plan like this:
p = drake_plan( file_out("/"))
make(p)
clean()
Be carefull reproducing it, it might wipe all your files !!!
I tried to reproduce it with a different user account, but did not manage.
So the above code is "too simple" and does not reproduce it.
But I remember perfectly that it happend,
as the execution of drake::clean() took a while,
which surprised me a lot.
Switching to an other console, I saw that my home folder got completely empty.
The error message "Error: The specified pathname is not a file: /" went away after I comment out this block from my plan:
other targets ...
{
shp="yearlymeanMin.shp"
shpDir=paste0("AgrilusPlanipennis/output/",shp)
dir.create(shpDir)
st_as_sf(agroMet_mean_yearly_min.shp) %>%
select(avg_yearly_min) %>%
sf::write_sf(file_out(paste0(shpDir,"/",shp)))
zip(file_out(paste0(shpDir,".zip")),
list.files(shpDir,full.names = T),extras = "-j" )
},
other targets
so it was maybe a strange block (an unnamed expression)
I see now that the combination of paste0(..) and file_out() / file_in(..) is maybe not supported neither...
I looked at the code of drake:::clean_single_target
function (target, cache, namespaces, graph)
{
files <- character(0)
if (is_file(target)) {
files <- target
}
if (target %in% igraph::V(graph)$name) {
deps <- vertex_attr(graph = graph, name = "deps", index = target)[[1]]
files <- sort(unique(deps$file_out))
}
unlink(drake_unquote(files), recursive = TRUE, force = TRUE)
for (namespace in namespaces) {
for (key in c(target, files)) {
cache$del(key = key, namespace = namespace)
}
}
}
and I see that it removes the dependencies "recursively" and "forced".
So if somehow one of the dependencies becomes "/" .... then there might be a problem.
The documentation of unlink() suggests that it would execute even things like "rm -rf /"
It would not work completely, as a normal user has no permissions on big parts of the file system,
but it would continue without failing and finally remove all files from the disk a user has permission to....
I believe this happend exactly. Somehow "/" was calculated as a file dependency given the code above (or similar)
So sorry to hear your files were deleted!!!
In hindsight, there is no reason for recursive or force to be TRUE in that call to unlink().
In file_in(), file_out(), and knitr_in(), file paths must be given literally. (Related: #353, https://stackoverflow.com/questions/50725436/r-drake-file-out-name-with-variable.) These functions do not evaluate the code in the arguments (such as paste0()). Their purpose is only to tell drake's static code analyzer to detect string literals and treat them as files.
library(drake)
deps_code(quote(file_out(paste0("/tmp", "/" , "something" ))))
#> $file_out
#> [1] "\"/tmp\"" "\"/\"" "\"something\""
See 99239e83bf6c72c44514009727ef14e4205e4d51. This should prevent whole folders from getting wiped out.
I will diagnose "Error: The specified pathname is not a file:" when I get back to work (ref: #102), and then unless you think I missed something, consider the issue solved.
I debugged the clean_single_target function by re-executing its steps,
and I can indeed come to the point where it calls 'unlink("/",recursive=TRUE,force=TRUE)'
based on the simple plan
```
p=drake_plan( file_in(paste0("/tmp", "/" , "" )))
make(p)
Error: The specified pathname is not a file: /
In addition: Warning message:
missing input files:
graph <- read_drake_graph(cache = get_cache())
graph
IGRAPH 85defe5 DN-- 4 3 --
- attr: name (v/c), deps (v/x), trigger (v/x), file (e/n)
- edges from 85defe5 (vertex names):
[1] "" ->drake_target_1 "/" ->drake_target_1 "/tmp"->drake_target_1
igraph::V(graph)$name
[1] "\"\"" "\"/\"" "\"/tmp\"" "drake_target_1"
deps <- igraph:::vertex_attr(graph = graph, name = "deps", index = "drake_target_1")[[1]]
deps$file_in
[1] "\"/tmp\"" "\"/\"" "\"\""
files <- sort(unique(deps$file_in))
files
[1] "\"\"" "\"/\"" "\"/tmp\""
drake_unquote(files)
[1] "" "/" "/tmp"
;;;; unlink(drake_unquote(files), recursive = TRUE, force = TRUE)
```
Thanks. Because of https://github.com/ropensci/drake/commit/99239e83bf6c72c44514009727ef14e4205e4d51, development drake now uses plain unlink() in clean_single_target(). recursive and force are now FALSE.
Update: "Error: The specified pathname is not a file:" was easy to fix. Please be sure any file_in(), file_out(), and knitr_in() files are not directories.
@wlandau - firstly, thanks for the fantastic package.
Just wanted report that I had a similar issue - I will try to reproduce it on another test project.
An entire directory ( ./analysis ) was deleted after using drake::clean(garbage_collection = TRUE), which I believe is undesired behavior.
I believe that the directory was potentially declared as a dependency after I used here::here() in combination with file_in() and file_out() in a plan (experimenting whether I could establish a dependence on figures for knitr::include_graphics() in a .Rmd report).
E.g.
my_plan <- drake_plan(plot1 = ggsave( file_out(here::here("analyses",
"figures",
"test_figure.png" )),
test_plot),
plot1_knit = file_in(here::here("analyses",
"figures",
"test_figure.png" ))
)
I'll try to report back with more detail and a reprex during the weekend.
Cheers
You know what? Let's just make drake stop trying to remove file_out() files in clean(), even when garbage_collection = TRUE. The risks are not worth it, and clean() will run faster. @the-Hull, the issue should go away if you install d22e3b0c639cf2944f5d206a188a082ee814e9a9.
I very much appreciate your time and dedication to making drake such a convenient and useful package forthe community. rOpensci and drake have made my life so much easier!
You are welcome, I am so glad it is making an impact.