lag in dplyr clobbers lag in base R and also any lag methods. To make it worse it clobbers it with a function that works differently from lag in R reversing the notation of lead and lag.
lag in dplyr should be renamed to Lag or some other name which will not conflict with core R and other packages.
A warning is issued when dplyr is loaded but this inconsistency is so egregious that I don't think that that is sufficient.
Try the following and note the difference.
lag(ts(1:4))
library(dplyr)
lag(ts(1:4))
Note that zoo and dyn work consistently with R. lag.xts works the opposite way that R works but at least it just adds an xts method and does not interfere with other methods. dynlm provides L which is like lag but in the opposite direction which seems ok because it uses a different name.
This is a natural consequence of having lots of R packages. Just be explicit and use stats::lag or dplyr::lag
I really don't agree at all. This is not the result of many packages. It's a conflict between R out of the box and dplyr. Even if there were no other packages this problem would exist. The names in R ought to be respected.
@ggrothendieck see the discussion on the commit https://github.com/hadley/dplyr/commit/f8a46e030b7b899900f2091f41071619d0a46288 for a more thoughtful line of reasoning. FWIW, I agree with you. The least-bad solution is: don't mask generics from base packages.
This just came up yet again on stackoverflow and the situation did not even involve lag in an obvious way although dplyr's lag was the culprit.
http://stackoverflow.com/questions/36991517/retain-zoo-class-after-lapply
The comments there refer to yet another SO post where it came up too showing how prevalent this problem is.
@hadley: It is easy for me to use stats::lag and dplyr::lag in my own code. But how about the many packages that use the base R lag-function, without explicitly writing stats::lag, how do I proceed here?
dplyr every time I want to use a function from a different package with which there is a conflict? (Note: this is not possible if I use tidyverse. In this case I would have to unload tidyverse/broom).lag = function(...) { stats::lag(...) } and lag = function(...) { dplyr::lag(...) } whenever there is a conflict.stats:: to the function with which there is a conflict: trace("conflictFunction", edit=TRUE) + some search/replace?While I haven't been able to search specifically for R packages that use the (unspecified) lag-function, the 3,439 available code results give some indication of the scale of this problem. On the other hand, it is quite clear that very few packages are specific in terms of which lag function they use; stats::lag is used ~10 times and dplyr::lag is used ~7 times.
The namespaces of packages are encapsulated and fixed in advance. So other packages will always use the intended lag() function even if you attach dplyr. The packages attached with library() are sort of a namespace for your scripts (the full story is a little more complicated) so you only need to be careful about lag in your scripts.
The namespace encapsulation of packages is a very good reason to turn your utility functions into a package, this way they will aways work the same way no matter what packages you've attached to the search path.
@lionel- I am experiencing the same problems as @jsekamane . I don't really understand your explanation of namespace encapsulation, but you seem to suggest that loading dplyr will not interfere with other package's usage of lag(). That has not been my experience. I am forced to unload/load dplyr depending on the project I'm working on. I'll know right away if I've forgotten to do this because I get crash with msg: "Error: n must be a nonnegative integer scalar, not double of length 1". Aha! forgot to unload dplyr!!
I love the tidyverse but it makes my whole development environment unstable.
What can we do?
@ljanissen I'd suggest asking on https://community.rstudio.com — it's likely you'll need a solid reprex to get a good answer, and I'd bet the process of creating a reprex will help you identify exactly where the problem occurs.
@ljanissen Are these CRAN packages? CRAN forces you to import functions explicitly in order to make your package completely encapsulated but if the package isn't on CRAN this might not be the case. If lag() isn't explicitly imported then the package is relying on the interactive search path to get the definition of lag. This is very brittle. Instead it should use the roxygen directive #' @importFrom stats lag. Imported functions will never be masked by something on the search path.
Thank you @hadley and @lionel- . I believe the problem is in quantstrat. I traced it (painfully) in the debugger which is how I worked out it was lag(), but I didn't pull out the code to create a reproducible example. I found some r-forge logs from 2015 that seem to be related, and that they feel they addressed the issue, so now I'm doubting myself. I think putting together a clear example is a good idea.
@ljanissen quantstrat never calls lag(), so it cannot be the problem. quantstrat depends on blotter, which does call lag(). But blotter imports stats::lag() to avoid this issue, so that can't be the problem either. I suspect the issue is with a function you're passing into the quantstrat workflow, but it's impossible to know for sure without a reproducible example.
@joshuaulrich thanks for the clarification. Yes, I think a reprex is called for.