Dplyr: filter() with two conditions freezes R

Created on 21 Dec 2018  Â·  14Comments  Â·  Source: tidyverse/dplyr

This

iris %>% filter(Sepal.Length > 7) %>% filter(Petal.Length < 6)

works fine (gives two rows), but this one:

iris %>% filter(Sepal.Length > 7, Petal.Length < 6)

freezes R...

I'm using dplyr v0.7.99.9000, installed from github just before test.

> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.1 LTS

Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] dplyr_0.7.99.9000

loaded via a namespace (and not attached):
 [1] tidyselect_0.2.5   compiler_3.5.1     magrittr_1.5       assertthat_0.2.0   R6_2.3.0           tools_3.5.1       
 [7] pillar_1.3.1.9000  glue_1.3.0         rstudioapi_0.8     tibble_1.4.99.9006 crayon_1.3.4       Rcpp_1.0.0        
[13] packrat_0.5.0      pkgconfig_2.0.2    rlang_0.3.0.9002   purrr_0.2.5.9000  

Most helpful comment

Confirmed that this is the same symptom as reported by @DavisVaughan. The infloop has been introduced by https://github.com/r-lib/rlang/commit/6ebe1e2296513fe123e23f6a152ae33df2422903, but is ultimately due to dplyr bypassing rlang's tidy eval API.

All 14 comments

Shouldn't freeze R, but, FWIW, the filter() syntax uses an ampersand for multiple criteria as you're using it, I believe:

library(dplyr)
iris %>% filter(Sepal.Length > 7 & Petal.Length < 6)
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
#> 1          7.1           3          5.9         2.1 virginica
#> 2          7.2           3          5.8         1.6 virginica

Created on 2018-12-21 by the reprex package (v0.2.1.9000)

ampersand works. Stragne, somedays ago comma also worked fine

Works for me on a fresh install of bbd84bc. Can you please reinstall and try again?

Also works for me too. Can you run this through a reprex to make sure something else in the session is not interfering:

library(dplyr)

iris %>% 
  filter(Sepal.Length > 7, Petal.Length < 6)
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
#> 1          7.1           3          5.9         2.1 virginica
#> 2          7.2           3          5.8         1.6 virginica

iris %>% 
  filter(Sepal.Length > 7) %>% 
  filter(Petal.Length < 6)
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
#> 1          7.1           3          5.9         2.1 virginica
#> 2          7.2           3          5.8         1.6 virginica

Created on 2018-12-21 by the reprex package (v0.2.1.9000)

Still the same :( Upgraded R to 3.5.2, all packages, and nothing :( Console:

> library(dplyr)

Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

    filter, lag

The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union

> 
> iris %>% 
+   filter(Sepal.Length > 7) %>% 
+   filter(Petal.Length < 6)
  Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
1          7.1           3          5.9         2.1 virginica
2          7.2           3          5.8         1.6 virginica
> 
> iris %>% 
+   filter(Sepal.Length > 7 & Petal.Length < 6)
  Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
1          7.1           3          5.9         2.1 virginica
2          7.2           3          5.8         1.6 virginica
> 
> iris %>% 
+   filter(Sepal.Length > 7, Petal.Length < 6)

and then it stops

This reproduces on my Windows with the current master (45894214a543e63c01f66e8b1fb4cc77a5978f40).

I can reproduce with level rlang, @lionel-

I'll see if that's the same bug reported by Davis.

Confirmed that this is the same symptom as reported by @DavisVaughan. The infloop has been introduced by https://github.com/r-lib/rlang/commit/6ebe1e2296513fe123e23f6a152ae33df2422903, but is ultimately due to dplyr bypassing rlang's tidy eval API.

Thanks. Will have a look in the next few days.

I got this far: https://github.com/tidyverse/dplyr/compare/master...lionel-:fix-tidyeval-infloop

We should:

  • Not touch the parent of the mask.
  • Pass quosures to eval_tidy whole.
  • And maybe in the future: pass the caller env to eval_tidy as well. This will help with nested data masks, but no immediate action is needed.

I think the overall approach is sound since that fixes the infloop and all mutate and summarise tests pass. However I get some failures in arrange() tests and elsewhere. Romain could you take a look please?

Edit: Will need to replace the Rcpp::unwindProtect() call by Rcpp::Rcpp_fast_eval(<rlang::eval_tidy(quote(<expr>), <env>>) on R < 3.5.

I found, that R freezes in case of use top_n() also. I am guessing that the problem is on the rlang side.

The problem is on the dplyr side, see previous message.

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/

Was this page helpful?
0 / 5 - 0 ratings