Dplyr: Latest dev version of dbplyr `rename_all` error

Created on 28 Dec 2017  路  21Comments  路  Source: tidyverse/dplyr

I had the following query that worked fine with CRAN version of dbplyr. I then installed the dev version and it started erroring.

library(DBI)
library(odbc)
library(dbplyr)
library(tidyverse)
con<-dbConnect(odbc(),
                  driver="ODBC Driver 13 for SQL Server",
                  server="mhknbn2kdz.database.windows.net",
                  database="AdventureWorks2012",
                  uid="sqlfamily",
                  pwd="sqlf@m1ly")

con %>% 
  dbGetQuery("select * from information_schema.columns") %>% 
  rename_all(str_to_lower)

Error: table_catalog = TABLE_CATALOG must be a symbol or a string, not formula

Here's the traceback

20.
stop(cnd)
19.
.abort(text)
18.
glubort(fmt_named_calls(named_calls), ..., .envir = .envir)
17.
bad_named_calls(named_call, "must be a symbol or a string, not {actual_type}")
16.
(function (expr, name) { switch_type(expr, string = , symbol = return(as_string(expr)), language = if (is_data_pronoun(expr)) { ...
15.
mapply(FUN = f, ..., SIMPLIFY = FALSE)
14.
Map(.f, .x, .y, ...)
13.
map2(exprs, names(exprs), switch_rename)
12.
rename_vars(names(.data), !(!(!quos(...))))
11.
rename.data.frame(.tbl, !(!(!syms)))
10.
rename(.tbl, !(!(!syms)))
9.
rename_all(., str_to_lower)
8.
function_list[[k]](value)
7.
withVisible(function_list[[k]](value))
6.
freduce(value, `_function_list`)
5.
`_fseq`(`_lhs`)
4.
eval(quote(`_fseq`(`_lhs`)), env, env)
3.
eval(quote(`_fseq`(`_lhs`)), env, env)
2.
withVisible(eval(quote(`_fseq`(`_lhs`)), env, env))
1.
con %>% dbGetQuery("select * from information_schema.columns") %>% rename_all(str_to_lower)

I was connected to SQL Server but the query is generic to any db that implements information_schema - had to provide a publicly available SQL Server db as sqlite does not support information_schema

Most helpful comment

Yes, thanks @lionel- !

And thank you @stephlocke for letting us know about this issue!

Also, thank you @batpigandme for the heads up when the issue was reported!

All 21 comments

Hi @stephlocke ! thank you for reporting this, would it be possible to test it this way?

tbl(con, in_schema(information_schema.columns)) %>%
  rename_all(str_to_lower)

I just want to make sure that the issue is not due priming the dplyr code with dbGetQuery()

Also, thanks for trying out str_to_lower()!

tbl(con, in_schema(information_schema.columns))

Error in paste0(schema, ".", table) :
object 'information_schema.columns' not found

Error in paste0(schema, ".", table) : object 'information_schema.columns' not found
9.
paste0(schema, ".", table)
8.
ident_q(paste0(schema, ".", table))
7.
in_schema(information_schema.columns)
6.
as.sql(from)
5.
tbl_sql("dbi", src = src, from = from)
4.
tbl.src_dbi(dbplyr::src_dbi(src), from = from, ...)
3.
tbl(dbplyr::src_dbi(src), from = from, ...)
2.
tbl.DBIConnection(con, in_schema(information_schema.columns))
1.
tbl(con, in_schema(information_schema.columns))

Ah, looking at in_schema I tweaked the code

tbl(con, in_schema("information_schema","columns")) %>%
  rename_all(str_to_lower)

got the original error Error: table_catalog = TABLE_CATALOG must be a symbol or a string, not formula

Ok cool, thank you, I'll look into this.

Sorry for giving you the wrong code earlier :)

No worries! Great to see in_schema coming along though 馃槃

So definitely an issue with the rename_all() as this works

con %>% 
  tbl(in_schema("information_schema","columns")) %>% 
  select_all(str_to_lower) 

Ok, so I just tested this with my local SQL express and it worked for me. I just realized that rename() is actually operating in R, so the str_to_lower() operation is the actual stringr function, not the dbplyr one :)

This works for me:

tbl(con, in_schema("information_schema","columns")) %>%
  rename_all(str_to_lower) 

Can you try this?

tbl(con, in_schema("information_schema","columns")) %>%
  rename_all(str_to_lower) %>%
  show_query()

It should return something like this:

<SQL>
SELECT "TABLE_CATALOG" AS "table_catalog", "TABLE_SCHEMA" AS "table_schema", "TABLE_NAME" AS "table_name", "COLUMN_NAME" AS "column_name", "ORDINAL_POSITION" AS "ordinal_position", "COLUMN_DEFAULT" AS "column_default", "IS_NULLABLE" AS "is_nullable", "DATA_TYPE" AS "data_type", "CHARACTER_MAXIMUM_LENGTH" AS "character_maximum_length", "CHARACTER_OCTET_LENGTH" AS "character_octet_length", "NUMERIC_PRECISION" AS "numeric_precision", "NUMERIC_PRECISION_RADIX" AS "numeric_precision_radix", "NUMERIC_SCALE" AS "numeric_scale", "DATETIME_PRECISION" AS "datetime_precision", "CHARACTER_SET_CATALOG" AS "character_set_catalog", "CHARACTER_SET_SCHEMA" AS "character_set_schema", "CHARACTER_SET_NAME" AS "character_set_name", "COLLATION_CATALOG" AS "collation_catalog", "COLLATION_SCHEMA" AS "collation_schema", "COLLATION_NAME" AS "collation_name", "DOMAIN_CATALOG" AS "domain_catalog", "DOMAIN_SCHEMA" AS "domain_schema", "DOMAIN_NAME" AS "domain_name"
FROM information_schema.columns

I now suspect that we may have some package version differences, here is my sessionInfo, can you compare it w yours to make sure we're at the same levels:

R version 3.4.2 (2017-09-28)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] stringr_1.2.0     dbplyr_1.1.0.9000 odbc_1.1.3        bindrcpp_0.2      dplyr_0.7.4       lubridate_1.7.1  
[7] RSQLite_2.0      

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.14     assertthat_0.2.0 digest_0.6.12    R6_2.2.2         DBI_0.7          magrittr_1.5     rlang_0.1.4     
 [8] stringi_1.1.5    blob_1.1.0       tools_3.4.2      bit64_0.9-7      glue_1.2.0       purrr_0.2.4      bit_1.1-12      
[15] hms_0.3          yaml_2.1.15      compiler_3.4.2   pkgconfig_2.0.1  memoise_1.1.0    tidyselect_0.2.3 bindr_0.1       
[22] tibble_1.3.4  
> tbl(con, in_schema("information_schema","columns")) %>%
+   rename_all(str_to_lower) %>%
+   show_query()
<SQL>
SELECT "TABLE_CATALOG" AS "table_catalog", "TABLE_SCHEMA" AS "table_schema", "TABLE_NAME" AS "table_name", "COLUMN_NAME" AS "column_name", "ORDINAL_POSITION" AS "ordinal_position", "COLUMN_DEFAULT" AS "column_default", "IS_NULLABLE" AS "is_nullable", "DATA_TYPE" AS "data_type", "CHARACTER_MAXIMUM_LENGTH" AS "character_maximum_length", "CHARACTER_OCTET_LENGTH" AS "character_octet_length", "NUMERIC_PRECISION" AS "numeric_precision", "NUMERIC_PRECISION_RADIX" AS "numeric_precision_radix", "NUMERIC_SCALE" AS "numeric_scale", "DATETIME_PRECISION" AS "datetime_precision", "CHARACTER_SET_CATALOG" AS "character_set_catalog", "CHARACTER_SET_SCHEMA" AS "character_set_schema", "CHARACTER_SET_NAME" AS "character_set_name", "COLLATION_CATALOG" AS "collation_catalog", "COLLATION_SCHEMA" AS "collation_schema", "COLLATION_NAME" AS "collation_name", "DOMAIN_CATALOG" AS "domain_catalog", "DOMAIN_SCHEMA" AS "domain_schema", "DOMAIN_NAME" AS "domain_name"
FROM information_schema.columns
> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252 
[2] LC_CTYPE=English_United Kingdom.1252   
[3] LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C                           
[5] LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] stats     graphics 
[3] grDevices utils    
[5] datasets  methods  
[7] base     

other attached packages:
 [1] dbplyr_1.1.0.9000 
 [2] forcats_0.2.0     
 [3] stringr_1.2.0.9000
 [4] dplyr_0.7.4       
 [5] purrr_0.2.4       
 [6] readr_1.1.1       
 [7] tidyr_0.7.2       
 [8] tibble_1.3.4.9003 
 [9] ggplot2_2.2.1     
[10] tidyverse_1.2.1   
[11] odbc_1.1.3        
[12] DBI_0.7           

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.14         
 [2] cellranger_1.1.0     
 [3] pillar_0.0.0.9000    
 [4] compiler_3.4.3       
 [5] plyr_1.8.4           
 [6] bindr_0.1            
 [7] tools_3.4.3          
 [8] bit_1.1-12           
 [9] lubridate_1.7.1      
[10] jsonlite_1.5         
[11] nlme_3.1-131         
[12] gtable_0.2.0         
[13] lattice_0.20-35      
[14] pkgconfig_2.0.1      
[15] rlang_0.1.4.9000     
[16] psych_1.7.8          
[17] cli_1.0.0            
[18] rstudioapi_0.7.0-9000
[19] yaml_2.1.16          
[20] parallel_3.4.3       
[21] haven_1.1.0          
[22] bindrcpp_0.2         
[23] xml2_1.1.1           
[24] httr_1.3.1           
[25] hms_0.4.0            
[26] bit64_0.9-7          
[27] grid_3.4.3           
[28] glue_1.2.0           
[29] R6_2.2.2             
[30] readxl_1.0.0         
[31] foreign_0.8-69       
[32] modelr_0.1.1         
[33] reshape2_1.4.3       
[34] blob_1.1.0           
[35] magrittr_1.5         
[36] scales_0.5.0.9000    
[37] rvest_0.3.2          
[38] assertthat_0.2.0     
[39] mnormt_1.5-5         
[40] colorspace_1.3-2     
[41] stringi_1.1.6        
[42] lazyeval_0.2.1       
[43] munsell_0.4.3        
[44] broom_0.4.3          
[45] crayon_1.3.4  

UPDATE: tried with CRAN stringr but same problem with just the rename_all

That is so weird! I just ran your resulting SQL query on my box and it worked! Not sure what's causing it to fail on yours

Can we try tolower() instead of str_to_lower() just to confirm is not trying to use dbplyr 's version of it (grasping at straws here )

Same issue with tolower() 馃

Try the azure sql db con details at the top - perhaps it's a compatibility issue?

good idea, I tried it and it worked for me

> tbl(con, in_schema("information_schema","columns")) %>%
+   rename_all(str_to_lower) 
# Source:   lazy query [?? x 23]
# Database: Microsoft SQL Server 12.00.1000[sqlfamily@mhknbn2kdz/AdventureWorks2012]
        table_catalog table_schema      table_name       column_name ordinal_position column_default is_nullable data_type
                <chr>        <chr>           <chr>             <chr>            <int>          <chr>       <chr>     <chr>
 1 AdventureWorks2012   Production BillOfMaterials BillOfMaterialsID                1           <NA>          NO       int
 2 AdventureWorks2012   Production BillOfMaterials ProductAssemblyID                2           <NA>         YES       int
 3 AdventureWorks2012   Production BillOfMaterials       ComponentID                3           <NA>          NO       int
 4 AdventureWorks2012   Production BillOfMaterials         StartDate                4    (getdate())          NO  datetime
 5 AdventureWorks2012   Production BillOfMaterials           EndDate                5           <NA>         YES  datetime
 6 AdventureWorks2012   Production BillOfMaterials   UnitMeasureCode                6           <NA>          NO     nchar
 7 AdventureWorks2012   Production BillOfMaterials          BOMLevel                7           <NA>          NO  smallint
 8 AdventureWorks2012   Production BillOfMaterials    PerAssemblyQty                8       ((1.00))          NO   decimal
 9 AdventureWorks2012   Production BillOfMaterials      ModifiedDate                9    (getdate())          NO  datetime
10 AdventureWorks2012   Production         Culture         CultureID                1           <NA>          NO     nchar
# ... with more rows, and 15 more variables: character_maximum_length <int>, character_octet_length <int>,
#   numeric_precision <int>, numeric_precision_radix <int>, numeric_scale <int>, datetime_precision <int>,
#   character_set_catalog <chr>, character_set_schema <chr>, character_set_name <chr>, collation_catalog <chr>,
#   collation_schema <chr>, collation_name <chr>, domain_catalog <chr>, domain_schema <chr>, domain_name <chr>

Ok, it looks like you're on a dev version of rlang and I'm not, can you try again with the CRAN version?

That's the badger! Good spot - using CRAN version of rlang and everything works OK again.

I grabbed the latest github ed of rlang and that errors too. I'm not sure how much this will be an issue for others or if I just hit on a crazy edge case that will hopefully go away as the dev versions of rlang and dbplyr progress but is there anything you'd recommend me do to help the teams pinpoint and resolve the issue?

Thanks @stephlocke, I will look into this.

Thanks @lionel- and thank you @edgararuiz for picking this up so quickly and working with me on the issue!

Yes, thanks @lionel- !

And thank you @stephlocke for letting us know about this issue!

Also, thank you @batpigandme for the heads up when the issue was reported!

Accidentally opened a duplicate in #3277 . The reprex there may provide an additional path with local data_frames if that is helpful at all, though!

Thanks for building a reprex @colearendt!

Are you still seeing the problems with the current dev version of rlang and with dbplyr 1.2.0?

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/

Was this page helpful?
0 / 5 - 0 ratings