Dplyr: Dbplyr filter string with grepl - translation error MS SQL Server

Created on 7 Sep 2017  路  8Comments  路  Source: tidyverse/dplyr

This is a question about dbplyr.

I normally use filter with grepl in dplyr, but when using dbplyr I get an error that grepl is not a recognized function. My guess is that it can't translate to SQL server. What is a way around this with dbplyr? The SQL I am using is MS SQL. I would create an small reprex, but I am not sure how to do this for a MS SQL environment. One solution that I got from stackoverflow was

iris %>% do(mutate(., is_setosa = grepl('set', Species)))

I was wondering if this is the most clean/native/recommended way to do this.

Most helpful comment

I stumbled across this myself and found a solution. https://stackoverflow.com/a/47198795/7416441

library(dbplyr)
library(dplyr)
library(nycflights13)

## Working chunk
con <-DBI::dbConnect(RSQLite::SQLite(), ":memory:")
DBI::dbWriteTable(con, "flights", flights)
DBI::dbGetQuery(con, "SELECT origin, flight 
                FROM flights WHERE origin like '%jf%'")
## End working chunk

# new working chunk
flights <- dplyr::tbl(con,"flights")
flights %>% 
  select(origin, flight) %>% 
  filter(origin %like% '%jf%')

All 8 comments

Here is a reproducible example of the error. It seems to persist in SQLite

```{r}
library(dbplyr)
library(nycflights13)

Working chunk

con <-DBI::dbConnect(RSQLite::SQLite(), ":memory:")
DBI::dbWriteTable(con, "flights", flights)
DBI::dbGetQuery(con, "SELECT origin, flight
FROM flights WHERE origin like '%jf%'")

End working chunk

The below code does not work

flights <- tbl(con,"flights")
flights %>%
select(origin, flight) %>%
filter(grepl('jf', origin))
```

the error is Error in rsqlite_send_query(conn@ptr, statement) : no such function: GREPL

Hi @alexhallam , you should be able to use the STRPOS MSSQL command directly in the filter() verb, here is the reference link: https://msdn.microsoft.com/en-us/library/dd355417.aspx. It will return the number of the first instance of the search term inside the field, so if it returns more than 0, then we have a match. The syntax would be something like: filter(strpos(origin, 'jf') > 0).

I have an implementation of that function wrapped in stringr::str_detect(), if you'd like to try out the PR: https://github.com/tidyverse/dbplyr/pull/35

Thanks. I gave that a shot, but I am getting an error [Microsoft][ODBC SQL Server Driver][SQL Server]'STRPOS' is not a recognized built-in function name. I am guessing that the SQL Server we are on is not the most recent version. Do you know of any other options?

@edgararuiz Er, that's not a T-SQL help page.

The equivalent to STRPOS would be CHARINDEX: https://docs.microsoft.com/en-us/sql/t-sql/functions/charindex-transact-sql

@Hong-Revo - You're right! Thank you.

Sorry @alexhallam , I looked in the wrong spot in the code and though that was the MSSQL translation. Weird that STRPOS is actually used in a different MS product.

So yes, CHARINDEX, and it's the same that I had in my PR: https://github.com/edgararuiz/dbplyr/blob/d608708027fdb0ef206b671f5623aca4e3b35a3d/R/db-odbc-mssql.R#L82-L85

CHARINDEX seems to still have a problem. I will give it another shot in a few days. Maybe I will give your PR.

I stumbled across this myself and found a solution. https://stackoverflow.com/a/47198795/7416441

library(dbplyr)
library(dplyr)
library(nycflights13)

## Working chunk
con <-DBI::dbConnect(RSQLite::SQLite(), ":memory:")
DBI::dbWriteTable(con, "flights", flights)
DBI::dbGetQuery(con, "SELECT origin, flight 
                FROM flights WHERE origin like '%jf%'")
## End working chunk

# new working chunk
flights <- dplyr::tbl(con,"flights")
flights %>% 
  select(origin, flight) %>% 
  filter(origin %like% '%jf%')

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/

Was this page helpful?
0 / 5 - 0 ratings