This may be a similar issue to #2732, which was closed. I got some feedback from Oracle folks based on the discussion there and so decided to file a new issue.
Hadley's response to #2732 was that the ROracle "hasn't supplied the necessary methods to make a full dplyr backend." He recommended working with ROracle's maintainers. Hadley's answer may still be true but the info I got back from the ROracle maintainer was that the issue was on dplyr's end.
To explain the basic issue, I'm to use tbl()
to work with tables in an Oracle database. I've used connections established with both ROracle
and odbc
and experience errors with both.
What I hope to do is this to then use dplyr's select(), filter(), etc. with the data:
tbl(con, in_schema("my_schema", "my_table"))
The error I get when attempting this is:
Error: <SQL> 'SELECT *
FROM (my_schema.my_table)
FETCH FIRST 10 ROWS ONLY '
nanodbc/nanodbc.cpp:1587: HY000: [Oracle][ODBC][Ora]ORA-00933: SQL command not properly ended
This next chunk does work and brings the query results down into R as a tibble:
my_tbl <- con %>% tbl(sql("SELECT * FROM my_schema.my_table WHERE year = 2017")) %>% collect()
Oddly though, if I leave off collect() then I get the ORA-00933 error again,
Again, I would like to use filter() and select() instead of the SQL method. I get the ORA-00933 error even using collect() with those verbs.
The take of the ROracle maintainer is that there's a modification/tranformation of the SQL statement so that the Oracle DB does not recognize it. He had worked with another user and had a trace file showing this. He says ROracle will just pass the SQL text and so did not think it was their error.
I'm not a very sophisticated user and so am not sure where to go next... If it is on ROracle's end, are there specific methods for the back-end that I could suggest they look at building?
Thanks for all your work on dplyr and dbplyr.
This is a tentative guess, but I believe that Oracle does not permit using "AS" to alias table names in SQL, and dplyr is building all it's queries assuming that that is permitted.
So, for example
"select * from my_table a"
is allowed but
"select * from my_table as a"
is not.
The issue is definitely on ROracle's side, but if they don't want to provide the necessary SQL translations we can do it. More shortly...
@nilescbn / @joranE - Would you mind connecting via ROracle
, use this code this code at the top of the script, and then try dplyr
commands again?
sql_translate_env.OraConnection <- dbplyr:::sql_translate_env.Oracle
sql_select.OraConnection <- dbplyr:::sql_select.Oracle
sql_subquery.OraConnection <- dbplyr:::sql_subquery.Oracle
Thank you for the quick replies, I appreciate it.
I did as you suggested @edgararuiz. The results were mixed.
Doing this tbl(con, in_schema("my_schema", "my_table"))
, still gives me the ORA-00933 error. Here's the specific error message:
Error in .oci.SendQuery(conn, statement, data = data, prefetch = prefetch, :
ORA-00933: SQL command not properly ended
I had some success using the dplyr
verbs though.This didn't work before but did this time:
my_tbl <- `tbl(con, in_schema("my_schema", "my_table")) %>%
filter(year == 2017) %>% collect()
As before though, if I remove the collect()
verb I get the ORA-00933 error. And as in my last post, using sql()
for the query gives the error when I don't use collect()
but works when I do use collect()
.
I wonder if this is an issue with how in_schema()
translates in Oracle. Is there a way to try it against the default schema?
@nilescbn -From looking at this article: https://docs.oracle.com/cd/B28359_01/server.111/b28310/general009.htm#ADMIN02101
Maybe passing this article will preset the default schema in your session:
ALTER SESSION SET CURRENT_SCHEMA = <schema name>
I can try that. It may take me a few days though because I have never worked with the default schema. I believe this database I'm working with provides a personalized schema for each user. And mine is empt. I can try copying something to it.
I have so little knowledge of how this all works, and so hesitate to even posit a hypothesis, yet why does the collect() function make the difference? The SQL translation seems to work fine when using collect()
.
Ok, then let's try changing the session schema using DBI
, my hope is that the subsequent commands will execute in the selected schema
DBI::dbExecute(con, "ALTER SESSION SET CURRENT_SCHEMA = <schema name>")
tbl(con, "<tablename>") %>%
filter(year == 2017) %>% collect()
I am not sure I understand the whole problem: is it an issue happening with obdc
connection AND ROracle
connection (which use OCI driver not odbc) ? or just one of them. I recently test both, they need different installation and they don't behave the same. Right now, I do not have an oracle database at hand but as soon as I have one, I will try too.
I just tried the code you sent for changing the schema @edgararuiz. Same behavior as before: it works when using collect()
and throws the ORA-00933 error without it.
And, @cderv, yes, the errors happen for me whether using odbc or ROracle. In my last two posts I was using ROrcale
. I just tried changing the schema with an odbc
connection. The error message is formated differently, but it's still an ORA-00933 error.
@cderv - Would you mind sending me what class(con)
returns for you?
I'm having the same issue. I've tried this on the standard schema, as well as with in_schema
, with the same results.
@edgararuiz : output of class(con)
below:
(More or less) reproducible example:
library(tidyverse)
source('~/credentials.R')
#> [1] "Credentials Loaded."
drv <- ROracle::Oracle()
con <- ROracle::dbConnect(
drv,
username = getOption("db.username"),
password = getOption("db.password"),
dbname = getOption("db.connectString"))
class(con)
#> [1] "OraConnection"
#> attr(,"package")
#> [1] "ROracle"
con %>% tbl("T_HTTP_STATUS")
#> Error in .oci.SendQuery(conn, statement, data = data, prefetch = prefetch, : ORA-00933: SQL command not properly ended
con %>% tbl("T_HTTP_STATUS") %>% collect()
#> Error in .oci.SendQuery(conn, statement, data = data, prefetch = prefetch, : ORA-00933: SQL command not properly ended
con %>% tbl("T_HTTP_STATUS") %>% filter(CODE > 200)
#> Error in .oci.SendQuery(conn, statement, data = data, prefetch = prefetch, : ORA-00933: SQL command not properly ended
con %>% tbl("T_HTTP_STATUS") %>% filter(CODE > 200) %>% collect()
#> Error in .oci.SendQuery(conn, statement, data = data, prefetch = prefetch, : ORA-00933: SQL command not properly ended
con %>% tbl("T_HTTP_STATUS") %>% head()
#> Error in .oci.SendQuery(conn, statement, data = data, prefetch = prefetch, : ORA-00933: SQL command not properly ended
con %>% tbl(sql("select * from T_HTTP_STATUS"))
#> Error in .oci.SendQuery(conn, statement, data = data, prefetch = prefetch, : ORA-00933: SQL command not properly ended
con %>% tbl(sql("select * from T_HTTP_STATUS")) %>% collect()
#> # A tibble: 73 x 3
#> CODE ERROR_MSG
#> <int> <chr>
#> 1 100 Continue
#> 2 101 Switching Protocols
#> 3 102 Processing
#> 4 200 OK
#> 5 201 Created
#> 6 202 Accepted
#> 7 203 Non-Authoritative Information
#> 8 204 No Content
#> 9 205 Reset Content
#> 10 206 Partial Content
#> # ... with 63 more rows, and 1 more variables: ERROR_MSG_DESC <chr>
con %>% tbl(sql("select * from T_HTTP_STATUS")) %>% head()
#> Error in .oci.SendQuery(conn, statement, data = data, prefetch = prefetch, : ORA-00907: missing right parenthesis
# edgararuiz's suggestion -------------------------------------------------
sql_translate_env.OraConnection <- dbplyr:::sql_translate_env.Oracle
sql_select.OraConnection <- dbplyr:::sql_select.Oracle
sql_subquery.OraConnection <- dbplyr:::sql_subquery.Oracle
con %>% tbl("T_HTTP_STATUS")
#> Error in .oci.SendQuery(conn, statement, data = data, prefetch = prefetch, : ORA-00933: SQL command not properly ended
con %>% tbl("T_HTTP_STATUS") %>% collect()
#> # A tibble: 73 x 3
#> CODE ERROR_MSG
#> <int> <chr>
#> 1 100 Continue
#> 2 101 Switching Protocols
#> 3 102 Processing
#> 4 200 OK
#> 5 201 Created
#> 6 202 Accepted
#> 7 203 Non-Authoritative Information
#> 8 204 No Content
#> 9 205 Reset Content
#> 10 206 Partial Content
#> # ... with 63 more rows, and 1 more variables: ERROR_MSG_DESC <chr>
con %>% tbl("T_HTTP_STATUS") %>% filter(CODE > 200)
#> Error in .oci.SendQuery(conn, statement, data = data, prefetch = prefetch, : ORA-00933: SQL command not properly ended
con %>% tbl("T_HTTP_STATUS") %>% filter(CODE > 200) %>% collect()
# A tibble: 69 x 3
#> CODE ERROR_MSG
#> <int> <chr>
#> 1 201 Created
#> 2 202 Accepted
#> 3 203 Non-Authoritative Information
#> 4 204 No Content
#> 5 205 Reset Content
#> 6 206 Partial Content
#> 7 207 Multi-Status
#> 8 208 Already Reported
#> 9 226 IM Used
#> 10 300 Multiple Choices
#> # ... with 59 more rows, and 1 more variables: ERROR_MSG_DESC <chr>
con %>% tbl("T_HTTP_STATUS") %>% head()
#> Error in .oci.SendQuery(conn, statement, data = data, prefetch = prefetch, : ORA-00933: SQL command not properly ended
con %>% tbl(sql("select * from T_HTTP_STATUS"))
#> Error in .oci.SendQuery(conn, statement, data = data, prefetch = prefetch, : ORA-00933: SQL command not properly ended
con %>% tbl(sql("select * from T_HTTP_STATUS")) %>% collect()
#> # A tibble: 73 x 3
#> CODE ERROR_MSG
#> <int> <chr>
#> 1 100 Continue
#> 2 101 Switching Protocols
#> 3 102 Processing
#> 4 200 OK
#> 5 201 Created
#> 6 202 Accepted
#> 7 203 Non-Authoritative Information
#> 8 204 No Content
#> 9 205 Reset Content
#> 10 206 Partial Content
#> # ... with 63 more rows, and 1 more variables: ERROR_MSG_DESC <chr>
con %>% tbl(sql("select * from T_HTTP_STATUS")) %>% head()
#> Error in .oci.SendQuery(conn, statement, data = data, prefetch = prefetch, : ORA-00907: missing right parenthesis
Hi @heidekrueger , thank you for the great write up!
Would it be possible to see the resulting query from one of the commands?
show_query({
con %>% tbl("T_HTTP_STATUS") %>% filter(CODE > 200)
})
@edgararuiz, here are the classes I see.
class(con)
returns:
[1] "OraConnection"
attr(,"package")
[1] "ROracle"
class(tbl(con, "<tablename>") %>% filter(year == 2017))
returns:
[1] "tbl_dbi" "tbl_sql" "tbl_lazy" "tbl"
class(tbl(con, "<tablename>") %>% filter(year == 2017)) %>% collect()
returns:
[1] "tbl_df" "tbl" "data.frame"
Using sql()
I see the same behavior as @heidekrueger. I just didn't print the tibbles.
Also, in case this might provide further clues of what's going on, I attempted a copy_to()
operation using a two column data.frame and got this error:
Error in (function (classes, fdef, mtable) : unable to find an inherited method for function ‘dbBegin’ for signature ‘"OraConnection"’
I was able to copy the same data.frame to my default schema using ROracle's dbWriteTable()
.
Ok, I think the issue with with the limit
sql translation argument, it should translate to FETCH FIRST {no} ROWS ONLY
. So anything that uses 'head()' or in a preview, like what tibble does, it fails. I need to see what this returns in your environment:
show_query(head(tbl(con, "<TABLE NAME>")))
That returns this (I left the actual table name in this time):
<SQL>
SELECT *
FROM ("COMPREHENSIVE_NPAC")
FETCH FIRST 6 ROWS ONLY
Does this statement run without an error if passed via dbGetQuery
?
DBI::dbGetQuery(SELECT *
FROM (\"COMPREHENSIVE_NPAC\")
FETCH FIRST 6 ROWS ONLY
Is this the proper syntax?:
DBI::dbGetQuery(con, "select *
from COMPREHENSIVE_NPAC
fetch FIRST 6 ROWS ONLY")
If yes, it gives me the ORA-00933 error:
Error in .oci.GetQuery(conn, statement, data = data, prefetch = prefetch, :
ORA-00933: SQL command not properly ended
The same thing happens if I use ROracle::dbGetQuery()
in place of the DBI version.
Yup, that should work. Are you on Oracle 12c? From the research I'm doing, it looks like fetch first...
will work on 12c but not in a lower version.
Does this work?
DBI::dbGetQuery(con, "select *
from COMPREHENSIVE_NPAC
LIMIT 6")
Looks like it's Oracle 11g, running select * from V$VERSION
gives me this:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
PL/SQL Release 11.2.0.4.0 - Production
"CORE 11.2.0.4.0 Production"
TNS for Linux: Version 11.2.0.4.0 - Production
NLSRTL Version 11.2.0.4.0 - Production
And nope, using LIMIT 6
instead of FETCH
throws the exact same error.
I was afraid of that, thank you so much for hanging with me on this for this long.
It looks like pre-12c limiting a query is done with a ROWNUM inside the WHERE clause.
if the following command works, I have an idea on how to handle this inside dbplyr
:
DBI::dbGetQuery(con, "SELECT *
FROM \"COMPREHENSIVE_NPAC\"
WHERE ROWNUM < 5 ")
Yes, that successfully printed the first 4 rows.
And, thanks to you for the help.
Great, I'm working on a PR that should handle backwards and forwards compatibility. Early tests are looking good. I will update you when it's ready.
Ok, tested the PR in my odbc
based connection, and it worked. I also added ROracle
support (I hope), so you shouldn't need to pass the translation lines of code I shared with you previously.
> show_query(tbl(con, "MTCARS7") %>% head)
<SQL>
SELECT * FROM (SELECT *
FROM ("MTCARS7") ) "zzz10" WHERE ROWNUM <= 6
You can install it this way:
devtools::install_github("edgararuiz/dbplyr", ref = "translation-roracle")
Great, thanks for your help!
I will test this in the morning (CEST) and report back
I tested this with ROracle
.
collect
).ORA-9033
error is gone, but they still fail with a different error:con %>% tbl("T_HTTP_STATUS")
#> Error in if (info$servername == "") "localhost" else info$servername :
#> argument is of length zero
con %>% tbl("T_HTTP_STATUS") %>% head()
#> Error in if (info$servername == "") "localhost" else info$servername :
#> argument is of length zero
show_query(tbl(con, "T_HTTP_STATUS") %>% head())
#> <SQL>
#> SELECT * FROM (SELECT *
#> FROM ("T_HTTP_STATUS") ) "zzz21" WHERE ROWNUM <= 6
ROracle::dbGetQuery(con,
'SELECT * FROM (SELECT * FROM ("T_HTTP_STATUS") ) "zzz21" WHERE ROWNUM <= 6' )
#> CODE ERROR_MSG
#> 1 100 Continue
#> 2 101 Switching Protocols
#> 3 102 Processing
#> 4 200 OK
#> 5 201 Created
#> 6 202 Accepted
It looks like the failure happens in the call to make_tbl
in the following statement
structure(list(...), class = c(subclass, "tbl"))
I don't know enough about the inner workings of dplyr, but my guess is that something is going wrong with the src
object, because ROracle::dbGetInfo(con)
has no servername
attribute:
names(ROracle::dbGetInfo(con))
#> [1] "username" "dbname" "serverVersion"
#> [4] "serverType" "resTotal" "resOpen"
#> [7] "prefetch" "bulk_read" "bulk_write"
#> [10] "stmt_cache" "results"
Running inside debugger inside structure
call from dplyr::make_tbl
:
# Running in debugger
Browse[3]> .Data
#> $src
#> Error in if (info$servername == "") "localhost" else info$servername :
#> argument is of length zero
Browse[3]> str(.Data)
#> List of 2
#> $ src:List of 2
#> ..$ con :Formal class 'OraConnection' [package "ROracle"] with 2 slots
#> .. .. ..@ handle :<externalptr>
#> .. .. ..@ timesten: logi FALSE
#> ..$ disco: NULL
#> ..- attr(*, "class")= chr [1:3] "src_dbi" "src_sql" "src"
#> $ ops:List of 2
#> ..$ x :Classes 'sql', 'character' chr "select * from T_HTTP_STATUS"
#> ..$ vars: chr [1:3] "CODE" "ERROR_MSG" "ERROR_MSG_DESC"
#> ..- attr(*, "class")= chr [1:3] "op_base_remote" "op_base" "op"
Outstanding work, thank you @heidekrueger . I removed the custom db_desc()
pointer, please try again and let me know. Thank you!
Looking good!
con %>% tbl("T_HTTP_STATUS")
%> # Source: table<T_HTTP_STATUS> [?? x 3]
%> # Database: OraConnection
%> CODE ERROR_MSG
%> <int> <chr>
%> 1 100 Continue
%> 2 101 Switching Protocols
%> 3 102 Processing
%> 4 200 OK
%> 5 201 Created
%> 6 202 Accepted
%> 7 203 Non-Authoritative Information
%> 8 204 No Content
%> 9 205 Reset Content
%> 10 206 Partial Content
%> # ... with more rows, and 1 more variables: ERROR_MSG_DESC <chr>
con %>% tbl("T_HTTP_STATUS") %>% filter(CODE > 200)
%> # Source: lazy query [?? x 3]
%> # Database: OraConnection
%> CODE ERROR_MSG
%> <int> <chr>
%> 1 201 Created
%> 2 202 Accepted
%> 3 203 Non-Authoritative Information
%> 4 204 No Content
%> 5 205 Reset Content
%> 6 206 Partial Content
%> 7 207 Multi-Status
%> 8 208 Already Reported
%> 9 226 IM Used
%> 10 300 Multiple Choices
%> # ... with more rows, and 1 more variables: ERROR_MSG_DESC <chr>
@edgararuiz sorry for the delay. It is not I do not want to help but I won't have access to my company oracle database before Monday. It Seems you're near to solve the issue. I will continue to follow and if you want me to still test everything, I could do next week with pleasure.
How can I test this? I have tried running devtools::install_github('tidyverse/dplyr')
and I have dbplyr
1.1.0 installed. Yet, when I try to get a table with my ROracle
connection
tbl(con, "my_table")
I just get the error
Error in .oci.SendQuery(conn, statement, data = data, prefetch = prefetch, :
ORA-00933: SQL command not properly ended
If I add the three lines shown above I get
sql_translate_env.OraConnection <- dbplyr:::sql_translate_env.Oracle
sql_select.OraConnection <- dbplyr:::sql_select.Oracle
sql_subquery.OraConnection <- dbplyr:::sql_subquery.Oracle
tbl(con, "my_table")
Error in .oci.SendQuery(conn, statement, data = data, prefetch = prefetch, :
ORA-00942: table or view does not exist
So I then try with the relevant schema
tbl(con, dbplyr::in_schema("my_schema", "my_table"))
but just the same error at the start
Error in .oci.SendQuery(conn, statement, data = data, prefetch = prefetch, :
ORA-00933: SQL command not properly ended
I also tried changing the current schema with
res <- dbSendQuery(con, "ALTER SESSION SET CURRENT_SCHEMA = MY_SCHEMA")
dbClearResult(res)
but still not properly ended
error.
Hi @cdeterman ,
A couple of things to try:
Can you run the dbSendQuery
command with equating it to a variable: dbSendQuery(con, "ALTER SESSION SET CURRENT_SCHEMA = MY_SCHEMA")
?
Can you try the same query in your Oracle SQL client?ALTER SESSION SET CURRENT_SCHEMA = MY_SCHEMA
Thanks
@edgararuiz
I'm not sure I understand your first point. I did set the dbSendQuery
as a variable.
res <- dbSendQuery(con, "ALTER SESSION SET CURRENT_SCHEMA = MY_SCHEMA")
Your second point, I can confirm that it does work to change the schema.
Ok, sorry for the confusion, let's just try this verbatim:
DBI::dbExecute(con, "ALTER SESSION SET CURRENT_SCHEMA = MY_SCHEMA")
@edgararuiz that seems to do the trick, it returns my table now. Many thanks
That's great to hear! I'll note that in the Oracle page on the site: http://db.rstudio.com/oracle/
@edgararuiz one other thing, is there an eta on the translator getting the count
and n_distinct
functions from the warning?
Warning message:
Translator is missing window functions:
count, n_distinct
Yes, that's my bad. I already sent a PR to fix that. Thanks for letting me know
@edgararuiz excellent, do you have a link to that PR? Which package is this PR for that I will make a point of pulling from until it is on CRAN?
Sure thing, here you go: https://github.com/tidyverse/dbplyr/pull/34
@cdeterman - I decided to submit a small PR to fix just that issue, would you mind trying it out to confirm that the warning went away? https://github.com/tidyverse/dbplyr/pull/37
Thanks!
@edgararuiz The count
function appears to work but the warning message is still showing up though.
@edgararuiz Do you need more tests on your branch ? I have Oracle 12 at my company.
Hi @cderv - Thanks, I don't have anything specific at this time. Are you currently using odbc
or ROracle
? If ROracle
, test driving https://github.com/tidyverse/dbplyr/pull/27 would be awesome
@cdeterman - Thank you so much for testing. I'm not able to reproduce the Warning after switching to the version in the PR. Would it be possible to restart your R session and then installing the PR's version?
@edgararuiz I restarted my R session and installed the tidyverse/dbplyr#27. It appears to be working correctly but the warning message is still there.
@cdeterman - Thank you again. Would it be possible to create a small reprex to see what I'm missing?
@edgararuiz reprex??? I don't recognize this term.
So sorry about that, I'm just referring to a small reproducible code that I can run in my environment and see if I get the same results
@edgararuiz
Unfortunately I can't provide the exact code but if you have an oracle database to connect to it should work for any table you wish to query.
library(ROracle)
library(dplyr)
sql <- "
SELECT *
FROM my_table
"
my_tbl <- tbl(con, sql(sql))
my_tbl %>%
count()
@edgararuiz it may also be worth noting that I discovered that a case_when
call inside of a mutate
results in an invalid query when I wish to have an 'ELSE' clause (Note, it will run without the ELSE clause but will populate it with NA
instead. I could fix this after the fact by this seems like it should be possible.
my_tbl %>%
mutate( new_var =
case_when(my_flag == 'check' ~ 1,
TRUE ~ 0)) %>%
show_query()
This results in a query where the begging has the following near the top (where no ELSE statement is included)
WHEN ("MY_FALG" = 'check') THEN (1.0)
WHEN (TRUE) THEN (0.0)
END as "new_var"
This is not valid syntax for Oracle SQL and therefore I always receive the error:
Error in .oci.SendQuery(conn, statement, data = data, prefetch = prefetch, :
ORA-00920: invalid relational operator
Perhaps a separate issue? Not sure if this is across other database backends.
@edgararuiz when you have a minute or two, can you please summarise any actions that I need to take based on this thread?
@hadley - I honestly do not see anything else for you to do beyond merging https://github.com/tidyverse/dbplyr/pull/27. I did added the "Fixes" call in the comment, but it doesn't seem to have closed this issue.
@cdeterman Did the fixes by @edgararuiz solved your issue with case_when
translation ?
It seems I still have an error on my side but before opening any new issue here, I would like to have to your opinion as you had already one. Thanks
This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/