Sparklyr: invoke issues on org.apache.spark.sql.Dataset

Created on 24 Jan 2017 · 3Comments · Source: sparklyr/sparklyr

I have dataset uploaded to Spark.
Then I try to perform some methods for invoke.

I get an error if I use invoke with select or groupBy, but I can successfully run invoke on drop

iris_tbl <- copy_to(sc, iris)
x <- spark_dataframe(iris_tbl)

Success:

x %>% invoke("drop","Species")

<jobj[63]>
  class org.apache.spark.sql.Dataset
  [Sepal_Length: double, Sepal_Width: double ... 2 more fields]

Error:
```
x %>% invoke("select","Species")

Error: java.lang.Exception: No matched method found for class org.apache.spark.sql.Dataset.select
at sparklyr.Invoke$.invoke(invoke.scala:91)
at sparklyr.StreamHandler$.handleMethodCall(stream.scala:89)
at sparklyr.StreamHandler$.read(stream.scala:55)
at sparklyr.BackendHandler.channelRead0(handler.scala:49)
at sparklyr.BackendHandler.channelRead0(handler.scala:14)
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
... ```

Source

VadymBoikov

Most helpful comment

@javierluraschi - how could I select multiple columns?

EDIT: Answer for anyone who ends up here.

cols = list("Species" , "Petal.Length" "Petal.Width")
x %>% invoke("select", cols[[1]], cols[2:length(cols)])

AkhilGNair on 22 Feb 2017

👍2

All 3 comments

@VadymBoikov currently, sparklyr does not support variable parameters which is the case of select (see http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.Dataset), since it takes a String followed by a String*.

In order to workaround this issue, you can pass an empty String[] by calling select() as follows:

x %>% invoke("select", "Species", list())

javierluraschi on 24 Jan 2017

👍1

It works, thank you

VadymBoikov on 25 Jan 2017

@javierluraschi - how could I select multiple columns?

EDIT: Answer for anyone who ends up here.

cols = list("Species" , "Petal.Length" "Petal.Width")
x %>% invoke("select", cols[[1]], cols[2:length(cols)])

AkhilGNair on 22 Feb 2017

👍2

Was this page helpful?

0 / 5 - 0 ratings

Related issues

java.net.UnknownHostException: $HOST: $HOST: Name or service not known

isomorphisms · 3Comments

Train Tensor Flow model with sparklyr

Fooourche · 3Comments

`ft_index_to_string` fails to convert indices back to strings

mikhailBalyasin · 4Comments

spark_apply error when using if() and logical evaluation

mjcarroll1985 · 3Comments

Java heap space reading table from hive

dangulod · 4Comments