Google-cloud-ruby: Bigquery Github documentation snippet bug

Created on 11 Jan 2017  路  12Comments  路  Source: googleapis/google-cloud-ruby

Issue

The following snippet lives in the Google-Cloud-Ruby Github documentation and errors when tested. The snippet can be found here. The error is shown below.

From a brief inspection the bug has to do with the query named parameter. When I swap the named parameter with the actual value the snippet works correctly.

# Error
franknatividad@franknatividad0:~/Tests/ruby-bigquery$ bundle exec ruby query.rb 
/usr/local/google/home/franknatividad/.rbenv/versions/2.3.1/lib/ruby/gems/2.3.0/gems/google-cloud-bigquery-0.23.0/lib/google/cloud/bigquery/service.rb:662:in `rescue in execute': invalidQuery: No matching signature for operator IN for argument types STRING and {ARRAY<STRING>} at [1:102] (Google::Cloud::InvalidArgumentError)
        from /usr/local/google/home/franknatividad/.rbenv/versions/2.3.1/lib/ruby/gems/2.3.0/gems/google-cloud-bigquery-0.23.0/lib/google/cloud/bigquery/service.rb:660:in `execute'
        from /usr/local/google/home/franknatividad/.rbenv/versions/2.3.1/lib/ruby/gems/2.3.0/gems/google-cloud-bigquery-0.23.0/lib/google/cloud/bigquery/service.rb:222:in `query'
        from /usr/local/google/home/franknatividad/.rbenv/versions/2.3.1/lib/ruby/gems/2.3.0/gems/google-cloud-bigquery-0.23.0/lib/google/cloud/bigquery/project.rb:422:in `query'
        from query.rb:10:in `<main>'

# Snippet
require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

sql = "SELECT word, SUM(word_count) AS word_count " \
      "FROM `bigquery-public-data.samples.shakespeare`" \
      "WHERE word IN (@words) GROUP BY word"
data = bigquery.query sql, params: { words: ['me', 'I', 'you'] }
# Gemfile

source 'https://rubygems.org'

gem 'google-cloud-bigquery'
# Gemfile.lock
GEM
  remote: https://rubygems.org/
  specs:
    addressable (2.5.0)
      public_suffix (~> 2.0, >= 2.0.2)
    faraday (0.10.1)
      multipart-post (>= 1.2, < 3)
    google-api-client (0.9.20)
      addressable (~> 2.3)
      googleauth (~> 0.5)
      httpclient (~> 2.7)
      hurley (~> 0.1)
      memoist (~> 0.11)
      mime-types (>= 1.6)
      representable (~> 2.3.0)
      retriable (~> 2.0)
    google-cloud-bigquery (0.23.0)
      google-api-client (~> 0.9.18)
      google-cloud-core (~> 0.21.0)
    google-cloud-core (0.21.1)
      googleauth (~> 0.5.1)
    googleauth (0.5.1)
      faraday (~> 0.9)
      jwt (~> 1.4)
      logging (~> 2.0)
      memoist (~> 0.12)
      multi_json (~> 1.11)
      os (~> 0.9)
      signet (~> 0.7)
    httpclient (2.8.3)
    hurley (0.2)
    jwt (1.5.6)
    little-plugger (1.1.4)
    logging (2.1.0)
      little-plugger (~> 1.1)
      multi_json (~> 1.10)
    memoist (0.15.0)
    mime-types (3.1)
      mime-types-data (~> 3.2015)
    mime-types-data (3.2016.0521)
    multi_json (1.12.1)
    multipart-post (2.0.0)
    os (0.9.6)
    public_suffix (2.0.5)
    representable (2.3.0)
      uber (~> 0.0.7)
    retriable (2.1.0)
    signet (0.7.3)
      addressable (~> 2.3)
      faraday (~> 0.9)
      jwt (~> 1.5)
      multi_json (~> 1.10)
    uber (0.0.15)

PLATFORMS
  ruby

DEPENDENCIES
  google-cloud-bigquery

BUNDLED WITH
   1.13.4
bigquery bug

Most helpful comment

I think the best solution is to add UNNEST to the snippet.

All 12 comments

Does the snippet use standard SQL syntax? You may have to set useLegacySql to false or prepend the query with a line that is just #standardSQL.

@tswast
Yes the snippet uses standard SQL syntax and here's a related quote from the documentation:

As demonstrated above, passing the params option will automatically set standard_sql to true.

A quick try explicitly using standard_sql results in the same error. Maybe that's where the problem exists?

# Code

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

sql = "SELECT word, SUM(word_count) AS word_count " \
        "FROM `bigquery-public-data.samples.shakespeare`" \
        "WHERE word IN (@words) GROUP BY word"


data = bigquery.query sql, standard_sql: true, params: { words: ['me', 'I', 'you'] }

Can you debug the request by printing out the JSON it is sending? It's also possible it's not being populated correctly. (It is a bit tricky)

Yes, I will try!

I suspect passing an array is simply an incorrect argument. According to Functions & Operators, the syntax is:

x IN (y, z, ...)

Notice that there is no array.

And replacing the array with a single string works:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

sql = "SELECT word, SUM(word_count) AS word_count " \
      "FROM `bigquery-public-data.samples.shakespeare`" \
      "WHERE word IN (@words) GROUP BY word"
data = bigquery.query sql, params: { words: 'you' }
#=> [{"word"=>"you", "word_count"=>12527}]

It also works with the addition of UNNEST, per the syntax guide linked above:

The UNNEST form treats an array scan like UNNEST in the FROM clause:
x [NOT] IN UNNEST(<array expression>)
This form is often used with ARRAY parameters. For example:
x IN UNNEST(@array_parameter)

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

sql = "SELECT word, SUM(word_count) AS word_count " \
      "FROM `bigquery-public-data.samples.shakespeare`" \
      "WHERE word IN UNNEST(@words) GROUP BY word"
data = bigquery.query sql, params: { words: ['me', 'I', 'you'] }
#=> [{"word"=>"I", "word_count"=>21028}, {"word"=>"me", "word_count"=>8030}, {"word"=>"you", "word_count"=>12527}]

I think the best solution is to add UNNEST to the snippet.

Thanks, Chris! I'll send a PR to add UNNEST.

I think that's the only option. I also tried embedding the elements in a string argument, but it didn't work, apparently the string argument is treated as a single element.

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

sql = "SELECT word, SUM(word_count) AS word_count " \
      "FROM `bigquery-public-data.samples.shakespeare`" \
      "WHERE word IN UNNEST(@words) GROUP BY word"
data = bigquery.query sql, params: { words: "'me', 'I', 'you'" }
#=> []

Thanks for your work on this @quartzmo! I think adding UNNEST is a great solution!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

NirKamara picture NirKamara  路  4Comments

quartzmo picture quartzmo  路  3Comments

danicuki picture danicuki  路  4Comments

kirkbyo picture kirkbyo  路  4Comments

echan00 picture echan00  路  4Comments