Google-cloud-ruby: How can I patch a complex BigQuery schema?

Created on 24 Feb 2017  路  10Comments  路  Source: googleapis/google-cloud-ruby

Considering this schema, how can I add another custom_data field?

[
    {
        "name": "action",
        "type": "STRING"
    },
    {
        "name": "properties",
        "type": "RECORD",
        "fields": [
            {
                "name": "revenue",
                "type": "FLOAT"
            },
            {
                "name": "custom_data",
                "type": "RECORD",
                "fields": [
                    {
                        "name": "store_name",
                        "type": "STRING"
                    }
                ]
            }
        ]
    }
]

[Update 2017/02/24, @quartzmo:]

Deliverables:

  • [ ] Create table schema with nested records
  • [x] The ability to get an existing field on a schema so users don't have to redefine the entire table schema
bigquery p2 acknowledged will not fix question

Most helpful comment

We disallowed nested records because at the time BigQuery did not allow nested records and gave a pretty obscure error. So the ArgumentError was added. But this looks to be supported by BigQuery now, along with a bunch of other things.

We are actively working on BigQuery and hope to have a substantial update soon. Creating tables with nested records will be included in that. @quartzmo, I think we should add the ability to get an existing field on a schema so users don't have to redefine the entire table schema.

All 10 comments

Hi @arthurbailao, does this example help?

table.schema do |schema|
  schema.string "first_name", mode: :required
  schema.record "cities_lived", mode: :repeated do |cities_lived|
    cities_lived.string "place", mode: :required
    cities_lived.integer "number_of_years", mode: :required
  end
end

Hi @quartzmo , thanks for help!!

I can't figure out how to add a single field to the custom_data nested field. I need to reproduce the whole mapping just to add a single field?

Furthermore, I've got the following error when trying to define a nested record type field

table.schema do |schema|
  schema.record "properties" do |properties|
    properties.record "custom_data" do |custom_data|
      custom_data.string "store_name"
    end
  end
end
ArgumentError: nested RECORD type is not permitted

@blowmage Can you remember why this check is here? (Why deeper nesting is disallowed?)

@arthurbailao Regarding your original topic:

How can I patch a complex BigQuery schema?
I can't figure out how to add a single field to the custom_data nested field. I need to reproduce the whole mapping just to add a single field?

My understanding is that we need to supply the complete schema in the the patch request, as explained in detail in this Stack Overflow answer. So this means that you must reproduce the whole mapping, unfortunately.

We disallowed nested records because at the time BigQuery did not allow nested records and gave a pretty obscure error. So the ArgumentError was added. But this looks to be supported by BigQuery now, along with a bunch of other things.

We are actively working on BigQuery and hope to have a substantial update soon. Creating tables with nested records will be included in that. @quartzmo, I think we should add the ability to get an existing field on a schema so users don't have to redefine the entire table schema.

SGTM, I will update the description of this issue with a checklist.

Great guys!! Thank you!!

This is now covered by follow-up issues.

@arthurbailao Last Friday we released a new google-cloud-bigquery version that adds some convenience methods that will make modifying an existing schema better. Adding a store_id column to custom_data can now look like this:

table.schema do |s|
  s.field["properties"].field["custom_data"].integer "store_id"
end

Thanks for opening this issue! If you have any problems with this new release please open another issue.

@blowmage I tested the new version and it works perfectly for me, thanks!!! Just fixing a typo in your code snippet:

table.schema do |s|
  s.field("properties").field("custom_data").integer "store_id"
end
Was this page helpful?
0 / 5 - 0 ratings

Related issues

danicuki picture danicuki  路  4Comments

Ricowere picture Ricowere  路  4Comments

tobsch picture tobsch  路  4Comments

echan00 picture echan00  路  4Comments

premist picture premist  路  3Comments