Considering this schema, how can I add another custom_data field?
[
{
"name": "action",
"type": "STRING"
},
{
"name": "properties",
"type": "RECORD",
"fields": [
{
"name": "revenue",
"type": "FLOAT"
},
{
"name": "custom_data",
"type": "RECORD",
"fields": [
{
"name": "store_name",
"type": "STRING"
}
]
}
]
}
]
[Update 2017/02/24, @quartzmo:]
Deliverables:
Hi @arthurbailao, does this example help?
table.schema do |schema|
schema.string "first_name", mode: :required
schema.record "cities_lived", mode: :repeated do |cities_lived|
cities_lived.string "place", mode: :required
cities_lived.integer "number_of_years", mode: :required
end
end
Hi @quartzmo , thanks for help!!
I can't figure out how to add a single field to the custom_data nested field. I need to reproduce the whole mapping just to add a single field?
Furthermore, I've got the following error when trying to define a nested record type field
table.schema do |schema|
schema.record "properties" do |properties|
properties.record "custom_data" do |custom_data|
custom_data.string "store_name"
end
end
end
ArgumentError: nested RECORD type is not permitted
@blowmage Can you remember why this check is here? (Why deeper nesting is disallowed?)
@arthurbailao Regarding your original topic:
How can I patch a complex BigQuery schema?
I can't figure out how to add a single field to the custom_data nested field. I need to reproduce the whole mapping just to add a single field?
My understanding is that we need to supply the complete schema in the the patch request, as explained in detail in this Stack Overflow answer. So this means that you must reproduce the whole mapping, unfortunately.
We disallowed nested records because at the time BigQuery did not allow nested records and gave a pretty obscure error. So the ArgumentError was added. But this looks to be supported by BigQuery now, along with a bunch of other things.
We are actively working on BigQuery and hope to have a substantial update soon. Creating tables with nested records will be included in that. @quartzmo, I think we should add the ability to get an existing field on a schema so users don't have to redefine the entire table schema.
SGTM, I will update the description of this issue with a checklist.
Great guys!! Thank you!!
This is now covered by follow-up issues.
@arthurbailao Last Friday we released a new google-cloud-bigquery version that adds some convenience methods that will make modifying an existing schema better. Adding a store_id column to custom_data can now look like this:
table.schema do |s|
s.field["properties"].field["custom_data"].integer "store_id"
end
Thanks for opening this issue! If you have any problems with this new release please open another issue.
@blowmage I tested the new version and it works perfectly for me, thanks!!! Just fixing a typo in your code snippet:
table.schema do |s|
s.field("properties").field("custom_data").integer "store_id"
end
Most helpful comment
We disallowed nested records because at the time BigQuery did not allow nested records and gave a pretty obscure error. So the ArgumentError was added. But this looks to be supported by BigQuery now, along with a bunch of other things.
We are actively working on BigQuery and hope to have a substantial update soon. Creating tables with nested records will be included in that. @quartzmo, I think we should add the ability to get an existing field on a schema so users don't have to redefine the entire table schema.