BigQuery Go Client
Linux 4.15.11-1-ARCH SMP PREEMPT x86_64 GNU/Linux
go version go1.10.1 linux/amd64
I am trying to stream into BigQuery.
It should just insert the data without 404 table not found error since the table already exists.
404 table not found
Before running go run the package was updated with go get -u cloud.google.com/go/bigquery
go run checkTable.go
2018/04/19 14:38:20 googleapi: Error 404: Not found: Table pureapp-199410:dream4.sometable14, notFound
2018/04/19 14:38:20 Creating table too
panic: googleapi: Error 404: Not found: Table pureapp-199410:dream4.sometable14, notFound
goroutine 1 [running]:
main.main()
/home/core/go/src/gcf_hello_world/checkTable.go:53 +0x991
exit status 2
I am able to consitently reproduce it with the code below:
First make sure the dataset exists
In the below code, change the tableID to a table which does not exist yet.
It will first insert without creating table, which is expected to fail.
Then it will create new table
Finally, it will insert again but this time also we'll end up with 404 table not found error.
Here is my project ID: pureapp-199410 (API request logs might help here)
In checkTable.go I have:
package main
import (
"context"
"log"
"cloud.google.com/go/bigquery"
"google.golang.org/api/option"
)
const (
projectID = "pureapp-199410"
seperator = "."
)
type Item struct {
Name string
Count int
}
func main() {
ctx := context.Background()
client, err := bigquery.NewClient(ctx, projectID, option.WithCredentialsFile("creds.json"))
if err != nil {
panic(err)
}
datasetID := "dream4"
tableID := "sometable13"
items := []*Item{
// Item implements the ValueSaver interface.
{Name: "n1", Count: 7},
{Name: "n2", Count: 2},
{Name: "n3", Count: 1},
}
u := client.Dataset(datasetID).Table(tableID).Uploader()
if err := u.Put(ctx, items); err != nil {
log.Println(err)
} else {
log.Println("Insert successful")
}
log.Println("Creating table too")
table := client.Dataset(datasetID).Table(tableID)
ctx1 := context.Background()
if err := table.Create(ctx1, &bigquery.TableMetadata{Schema: schema}); err != nil {
panic(err)
}
u = client.Dataset(datasetID).Table(tableID).Uploader()
ctx2 := context.Background()
if err := u.Put(ctx2, items); err != nil {
panic(err)
} else {
log.Println("Insert successful")
}
}
var schema bigquery.Schema = []*bigquery.FieldSchema{
{Name: "Name", Type: bigquery.StringFieldType},
{Name: "Count", Type: bigquery.IntegerFieldType},
}
There is a delay between creating a table and its availability.
Test 1
go run run.go
2018/04/19 20:33:44 Insert retry attempt 1 error googleapi: Error 404: Not found: Table pureapp-199410:dream4.sometable32, notFound
2018/04/19 20:33:44 Creating table only
2018/04/19 20:33:45 Table successfully created
2018/04/19 20:33:45 Waiting for 3ms before next attempt
2018/04/19 20:33:45 Insert retry attempt 2 error googleapi: Error 404: Not found: Table pureapp-199410:dream4.sometable32, notFound
2018/04/19 20:33:45 Creating table only
2018/04/19 20:33:45 Table already exists
2018/04/19 20:33:45 Waiting for 11ms before next attempt
2018/04/19 20:33:46 Insert retry attempt 3 error googleapi: Error 404: Not found: Table pureapp-199410:dream4.sometable32, notFound
2018/04/19 20:33:46 Creating table only
2018/04/19 20:33:46 Table already exists
2018/04/19 20:33:46 Waiting for 15ms before next attempt
2018/04/19 20:33:47 Insert retry attempt 4 error googleapi: Error 404: Not found: Table pureapp-199410:dream4.sometable32, notFound
2018/04/19 20:33:47 Creating table only
2018/04/19 20:33:47 Table already exists
2018/04/19 20:33:47 Waiting for 25ms before next attempt
2018/04/19 20:33:48 Insert retry attempt 5 error googleapi: Error 404: Not found: Table pureapp-199410:dream4.sometable32, notFound
2018/04/19 20:33:48 Creating table only
2018/04/19 20:33:48 Table already exists
2018/04/19 20:33:48 Waiting for 33ms before next attempt
2018/04/19 20:33:49 Insert retry attempt 6 error googleapi: Error 404: Not found: Table pureapp-199410:dream4.sometable32, notFound
2018/04/19 20:33:49 Creating table only
2018/04/19 20:33:49 Table already exists
2018/04/19 20:33:49 Waiting for 72ms before next attempt
2018/04/19 20:33:50 Insert retry attempt 7 error googleapi: Error 404: Not found: Table pureapp-199410:dream4.sometable32, notFound
2018/04/19 20:33:50 Creating table only
2018/04/19 20:33:50 Table already exists
2018/04/19 20:33:50 Waiting for 133ms before next attempt
2018/04/19 20:33:51 Insert retry attempt 8 error googleapi: Error 404: Not found: Table pureapp-199410:dream4.sometable32, notFound
2018/04/19 20:33:51 Creating table only
2018/04/19 20:33:51 Table already exists
2018/04/19 20:33:51 Waiting for 256ms before next attempt
2018/04/19 20:33:52 Insert retry attempt 9 error googleapi: Error 404: Not found: Table pureapp-199410:dream4.sometable32, notFound
2018/04/19 20:33:52 Creating table only
2018/04/19 20:33:53 Table already exists
2018/04/19 20:33:53 Waiting for 518ms before next attempt
2018/04/19 20:33:54 Insert retry attempt 10 error googleapi: Error 404: Not found: Table pureapp-199410:dream4.sometable32, notFound
2018/04/19 20:33:54 Creating table only
2018/04/19 20:33:54 Table already exists
2018/04/19 20:33:54 Waiting for 1.024s before next attempt
2018/04/19 20:33:56 Insert retry attempt 11 error googleapi: Error 404: Not found: Table pureapp-199410:dream4.sometable32, notFound
2018/04/19 20:33:56 Creating table only
2018/04/19 20:33:56 Table already exists
2018/04/19 20:33:56 Waiting for 2.052s before next attempt
2018/04/19 20:33:59 Insert retry attempt 12 error googleapi: Error 404: Not found: Table pureapp-199410:dream4.sometable32, notFound
2018/04/19 20:33:59 Creating table only
2018/04/19 20:33:59 Table already exists
2018/04/19 20:33:59 Waiting for 4.097s before next attempt
2018/04/19 20:34:03 Insert retry attempt 13 error googleapi: Error 404: Not found: Table pureapp-199410:dream4.sometable32, notFound
2018/04/19 20:34:03 Creating table only
2018/04/19 20:34:04 Table already exists
2018/04/19 20:34:04 Waiting for 8.194s before next attempt
2018/04/19 20:34:13 Insert retry attempt 14 error googleapi: Error 404: Not found: Table pureapp-199410:dream4.sometable32, notFound
2018/04/19 20:34:13 Creating table only
2018/04/19 20:34:13 Table already exists
2018/04/19 20:34:13 Waiting for 16.393s before next attempt
2018/04/19 20:34:30 Insert retry attempt 15 error googleapi: Error 404: Not found: Table pureapp-199410:dream4.sometable32, notFound
2018/04/19 20:34:30 Creating table only
2018/04/19 20:34:30 Table already exists
2018/04/19 20:34:30 Waiting for 32.776s before next attempt
2018/04/19 20:35:03 Insert retry attempt 16 error googleapi: Error 404: Not found: Table pureapp-199410:dream4.sometable32, notFound
2018/04/19 20:35:03 Creating table only
2018/04/19 20:35:04 Table already exists
2018/04/19 20:35:04 Waiting for 1m5.54s before next attempt
2018/04/19 20:36:10 Inserting data successful after 2m27.182891398s
Test2
go run run.go
2018/04/19 20:37:16 Insert retry attempt 1 error googleapi: Error 404: Not found: Table pureapp-199410:dream4.sometable33, notFound
2018/04/19 20:37:16 Creating table only
2018/04/19 20:37:17 Table successfully created
2018/04/19 20:37:17 Waiting for 3ms before next attempt
2018/04/19 20:37:17 Insert retry attempt 2 error googleapi: Error 404: Not found: Table pureapp-199410:dream4.sometable33, notFound
2018/04/19 20:37:17 Creating table only
2018/04/19 20:37:17 Table already exists
2018/04/19 20:37:17 Waiting for 11ms before next attempt
2018/04/19 20:37:18 Insert retry attempt 3 error googleapi: Error 404: Not found: Table pureapp-199410:dream4.sometable33, notFound
2018/04/19 20:37:18 Creating table only
2018/04/19 20:37:19 Table already exists
2018/04/19 20:37:19 Waiting for 15ms before next attempt
2018/04/19 20:37:19 Insert retry attempt 4 error googleapi: Error 404: Not found: Table pureapp-199410:dream4.sometable33, notFound
2018/04/19 20:37:19 Creating table only
2018/04/19 20:37:20 Table already exists
2018/04/19 20:37:20 Waiting for 25ms before next attempt
2018/04/19 20:37:20 Insert retry attempt 5 error googleapi: Error 404: Not found: Table pureapp-199410:dream4.sometable33, notFound
2018/04/19 20:37:20 Creating table only
2018/04/19 20:37:21 Table already exists
2018/04/19 20:37:21 Waiting for 33ms before next attempt
2018/04/19 20:37:21 Insert retry attempt 6 error googleapi: Error 404: Not found: Table pureapp-199410:dream4.sometable33, notFound
2018/04/19 20:37:21 Creating table only
2018/04/19 20:37:21 Table already exists
2018/04/19 20:37:21 Waiting for 72ms before next attempt
2018/04/19 20:37:22 Insert retry attempt 7 error googleapi: Error 404: Not found: Table pureapp-199410:dream4.sometable33, notFound
2018/04/19 20:37:22 Creating table only
2018/04/19 20:37:22 Table already exists
2018/04/19 20:37:22 Waiting for 133ms before next attempt
2018/04/19 20:37:23 Insert retry attempt 8 error googleapi: Error 404: Not found: Table pureapp-199410:dream4.sometable33, notFound
2018/04/19 20:37:23 Creating table only
2018/04/19 20:37:23 Table already exists
2018/04/19 20:37:23 Waiting for 256ms before next attempt
2018/04/19 20:37:24 Insert retry attempt 9 error googleapi: Error 404: Not found: Table pureapp-199410:dream4.sometable33, notFound
2018/04/19 20:37:24 Creating table only
2018/04/19 20:37:24 Table already exists
2018/04/19 20:37:24 Waiting for 518ms before next attempt
2018/04/19 20:37:25 Insert retry attempt 10 error googleapi: Error 404: Not found: Table pureapp-199410:dream4.sometable33, notFound
2018/04/19 20:37:25 Creating table only
2018/04/19 20:37:26 Table already exists
2018/04/19 20:37:26 Waiting for 1.024s before next attempt
2018/04/19 20:37:27 Insert retry attempt 11 error googleapi: Error 404: Not found: Table pureapp-199410:dream4.sometable33, notFound
2018/04/19 20:37:27 Creating table only
2018/04/19 20:37:27 Table already exists
2018/04/19 20:37:27 Waiting for 2.052s before next attempt
2018/04/19 20:37:30 Insert retry attempt 12 error googleapi: Error 404: Not found: Table pureapp-199410:dream4.sometable33, notFound
2018/04/19 20:37:30 Creating table only
2018/04/19 20:37:30 Table already exists
2018/04/19 20:37:30 Waiting for 4.097s before next attempt
2018/04/19 20:37:35 Insert retry attempt 13 error googleapi: Error 404: Not found: Table pureapp-199410:dream4.sometable33, notFound
2018/04/19 20:37:35 Creating table only
2018/04/19 20:37:35 Table already exists
2018/04/19 20:37:35 Waiting for 8.194s before next attempt
2018/04/19 20:37:44 Insert retry attempt 14 error googleapi: Error 404: Not found: Table pureapp-199410:dream4.sometable33, notFound
2018/04/19 20:37:44 Creating table only
2018/04/19 20:37:45 Table already exists
2018/04/19 20:37:45 Waiting for 16.393s before next attempt
2018/04/19 20:38:01 Insert retry attempt 15 error googleapi: Error 404: Not found: Table pureapp-199410:dream4.sometable33, notFound
2018/04/19 20:38:01 Creating table only
2018/04/19 20:38:02 Table already exists
2018/04/19 20:38:02 Waiting for 32.776s before next attempt
2018/04/19 20:38:35 Insert retry attempt 16 error googleapi: Error 404: Not found: Table pureapp-199410:dream4.sometable33, notFound
2018/04/19 20:38:35 Creating table only
2018/04/19 20:38:35 Table already exists
2018/04/19 20:38:35 Waiting for 1m5.54s before next attempt
2018/04/19 20:39:41 Inserting data successful after 2m26.84197699s


I think you answered your own question. Creating a table is an eventually consistent operation: it may not appear to exist everywhere until after some time has passed. In the screenshots above, the second request, which happens 2 seconds after the first, may end up at a different location where the creation hasn't yet propagated.
Have you considered Uploader.TableTemplateSuffix as a way to create the table automatically?
I started with TableTemplateSuffix but I get the impression that it can't use a table from a dataset as a template for a new table in a different dataset. So, if I create a new dataset, there is no way to base my new table on anything and the only option I am left with is to create it from a schema. Please correct me if I am wrong here.
I am creating datasets and tables dynamically based on the fields of the event. If the insertAll request fails, I create either both dataset and table or just table based on the error response. (I parse the error to know what does not exist)
I am reading from Kinesis steams and plan to use Lambda for streaming into BQ.
Do you know what strategy Apache Beam uses to get around these limitations of dynamically creating datasets/tables and streaming data into them without waiting for 10 minutes and retrying?
Yes, BigQuery's streaming inserts definitely exhibit eventually consistent behaviors when you reactively create a table in response to a streaming notFound response. This is covered in the streaming section of the BigQuery troubleshooting docs.
However, its not clear to me from the comments whether you're dealing with constantly evolving schemas, or if there's a known set of messages/schemas and you're just inspecting and redirecting them based on message structure.
If its the latter case, having the tables created beforehand should likely suffice, as streaming to a table that already exists should encounter none of the eventual consistency issues you're observing with your current approach.
However, if the schemas are truly dynamic, one of the things you might consider is leveraging BigQuery's ability to use complex types, particularly arrays of structs.
Consider the following schema:
_ = bigquery.Schema{
{
Name: "event_time",
Type: bigquery.TimestampFieldType,
},
{
Name: "sender_id",
Type: bigquery.StringFieldType,
},
{
Name: "destination_id",
Type: bigquery.StringFieldType,
},
{
Name: "message_type",
Type: bigquery.StringFieldType,
},
{
Name: "metrics",
Type: bigquery.RecordFieldType,
Repeated: true,
Schema: bigquery.Schema{
{
Name: "key",
Type: bigquery.StringFieldType,
},
{
Name: "value",
Type: bigquery.StringFieldType,
},
{
Name: "annotation",
Type: bigquery.StringFieldType,
},
},
},
}
Within it, you have a set of known top level fields (event_time, sender, destination, etc), but "metrics" is an array of key/value structs, which can receive many unrelated values and does not require schema evolution. You can access/manipulate filter the arrays as needed using Standard SQL, which has its own topic in the query documentation.
Closing, but please reopen if you have more questions or concerns.
@jba Main concern is that nowhere in documentation it mentions anything like "table will eventually be there" and no recommended solutions offered either. Everyone has to figure this out on their own which is wasteful.
I spent a few hours yesterday thinking I was going mad. It would be nice if user experience was better in this regard.
Hello, I've got the same issue. Due to some reasons I can't use TableTemplateSuffix and I need to create table dynamically and stream data to it. I have tried to request TableMetadata and it response successfully but Inserter still returns 404 error right after table.Metadata() success.
There would be really useful some kind of Available() method on Table
@jba
Most helpful comment
@jba Main concern is that nowhere in documentation it mentions anything like "table will eventually be there" and no recommended solutions offered either. Everyone has to figure this out on their own which is wasteful.
I spent a few hours yesterday thinking I was going mad. It would be nice if user experience was better in this regard.