The schema requires being updated to be STRING with mode as repeatable. But with this schema when I try to push a CSV file, it gives me following error:
google.cloud.exceptions.BadRequest: 400 Cannot load CSV data with a repeated field.
Anything I might be missing?
Related SO question: https://stackoverflow.com/questions/45315063/how-to-add-array-of-strings-as-a-schema-value-for-bigquery/45315110
This is a limitation of the CSV file format, which can't really represent multi-valued fields: each "column" corresponds to a single field.
The docs for nested / repeated fields say:
BigQuery supports loading and exporting nested and repeated data in the form of JSON and Avro files.
I have specified a column with the type string and mode repeated but when I am pushing an array data through python it creates multiple rows of different elements of array
@dhruvluthra1996 mode='REPEATED' implies that you pass a list/sequence of strings for the value. The behavior you are seeing is because Python strings are themselves sequences (whose elements are one-character strings).
Most helpful comment
This is a limitation of the CSV file format, which can't really represent multi-valued fields: each "column" corresponds to a single field.
The docs for nested / repeated fields say: