Google-cloud-python: Excluded indexes from nested Array

Created on 25 Mar 2018  路  9Comments  路  Source: googleapis/google-cloud-python

Hi team,
Currently I couldn't exclude indexs of properties inside array when using google-cloud-python

My model is quite complex,1 flow contains many flow childs and each child could contains other flow childs as well

Flow model:

{
  title: 'name',
  description: 'LONG_TEXT_HERE',
  flow_children: [ 
      {
        title: 'name',
        description: 'LONG_TEXT_HERE',
        flow_children: []
    }
   ] 
}

Im trying to follow document to exclude indexs by several way but it didnt work.

task = datastore.Entity(key=key, exclude_from_indexes=['consideration','description','flow_children','flow_children[].description'])

Still get below error
google.api_core.exceptions.BadRequest: 400 The value of property "description" is longer than 1500 bytes.

Then I try to use google cloud console to update entity with value and index directly, it works.

{
  "values": [
    {
      "entityValue": {
        "properties": {
          "flow_children": {
            "arrayValue": {
              "values": [
                {
                  "entityValue": {
                    "properties": {
                      "description": {
                        "stringValue":"LONG_TEXT_HERE",
                       "excludeFromIndexes": true
          },
          "type": {
            "stringValue": "flowstep"
          },
          "title": {
            "stringValue": "Go to back yard"
          }
        }
      }
    }
  ]
}

Anyone could help me with this case please :(

Thanks
My envs:
python 3.6.4
google-cloud-datastore 1.4.0
Macosx High Serria

question datastore

All 9 comments

Hello, it seems like this
is similar to your issue? Could you reproduce the problem with the code sample there?

Thanks for your reply @chemelnucfin

I already read the issue above but it's apply for embedded entity.
My situation is array
I put my sample code here

@app.route('/')
def partner():
    ds = datastore.Client()

    data = {
        'timestamp': datetime.datetime.utcnow(),
        'test1': 'test 111',
        'test2': 'test 2',
        'long_text': 'LONG_TEXT_HERE',
        'flow_children':  [
            {
                'long_text': 'LONG_TEXT_HERE'
            }
        ]
    }

    _id = str(uuid.uuid4())

    key = ds.key('Test', _id)
    entity = datastore.Entity(key=key,exclude_from_indexes=["long_text", "flow_children", "flow_children[].long_text"])
    entity.update(data)
    ds.put(entity)
    data['partner_id'] = _id
    output = jsonify(data)

btw: Does data-store have an option to ignored all indexs, It mean just create index by demand (when I needed), currently it's automatically and it's quite annoyed ?

Haven't solved it yet, but here's reproducible code:
```
class TestLongBytes(unittest.TestCase):

def test_long_bytes(self):
    import uuid
    from google.cloud.datastore import entity
    client = clone_client(Config.CLIENT)
    s = 'a'*1505
    _id = str(uuid.uuid4())
    key = client.key('Test', _id)
    data = {'long_t': s, 'long': [ {'long_text': s}]}
    ent = entity.Entity(key, exclude_from_indexes=['long_t', 'long'])
    ent.update(data)
    client.put(ent)

def test_long_bytes2(self):
    import uuid
    from google.cloud.datastore import entity
    client = clone_client(Config.CLIENT)
    s = 'a'*1505
    _id = str(uuid.uuid4())
    key = client.key('Test', _id)
    data = {'long_t': s}
    ent = entity.Entity(key, exclude_from_indexes=['long_t'])
    ent.update(data)
    client.put(ent)

def test_long_bytes3(self):
    import uuid
    from google.cloud.datastore import entity
    client = clone_client(Config.CLIENT)
    s = 'a'*1505
    _id = str(uuid.uuid4())
    key = client.key('Test', _id)
    data = {'long_t': s}
    ent = entity.Entity(key)
    ent.update(data)
    client.put(ent)

```
cases 1 and 3 fail.

@tseaver @jonparrott so it seems like exclude_from_indexes does not propagate down to dictionaries inside a list.

Do we want the full path of exclude_from_indexes (as in splitting on dots or something) or does just the name of the index anywhere?
As in my first example, exclude 'long_text' or 'long.long_text' or long[long_text], or something else?

@duyphuong13 For nested values to be excluded, they must be created as Entity instances, rather than bare dicts. So, for your example, I wrote this gist.

I'm going to close preemptively. Please feel free to follow up / reopen if my answer above doesn't work for you.

@tseaver If I do as you mention, I get:

File "/env/lib/python3.6/site-packages/google/cloud/datastore/client.py", line 438, in put_multi
current.put(entity)
File "/env/lib/python3.6/site-packages/google/cloud/datastore/batch.py", line 188, in put
raise ValueError("Entity must have a key")
ValueError: Entity must have a key

What I'm doing is translatable to this code:

entity = datastore.Entity()
entity['small_val'] = small_val
entity['sub_dict'] = datastore.Entity()
entity['sub_dict'].update(has_large_values)

if len(entity['sub_dict']['maybe_large_value']) > large_value:
    entity['sub_dict'].exclude_from_indexes.add('maybe_large_value')

client.put(entity)

How do I solve this, then?

For anyone else struggling with this one and landing on this page, much like myself, here's a snippet that is a bit more comprehensible than the one above IMO:

key = client.key("MyKind", f"{account_id}_{integration_id}")
# this is not enough to not index the secret
integration = datastore.Entity(key=key, exclude_from_indexes=["secret"])

# ensure that nested fields are not indexed - *le sigh*
keys = secret.keys()
secret_entity = datastore.Entity(exclude_from_indexes=list(keys))
secret_entity.update(secret)


record = {
    "integration_id": integration_id,
    "account_id": account_id,
    "secret": secret_entity,
    "created_at": datetime.utcnow(),
    "updated_at": datetime.utcnow(),
    "visible_fields": visible_fields or [],
}
integration.update(record)

client.put(integration)
Was this page helpful?
0 / 5 - 0 ratings