Etcd: How to scan for list of keys under a prefix, scaling to 100k keys?

Created on 7 Aug 2019  路  8Comments  路  Source: etcd-io/etcd

Hi, this is a question, not a bug.

Any advice for how to deal with a Range() request to return a prefix, but the number of keys in that prefix is very large (tens of thousands)? I am getting a gRPC error:

error<_Rendezvous of RPC that terminated with: status = StatusCode.RESOURCE_EXHAUSTED details = "Received message larger than max (18053530 vs. 4194304)" debug_error_string = "{"created":"@1565191377.968980327","description":"Received message larger than max (18053530 vs. 4194304)","file":"src/core/ext/filters/message_size/message_size_filter.cc","file_line":190,"grpc_status":8}"

It sounds like I should do this Range incrementally. And I notice that Range has a limit option:

  // limit is a limit on the number of keys returned for the request. When limit is set to 0,
  // it is treated as no limit.
  int64 limit = 3;

But it is not clear, if I receive a response limited by the limit, how to "continue" with the Range, to get the next batch of keys?

How do people usually deal with scanning lots of keys?

arequestion stale

Most helpful comment

Here is an example of ranging keys with page size of 1000.

https://github.com/etcd-io/etcd/blob/master/clientv3/mirror/syncer.go

All 8 comments

Here is an example of ranging keys with page size of 1000.

https://github.com/etcd-io/etcd/blob/master/clientv3/mirror/syncer.go

Thanks, but that uses the Go client, I was looking for generic gRPC approach (since I am using a Python client). But I managed to figure out a reasonable solution:

My solution is to keep repeating Range requests for as long as response.more is True, keeping the same range_end (that limits to a prefix) but with range_start changed to be the last key returned (+ 1).

Code extract:


        range_start = "/betmessagecache/meta"
        range_end = etcd3.utils.increment_last_byte(
            etcd3.utils.to_bytes(range_start)
        )
        while True:
            range_request = etcd3.etcdrpc.RangeRequest()
            range_request.key = etcd3.utils.to_bytes(range_start)
            range_request.keys_only = False
            range_request.range_end = etcd3.utils.to_bytes(range_end)
            range_request.sort_order = etcd3.etcdrpc.RangeRequest.ASCEND
            range_request.sort_target = etcd3.etcdrpc.RangeRequest.KEY
            range_request.serializable = True
            range_request.limit = 1000

            range_response = await etcd.kvstub.Range(
                range_request,
                etcd.timeout,
                credentials=etcd.call_credentials,
                metadata=etcd.metadata,
            )
            for kv in range_response.kvs:
                do_stuff_with_kv
            if not range_response.more:
                break
            range_start = etcd3.utils.increment_last_byte(kv.key)

I'm going to close this as the question has two answers: one in Go, one in Python. But please do reopen if you think I am doing something stupid.

Unfortunately, my solution above, while functional, has terrible performance, I found out later. It causes big CPU and latency spikes on the server side.

I suspect is because I ask the server to sort keys (range_request.sort_order = etcd3.etcdrpc.RangeRequest.ASCEND), and with 1000 limit and 100k keys, I need to call it 100 times to scan all keys. In other words, the etcd server has to sort 100k keys 100 times.

Alas, I'm beginning to think etcd is the wrong tool for this particular job :(

I'm going to reopen this because I still don't have a satisfactory answer. I need:

  • Get a list of keys under a given prefix;
  • The solution must scale to 100k keys under that prefix;
  • Looking for an algorithm in terms of gRPC etcdv3 primitives, not Go client based;
  • It doesn't have to be fast;
  • It shouldn't cause latency spikes on the etcd cluster.

The solution I posted above has the problem of causing huge CPU spikes server-side. And of course CPU spikes cause latency spikes.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

@gjcarneiro Did you find any ideal solution for this problem?? We are not using the etcd go client as well.

I'm afraid not. I'm still using the same solution, which is not ideal, but at least I only need to do this scan once on application start. After that, the application keeps a Watch on that prefix and from the Watch notifications builds a client-side model of the keys.

Thanks!!

We are using the etcd as a distributed kv store in the microservice environment and we have multiple apps running the range requests on large data store and maybe running sort requests is not ideal.

can we re-open this ?? just to keep the thread running.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

r007m4n picture r007m4n  路  3Comments

suresh-chaudhari picture suresh-chaudhari  路  3Comments

yuchengwu picture yuchengwu  路  4Comments

WanLinghao picture WanLinghao  路  4Comments

itnikita picture itnikita  路  3Comments