I am trying to understand when it is required to manually configure the session pool.
I read the session docs and it appears to say that only creators of client libraries need to manage sessions manually. However, one way of interpreting that means that I should still set the session pool config appropriately, and the client library will manage the sessions according to what I want to achieve.
In the recent changelog, I see that spanner.NewClient is created with a minimum of 100 opened sessions by default now.
Am I right to say that in many cases the default config will work well, or do I need to use my own config for better performance?
I am also not sure but it seems I used to see Spanner documentation on how to configure the session pool configuration, for example MaxOpened. I can't seem to find that anymore, and the godoc doesn't appear to explain that.
Hi @LiHaoTan
You normally do not have to manually configure the session pool. In most cases the default settings will work well, and you can safely use the spanner.NewClient method.
If you do need to supply a custom session pool configuration, or other Spanner client configuration, you should create your client using the spanner.NewClientWithConfig method (see https://godoc.org/cloud.google.com/go/spanner#NewClientWithConfig). Supplying a custom configuration for for example MaxOpened is done like this:
config := ClientConfig{
SessionPoolConfig: SessionPoolConfig{
MaxOpened: 1000,
},
}
formattedDatabase := fmt.Sprintf("projects/%s/instances/%s/databases/%s", "[PROJECT]", "[INSTANCE]", "[DATABASE]")
client, err := spanner.NewClientWithConfig(ctx, formattedDatabase, config)
Thank you for your reply.
Also sorry for my unclear phrasing but with regards to configuring the session pool I was actually asking how we should tune the session pool.
For instance (a completely artificial example), assuming I am going to do 2,000 concurrent reads a second, and I am using 10 Kubernetes pods, then should I set it MinOpened to 200? And things like how many HealthCheckWorkers I need. How fast is BatchCreateSessions and related to that what should I set MaxBurst to?
I understand that I should profile my application to figure things out but just wondering if there are some general guidelines.
Thanks for the clarification. As always with these things, the answer depends on the circumstances.
The most important general rule of thumb is: Set MinOpened to at least the number of concurrent transactions that your client will be executing (and you should round up and not down). If you are able to estimate a good value for that, the default values for the other session pool settings are good.
Furthermore, the following applies:
MinOpened=200 is a good choice.MaxBurst controls the maximum number of sessions that should be created concurrently on demand. If your application normally only needs 100 sessions, but there might be sudden bursts that increases that requirement to 300 sessions, you could set MinOpened=100 and MaxBurst=200. This will ensure that the Spanner client will create up to 200 sessions concurrently if the application is requesting more sessions than are in the pool at that moment. You should only do this if these bursts are uncommon. If it is normal that your application sometimes needs 300 sessions, it is better to set MinSessions=300.MaxIdle controls the number of sessions that the session pool will keep in the pool, even though they are considered 'idle' by the pool. It's best explained using an example:MinOpened=100, MaxOpened=400, MaxIdle=10.150 sessions.MinOpened sessions, but that these sessions are not needed. It will start to delete sessions until the session pool contains MinOpened+MaxIdle sessions. In this example the session pool maintainer will reduce the number of sessions in the pool to 110.HealthCheckWorkers and HealthCheckInterval are good for virtually all circumstances and there's no general rule of thumb when these should be changed.BatchCreateSessions will only be used to initialize the MinOpened sessions in the pool. This RPC executes in roughly the same time as a single CreateSession RPC, meaning that the total initialization time of a session pool with 100+ sessions can normally be measured in some hundreds of milliseconds (also depending on the network latency between your client and the server).
Thank you so much for your explanation!
Closing this issue as hopefully the above note has provided the information you needed. Please feel free to reopen if something is not clear.
What is discussed in this issue is the meaning of the settings as of v1.1.0.
I would like to confirm the current state.
MinOpened is not changed and it is the most important setting.MaxOpened is not changed and there are some need to change because default value(400) is too big or small for some clients.MaxBurst is deprecated (https://github.com/googleapis/google-cloud-go/pull/4115) by spanner: increase sessions in batches(v1.6.0).HealthCheckInterval and HealthCheckWorkers.MaxIdle is not changed since spanner: keep better track of max sessions(v1.1.0).WriteSessions is not changed and meaningful in some situations.spanner.databases.beginOrRollbackReadWriteTransaction permission.See inline replies.
What is discussed in this issue is the meaning of the settings as of v1.1.0.
I would like to confirm the current state.
The meaning of
MinOpenedis not changed and it is the most important setting.
- In my experience, the client like CLI or batch process, which is short living and use less concurrent sessions, are not comfortable with default value(100).
The meaning of MinOpened has not changed and is certainly one of the most important settings. For most cases, this value can be kept at the default value, or should be increased if your application is expected to execute a large number of queries / transactions in parallel. A smaller value can also be a good choice if your application is short lived, as creating 100 sessions, executing one or only a few queries, and then deleting 100 sessions is inefficient. If your application is long-lived, but never does any parallel queries and thereby never really needs more than 1 session, the default value is also OK, as the overhead of creating and deleting sessions in comparison with the total application lifetime is relatively low. But also in the last case it can make sense to decrease it to a lower value.
- The meaning of
MaxOpenedis not changed and there are some need to change because default value(400) is too big or small for some clients.
The meaning of MaxOpened has not changed. There are not very many cases where the default value is too big, as this value will not have any impact on an application that uses less than MaxOpened sessions. The only scenario where it could make sense to lower it, is if you suspect that your application is experiencing a session leak, and you want to track it down more quickly. Lowering MaxOpened in combination with setting TrackSessionHandles to true, will ensure that the session pool will be exhausted more quickly, and will return the a stackdump of the at that moment checked out sessions.
MaxBurstis deprecated (#4115) by spanner: increase sessions in batches(v1.6.0).
Correct. This setting does not have any function anymore, and is only kept around to prevent compilation failures in existing applications.
- The health check has changed and there are no need to change
HealthCheckIntervalandHealthCheckWorkers.
Correct.
spanner: update the health check interval(v1.4.0)
- It seems that the default value of HealthCheckInterval is changed to 50(min) but it is not reflected in godoc.
Correct, and good spot on the missing change to the godoc. I'll update that.
spanner: switch the session keepalive method from GetSession to SELECT 1(v1.5.0)
- This ping query properly prevents idle sessions from being dropped.
Correct. The change from GetSession to SELECT 1 should be considered an internal implementation detail, and was mainly done to be consistent with the client libraries for other languages.
The meaning of
MaxIdleis not changed since spanner: keep better track of max sessions(v1.1.0).
- It is minor setting (java-spanner deprecate MaxIdleSessions configuration option).
Correct, the meaning and behavior of this option has not changed. It is a setting that is normally not needed, and in most cases it's better to just use a higher MinSessions value.
The meaning of
WriteSessionsis not changed and meaningful in some situations.
- The client library maintains write pool and currently not use inline begin transaction like java-spanner(googleapis/java-spanner#325).
If the ratio of ReadOnly transaction and ReadWrite transaction is extremely different, it makes sense to set it.
Especially, there are only ReadOnly transactions, it is better to be set 0.0.
- In this case, the principal may not have
spanner.databases.beginOrRollbackReadWriteTransactionpermission.
All the above is correct. This setting is still used in the Go client library to maintain a number of write-prepared sessions in the pool, and it can be useful to tweak this value if your application has a significantly different read/write ratio than reflected in this setting.
Thanks for answering!
Most helpful comment
Thanks for the clarification. As always with these things, the answer depends on the circumstances.
The most important general rule of thumb is: Set
MinOpenedto at least the number of concurrent transactions that your client will be executing (and you should round up and not down). If you are able to estimate a good value for that, the default values for the other session pool settings are good.Furthermore, the following applies:
MinOpened=200is a good choice.MaxBurstcontrols the maximum number of sessions that should be created concurrently on demand. If your application normally only needs 100 sessions, but there might be sudden bursts that increases that requirement to 300 sessions, you could setMinOpened=100andMaxBurst=200. This will ensure that the Spanner client will create up to 200 sessions concurrently if the application is requesting more sessions than are in the pool at that moment. You should only do this if these bursts are uncommon. If it is normal that your application sometimes needs 300 sessions, it is better to setMinSessions=300.MaxIdlecontrols the number of sessions that the session pool will keep in the pool, even though they are considered 'idle' by the pool. It's best explained using an example:MinOpened=100,MaxOpened=400,MaxIdle=10.150sessions.MinOpenedsessions, but that these sessions are not needed. It will start to delete sessions until the session pool containsMinOpened+MaxIdlesessions. In this example the session pool maintainer will reduce the number of sessions in the pool to 110.HealthCheckWorkersandHealthCheckIntervalare good for virtually all circumstances and there's no general rule of thumb when these should be changed.BatchCreateSessionswill only be used to initialize theMinOpenedsessions in the pool. This RPC executes in roughly the same time as a single CreateSession RPC, meaning that the total initialization time of a session pool with 100+ sessions can normally be measured in some hundreds of milliseconds (also depending on the network latency between your client and the server).