node_redis: 2.7.1
redis: 3.2
platform: Amazon linux
description:
When redis' snapshotting occurs it disconnects the redis client. The client reconnects and resubscribes, however, they no longer receive data. This can be reproduced by issuing a SAVE
command via the redis-cli.
JS
const redis = require('redis');
const redisClient = redis.createClient({
host: config.redisEndpoint,
port: config.redisPort,
});
redisClient.on('connect', () => {
console.log('Connected to Redis');
console.log('subscription_set:', redisClient.subscription_set);
});
redisClient.on('reconnecting', (stats) => {
console.log('Reconnecting to Redis', stats);
});
redisClient.on('error', (error) => {
console.log('Failed to connect to Redis: ', error);
});
redisClient.on('subscribe', (channel, count) => { // eslint-disable-line no-unused-vars
console.log('Subscribed to: ', channel, count);
});
redisClient.on('message', (channel, message) => { // eslint-disable-line no-unused-vars
console.log('Received from: ', channel, message);
});
redisClient.subscribe('test');
Logs
Initial connect
info: Connected to Redis
info: subscription_set:
info: Subscribed to: test 1
Issuing a SAVE command in redis-cli
info: Error: Redis connection to 34.213.3.19:6379 failed - read ECONNRESET
info: Reconnecting to Redis delay=983, attempt=4, code=ECONNRESET, errno=ECONNRESET, syscall=connect, address=127.0.0.1, port=6379, total_retry_time=1118, times_connected=1
info: Connected to Redis
info: subscription_set: subscribe_test=test
info: Subscribed to: test 1
It is at this point that the client will no longer receive any data published to the channel test. If you have any questions or need more info to reproduce please let me know.
@BridgeAR @iamjem Might be related to #1249
Thanks for your detailed description, however I have not been able to reproduce it yet. When I issue a SAVE
through redis-cli the connection is not reset.
When I manually stop the redis server and start it again it seems to receive messages fine.
I have been testing this on Mac, will try to reproduce it too on Linux
Hmmm... that's interesting. It was occurring with a totally stock redis installation built from source. The only option I changed was to allow remote connections. I'll take a look shortly and see if I can figure out what's up. @jdegger
@jdegger I'm not able to repro locally, only in AWS under super specific conditions. I can create a second copy of this AMI and provide code that will allow you to reproduce but it's seemingly something specific to running in AWS. Let me know if you're interested... otherwise I'd say go ahead and close the issue
Ok, finally able to reproduce reliably. There are a few requirements:
tcp-keepalive 0
What you will see is that every 2 hours (at least for me) is an ECONNRESET error. At this point, the socket appears to still be open in redis but maybe the node process has lost a reference to it and opened a new socket? Overnight I had like 1 client connected per 2 hour block plus the currently running process when issuing CLIENT LIST
from redis-cli.
It's important to note that the clients were all successfully reconnecting but not receiving data. I have a hunch that the data was being pushed out, but to the old connection not the new one... but that's totally a hunch.
Now, i'm not 100% sure if this is a node_redis bug or actually a bug with node, so if this is the wrong place to file just let me know. I'd be happy to try and debug/step through but have limited time.
The fix was to set tcp-keepalive 300
which keeps the connection active... but this feels wonky since the client should be able to successfully reconnect.
@shmendo
Our also shows that ECONNRESET error happens every 2 hours. But I checked tcp-keepalive, it is 300
CONFIG GET tcp-keepalive
1) "tcp-keepalive"
2) "300"
Most helpful comment
Ok, finally able to reproduce reliably. There are a few requirements:
tcp-keepalive 0
What you will see is that every 2 hours (at least for me) is an ECONNRESET error. At this point, the socket appears to still be open in redis but maybe the node process has lost a reference to it and opened a new socket? Overnight I had like 1 client connected per 2 hour block plus the currently running process when issuing
CLIENT LIST
from redis-cli.It's important to note that the clients were all successfully reconnecting but not receiving data. I have a hunch that the data was being pushed out, but to the old connection not the new one... but that's totally a hunch.
Now, i'm not 100% sure if this is a node_redis bug or actually a bug with node, so if this is the wrong place to file just let me know. I'd be happy to try and debug/step through but have limited time.
The fix was to set
tcp-keepalive 300
which keeps the connection active... but this feels wonky since the client should be able to successfully reconnect.