Node-redis: retry_strategy doesn't reset after flushing the error

Created on 18 Feb 2017  路  21Comments  路  Source: NodeRedis/node-redis

  • Version: node_redis 2.6.5, redis 3.2.7
  • Platform: Node.js 7.5.0 on Mac OS 10.12.3
  • Description:

I'm using the following code

const redis = require('redis');

const client = redis.createClient({
  url: 'redis://localhost:6379',
  retry_strategy: options => new Error('Fail')
});

setInterval(() => client.incr('some-key', console.log), 5000);

running against a local redis (docker run --rm -p 6379:6379 redis)

After I see the first output, I kill the redis to simulate disconnect. When I've seen the error come in I restart the redis. The connection doesn't come up again. I would expect that after flushing the error to the offline handlers, the client would try to reconnect on the next command. instead it stays in closed state. Also it's returning with AbortError instead of new Error('Fail').

Feature Request

Most helpful comment

Quick update here.
I added the enable_offline_queue = false to my createClient options, and it seems to have done what I wanted it to, which is:
"Continue to attempt to re-connect to my Redis server based on the retry_strategy function, but immediately throw errors in response to attempts to use the client in the meantime. Then, if able to reconnect, just start working again".

FYI: Thows the following error on a get call: AbortError: SET can't be processed. Stream not writeable.

For my usage as a straight layer between the DB and my app, this should be fine. Would be nice to know if there's a more "robust" solution possible where:

  • Client continues to attempt to reconnect based on retry_strategy
  • Calls/consumers/users of the Client don't hang while it's reconnecting
  • Commands can still be queued (conditionally?) to be executed if/when a connection is re-established...in case I'm incrementing things, etc.

Probably a lot of tough scenarios to consider here...

All 21 comments

Via the documentation for retry_strategy, it appears that node_redis will only attempt to reconnect if the function returns a number.

If you return a number from this function, the retry will happen exactly after that time in milliseconds.

In your case you are returning a non number, which does not satisfy the requirements for attempting to reconnect.

If you return a non-number, no further retry will happen and all offline commands are flushed with errors.

the flow I'm looking for is: on disconnect, retry connecting 3 times with 1 second interval. If at the third time it's still disconnect, it should return all outstanding commands with an error. I don't want it to stop trying to reconnect.

With a modified version of the example given in the README.md, you should be able to achieve what you are looking for. However, once the max attempts of 3 is exhausted, the client will not automatically attempt to reconnect on the next command. I believe you would need to manually call createClient() again.

const client = redis.createClient({
    url: 'redis://localhost:6379',
    retry_strategy: (options) => {
        if (options.times_connected >= 3) {
            // End reconnecting after a specific number of tries and flush all commands with a individual error
            return new Error('Retry attempts exhausted');
        }
        // reconnect after
        return 1000;
    }
});

Yes I left out the attempts since that doesn't make a difference here.

So basically you mean:

let client = createClient();
function createClient () {
  return redis.createClient({
    url: 'redis://localhost:6379',
    retry_strategy: (options) => {
      client = createClient();
      return new Error('Retry attempts exhausted');
    }
  });
}

Which implies I can't pass around my redis client in my code anymore?

You could create a light wrapper to manage a singleton instance of the redis client.

// redisClient.js

let redis = require("redis");

let client = null;

const options = {
    url: 'redis://localhost:6379',
    retry_strategy: (options) => {
        client = null;
        return new Error("Redis client connection dropped.");
    }
};

module.exports = {
    getClient: () => {
        if (client == null) {
            client = redis.createClient(options);
        }
        return client;
    }
};

To use:

// usage

let client = require("redisClient.js");

client.getClient().get("some key");
client.getClient().set("some key", "some value");

I suppose this is very similar to calling createClient() in the retry strategy, but i'm not a huge fan of that method.

If you wanted to reduce the need to call getClient() all the time, you could make a large wrapper that wraps all of the redis calls like get(), set(), etc, and call getClient() in each one of those methods. However, this method of getClient() is a pretty common method of managing connections lazily rather than eagerly.

sure, but then why does this library provides any retry strategy at all if you have to wrap it in an extra retry wrapper anyway?

I'm not the author so i'm not sure of the motivations behind the specific implementation of the retry_strategy. In my own uses of node_redis I have found the currently functionality of retry_strategy to be useful, but I agree it is lacking in the ability to automatically retry no matter what. In my uses I have consistently returned a number from the retry_strategy and never return a error, as I always want the connection attempt to be made. I do however log the errors from within the retry_strategy.

How about something like this? https://github.com/viglucci/node_redis/tree/feature-reconnect-after-flush#retry_strategy-example

var client = redis.createClient({
    retry_strategy: function (options) {
        if (options.error && options.error.code === 'ECONNREFUSED') {
            // End reconnecting on a specific error and flush all commands with a individual error
            return new Error('The server refused the connection');
        }
        if (options.total_retry_time > 1000 * 60 * 60) {
            // End reconnecting after a specific timeout and flush all commands with a individual error
            return new Error('Retry time exhausted');
        }
        // attempt reconnect after retry_delay, and flush any pending commands
        return {
            retry_delay: Math.min(options.attempt * 100, 3000),
            error: new Error('Your custom error.');
        }
    }
});

Makes a lot of sense conceptually. Another way of implementing this could be by splitting this into two strategies:

var client = redis.createClient({
  // this strategy handles the pending redis commands when the connection goes down
  flush_strategy (options) {
    if (options.attempt >= 3) {
      // flush all pending commands with this error
      return new Error('Redis unavailable')
    }
    // let the connection come up again on its own
    return null;
  },
  // this strategy handles the reconnect of a failing redis
  retry_strategy (options) {
    if (options.total_retry_time > 1000 * 60 * 60) {
      // The connection is never going to get up again
      // kill the client with the error event
      return new Error('Retry time exhausted');
    }
    // attempt reconnect after this delay
    return Math.min(options.attempt * 100, 3000)
  }
});

I don't really like the name flush_strategy but one could come up with something better.

Oh interesting. What you are suggesting here feels more like renaming the current retry_strategy to connection_strategy and using the existing retry_strategy as what you've referred to as the flush_strategy.

var client = redis.createClient({
    // this strategy handles reconnection
    connection_strategy (options) {
        if (options.total_retry_time > 1000 * 60 * 60) {
            // The connection is never going to get up again
            // kill the client with the error event
            return new Error('Retry time exhausted');
        }
        // attempt reconnect after this delay
        return Math.min(options.attempt * 100, 3000)
    },
    // this strategy handles the pending redis commands when the connection goes down
    retry_strategy (options) {
        if (options.attempt >= 3) {
            // flush all pending commands with this error
            return new Error('Redis unavailable');
        }
        // let the client attempt the commands once a connection is available
        return null;
    },
});

Implemented: https://github.com/viglucci/node_redis/tree/feature-connection-strategy#retry_strategy-example

@Janpot can you elaborate the use case where this is needed? I tried to lower the amount of options to make it simpler to use and the retry_strategy is already a powerful option.

@viglucci thanks for answering here and giving good feedback! I personally like your suggestion to return an object most in case this will be implemented. This would not really be a breaking change, even though I am planning on opening a v3 branch soon. Currently it's just to much hassle to implement many new features and removing all deprecated things will help a lot.

@BridgeAR It's pretty much that right now, if redis connection breaks for some reason, all my commands just hang. Suppose I have a fallback for the operation that redis is doing then that fallback also hangs. I'd rather want redis to return an error after a few retries, which I can catch and employ the fallback. That doesn't mean I don't want the client to stop trying to connect. Redis might come back up in a minute.

I just don't see the point of the current behavior. You return an error in the retry handler and then your client just dies forever. Who needs that sort of behavior unless you create a client for every command?

Or I might not see the use case for that ofcourse 馃槃

@BridgeAR Thanks. Well if you have a "road to V3" roadmap with simple tasks like remove support for configuration params / remove deprecation warnings, etc, ide be interested in submitting pull requests. Probably wouldn't be able to tackle large work like adding support for additional redis commands, but simple house cleaning things I could probably do if they were open for PR's once a V3 feature branch is available.

@viglucci I invited to as a collaborator. I'm going to make a plan sometime soon

+1 for a solution to this problem.

I'm using Express, and while my retry_strategy is still returning integers to try and connect again in the future, the commands (and web requests) continue to pile/back-up instead of throwing something so they can get on with their lives...and hopefully a connection is _eventually_ re-established before the strategy says to give up or whatever.

Can this be accomplished? @Janpot have you found anything currently available to solve this situation?

Maybe something with the enable_offline_queue and/or retry_unfulfilled_commands options can accomplish this?

Thanks to all for your hard work and help!

Quick update here.
I added the enable_offline_queue = false to my createClient options, and it seems to have done what I wanted it to, which is:
"Continue to attempt to re-connect to my Redis server based on the retry_strategy function, but immediately throw errors in response to attempts to use the client in the meantime. Then, if able to reconnect, just start working again".

FYI: Thows the following error on a get call: AbortError: SET can't be processed. Stream not writeable.

For my usage as a straight layer between the DB and my app, this should be fine. Would be nice to know if there's a more "robust" solution possible where:

  • Client continues to attempt to reconnect based on retry_strategy
  • Calls/consumers/users of the Client don't hang while it's reconnecting
  • Commands can still be queued (conditionally?) to be executed if/when a connection is re-established...in case I'm incrementing things, etc.

Probably a lot of tough scenarios to consider here...

@newhouse,

Thanks for mentioning this. We wanted the same behavior for our application and your solution worked for immediately returning an error to the caller, but allow us to retry to connect indefinitely.

Thank you @newhouse, I'm not sure I would have ever found this.

IMO this should be the default behavior, or at least explicitly called out in the docs around retry_strategy. Let me know if you think the latter is reasonable, and I will open a PR to add the docs.

@bwhitty a docs PR from anyone would be most welcome 馃憤

One gotcha with using enable_offline_queue = false is that commands sent right after calling createClient, but before the connection is actually opened will fail as well.

One gotcha with using enable_offline_queue = false is that commands sent right after calling createClient, but before the connection is actually opened will fail as well.

@jkirkwood Sounds like you'll want to delay trying to send your first commands until the ready event (or similar appropriate event) is emitted?

Was this page helpful?
0 / 5 - 0 ratings