Caddy: acme_server: error on reload

Created on 13 Nov 2020  路  16Comments  路  Source: caddyserver/caddy

I started with this config:

{
  "apps": {
    "http": {
      "servers": {
        "srv0": {
          "listen": [
            ":80"
          ],
          "routes": [
            {
              "match": [
                {
                  "host": [
                    "localhost"
                  ]
                }
              ],
              "handle": [
                {
                  "handler": "acme_server"
                }
              ]
            }
          ]
        }
      }
    }
  }
}

To which I only added logging:

{
  "logging": {
    "logs": {
      "log0": {
        "writer": {
          "output": "stderr"
        },
        "encoder": {
          "format": "json"
        }
      }
    }
  },
  "apps": {
    "http": {
      "servers": {
        "srv0": {
          "logs": {
            "logger_names": {
              "localhost": "log0"
            }
          },
          "listen": [
            ":80"
          ],
          "routes": [
            {
              "match": [
                {
                  "host": [
                    "localhost"
                  ]
                }
              ],
              "handle": [
                {
                  "handler": "acme_server"
                }
              ]
            }
          ]
        }
      }
    }
  }
}

Ran caddy reload --config caddy.json and got the error:

ERROR   admin.api   request error   {"error": "loading config: loading new config: loading http app module: provision http: server srv0: setting up route handlers: route 0: loading handler modules: position 0: loading module 'acme_server': provision http.handlers.acme_server: initializing certificate authority: Error opening database of Type badger with source /Users/mohammed/Library/Application Support/Caddy/acme_server/db: error opening Badger database: Cannot acquire directory lock on \"/Users/mohammed/Library/Application Support/Caddy/acme_server/db\".  Another process is using this Badger database.: resource temporarily unavailable", "status_code": 400}

reload: sending configuration to instance: caddy responded with error: HTTP 400: {"error":"loading config: loading new config: loading http app module: provision http: server srv0: setting up route handlers: route 0: loading handler modules: position 0: loading module 'acme_server': provision http.handlers.acme_server: initializing certificate authority: Error opening database of Type badger with source /Users/mohammed/Library/Application Support/Caddy/acme_server/db: error opening Badger database: Cannot acquire directory lock on \"/Users/mohammed/Library/Application Support/Caddy/acme_server/db\".  Another process is using this Badger database.: resource temporarily unavailable"}

2020/11/13 21:22:08.725 INFO    admin   stopped previous server

Note the stopped previous server. I haven't checked whether the acme endpoint still works.

bug

Most helpful comment

I think that rolling with BoltDB is totally fine as long as it's working.

We've had to make a number of code changes to export various badger db configuration to allow compatibility for a wider array of operating systems / platforms. Hopefully using BoltDB helps to avoid that. We haven't heard any issues with BoltDB (although that's probably because users are defaulted to Badger 馃槈).

All 16 comments

Thanks for the report!

Honestly, between this issue and #3849, that underlying badger storage is becoming a pain. Is there any way we can get rid of it, @dopey, and use something a little less restrictive?

Could switch to boltdb: https://smallstep.com/docs/step-ca/configuration#boltdb

I'm happy to test that as an alternative to my PR.

Just pushed new PRs for switching to different databases here #3867 and #3868

Pulling discussion over from the other PR here to consolidate conversations on this issue.

As per @mholt's recommendation on https://github.com/caddyserver/caddy/pull/3868#issuecomment-728246450, I'm looking into using a UsagePool.

The example shared is pretty clear, however I believe using it here will be complicated by the fact that this is an actual file and we don't instantiate the database instance directly. It is done by the upstream packages.

I've found that the error is raised when creating the authority on this line: https://github.com/caddyserver/caddy/blob/05d12d49430d118cb337d92c3d3ba929c70cbeba/modules/caddypki/acmeserver/acmeserver.go#L145

This means that I'd need to generate some key using the caddypki.AuthorityConfig. This is not a problem. The problem is that if any of the values within the key change, we will encounter this error again due to an attempt to create a new database instance on top of the old one again.

Possible solutions I can see are:

  1. an upstream patch to make it possible to pass a reference to a database instance rather than just configuration
  2. using a unique database name based on the same key used in the UsagePool to avoid a collision and provide some cleanup of old databases.

The former would be better, but not sure how feasible it is. The latter feels kind of hacky, but is likely doable entirely within caddy.

This means that I'd need to generate some key using the caddypki.AuthorityConfig. This is not a problem. The problem is that if any of the values within the key change, we will encounter this error again due to an attempt to create a new database instance on top of the old one again.

What if the key was just the path to the database file?

In that case, if any of the other config values in authorityConfig are changed (eg. the CA), the server would not be regenerated.

Ah, I see what you mean now.

Hmmm, I'd rather we do this properly... so it looks like the Smallstep dependency needs a way to take an initialized database, instead of being told how to initialize a database, did I understand you right? If so, I can open an issue upstream if you want, and see if we can work that out.

Yea. That's exactly what I was thinking. Although... is it possible to run multiple ACME servers with multiple CAs with Caddy? If so, it may actually be best to initialize the database based on the CA name provided anyway. That would resolve this issue from the Caddy side. If not, then I agree it makes sense to fix upstream.

is it possible to run multiple ACME servers with multiple CAs with Caddy? If so, it may actually be best to initialize the database based on the CA name provided anyway.

Yes, it is, so that's a very good idea too. I'd need to look at the code for that to have a better sense of it but I'm backlogged a couple weeks with issues and PRs that piled up recently. If you want to continue helping out, I'd say go for it, it'd be really appreciated! I think your dev sense is really on-point so far. Otherwise, it'll just have to wait until I get around to it.

Actually... I think I found there is an upstream for specifying an initialized database. I'll start with that. Having multiple databases may still be necessary if running multiple authorities anyway.

Just pushed a branch that apparently resolves this issue.

Sorry, late to the discussion. Boltdb is a good option if it fixes the issues folks were having.

We chose Badger as the default due to the larger and more recently engaged community.

The reason that we don't default Badger to v2 is that it's not backwards compatible. People who ran the old version would have to migrate their DB. We just opened an issue in our db integration layer (https://github.com/smallstep/nosql/issues/11) for defaulting to the right DB if one already exists, otherwise generating a new one.
Sounds like this wouldn't have fixed your issue on it's own because you'd also have to set some DB configuration values.

Apparently Badger now has an in-memory mode, not sure how far along that is. @maraino did some quick sleuthing this morning.

Thanks @dopey.

The fix to this bug in particular is not dependent on Badger vs BoltDB, but instead managing db instances properly.

Switching to BoltDB was because of this issue: https://github.com/caddyserver/caddy/issues/3847.

The issue with BadgerV1 was actually due to the in memory database. The issue was resolved by switching to either BadgerV2 or BoltDB. I actually had a branch for either and we decided on BoltDB as we were going to break backwards compatibility anyway. It would not be hard to switch this to a BadgerV2 database.

We haven't tagged a stable release with these changes yet (we did tag v2.3.0-beta.1), so if we do need to make any changes, there's still a bit of time, technically.

I don't really have an opinion on which database should be used for the final release, but if someone feels strongly that we should move to BadgerV2, I'm happy to make that change.

I think that rolling with BoltDB is totally fine as long as it's working.

We've had to make a number of code changes to export various badger db configuration to allow compatibility for a wider array of operating systems / platforms. Hopefully using BoltDB helps to avoid that. We haven't heard any issues with BoltDB (although that's probably because users are defaulted to Badger 馃槈).

Was this page helpful?
0 / 5 - 0 ratings

Related issues

aeroxy picture aeroxy  路  3Comments

ericmdantas picture ericmdantas  路  3Comments

lorddaedra picture lorddaedra  路  3Comments

mholt picture mholt  路  3Comments

dafanasiev picture dafanasiev  路  3Comments