Caddy: Allow placeholders in log filenames

Created on 2 Feb 2017 · 9Comments · Source: caddyserver/caddy

I recognize that the documentation does not claim to support placeholders in the filename portion of logs, so please consider this a feature request more than a bug report!

1. What version of Caddy are you running (`caddy -version`)?

Caddy 0.9.5

2. What are you trying to do?

I'd like to have multiple vhosts (in a WordPress multi-site setup) served by a single Caddy configuration block, but have each site's activities logged to vhost-specific log files.

3. What is your entire Caddyfile?

# for local testing purposes only
*:80
gzip
tls off
rewrite {
  if {path} not_match ^\/wp-admin
  to {path} {path}/ /index.php?_url={uri}
}
fastcgi / 127.0.0.1:9000 php
log /var/log/caddy/{hostonly}.log {
  rotate {
    size 100 # Rotate after 100 MB
    age 7 # Keep rotated log files for 7 days
    keep 10 # Keep at most 10 log files
  }
}
errors {
  log /var/log/caddy/{hostonly}-e.log {
    size 50 # Rotate after 50 MB
    age 7 # Keep rotated files for 7 days
    keep 5 # Keep at most 5 log files
  }
}

4. How did you run Caddy (give the full command and describe the execution environment)?

I copied the init/linux-systemd/caddy.service file to /etc/systemd/system, and ran sudo systemctl start caddy.service on an Ubuntu 16.04 VM running under VirtualBox.

5. What did you expect to see?

I expected to see /var/log/caddy populated with files that matched the names I might use to access the sites.

6. What did you see instead (give full error messages and/or log)?

I see files named /var/log/caddy/{hostonly}.log and /var/log/caddy/{hostonly}-e.log

7. How can someone who is starting from scratch reproduce this behavior as minimally as possible?

Ensure that the site running Caddy is accessible via multiple host names. If DNS is not set up for this, add multiple entries to /etc/hosts for the IP address at which Caddy is running. Then try to access each of those host names.

Strictly speaking even that's not necessary. Just try to use a placeholder in the filename portion of a log directive and observe that the placeholder is not replaced.

discussion feature request

Source

skpy

👍1

Most helpful comment

While it's true that this could be an attack vector, unless you advertise your specific configurations any attacker would have to try this specific attack and hope for the best. If you're facing an attacker dedicated enough to try this, they'll like try plenty of other things to attack you, so this doesn't introduce much more risk in my mind.

On the one hand, I like to allow people to use the software in the way that makes the most sense for them; so if the user really wants to create an unlimited number of log files, then document the risks and let 'em do it. We can't presume to know every possible valid use case someone might have that we didn't anticipate.

On the other hand, I do prefer software that places real value on thinking things through and helps me not shoot myself in the foot.

For myself, I've solved my logging problem. There are plenty of ways to tackle this problem:

log to a single log file in a custom log format, and parse that file on a regular basis, as @mholt suggested
log to something like cronolog
lots of other solutions I haven't considered

skpy on 26 Apr 2017

👍2

All 9 comments

I have the start of a PR for this issue with commit https://github.com/mholt/caddy/commit/4efdf66abdb4fb01b44d9cbfeb387570e3ffce61

@mholt There are some comments inline if you have time to take a look.

Essentially I call replacer on the path for the log file.

If the logfile returned is differnt to the currently open one, I close it and open the new log file? I have questions around

Am I closing the existing logfile correctly?
would a better approach be to leave a file open for each different log file opened rather than closing and reopening?
Is this desired behaviour are we happy that this Issue is to be implemented?
Is #1404 relevant to this at all?

tobya on 8 Feb 2017

The PR I created for this works, but #1404 change completely how log files are opened and maintained, so I will need to look at it again. Fundamentally it is straightforward though.

tobya on 14 Feb 2017

I think this could become an attack vector, since it isn't rate limited. An attacker could just flood your server with requests for random names in the Host header and fill up your disk...

What if you just prefix each log entry with the hostname, then you can still filter all lines by the hostname you want?

mholt on 18 Feb 2017

@skpy I think we will close this for the moment. While this is possible to do we are concerned about the performance and security implications of implementing this feature.

tobya on 19 Feb 2017

👍1

Looking at this again, the fundamental issue is mholt is concerned about its potential use in an attack by filling up disk with requests for random hosts on a caddy setup with wildcard domains, each of which would create a new log file on disk.
eg

:80 { log / c:\logs\{hostonly}.log root c:\website\ }
If it is a standard domain without port or :443 then the rate limiting of letsencrypt will prevent this sort of attack.

A possible solution may be to rate limit the number of logfile names that can be created in memory for a dynamic name and writing to a generic log file if the limit is reached. This limit could be quite high say 1000 (although memory usage may be a factor) and perhaps set higher if required. It would be reset on caddy restart.

tobya on 25 Apr 2017

That's complicated. :( Is this feature important enough to assume the maintenance burden of rate limiting new log files being created? It also just slows down the attack, doesn't prevent it...

mholt on 26 Apr 2017

On the other hand, I do prefer software that places real value on thinking things through and helps me not shoot myself in the foot.

For myself, I've solved my logging problem. There are plenty of ways to tackle this problem:

log to a single log file in a custom log format, and parse that file on a regular basis, as @mholt suggested
log to something like cronolog
lots of other solutions I haven't considered

skpy on 26 Apr 2017

👍2

@skpy Thanks for explaining the workarounds, that is very helpful!

I will just comment on this:

unless you advertise your specific configurations any attacker would have to try this specific attack and hope for the best.

And say that this is true of most Internet attacks. It's not uncommon for server logs to already be littered with requests for /wp-admin or even random hostnames (I see them all the time on my 5-legit-visitors-per-month sites).

So, I do think the risk is real, even if it means you have to be a target in practice (note, not "target practice" though you could be that too).

Let's go with one of your alternatives for now. :smile: We'll keep thinking on this; maybe there's a simple, elegant solution.

mholt on 26 Apr 2017

This option could be really useful in some cases, I'm building a service that will allow users to use a custom subdomain or their own domain to access the application, and I would like to keep logs organized.
A log directory hierarchy of this format is what I'm looking for:
/var/log/caddy/{host}/(access|error).log

I had a couple ideas to solve the rate-limiting problem:

Option to set max_logs and then fallback to a common log file, that is constantly rotated so it doesn't fill up the disk. When users need more log fles (so hosts) will just have to increase the value and then reload the configuration. This will obviously lead to a series of issues: for example, you may loose important logs of legit hosts.
Allow this only in specified hosts, wildcard domains and port-only setups are the dangerous ones (if I understood the issue), so for an attacker to fill up the disk he would also have to create a config file on the server for each targeted host.

If we consider the first option we can do something like this, which is nice and clean:

*.domain.tld {
    ...
    log /var/log/caddy/{host}/access.log 100
    errors /var/log/caddy/{host}/error.log 100
    ...
}

-- OR --

*.domain.tld {
    ...
    log {
        path /var/log/caddy/{host}/access.log
        max_files 100
        ...
    }
    errors {
        path /var/log/caddy/{host}/error.log
        max_files 100
        ...
    }
    ...
}

On the other hand, if we consider the second option, we can do something like this:

# Main config file
domain.tld {
    ...
    import log.conf
    ...
}

subdomain1.domain.tld {
    ...
    import log.conf
    ...
}
...

# log.conf
log /var/log/caddy/{host}/access.log
errors /var/log/caddy/{host}/error.log

this would increase productivity while configuring multiple hosts and would also distinguish hostnames in multiple host declarations (such as sub0.domain.tld, sub1.domain.tld {...}), which is by the way a problem that you can solve with custom logging format, as you said multiple times.

The first option would be useful in situations where you don't exactly know the amount of hosts/subdomains you're hosting and it may be a variable, the second option can be useful when you have plenty of different hosts, but they're a fixed amount, and the solution is just an enhancement in productivity while configuring.

Moreover, you should consider that on production scenarios you normally have a different partition for log files, made exactly to solve this problem. I know that Caddy has the goal to be easy for 0-knowledge users, but it shouldn't limit advanced use-cases in my opinion. That said, it's also really unlikely that a unexperienced user needs this functionality, and as @skpy was saying you can always say in the documentation with a big red alert that it is really unsafe to do this and that you should not do this unless you know hat you're doing, as you did here for tls settings.