Core: HAproxy config can't be synced to secondary node - fails with "parse error. not well formed"

Created on 18 Dec 2020 · 94Comments · Source: opnsense/core

Describe the bug
After upgrading to 20.7.7 - I'm unable to sync my primary to secondary node. Basic troubleshooting reveals that when turning off HAproxy in config sync, the sync continues.

Before the upgrade, things were running fine. Downgrading via opnsense-revert -r 20.7.6 opnsense doesn't change the behaviour. Note that the HAproxy config is valid and running fine.

To Reproduce
Upgrade from 20.7.6 to 20.7.7 with an elaborate haproxy config; following error is thrown on sync:
parse error. not well formed

Expected behavior
I expect the sync to work. I've already checked if no special characters are used in descriptions of backends / rules / etc.. and this isn't the case. After taking a snapshot in proxmox of the systems in question, and starting to "remove" parts of the haproxy config; at some point I'm greeted with an API error and it might be resolved - but it's inconsistend in when it's resolved. (sometimes it's magically fixed when removing a rule, sometimes a backend or virtual service,...)

Relevant log files

</struct></value>
</data></array></value></member>
</struct></value>
</data></array></value></member>
  <member><name>dnssec</name><value><string>1</string></value></member>
</struct></value></member>
</struct></value></param>
</params></methodCall>received >>>
<?xml version="1.0"?>
<methodResponse>
  <fault>
    <value>
      <struct>
        <member>
          <name>faultCode</name>
          <value><int>-32700</int></value>
        </member>
        <member>
          <name>faultString</name>
          <value><string>parse error. not well formed</string></value>
        </member>
      </struct>
    </value>
  </fault>
</methodResponse>
error >>>
parse error. not well formed

Environment
OpnSense 20.7.7 / Proxmox / CARP setup

upstream

Source

Wireheadbe

Most helpful comment

lighttpd developer here: My sincere apologies for the trouble with lighttpd 1.4.56 and lighttpd 1.4.57.

lighttpd 1.4.56 was a major, major release with large rewrites for HTTP/2 and multiple TLS library options, as well as HTTP/1.1 chunked enhancement (which was the source of bugs here)

With lighttpd 1.4.58, I believe all the bugs have been shaken out. At least all the reported bugs have been addressed.

In the future, if you do notice regressions, please report them upstream on https://github.com/lighttpd/lighttpd1.4 or https://redmine.lighttpd.net/projects/lighttpd/issues

Thanks.

gstrauss on 5 Jan 2021

👍6

All 94 comments

Is the sync URL http instead of https?

fichtner on 18 Dec 2020

in the config, it's configured as "just" an IP, the error shows an http url with that very same IP (http://10.0.0.2:8090/xmlrpc.php) . Should i reconfigure it as https?

Wireheadbe on 18 Dec 2020

Worth a try. I can check out the candidate lighttpd tomorrow, but it seems that http->https redirect is broken.

On 18. Dec 2020, at 00:49, Jeffrey notifications@github.com wrote:

in the config, it's configured as "just" an IP, the error shows an http url with that very same IP (http://10.0.0.2:8090/xmlrpc.php) . Should i reconfigure it as https?

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.

fichtner on 18 Dec 2020

I did an upgrade from 20.7.3 to 20.7.7 yesterday in the night and also synced (including HAProxy) without problem.
Please switch to https and try again.

mimugmail on 18 Dec 2020

Worth a try. I can check out the candidate lighttpd tomorrow, but it seems that http->https redirect is broken.
…
On 18. Dec 2020, at 00:49, Jeffrey @.*> wrote: in the config, it's configured as "just" an IP, the error shows an http url with that very same IP (http://10.0.0.2:8090/xmlrpc.php) . Should i reconfigure it as https? — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

But it's strange it would only affect haproxy no? So yeah, my management runs on port 8090, no redirects.

Wireheadbe on 18 Dec 2020

And it runs unencrypted?

mimugmail on 18 Dec 2020

Yes. I'll try a packet capture in a couple of moments. I should see where the issue happens I guess, as I presume, the error I see on the primary node during the sync, is actually the response of the secondary, right?

Wireheadbe on 18 Dec 2020

So I see that the full config is being sent, but it's definitely on the receiving end that something is unable to be parsed. I might have to put some hack-job echo'ing in the xmlrpc.php to find this one..

Wireheadbe on 18 Dec 2020

https://twitter.com/opnsense/status/1339847119977533442

fichtner on 18 Dec 2020

Holy.. 👍 That did it.. Found the sneaky bugger? 🥇

Wireheadbe on 18 Dec 2020

Looks like the TLS engine is running on HTTP and now needs an explicit disable: adcade2fed8

We will issue a hotfix today, but half-expect that this is an uncaught regression in lighttpd.

fichtner on 18 Dec 2020

Still the same with Hotfix 20.7.7_1
Running: opnsense-revert -r 20.7.6 lighttpd && configctl webgui restart
... fixes it again.

Wireheadbe on 18 Dec 2020

acme-client for certs maybe?

fichtner on 18 Dec 2020

Yes - using the Let's encrypt with HA-proxy integration.

Wireheadbe on 19 Dec 2020

https://github.com/opnsense/plugins/issues/2126#issuecomment-742446531

fichtner on 19 Dec 2020

Not sure how this is related - as I already forcefully re-issued the certificates. It's purely the config sync that fails of ha-proxy to the second node, running over http-xmlrpc.

Wireheadbe on 19 Dec 2020

I don't have the full picture so I have to ask and at the moment I have no further ideas.

Cheers,
Franco

fichtner on 19 Dec 2020

Sure, no problem Franco.. Is there a way to get xmlrpc.php to dump some debug output on the secondary node, so we can know _where_ the issue is? To somehow "pinpoint" the parse error?

Wireheadbe on 19 Dec 2020

Can you try to verify that the response is indeed incorrect (most likely empty)?

# curl -v -l http://10.0.0.2:8090/xmlrpc.php

HTTP by default redirects to HTTPS... Lighttpd emits an error, it should be about SSL/TLS:

# clog /var/log/lighttpd.log | tail

Something like

(mod_openssl.c.3133) SSL: error:14FFF0C3:SSL routines:(UNKNOWN)SSL_internal:null ssl ctx

Using curl on HTTPS should work...

fichtner on 20 Dec 2020

So my management portal on 8090 is running in plain old http - so for XMLRPC it should be the same:

*   Trying 10.0.0.2:8090...
* Connected to 10.0.0.2 (10.0.0.2) port 8090 (#0)
> GET /xmlrpc.php HTTP/1.1
> Host: 10.0.0.2:8090
> User-Agent: curl/7.74.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Connection: close
< Content-Length: 145
< Content-type: text/xml;charset=UTF-8
< Date: Sun, 20 Dec 2020 20:04:19 +0000
< Expires: Tue, 22 Dec 2020 22:04:19 GMT
< Cache-Control: max-age=180000
< Server: OPNsense
<
<?xml version="1.0"?>
<methodResponse>
<params>
    <param>
      <value>Authentication failed</value>
    </param>
  </params>
* Closing connection 0

After I've done the curl - here's the tail of the log on the secondary node:
Dec 20 11:10:50 opnsense-sec lighttpd[66622]: (server.c.1488) server started (lighttpd/1.4.55)

So no ssl errors - as I'm not using SSL. It's definitely something unparseable - but I've checked the config and no weird characters are being used.

Wireheadbe on 20 Dec 2020

Can you try 31dee2dfcc on the xmlrpc responder?

# opnsense-patch 31dee2dfcc

fichtner on 20 Dec 2020

It might not fix the issue. If lighttpd ignores the last line without EOL it's difficult to fix. Maybe we can stuff an extra PHP_EOL into the response instead.

fichtner on 20 Dec 2020

Alas, didn't fix it. I updated both nodes to the latest lighttpd again, and applied the patch on the second one:

Patching file www/xmlrpc.php using Plan A...
Hunk #1 succeeded at 69.
done
All patches have been applied successfully.  Have a nice day.
root@opnsense-sec:~ #

And then on the primary:

root@opnsense-pri:~ # /usr/local/etc/rc.filter_synchronize
[...]
</struct></value></member>
</struct></value></param>
</params></methodCall>received >>>
<?xml version="1.0"?>
<methodResponse>
  <fault>
    <value>
      <struct>
        <member>
          <name>faultCode</name>
          <value><int>-32700</int></value>
        </member>
        <member>
          <name>faultString</name>
          <value><string>parse error. not well formed</string></value>
        </member>
      </struct>
    </value>
  </fault>
</methodResponse>
error >>>
parse error. not well formed

(How should I revert the patch btw.. ?)

Wireheadbe on 20 Dec 2020

Run same command again to revert.

I am unsure what's going on... it must be the responder since that is the broken lighttpd (sender does not use it) so somehow lighttpd mangled the request payload in receive?

fichtner on 20 Dec 2020

Ok, reverted the patch, thanks. I don't have any config to sync at the moment so I can keep it in the "broken" state for further testing. The odd thing is, that it's only in the HA-proxy part. All other syncs are running fine (unbound etc ..). I've even made sure to change the auto-created acme client rules, so the quotes aren't in the description anymore. Same issue remains.

The strangest part, is when I start removing chunks from the ha-proxy config, is that at "some" point it starts working again BUT always on a different point, so it's not related to a certain entry :(

edit: Maybe my config is too big? Could that be an issue?

Wireheadbe on 20 Dec 2020

Just as a test, I changed the config so tcp/8090 is using SSL, and the issue also remains.

root@opnsense-pri:~ # curl -k -v -l https://10.0.0.2:8090/xmlrpc.php
*   Trying 10.0.0.2:8090...
* Connected to 10.0.0.2 (10.0.0.2) port 8090 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*  CAfile: /usr/local/etc/ssl/cert.pem
*  CApath: none
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN, server accepted to use http/1.1
* Server certificate:
*  subject: C=NL; ST=Zuid-Holland; L=Middelharnis; O=OPNsense
*  start date: Dec 29 22:11:38 2019 GMT
*  expire date: Dec 28 22:11:38 2020 GMT
*  issuer: C=NL; ST=Zuid-Holland; L=Middelharnis; O=OPNsense
*  SSL certificate verify result: self signed certificate (18), continuing anyway.
> GET /xmlrpc.php HTTP/1.1
> Host: 10.0.0.2:8090
> User-Agent: curl/7.74.0
> Accept: */*
>
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* old SSL session ID is stale, removing
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Connection: close
< Content-Length: 145
< Content-type: text/xml;charset=UTF-8
< Date: Mon, 21 Dec 2020 09:07:09 +0000
< Expires: Wed, 23 Dec 2020 11:07:09 GMT
< Cache-Control: max-age=180000
< Server: OPNsense
<
<?xml version="1.0"?>
<methodResponse>
<params>
    <param>
      <value>Authentication failed</value>
    </param>
  </params>
* Closing connection 0
* TLSv1.3 (OUT), TLS alert, close notify (256):

So the sync gives:

root@opnsense-pri:~ # /usr/local/etc/rc.filter_synchronize
[...]
</struct></value></member>
</struct></value></param>
</params></methodCall>received >>>
<?xml version="1.0"?>
<methodResponse>
  <fault>
    <value>
      <struct>
        <member>
          <name>faultCode</name>
          <value><int>-32700</int></value>
        </member>
        <member>
          <name>faultString</name>
          <value><string>parse error. not well formed</string></value>
        </member>
      </struct>
    </value>
  </fault>
</methodResponse>
error >>>
parse error. not well formed

Wireheadbe on 21 Dec 2020

I wonder if we're hitting something like this: https://core.trac.wordpress.org/ticket/7794

Wireheadbe on 21 Dec 2020

Ok, hacky test.. On line 380 in /usr/local/opnsense/contrib/IXR/IXR_Library.php
I changed
$this->error(-32700, 'parse error. not well formed');
to
$this->error(-32700, $data);

And I see that the content of $data is truncated:
</struct></value></member> <member><name>id</name><value><string>5fa94c54750657.53660063</string></value></member> <member><name>enabled</name><value><string>1</string></value></member> <member><name>name</name><value><string>KVM01-web</string></value></member> <member><name>description</name><value><string>KVM01 Web UI</string></value></member> <member><name>address</name><value><string>192.1root@opnsense-pri:~ #
Or is this expected?

Wireheadbe on 21 Dec 2020

Reverting lighttpd (opnsense-revert -r 20.7.6 lighttpd && configctl webgui restart); and with that hacky test, doesn't even hit that error condition, so $this->message = new IXR_Message($data); is working correctly on an older lighttpd

Wireheadbe on 21 Dec 2020

can you try https://github.com/opnsense/core/commit/8644af058c853ea30b60983965c2b56d521f3631 ? and then execute /usr/local/etc/rc.filter_synchronize debug it would show what's being exchanged with the other end.

on a clean install opnsense-patch 8644af0 would probably do the trick

AdSchellevis on 21 Dec 2020

error >>>
parse error. not well formed>>> send :
Host: 10.0.0.2
User-Agent: XML_RPC
Content-Type: text/xml
Content-Length: 117
Authorization: Basic 
<?xml version="1.0"?>
<methodCall>
<methodName>opnsense.filter_configure</methodName>
<params>
</params></methodCall>

>>> received :
<?xml version="1.0"?>
<methodResponse>
  <params>
    <param>
      <value>
      <boolean>1</boolean>
      </value>
    </param>
  </params>
</methodResponse>

Wireheadbe on 21 Dec 2020

@Wireheadbe better remove the Authorisation tag from the post

AdSchellevis on 21 Dec 2020

the output doesn't make sense to me, to response looks like valid xml.

AdSchellevis on 21 Dec 2020

Do you guys want the full HA-proxy config?

Wireheadbe on 21 Dec 2020

This is occurring for me with trying to sync any config not just HA-proxy

maxfield-allison on 21 Dec 2020

@maxfield-allison same debug output? or another call?

AdSchellevis on 21 Dec 2020

</struct></value></member>
</struct></value></param>
</params></methodCall>received >>>
<?xml version="1.0"?>
<methodResponse>
  <fault>
    <value>
      <struct>
        <member>
          <name>faultCode</name>
          <value><int>-32700</int></value>
        </member>
        <member>
          <name>faultString</name>
          <value><string>parse error. not well formed</string></value>
        </member>
      </struct>
    </value>
  </fault>
</methodResponse>
error >>>
parse error. not well formed>>> send :
Host: 10.1.3.2
User-Agent: XML_RPC
Content-Type: text/xml
Content-Length: 117
Authorization: Basic *****************=
<?xml version="1.0"?>
<methodCall>
<methodName>opnsense.filter_configure</methodName>
<params>
</params></methodCall>
>>> received :
<?xml version="1.0"?>
<methodResponse>
  <params>
    <param>
      <value>
      <boolean>1</boolean>
      </value>
    </param>
  </params>
</methodResponse>

Same output.

maxfield-allison on 21 Dec 2020

can you post the complete message including the "send :" text?

AdSchellevis on 21 Dec 2020

👍1

apologies, edited

maxfield-allison on 21 Dec 2020

better remove the auth header :)

AdSchellevis on 21 Dec 2020

👀1

to me it looks like your errors are above the visible range here. opnsense.filter_configure is properly executed, the debug output from the sync is not in your postings.... (it doesn't exit when it fails)

AdSchellevis on 21 Dec 2020

Is there anything else I can provide to try tracking down what's getting messed up? I'm unchecking all of the boxes and syncing items one at a time to try narrowing it down.

maxfield-allison on 21 Dec 2020

the output in this thread isn't complete, the script executes multiple command, one fails and the last one doesn't (look for multiple send : tags and the responses it received).

AdSchellevis on 21 Dec 2020

👍1

>>> send : 
Host: 10.1.3.2

User-Agent: XML_RPC

Content-Type: text/xml

Content-Length: 117

Authorization: Basic 

<?xml version="1.0"?>
<methodCall>
<methodName>opnsense.firmware_version</methodName>
<params>
</params></methodCall>
>>> received : 
<?xml version="1.0"?>
<methodResponse>
  <params>
    <param>
      <value>
      <struct>
  <member><name>base</name><value><struct>
  <member><name>version</name><value><string>20.7.6</string></value></member>
</struct></value></member>
  <member><name>firmware</name><value><struct>
  <member><name>version</name><value><string>20.7.7_1</string></value></member>
</struct></value></member>
  <member><name>kernel</name><value><struct>
  <member><name>version</name><value><string>20.7.6</string></value></member>
</struct></value></member>
</struct>
      </value>
    </param>
  </params>
</methodResponse>

>>> send : 
Host: 10.1.3.2

User-Agent: XML_RPC

Content-Type: text/xml

Content-Length: 411081

Authorization: Basic 

<?xml version="1.0"?>
<methodCall>
<methodName>opnsense.restore_config_section</methodName>
<params>

...


</params></methodCall>received >>> 
<?xml version="1.0"?>
<methodResponse>
  <fault>
    <value>
      <struct>
        <member>
          <name>faultCode</name>
          <value><int>-32700</int></value>
        </member>
        <member>
          <name>faultString</name>
          <value><string>parse error. not well formed</string></value>
        </member>
      </struct>
    </value>
  </fault>
</methodResponse>
error >>> 
parse error. not well formed>>>



md5-c76a024142dc47cf8aa24177572ca54e



send : 
Host: 10.1.3.2

User-Agent: XML_RPC

Content-Type: text/xml

Content-Length: 117

Authorization: Basic 

<?xml version="1.0"?>
<methodCall>
<methodName>opnsense.filter_configure</methodName>
<params>
</params></methodCall>
>>> received : 
<?xml version="1.0"?>
<methodResponse>
  <params>
    <param>
      <value>
      <boolean>1</boolean>
      </value>
    </param>
  </params>
</methodResponse>

maxfield-allison on 21 Dec 2020

it's complaining about the opnsense.restore_config_section which it thinks is not valid xml, just copy he contents and check with an xml parser (the issue content is omitted...):

<?xml version="1.0"?>
<methodCall>
<methodName>opnsense.restore_config_section</methodName>
<params>

...


</params></methodCall>

It should be valid xml, if it isn't, the issue is clear, could be some unescaped character or an encoding issue.

AdSchellevis on 21 Dec 2020

Ran an external validator on the full xml and it found no issues.

maxfield-allison on 21 Dec 2020

After disabling sync of certain sections, it looks like I don't get an error with shaper deselected. I'm guessing there's something in that xml that's causing the issue on my instance. Also seeing the error when intrusion detection is selected.

maxfield-allison on 21 Dec 2020

I've put mine also in an external validator - checks out as valid.
I've disabled everything _but_ HA-proxy, still get the parsing error

Wireheadbe on 21 Dec 2020

The XML is about +- 5000 lines, just with HA-proxy (so yes, quite extensive) - could that be an issue? Maximum post size or something?

Wireheadbe on 21 Dec 2020

That's what my thought was. my total xml is 64k lines

maxfield-allison on 21 Dec 2020

@Wireheadbe does your sync function with HA Proxy removed from the options?

maxfield-allison on 21 Dec 2020

@maxfield-allison yes. Then it syncs everything fine.. But not that much defined in other parts.
Just by chance - are you using any ipv6 hosts in your ha-proxy definitions?

Wireheadbe on 21 Dec 2020

I'm not using HA-Proxy. I'm encountering the error with certain combinations of config xml sections.

maxfield-allison on 21 Dec 2020

looks like everything functions well until I include the Shaper config. Wondering now if the syntax is correct but we're missing some sort of end statement? or as you said, hitting a maximum post size. If I remove shaper and add several other sections I don't encounter the error.

maxfield-allison on 21 Dec 2020

only the shaper section is also ok?

AdSchellevis on 21 Dec 2020

👍1

Didn't we already establish latest lighttpd was broken somehow? I assume they have issues with request compression or chunking. Let me distribute a 1.4.57 version for you from last week...

fichtner on 21 Dec 2020

❤1

only the shaper section is also ok?

yep by itself shaper works.

maxfield-allison on 21 Dec 2020

if it's fixed with downgrading lighttpd, you probably just have to wait for an update, the only odd thing is that the xml content received by the client is ok, which doesn't make sense in that case.

AdSchellevis on 21 Dec 2020

👍1

# pkg add -f https://pkg.opnsense.org/FreeBSD:12:amd64/20.7/misc/lighttpd-1.4.57.txz && configctl webgui restart

fichtner on 21 Dec 2020

❤1

unfortunately, even with the version you've provided I'm still encountering the issue.

maxfield-allison on 21 Dec 2020

worth a try, thanks!

fichtner on 21 Dec 2020

👍1

@AdSchellevis stab in the dark, but since lighttpd changed mod_compress to mod_deflate and this is in... https://github.com/opnsense/core/blob/master/src/etc/inc/plugins.inc.d/webgui.inc#L440

fichtner on 21 Dec 2020

@fichtner I do agree it is suspicious, if compression would be broken it might break output... but, I just can't believe the output received by xmlrpclib is valid in that case, so either the console suppresses mangled characters or there's something else really odd going on. I haven't seen issues on my end, but I haven't send very large files either

forget about this, the responses are small, the requests are large (for which we would need to debug the receiving end)

AdSchellevis on 21 Dec 2020

I can't reproduce this, on my end it pushes 120183 bytes to the other end without issues.

AdSchellevis on 21 Dec 2020

what is your setup? what categories are you syncing?

maxfield-allison on 21 Dec 2020

How could I debug on the receiving side?

maxfield-allison on 21 Dec 2020

the xmlrpc server doesn't really offer a lot of debug options, you could dump the data to a file in the library, somewhere around here:

https://github.com/opnsense/core/blob/945202230726af76582103680fa4c49114670752/contrib/IXR/IXR_Library.php#L377

AdSchellevis on 22 Dec 2020

Ok, so I did the following there:

$this->message = new IXR_Message($data);
        $myfile = fopen("/tmp/newfile.txt", "w") or die("Unable to open file!");
        fwrite($myfile, $data);
        fclose($myfile);

And on the old lighttpd - I get the full XML file (about 7100 lines) - on the new lighttpd; it's truncated after about 3200 lines. It's even truncated midway a closing tag, so not something encoding related.

Wireheadbe on 22 Dec 2020

and what's the size in bytes? just out of curiosity, we probably should wait for an upstream fix anyway...

AdSchellevis on 22 Dec 2020

It could be related to chunking..
If I look at the size of the file:

ls -la /tmp/newfile.txt
-rw-r-----  1 root  wheel  262144 Dec 22 09:52 /tmp/newfile.txt

The "old" lighttpd is about 700k

Wireheadbe on 22 Dec 2020

Installing lighttpd-1.4.57.txz chops it at the same place.

Wireheadbe on 22 Dec 2020

I can confirm that in my case the config seems also be truncated at 262144 bytes, resulting in an "not well formed" parse error.
(When using the old lighttpd from https://pkg.opnsense.org/FreeBSD:12:amd64/20.7/MINT/20.7.5/OpenSSL/All/lighttpd-1.4.55_1.txz this issue is gone.)

fraenki on 23 Dec 2020

@fraenki is that 256k the xml content or including the headers? it's almost like some parameter (https://redmine.lighttpd.net/projects/lighttpd/wiki/Docs_ConfigurationOptions) is set to a default of 256k where it wasn't before. I can probably generate a test case in the days to come, it shouldn't be very difficult to trigger this behaviour. I haven't seen issues on their end yet explaining this behaviour.

AdSchellevis on 23 Dec 2020

It's the content of the XML message, not including headers.

Wireheadbe on 23 Dec 2020

Didn't follow the whole tread, sorry. Does this came from lighttpd update in 20.7.7?
I did a HA Update to 20.7.7 but didn't experience this.

mimugmail on 23 Dec 2020

@mimugmail yes, only when the request body size seems to be above 256k

AdSchellevis on 23 Dec 2020

Lighttpd 1.4.58 is out now... please give it a go:

# pkg add -f https://pkg.opnsense.org/FreeBSD:12:amd64/20.7/misc/lighttpd-1.4.58.txz && configctl webgui restart

https://www.lighttpd.net/2020/12/27/1.4.58/

fichtner on 29 Dec 2020

@fichtner funny enough i was just about to reply here, as i'm hitting the same issue setting up a fresh HA pair

looks like the installation doesn't work

# pkg add -f https://pkg.opnsense.org/FreeBSD:12:amd64/20.7/misc/lighttpd-1.4.58.txz && configctl webgui restart
Fetching lighttpd-1.4.58.txz: 100%  315 KiB 323.1kB/s    00:01
Installing lighttpd-1.4.58...
package lighttpd is already installed, forced install
pkg: Missing dependency 'libressl'

Failed to install the following 1 package(s): https://pkg.opnsense.org/FreeBSD:12:amd64/20.7/misc/lighttpd-1.4.58.txz

marcquark on 29 Dec 2020

eh ok, I was trying to avoid that... hold on

fichtner on 29 Dec 2020

Sorry, doesn't work for OpenSSL flavour at the moment. Can't provide another package at the moment.

fichtner on 29 Dec 2020

@fichtner I did a quick test build of lighttpd-1.4.58 (OpenSSL) on my private poudriere. I can confirm that it fixes the issue. The build options are different, so I can't share the package with other people.

I've repeated the test multiple times with 1.4.56 and 1.4.58 – .56 always failed, .58 always OK.

fraenki on 29 Dec 2020

👍3

@fraenki thanks for confirming. this is good news 👍

fichtner on 29 Dec 2020

Indeed - awesome that it's fixed with upcoming 1.4.58

Wireheadbe on 29 Dec 2020

hotfix out in a couple of minutes

fichtner on 4 Jan 2021

🎉2 ❤1

@fichtner Hotfix works as expected, thank you!

fraenki on 4 Jan 2021

🎉1

lighttpd developer here: My sincere apologies for the trouble with lighttpd 1.4.56 and lighttpd 1.4.57.

lighttpd 1.4.56 was a major, major release with large rewrites for HTTP/2 and multiple TLS library options, as well as HTTP/1.1 chunked enhancement (which was the source of bugs here)

With lighttpd 1.4.58, I believe all the bugs have been shaken out. At least all the reported bugs have been addressed.

In the future, if you do notice regressions, please report them upstream on https://github.com/lighttpd/lighttpd1.4 or https://redmine.lighttpd.net/projects/lighttpd/issues

Thanks.

gstrauss on 5 Jan 2021

👍6

@gstrauss no worries, keep up the good work :)

Just as a side note we had to disable ssl engine on HTTP->HTTPS redirect to fix it... not sure if this was an intended change? https://github.com/opnsense/core/commit/adcade2fed89c0 But we don't really mind if it is just a configuration change that is necessary.

Cheers,
Franco

fichtner on 5 Jan 2021

If ssl.engine = "enable" is in the global scope in lighttpd.conf, then it is inherited by all $SERVER["socket"] conditions and you will need ssl.engine = "disable" on the ports on which you want HTTP (instead of HTTPS), as you have done.

Is that not the case?

gstrauss on 5 Jan 2021

@gstrauss well, yes, but it wasn't the case for years with < 1.4.56 that's why I'm asking

fichtner on 5 Jan 2021

Yes, the change to inherit from global scope was intentional so that you could write your ssl.* config in the global scope -- but without ssl.engine = "enable" since the default is server.port = 80 -- and then not have to repeat the ssl.* directives in multiple conditions if the only ssl.* directive in $SERVER["socket"] is ssl.engine = "enable", e.g.

$SERVER["socket"] == "*:443"     { ssl.engine = "enable" }
$SERVER["socket"] == "*[::]:443" { ssl.engine = "enable" }

gstrauss on 5 Jan 2021

@gstrauss ok thanks!

fichtner on 5 Jan 2021

FYI, while looking at src/etc/inc/plugins.inc.d/webgui.inc, here are a couple things you might want to do to simplify your config and improve performance:

server.event-handler  = "freebsd-kqueue"
server.network-backend  = "writev"

If you leave these out, lighttpd will prefer to use kqueue on systems where available, so that is the default.
If you leave out server.network-backend, then lighttpd will prefer the system sendfile if available. In the distant past, you may have needed to specify server.network-backend = "writev" on FreeBSD on embedded systems, since not all filesystems support sendfile, especially some flash filesystems. However, lighttpd has since been enhanced to fall back to writev if sendfile fails.

Most of the OPNsense ssl.* configuration directives are either the default or have been superceded. lighttpd 1.4.56 and later use "MinProtocol" => "TLSv1.2" by default. Also, when your cipher list is restricted to a set which only supports Perfect Forward Secrecy, it is beneficial (and not a reduction in security) to allow the client to choose its preferred cipher. This can greatly speed up mobile clients, which might prefer CHACHA20 over AESGCM since some mobile devices do not have hardware accelerated AESGCM. Much of this is docmented at https://redmine.lighttpd.net/projects/lighttpd/wiki/Docs_SSL though can be pretty dense reading, which is why simpler SSL configs are better.

ssl.pemfile = "..."
ssl.ca-file = "..."
ssl.openssl.ssl-conf-cmd = ("MinProtocol" => "TLSv1.2",
                            "Options" => "-ServerPreference",
                            "CipherString" => "EECDH+AESGCM:AES256+EECDH:CHACHA20:!SHA1:!SHA256:!SHA384")

$SERVER["socket"] == "*:443"     { ssl.engine = "enable" }
$SERVER["socket"] == "*[::]:443" { ssl.engine = "enable" }

With the above config, it is not necessary to set ssl.disable-client-renegotiation = "enable", which is the default in lighttpd, and it is not necessary to set ssl.dh-file since a proper set of parameters are used by (modern) TLS libraries. Similar for ssl.ec-curve.

lighttpd sets "HTTPS" => "on" in the environment when necessary, so you should not need to do so.

The above config "MinProtocol" => "TLSv1.2" supercedes ssl.use-sslv2 = "disable" and ssl.use-sslv3 = "disable" and, along with a restricted cipher list, changes ssl.honor-cipher-order. However, if a custom cipher list is configured by the OPNsense user, you may want to omit "Options" => "-ServerPreference" and have lighttpd use the lighttpd default which historically has been the equivalent of "Options" => "ServerPreference" (ssl.honor-cipher-order = "enable")

In addition, as you currently do, "Strict-Transport-Security" should be set when configured.

$HTTP["scheme"] == "https" {
            setenv.add-response-header = ("Strict-Transport-Security" => "max-age=15768000" )
}

https://redmine.lighttpd.net/projects/lighttpd/wiki/Docs_ModSetEnv#HTTP-Strict-Transport-Security-HSTS

gstrauss on 5 Jan 2021

❤3

@fichtner wrote in commit f29c0b9 :

We could support TLS 1.3 now but that seems to be an exclusive option without a fallback.

FYI: not exclusive. "MinProtocol" => "TLSv1.2" specifies the minimum protocol level, not the maximum. If the underlying TLS library (e.g. openssl) supports TLSv1.3, then "MinProtocol" => "TLSv1.2" tells the TLS library to permit TLSv1.2 and TLSv1.3, but not anything earlier than TLSv1.2

gstrauss on 10 Jan 2021

@gstrauss sorry, I have misread the info provided in your document:

# STRONGEST: As of Sep 2020, for use w/ modern clients only; not compat w/ older clients
#ssl.openssl.ssl-conf-cmd = ("MinProtocol" => "TLSv1.3",
#                            "Options" => "-ServerPreference")

Since we support LibreSSL (currently 3.1) as well as OpenSSL my TLS 1.3 tests are lacking when using LibreSSL variant.

Cheers,
Franco

fichtner on 10 Jan 2021

Was this page helpful?

0 / 5 - 0 ratings