Core: CARP: leaving maintenance mode on unit1 after reboot doesn't change advskew

Created on 26 Aug 2019  路  6Comments  路  Source: opnsense/core

[X] I have read the contributing guide lines at https://github.com/opnsense/core/blob/master/CONTRIBUTING.md

[X] I have searched the existing issues and I'm convinced that mine is new.

Describe the bug
As discussed via IRC it seems that after patch https://github.com/opnsense/core/commit/0e9912c374eb0538194727f747ad681d38ab6c64 when setting unit 1 in maintenance mode, reboot, and leave again, it is still backup and not master because advskew is still on 254.
Idea of @AdSchellevis was to refactor https://github.com/opnsense/core/blob/c3ccc63fd184168e22822fe49aabf7e7e8b40d1a/src/etc/inc/interfaces.inc#L1711-L1716 since we now have demotion.
Reference https://forum.opnsense.org/index.php?topic=13987.msg64276#new

OPNsense is on latest 19.7.2

bug

Most helpful comment

Confirmed, also working as expected on my end :)

All 6 comments

Just as a note: as mentioned in https://www.thomas-krenn.com/en/wiki/OPNsense_HA_Cluster_configuration after the reboot, on firewall 1 "net.inet.carp.demotion" was 0 again (not 240).

So when simple set advskew back to 0 with the refactoring, the "Enter Persistent CARP Maintenance Mode" would not be persistent for another reboot I assume.

I think the wanted behaviour of the "Enter Persistent CARP Maintenance Mode" would be:

  1. When "Enter Persistent CARP Maintenance Mode" is selected of firewall 1, the other node (firewall 2) should become MASTER for all CARP IPs.
  2. When doing a reboot of firewall 1 while being in the "Persistent CARP Maintenance Mode", firewall 1 should stay as BACKUP for all CARP IPs, while firewall 2 still is MASTER.
  3. When clicking "Leave Persistent CARP Maintenance Mode" on firewall 1 again, firewall 1 should become MASTER again.

Would it be possible that demotion would stay at 240 on firewall 1 even after the reboot (when firewall 1 is the in "Persistent CARP Maintenance Mode"?

@tk-wfischer to proposed change is to stop fiddling with advskew and set demotion on early boot, similar to what we're going todo for https://github.com/opnsense/core/issues/3636

opnsense-patch 28cc0dc
rm /usr/local/etc/rc.syshook.d/early/98_carp_maintenance.orig 

should do the trick.

@mimugmail @tk-wfischer can you check on your end?

@AdSchellevis first of all, thank you very much for your friendly and kind support.

Now it works as expected. :-) Here is the full log:

fw1: default MASTER
fw2: default SLAVE

Initial state:
==============

fw1:
~~~~
root@fw1:~ # ifconfig | grep carp
    carp: MASTER vhid 3 advbase 1 advskew 0
    carp: MASTER vhid 1 advbase 1 advskew 0
root@fw1:~ # sysctl net.inet.carp
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1

fw2:
~~~~
root@fw2:~ # ifconfig | grep carp
    carp: BACKUP vhid 3 advbase 1 advskew 100
    carp: BACKUP vhid 1 advbase 1 advskew 100
root@fw2:~ # sysctl net.inet.carp
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1

Test 1: fw1 "Enter Persistent CARP Maintenance Mode"
====================================================

fw1:
~~~~
root@fw1:~ # clog -f /var/log/system.log
Aug 27 10:05:50 fw1 kernel: carp: demoted by 240 to 240 (sysctl)
Aug 27 10:05:50 fw1 kernel: carp: 1@igb1: MASTER -> BACKUP (more frequent advertisement received)
Aug 27 10:05:50 fw1 kernel: ifa_maintain_loopback_route: deletion failed for interface igb1: 3
Aug 27 10:05:50 fw1 kernel: carp: 3@igb0: MASTER -> BACKUP (more frequent advertisement received)
Aug 27 10:05:50 fw1 kernel: ifa_maintain_loopback_route: deletion failed for interface igb0: 3
Aug 27 10:05:51 fw1 opnsense: /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "10.1.102.253 - Virtual WAN IP (1@igb1)" has resumed the state "BACKUP" for vhid 1 
Aug 27 10:05:51 fw1 opnsense: /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "192.168.1.1 - Virtual LAN IP (3@igb0)" has resumed the state "BACKUP" for vhid 3 

fw2:
~~~~
root@fw2:~ # clog -f /var/log/system.log
Aug 27 10:05:50 fw2 kernel: carp: 1@igb1: BACKUP -> MASTER (preempting a slower master)
Aug 27 10:05:50 fw2 kernel: carp: 3@igb0: BACKUP -> MASTER (preempting a slower master)
Aug 27 10:05:51 fw2 opnsense: /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "10.1.102.253 - Virtual WAN IP (1@igb1)" has resumed the state "MASTER" for vhid 1 
Aug 27 10:05:51 fw2 opnsense: /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "192.168.1.1 - Virtual LAN IP (3@igb0)" has resumed the state "MASTER" for vhid 3 


Test 2: Reboot of fw1:
======================

fw1:
~~~~
root@fw1:~ # clog /var/log/system.log | grep carp
[...]
Aug 27 10:08:18 fw1 kernel: carp: demoted by 240 to 240 (sysctl)
Aug 27 10:08:19 fw1 kernel: carp: demoted by 240 to 480 (interface down)
Aug 27 10:08:19 fw1 kernel: carp: demoted by 240 to 720 (pfsync bulk start)
Aug 27 10:08:19 fw1 kernel: carp: demoted by 240 to 960 (interface down)
Aug 27 10:08:23 fw1 kernel: carp: 3@igb0: INIT -> BACKUP (initialization complete)
Aug 27 10:08:23 fw1 kernel: carp: demoted by -240 to 720 (interface up)
Aug 27 10:08:24 fw1 kernel: carp: 1@igb1: INIT -> BACKUP (initialization complete)
Aug 27 10:08:24 fw1 kernel: carp: demoted by -240 to 480 (interface up)
Aug 27 10:08:24 fw1 kernel: carp: demoted by -240 to 240 (pfsync bulk done)
Aug 27 10:08:24 fw1 opnsense: /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "192.168.1.1 - Virtual LAN IP (3@igb0)" has resumed the state "BACKUP" for vhid 3 
Aug 27 10:08:25 fw1 opnsense: /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "10.1.102.253 - Virtual WAN IP (1@igb1)" has resumed the state "BACKUP" for vhid 1 
root@fw1:~ # ifconfig | grep carp
    carp: BACKUP vhid 3 advbase 1 advskew 0
    carp: BACKUP vhid 1 advbase 1 advskew 0
root@fw1:~ # sysctl net.inet.carp
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 240
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1


Test 3: fw1 "Leave Persistent CARP Maintenance Mode"
====================================================

fw1:
~~~~
root@fw1:~ # clog -f /var/log/system.log
[...]
Aug 27 10:11:05 fw1 opnsense: /index.php: Successful login for user 'root' from: 192.168.1.127 
Aug 27 10:11:24 fw1 kernel: carp: demoted by -240 to 0 (sysctl)
Aug 27 10:11:24 fw1 kernel: carp: 1@igb1: BACKUP -> MASTER (preempting a slower master)
Aug 27 10:11:24 fw1 kernel: carp: 3@igb0: BACKUP -> MASTER (preempting a slower master)
Aug 27 10:11:25 fw1 opnsense: /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "10.1.102.253 - Virtual WAN IP (1@igb1)" has resumed the state "MASTER" for vhid 1 
Aug 27 10:11:25 fw1 opnsense: /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "192.168.1.1 - Virtual LAN IP (3@igb0)" has resumed the state "MASTER" for vhid 3 
root@fw1:~ # ifconfig | grep carp
    carp: MASTER vhid 3 advbase 1 advskew 0
    carp: MASTER vhid 1 advbase 1 advskew 0
root@fw1:~ # sysctl net.inet.carp
net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.allow: 1

fw2:
~~~~
root@fw2:~ # clog -f /var/log/system.log
Aug 27 10:11:24 fw2 kernel: carp: 1@igb1: MASTER -> BACKUP (more frequent advertisement received)
Aug 27 10:11:24 fw2 kernel: ifa_maintain_loopback_route: deletion failed for interface igb1: 3
Aug 27 10:11:24 fw2 kernel: carp: 3@igb0: MASTER -> BACKUP (more frequent advertisement received)
Aug 27 10:11:24 fw2 kernel: ifa_maintain_loopback_route: deletion failed for interface igb0: 3
Aug 27 10:11:25 fw2 opnsense: /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "10.1.102.253 - Virtual WAN IP (1@igb1)" has resumed the state "BACKUP" for vhid 1 
Aug 27 10:11:25 fw2 opnsense: /usr/local/etc/rc.syshook.d/carp/20-openvpn: Carp cluster member "192.168.1.1 - Virtual LAN IP (3@igb0)" has resumed the state "BACKUP" for vhid 3 

I have only noticed, that the syshook has "openvpn" in it's name (I did not configure openvpn):

  • /usr/local/etc/rc.syshook.d/carp/20-openvpn
    Is this just because of historical reasons?

From a functional perspective, the patch fixes my problem :-) Thank you again for your fast support.

@awerner your more than welcome, thanks for testing!

The openvpn hook is an event that's triggered on a state change and reloads openvpn clients if needed. It's very practical, you can perform actions on these (devd) events. (in core only openvpn clients need this)

Confirmed, also working as expected on my end :)

Was this page helpful?
0 / 5 - 0 ratings