Core: interfaces: dhclient-script regression in 19.1

Created on 6 Feb 2019  路  92Comments  路  Source: opnsense/core

Just want to be sure the dev are aware of "our" issue.

After 10-20 minutes of uptime all incoming connections are beeing dropped!
So, OpenVPN tunnels are dropped too. It was fine on 18.7.10.

In the General-system log I can see this every 20 minutes:

Feb 5 13:00:55  opnsense: /usr/local/etc/rc.newwanip: Dynamic DNS: (Success) No change in IP address
Feb 5 13:00:55  opnsense: /usr/local/etc/rc.newwanip: Dynamic DNS: updating cache file /var/cache/dyndns.org_0.cache: ***
Feb 5 13:00:50  opnsense: /usr/local/etc/rc.newwanip: Interface '' is disabled or empty, nothing to do.
Feb 5 13:00:50  opnsense: /usr/local/etc/rc.newwanip: IP renewal is starting on 'ovpns2'

Feb 5 13:00:49  kernel: ovpns2: link state changed to UP
Feb 5 13:00:44  kernel: ovpns2: link state changed to DOWN
Feb 5 13:00:44  opnsense: /usr/local/etc/rc.newwanip: Resyncing OpenVPN instances for interface WAN.

Feb 5 13:00:40  opnsense: /usr/local/etc/rc.newwanip: ROUTING: skipping IPv6 default route
Feb 5 13:00:40  opnsense: /usr/local/etc/rc.newwanip: ROUTING: keeping current default gateway '***'
Feb 5 13:00:40  opnsense: /usr/local/etc/rc.newwanip: ROUTING: setting IPv4 default route to ***
Feb 5 13:00:40  opnsense: /usr/local/etc/rc.newwanip: ROUTING: IPv4 default gateway set to wan
Feb 5 13:00:40  opnsense: /usr/local/etc/rc.newwanip: ROUTING: IPv6 default gateway set to wan
Feb 5 13:00:40  opnsense: /usr/local/etc/rc.newwanip: ROUTING: entering configure using 'wan'
Feb 5 13:00:39  opnsense: /usr/local/etc/rc.newwanip: On (IP address: ***) (interface: WAN[wan]) (real interface: re1_vlan10).
Feb 5 13:00:39  opnsense: /usr/local/etc/rc.newwanip: IP renewal is starting on 're1_vlan10'

after every execution of rc.newwanip (even manually) all connection are dropped.

I watch TV over the internet and every 20 minutes the stream hangs, so I have to rewind to build up the stream again.

Here is the forum post:
https://forum.opnsense.org/index.php?topic=11456.0

Is there any solution? What could it be?
According to github, the last changes to rc.newwanip where 5 month ago (https://github.com/opnsense/core/commits/master/src/etc/rc.newwanip).
I think it has something to do with the switch to HardenedBSD, but I am absolutly not a unix guy...

bug

Most helpful comment

We don't know what the error is, so it will come back if we just revert fixes for other problems. Not to mention having issues back that have been fixed with this. Does that make sense?

All 92 comments

OPNsense 19.1.1-amd64
FreeBSD 11.2-RELEASE-p8-HBSD
OpenSSL 1.0.2q 20 Nov 2018

My OPNsense is running on a apu1d4 from PC Engines for some year without any issue.

could be driver (realtek), other connectivity issues. or when ips is enabled related to https://github.com/opnsense/core/issues/3175

IPS is not enabled, ipv6 is also disabled.
Did the realtek drivers change in 19.1.1?

The newwanip script is called every 20 minutes. after every run my connections are being droped.
Is something new in the dhcp? I'm just guessing

newwanip is likely triggered due to another event, maybe dmesg has more info (or there's data before IP renewal is starting on 're1_vlan10') assuming that 19.1 has the same issue (different kernel than 18.7.x series)

The data before your mentioned line is the end of my mentioned line 馃槃


Feb 6 16:50:19 | opnsense: /usr/local/etc/rc.newwanip: IP renewal is starting on 're1_vlan10'
-- | --
Feb 6 16:30:42 | opnsense: /usr/local/etc/rc.newwanip: Dynamic DNS: (Success) No change in IP address


how can i output dmesg? I have ssh access.

just dmesg in a console

I see a lot more of:

ovpns2: link state changed to DOWN
ovpns2: link state changed to UP
ovpns2: link state changed to DOWN
ovpns2: link state changed to UP

you need the whole output?

cause and effect... ovpns2 is likely going down, because the interface it's bound to is gone...

so, yes, the logging provided so far doesn't tell a lot about your issue.

WAN connection is gone too, I cannot find anything in dmesg output.

Copyright (c) 2013-2018 The HardenedBSD Project.
Copyright (c) 1992-2018 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
    The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 11.2-RELEASE-p8-HBSD  31af16db12b(stable/19.1) amd64
FreeBSD clang version 6.0.0 (tags/RELEASE_600/final 326565) (based on LLVM 6.0.0)
HardenedBSD: initialize and check features (__HardenedBSD_version 1100056 __FreeBSD_version 1102000).
CPU: AMD G-T40E Processor (1000.02-MHz K8-class CPU)
  Origin="AuthenticAMD"  Id=0x500f20  Family=0x14  Model=0x2  Stepping=0
  Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>
  Features2=0x802209<SSE3,MON,SSSE3,CX16,POPCNT>
  AMD Features=0x2e500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM>
  AMD Features2=0x35ff<LAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,IBS,SKINIT,WDT>
  SVM: NP,NRIP,NAsids=8
  TSC: P-state invariant, performance statistics
real memory  = 4815060992 (4592 MB)
avail memory = 4067450880 (3879 MB)
Event timer "LAPIC" quality 600
ACPI APIC Table: <CORE   COREBOOT>
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
FreeBSD/SMP: 1 package(s) x 2 core(s)
ioapic0 <Version 2.1> irqs 0-23 on motherboard
SMP: AP CPU #1 Launched!
Timecounter "TSC" frequency 1000019124 Hz quality 800
random: entropy device external interface
wlan: mac acl policy registered
netmap: loaded module
module_register_init: MOD_LOAD (vesa, 0xffffffff8113fe30, 0) error 19
kbd0 at kbdmux0
nexus0
cryptosoft0: <software crypto> on motherboard
acpi0: <CORE COREBOOT> on motherboard
acpi0: Power Button (fixed)
cpu0: <ACPI CPU> on acpi0
cpu1: <ACPI CPU> on acpi0
atrtc0: <AT realtime clock> port 0x70-0x71 irq 8 on acpi0
atrtc0: registered as a time-of-day clock, resolution 1.000000s
Event timer "RTC" frequency 32768 Hz quality 0
attimer0: <AT timer> port 0x40-0x43 irq 0 on acpi0
Timecounter "i8254" frequency 1193182 Hz quality 0
Event timer "i8254" frequency 1193182 Hz quality 100
Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
acpi_timer0: <32-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0
Timecounter "HPET" frequency 14318180 Hz quality 950
Event timer "HPET" frequency 14318180 Hz quality 550
Event timer "HPET1" frequency 14318180 Hz quality 450
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
pcib1: <ACPI PCI-PCI bridge> irq 16 at device 4.0 on pci0
pci1: <ACPI PCI bus> on pcib1
re0: <Realtek PCIe GBE Family Controller> port 0x1000-0x10ff mem 0xf7a00000-0xf7a00fff,0xf7900000-0xf7903fff irq 16 at device 0.0 on pci1
re0: Using Memory Mapping!
re0: Using 1 MSI-X message
re0: ASPM disabled
re0: version:1.95.00
re0: Ethernet address: 00:0d:b9:3f:92:14

This product is covered by one or more of the following patents:
US6,570,884, US6,115,776, and US6,327,625.
re0: Ethernet address: 00:0d:b9:3f:92:14
pcib2: <ACPI PCI-PCI bridge> irq 17 at device 5.0 on pci0
pci2: <ACPI PCI bus> on pcib2
re1: <Realtek PCIe GBE Family Controller> port 0x2000-0x20ff mem 0xf7c00000-0xf7c00fff,0xf7b00000-0xf7b03fff irq 17 at device 0.0 on pci2
re1: Using Memory Mapping!
re1: Using 1 MSI-X message
re1: ASPM disabled
re1: version:1.95.00
re1: Ethernet address: 00:0d:b9:3f:92:15

This product is covered by one or more of the following patents:
US6,570,884, US6,115,776, and US6,327,625.
re1: Ethernet address: 00:0d:b9:3f:92:15
pcib3: <ACPI PCI-PCI bridge> irq 18 at device 6.0 on pci0
pci3: <ACPI PCI bus> on pcib3
re2: <Realtek PCIe GBE Family Controller> port 0x3000-0x30ff mem 0xf7e00000-0xf7e00fff,0xf7d00000-0xf7d03fff irq 18 at device 0.0 on pci3
re2: Using Memory Mapping!
re2: Using 1 MSI-X message
re2: ASPM disabled
re2: version:1.95.00
re2: Ethernet address: 00:0d:b9:3f:92:16

This product is covered by one or more of the following patents:
US6,570,884, US6,115,776, and US6,327,625.
re2: Ethernet address: 00:0d:b9:3f:92:16
ahci0: <AMD SB7x0/SB8x0/SB9x0 AHCI SATA controller> port 0x4010-0x4017,0x4020-0x4023,0x4018-0x401f,0x4024-0x4027,0x4000-0x400f mem 0xf7f04000-0xf7f043ff irq 19 at device 17.0 on pci0
ahci0: AHCI v1.20 with 6 6Gbps ports, Port Multiplier supported
ahci0: quirks=0x22000<ATI_PMP_BUG,1MSI>
ahcich0: <AHCI channel> at channel 0 on ahci0
ahcich1: <AHCI channel> at channel 1 on ahci0
ahcich2: <AHCI channel> at channel 2 on ahci0
ahcich3: <AHCI channel> at channel 3 on ahci0
ahcich4: <AHCI channel> at channel 4 on ahci0
ahcich5: <AHCI channel> at channel 5 on ahci0
ohci0: <AMD SB7x0/SB8x0/SB9x0 USB controller> mem 0xf7f00000-0xf7f00fff irq 18 at device 18.0 on pci0
usbus0 on ohci0
usbus0: 12Mbps Full Speed USB v1.0
ehci0: <AMD SB7x0/SB8x0/SB9x0 USB 2.0 controller> mem 0xf7f04400-0xf7f044ff irq 17 at device 18.2 on pci0
usbus1: EHCI version 1.0
usbus1 on ehci0
usbus1: 480Mbps High Speed USB v2.0
ohci1: <AMD SB7x0/SB8x0/SB9x0 USB controller> mem 0xf7f01000-0xf7f01fff irq 18 at device 19.0 on pci0
usbus2 on ohci1
usbus2: 12Mbps Full Speed USB v1.0
ehci1: <AMD SB7x0/SB8x0/SB9x0 USB 2.0 controller> mem 0xf7f04500-0xf7f045ff irq 17 at device 19.2 on pci0
usbus3: EHCI version 1.0
usbus3 on ehci1
usbus3: 480Mbps High Speed USB v2.0
isab0: <PCI-ISA bridge> at device 20.3 on pci0
isa0: <ISA bus> on isab0
pcib4: <ACPI PCI-PCI bridge> at device 20.4 on pci0
pci4: <ACPI PCI bus> on pcib4
ohci2: <AMD SB7x0/SB8x0/SB9x0 USB controller> mem 0xf7f02000-0xf7f02fff irq 18 at device 20.5 on pci0
usbus4 on ohci2
usbus4: 12Mbps Full Speed USB v1.0
pcib5: <ACPI PCI-PCI bridge> at device 21.0 on pci0
pci5: <ACPI PCI bus> on pcib5
ohci3: <AMD SB7x0/SB8x0/SB9x0 USB controller> mem 0xf7f03000-0xf7f03fff at device 22.0 on pci0
usbus5 on ohci3
usbus5: 12Mbps Full Speed USB v1.0
ehci2: <AMD SB7x0/SB8x0/SB9x0 USB 2.0 controller> mem 0xf7f04600-0xf7f046ff at device 22.2 on pci0
usbus6: EHCI version 1.0
usbus6 on ehci2
usbus6: 480Mbps High Speed USB v2.0
acpi_button0: <Power Button> on acpi0
orm0: <ISA Option ROM> at iomem 0xee800-0xeffff on isa0
ppc0: cannot reserve I/O port range
uart0: <16550 or compatible> at port 0x3f8 irq 4 flags 0x10 on isa0
uart0: console (115200,n,8,1)
uart1: <16550 or compatible> at port 0x2f8 irq 3 on isa0
Timecounters tick every 1.000 msec
ugen6.1: <ATI EHCI root HUB> at usbus6
ugen5.1: <ATI OHCI root HUB> at usbus5
uhub0: <ATI EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus6
uhub1: <ATI OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus5
ugen2.1: <ATI OHCI root HUB> at usbus2
ugen4.1: <ATI OHCI root HUB> at usbus4
uhub2: <ATI OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus2
ugen3.1: <ATI EHCI root HUB> at usbus3
uhub3: <ATI OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus4
uhub4: <ATI EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus3
ugen0.1: <ATI OHCI root HUB> at usbus0
uhub5: <ATI OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus0
ugen1.1: <ATI EHCI root HUB> at usbus1
uhub6: <ATI EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus1
ada0 at ahcich0 bus 0 scbus0 target 0 lun 0
ada0: <SATA SSD S9FM02.2> ACS-3 ATA SATA 3.x device
ada0: Serial Number 8EA807550B3200389438
ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
ada0: Command Queueing enabled
ada0: 15272MB (31277232 512 byte sectors)
Trying to mount root from ufs:/dev/ada0s1a [rw,noatime]...
uhub3: 2 ports with 2 removable, self powered
uhub1: 4 ports with 4 removable, self powered
uhub2: 5 ports with 5 removable, self powered
uhub5: 5 ports with 5 removable, self powered
random: unblocking device.
uhub0: 4 ports with 4 removable, self powered
uhub4: 5 ports with 5 removable, self powered
uhub6: 5 ports with 5 removable, self powered
ugen6.2: <Generic Flash Card ReaderWriter> at usbus6
umass0 on uhub0
umass0: <Generic Flash Card ReaderWriter, class 0/0, rev 2.01/1.00, addr 2> on usbus6
umass0:  SCSI over Bulk-Only; quirks = 0x4001
umass0:6:0: Attached to scbus6
da0 at umass-sim0 bus 0 scbus6 target 0 lun 0
da0: <Multiple Card  Reader 1.00> Removable Direct Access SPC-2 SCSI device
da0: Serial Number 058F63666485
da0: 40.000MB/s transfers
da0: Attempt to query device size failed: NOT READY, Medium not present
da0: quirks=0x2<NO_6_BYTE>
amdtemp0: <AMD CPU On-Die Thermal Sensors> on hostb4
tun2: changing name to 'ovpns2'
re1: link state changed to UP
vlan0: changing name to 're1_vlan10'
re2: link state changed to UP
vlan1: changing name to 're2_vlan2'
re0: link state changed to UP
pflog0: promiscuous mode enabled
ovpns2: link state changed to UP
ovpns2: link state changed to DOWN
ovpns2: link state changed to UP
ipfw2 (+ipv6) initialized, divert loadable, nat loadable, default to accept, logging disabled
DUMMYNET 0 with IPv6 initialized (100409)
load_dn_sched dn_sched FIFO loaded
load_dn_sched dn_sched QFQ loaded
load_dn_sched dn_sched RR loaded
load_dn_sched dn_sched WF2Q+ loaded
load_dn_sched dn_sched PRIO loaded
load_dn_sched dn_sched FQ_CODEL loaded
load_dn_sched dn_sched FQ_PIE loaded
load_dn_aqm dn_aqm CODEL loaded
load_dn_aqm dn_aqm PIE loaded
ovpns2: link state changed to DOWN
ovpns2: link state changed to UP
ovpns2: link state changed to DOWN
ovpns2: link state changed to UP

Do you know where ovpns2 is listening on? WAN perhaps?

Yes WAN

Do you get a new IPv4 on WAN every time this happens?

No, the IP is always the same, although I dont have a fixed ip.

Please check /var/db/wan_cacheip and /var/db/wan_ip -- they shouldn't match, correct?

But they match O_o

It's not the driver issue, i have the same on different hardware. And i have static IP on WAN, but issued by DHCP.

Let's try to find it... this one looks like a good candidate... c83bb8d:

# opnsense-patch c83bb8d

Same issue here, other hardware (Intel NIC's). My ISP has had a DHCP renew time of 30 seconds and after upgrade to 19.1.1 I get every 30 seconds a drop of my OpenVPN interface.

In the meantime my ISP has changed the DHCP renew time to 12 hours => no issues so far.

Additionally the NTPD service get's also and exit signal during the rc.newwanip script:

ntpd[56403]: ntpd exiting on signal 15 (Terminated)

I have figured out how to easy manually reproduce the issue (VPN interface down/up + the NTPD exit signal in my case):

In the Web Interface go to "Interfaces > Overview" and just klick "Reload" on the WAN interface

Applied the patch c83bb8d and klicked "Reload" still causes my ovpns1 link DOWN/UP:

Feb 6 19:24:57  opnsense: /usr/local/etc/rc.dyndns: Dynamic DNS (SOMEWHAT.dyndns.org): No change in my IP address and/or 25 days has not passed. Not updating dynamic DNS entry.
Feb 6 19:24:56  opnsense: /usr/local/etc/rc.configure_interface: ROUTING: skipping IPv6 default route
Feb 6 19:24:56  opnsense: /usr/local/etc/rc.configure_interface: ROUTING: keeping current default gateway 'ISPGWIP'
Feb 6 19:24:56  opnsense: /usr/local/etc/rc.configure_interface: ROUTING: setting IPv4 default route to ISPGWIP
Feb 6 19:24:56  opnsense: /usr/local/etc/rc.configure_interface: ROUTING: no IPv6 default gateway set, assuming wan
Feb 6 19:24:56  opnsense: /usr/local/etc/rc.configure_interface: ROUTING: IPv4 default gateway set to wan
Feb 6 19:24:56  opnsense: /usr/local/etc/rc.configure_interface: ROUTING: entering configure using 'wan'
Feb 6 19:24:55  opnsense: /usr/local/etc/rc.newwanip: Dynamic DNS (SOMEWHAT.dyndns.org): No change in my IP address and/or 25 days has not passed. Not updating dynamic DNS entry.
Feb 6 19:24:54  opnsense: /usr/local/etc/rc.newwanip: Interface '' is disabled or empty, nothing to do.
Feb 6 19:24:54  opnsense: /usr/local/etc/rc.newwanip: IP renewal is starting on 'ovpns1'
Feb 6 19:24:54  kernel: ovpns1: link state changed to UP
Feb 6 19:24:54  kernel: ovpns1: link state changed to DOWN
Feb 6 19:24:54  opnsense: /usr/local/etc/rc.newwanip: Resyncing OpenVPN instances for interface WAN.
Feb 6 19:24:53  opnsense: /usr/local/etc/rc.newwanip: ROUTING: skipping IPv6 default route
Feb 6 19:24:53  opnsense: /usr/local/etc/rc.newwanip: ROUTING: keeping current default gateway 'ISPGWIP'
Feb 6 19:24:53  opnsense: /usr/local/etc/rc.newwanip: ROUTING: setting IPv4 default route to ISPGWIP
Feb 6 19:24:53  opnsense: /usr/local/etc/rc.newwanip: ROUTING: no IPv6 default gateway set, assuming wan
Feb 6 19:24:53  opnsense: /usr/local/etc/rc.newwanip: ROUTING: IPv4 default gateway set to wan
Feb 6 19:24:53  opnsense: /usr/local/etc/rc.newwanip: ROUTING: entering configure using 'wan'
Feb 6 19:24:53  opnsense: /usr/local/etc/rc.newwanip: On (IP address: MYIPADDRESS) (interface: WAN[wan]) (real interface: em0).
Feb 6 19:24:53  sshlockout[11128]: sshlockout/webConfigurator v3.0 starting up
Feb 6 19:24:53  opnsense: /usr/local/etc/rc.newwanip: IP renewal is starting on 'em0'

and NTPD exit signal:

Feb 6 19:24:56 | ntpd[94413]: ntpd exiting on signal 15 (Terminated)

I had the same (or at least a similar) issue (IP renewal starting on WAN every 30 minutes), then applied c83bb8d about 75 minutes ago, and rebooted. Since then, no more IP renewal.

Patched again + reboot... still VPN link down/up during IP renewal.

After patching the tv stream didn't drop for over 1 hour. Tomorrow I can test if the vpn connection is stable again.

Thank you very much for your quick interaction.

I can't see any newwanip logs over the night. So it ran stable I think.

I know now why the connection dropped everytime... I had "Firewall: Settings: Advanced: Dynamic state reset" active (_Reset all states when a dynamic IP address changes.This option flushes the entire state table on IPv4 address changes in dynamic setups to e.g. allow VoIP servers to re-register._)
But the IP didn't change, so this trigger shouldn't have been triggered.

But thank you for the patch!

What happens to the patch if I install some updates through the web gui? Does it persist?

Got it, the "Reload" button forces the rc.newwanip script but the patch doesn't call it from dhclient if it's not a real IP change, right?

So the patch works also for me!

There was a DHCP triggered renewal try on my WAN without real IP change a few minutes ago (log entry in the dhcp log). But nothing to see about the rc.newwanip in general system log and also no exit signal in NTPD log.

So now my ISP can go back to stupid 30 second IP renewal time ;-)

Thanks a lot!!!

Same question as tbandixen: Will this patch persist or will it find the way into updates?

The patch will not persist into 19.1.2 and needs to be reapplied worst case. opnsense-patch was designed as a tool to test changes between versions and to troubleshoot.

As the change for has been proposed in issue #2542 for a long time for a good reason we now need to find the problem contained within the patch itself as a way to go forward with a better path than just to go back to what we had before.

I understand that, if I again run the # opnsense-patch c83bb8d it will revert the patch, right?

I'd like to test the behaviour without the active setting "Firewall: Settings: Advanced: Dynamic state reset".

Yes, that will revert it again. If the patching should go wrong for any reason (the utility would tell you) then you can also revert the whole opnsense package to its last known good release with this:

# opnsense-revert opnsense

PS: Thanks for your help!

Hello

Don't spam Enter please? I find this Issue curious

@fichtner I don't understand this fixed or what 'The patch will not persist 19.1.2'. First and foremost, I've been observing, this is active project. With that said, I don't know what the patch does? Is it a configuration problem?

Please take the time to read the posts in this issue and the patch itself and the originating issue. I am not here to give you a summary and it's quite impolite also to everyone else posting debug data here.

The VPN connections drops again after reverting the patch. With the same logging information.
Trying to analyze the patch and maybe I found a type?
https://github.com/opnsense/core/commit/c83bb8d286dbcd5ffcc257027a15782057efd11d#diff-0cf30baaca1e01958ae0680c8681bb6eR48
Shouldn't $new_hostname be $new_host_name

Well, you are looking at the revert, which means the bug was removed in 19.1, not introduced.

Better to review the original ticket: https://github.com/opnsense/core/commit/882ce283fd1210a106ca2d953fc169958987cfbe

It might be related.

You don't have to, I've read you enough to have idea the person you are @fichtner. To say I won't follow up, I asking me politely not to comment?

Right, I was confused... Looking at the original ticket now

I have the firewall installed, I need to know it's stable... Yes I am doing the follow up right now

@lhorace last warning: please stop spamming

Okay, I didn't know I was spamming, I will switch back to pfsense

You should banned now

Thanks, but you are indeed still spamming ;)

Yeah my history tells a different story

After reapplying the patch, I looked at the /var/log/dhcpd.log file.

If the patch is applied there are logs like this every dhcp cycle (20 minutes for me):

Feb  7 03:48:28 apu dhclient[70232]: DHCPREQUEST on re1_vlan10 to X.X.X.X port 67
Feb  7 03:48:28 apu dhclient[70232]: ip length 328 disagrees with bytes received 332.
Feb  7 03:48:28 apu dhclient[70232]: accepting packet with data after udp payload.
Feb  7 03:48:28 apu dhclient[70232]: DHCPACK from X.X.X.X
Feb  7 03:48:28 apu dhclient: RENEW
Feb  7 03:48:28 apu dhclient: Creating resolv.conf
Feb  7 03:48:28 apu dhclient[70232]: bound to Y.Y.Y.Y -- renewal in 1200 seconds.

Without the patch these logs are looking like this:

Feb  7 09:28:31 apu dhclient[70232]: DHCPREQUEST on re1_vlan10 to X.X.X.X port 67
Feb  7 09:28:32 apu dhclient[70232]: ip length 328 disagrees with bytes received 332.
Feb  7 09:28:32 apu dhclient[70232]: accepting packet with data after udp payload.
Feb  7 09:28:32 apu dhclient[70232]: DHCPACK from X.X.X.X
Feb  7 09:28:32 apu dhclient: Creating resolv.conf
Feb  7 09:28:57 apu dhclient[70232]: bound to Y.Y.Y.Y -- renewal in 1200 seconds.

The line dhclient: RENEW is missing, how comes?
Could that be the difference?

Where are these variables alias_ip_address, old_ip_address and new_ip_address (in the dhcpclient-script) being set/defined?

--- Edit
I think I found it:
https://www.freebsd.org/cgi/man.cgi?query=dhclient-script&sektion=8&manpath=freebsd-release-ports

It is set in dhclient binary before dispatching dhclient-script. Sorry, I will follow up later, stuck at my day job at the moment.

So RENEW is missing because of https://github.com/opnsense/core/commit/882ce283fd1210a106ca2d953fc169958987cfbe#diff-0cf30baaca1e01958ae0680c8681bb6eL311

Can you please apply 2e314c0 and save from system: settings: logging to move the dhclient (and dhcp6c) log to the system log? it will be easier to have all in one place.

# opnsense-patch 2e314c0

Not sure how useful this is, and whether it's a bug or intentional, but in /usr/local/etc/rc.newwanip line 83:

$configip = $config['interfaces'][$interface]['ipaddr'];

resolves to the string 'dhcp'. This causes the check in line 161:

if (!is_ipaddr($cacheip) || $ip != $cacheip || !is_ipaddr($configip)) {

to evaluate to 1/True (specifically !is_ipaddr($configip)), so that code block is entered every time rc.newwanip is run, which restarts everything AFAICT. I have no idea if any of that behavior has changed recently--it sounds like it hasn't--but for the time being I just removed

|| !is_ipaddr($configip)

from the check in line 161 to get the OpenVPN connection to stop dropping every 5 minutes. In hindsight I wish I'd found this ticket with the patch first, but maybe this will be useful.

I applied 2e314c0 (without c83bb8d from before), and I just saved "System: Settings: Logging" without any changes, thats right?

@tbandixen that's correct

@cejohnson thanks. that's been this way for a long time however ;(

I've reviewed the "bad" patch line by line and nothing really jumps out there that would explain all of it. We need more data points, hopefully a full log and the $LOGGER $reason addition will reveal what is going on.

Somehow there are missing entries?

Feb  8 11:18:06 apu dhclient: Creating resolv.conf
Feb  8 11:18:06 apu opnsense: /usr/local/etc/rc.newwanip: IP renewal is starting on 're1_vlan10'
Feb  8 11:18:06 apu opnsense: /usr/local/etc/rc.newwanip: On (IP address: X.X.X.X) (interface: WAN[wan]) (real interface: re1_vlan10).
Feb  8 11:18:06 apu kernel: stf0: changing name to 're1_vlan10_stf'
Feb  8 11:18:07 apu opnsense: /usr/local/etc/rc.newwanip: ROUTING: entering configure using 'wan'
Feb  8 11:18:07 apu opnsense: /usr/local/etc/rc.newwanip: ROUTING: IPv6 default gateway set to wan
Feb  8 11:18:07 apu opnsense: /usr/local/etc/rc.newwanip: ROUTING: IPv4 default gateway set to wan
Feb  8 11:18:07 apu opnsense: /usr/local/etc/rc.newwanip: ROUTING: setting IPv4 default route to Y.Y.Y.Y
Feb  8 11:18:07 apu opnsense: /usr/local/etc/rc.newwanip: ROUTING: keeping current default gateway 'Y.Y.Y.Y'
Feb  8 11:18:07 apu opnsense: /usr/local/etc/rc.newwanip: ROUTING: setting IPv6 default route to Z.Z.Z.Z
Feb  8 11:18:07 apu opnsense: /usr/local/etc/rc.newwanip: ROUTING: removing /tmp/re1_vlan10_stf_defaultgwv6
Feb  8 11:18:07 apu opnsense: /usr/local/etc/rc.newwanip: ROUTING: creating /tmp/re1_vlan10_stf_defaultgwv6 using 'Z.Z.Z.Z'
Feb  8 11:18:13 apu opnsense: /usr/local/etc/rc.newwanip: Resyncing OpenVPN instances for interface WAN.
Feb  8 11:18:13 apu kernel: ovpns2: link state changed to DOWN
Feb  8 11:18:20 apu kernel: ovpns2: link state changed to UP
Feb  8 11:18:21 apu opnsense: /usr/local/etc/rc.newwanip: IP renewal is starting on 'ovpns2'
Feb  8 11:18:21 apu opnsense: /usr/local/etc/rc.newwanip: Interface '' is disabled or empty, nothing to do.
Feb  8 11:18:28 apu opnsense: /usr/local/etc/rc.newwanip: Dynamic DNS: updating cache file /var/cache/dyndns.org_0.cache: X.X.X.X
Feb  8 11:18:28 apu opnsense: /usr/local/etc/rc.newwanip: Dynamic DNS: (Success) No change in IP address

@fichtner please review logs in #3200 based on save 2e314c0 (without c83bb8d from before). If you are under the impression that this issue is the same as #3200 should I concentrate debug logs here?

I have a similar behaviour. After upgrade to 19.1 (using 19.1.1 now) I get this:

gif0: link state changed to UP
ovpns1: link state changed to DOWN
ovpns1: link state changed to UP
gif0: link state changed to DOWN
gif0: link state changed to UP
gif0: link state changed to DOWN
gif0: link state changed to UP
gif0: link state changed to DOWN
gif0: link state changed to UP
ovpns1: link state changed to DOWN
ovpns1: link state changed to UP
gif0: link state changed to DOWN
gif0: link state changed to UP
gif0: link state changed to DOWN
gif0: link state changed to UP
gif0: link state changed to DOWN
gif0: link state changed to UP
ovpns1: link state changed to DOWN
ovpns1: link state changed to UP

My SSL VPN connection breaks like every 5 minutes, even though I have regen set to 7200 seconds.

And my SSL VPN doesn't go over the GIF, it goes over WAN, so I think it's odd.

This is from my general log:

```

Feb 8 17:00:32 | opnsense: /usr/local/etc/rc.newwanip: Dynamic DNS: (Error) Authentication failed
-- | --
Feb 8 17:00:29 | opnsense: /usr/local/etc/rc.filter_configure: Cannot switch while 0 inet6 gateways are up
Feb 8 17:00:29 | opnsense: /usr/local/etc/rc.filter_configure: ROUTING: keeping current default gateway '213.67.96.1'
Feb 8 17:00:29 | opnsense: /usr/local/etc/rc.newwanip: Interface '' is disabled or empty, nothing to do.
Feb 8 17:00:29 | opnsense: /usr/local/etc/rc.newwanip: IP renewal is starting on 'ovpns1'
Feb 8 17:00:29 | kernel: ovpns1: link state changed to UP
Feb 8 17:00:28 | opnsense: /usr/local/etc/rc.filter_configure: Cannot switch while 0 inet6 gateways are up
Feb 8 17:00:28 | opnsense: /usr/local/etc/rc.filter_configure: ROUTING: keeping current default gateway '213.67.96.1'
Feb 8 17:00:27 | kernel: ovpns1: link state changed to DOWN
Feb 8 17:00:27 | opnsense: /usr/local/etc/rc.newwanip: Resyncing OpenVPN instances for interface WAN.
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: Cannot switch while 0 inet6 gateways are up
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: Cannot switch while 0 inet gateways are up
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: Adding static route for monitor 9.9.9.9 via 90.232.8.233
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: Removing static route for monitor 9.9.9.9 via 90.232.8.233
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: skipping IPv6 default route
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: keeping current default gateway '213.67.96.1'
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: setting IPv4 default route to 213.67.96.1
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: IPv4 default gateway set to wan
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: IPv6 default gateway set to opt7
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: entering configure using 'wan'
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: creating /tmp/gif0_defaultgwv6 using '2001:470:27:3d9::1'
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: removing /tmp/gif0_defaultgwv6
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: setting IPv6 default route to 2001:470:27:3d9::1
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: skipping IPv4 default route
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: IPv4 default gateway set to wan
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: IPv6 default gateway set to opt7
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: entering configure using 'opt7'
Feb 8 17:00:26 | kernel: gif0: link state changed to UP
Feb 8 17:00:26 | kernel: gif0: link state changed to DOWN
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: Clearing states for stale opt7 route on gif0
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: creating /tmp/gif0_defaultgwv6 using '2001:470:27:3d9::1'
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: removing /tmp/gif0_defaultgwv6
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: setting IPv6 default route to 2001:470:27:3d9::1
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: skipping IPv4 default route
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: IPv4 default gateway set to wan
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: IPv6 default gateway set to opt7
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: entering configure using 'opt7'
Feb 8 17:00:26 | kernel: gif0: link state changed to UP
Feb 8 17:00:26 | kernel: gif0: link state changed to DOWN
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: creating /tmp/gif0_defaultgwv6 using '2001:470:27:3d9::1'
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: removing /tmp/gif0_defaultgwv6
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: setting IPv6 default route to 2001:470:27:3d9::1
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: skipping IPv4 default route
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: IPv4 default gateway set to wan
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: IPv6 default gateway set to opt7
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: entering configure using 'opt7'
Feb 8 17:00:26 | kernel: gif0: link state changed to UP
Feb 8 17:00:26 | kernel: gif0: link state changed to DOWN
聽```

After doing opnsense-patch c83bb8d it's still up, and it's been 28 minutes now. Seems promising. :D

I'm wondering if its actually https://github.com/opnsense/core/commit/a1dbbb5e since that is new for 19.1 as well coming from FreeBSD directly but I don't really understand how it could cause such a disruptive behaviour...

# opnsense-revert opnsense
# opnsense-patch a1dbbb5e

A yes or no on that would be helpful. Thank you.

PS: There's also b20f71b to try but looks like low impact

I'm wondering if its actually a1dbbb5 since that is new for 19.1 as well coming from FreeBSD directly but I don't really understand how it could cause such a disruptive behaviour...

# opnsense-revert opnsense
# opnsense-patch a1dbbb5e

A yes or no on that would be helpful. Thank you.

It looks promising, I cant say yes, but maybe. The TV stream doesnt drop anymore, but now I'm not connected throught VPN, but the logfile looks like this:

Feb  9 12:38:16 apu opnsense: /usr/local/etc/rc.newwanip: IP renewal is starting on 're1_vlan10'
Feb  9 12:38:16 apu sshlockout[8895]: sshlockout/webConfigurator v3.0 starting up
Feb  9 12:38:16 apu opnsense: /usr/local/etc/rc.newwanip: On (IP address: X.X.X.X) (interface: WAN[wan]) (real interface: re1_vlan10).
Feb  9 12:38:17 apu kernel: stf0: changing name to 're1_vlan10_stf'
Feb  9 12:38:17 apu opnsense: /usr/local/etc/rc.newwanip: ROUTING: entering configure using 'wan'
Feb  9 12:38:17 apu opnsense: /usr/local/etc/rc.newwanip: ROUTING: IPv6 default gateway set to wan
Feb  9 12:38:17 apu opnsense: /usr/local/etc/rc.newwanip: ROUTING: IPv4 default gateway set to wan
Feb  9 12:38:17 apu opnsense: /usr/local/etc/rc.newwanip: ROUTING: setting IPv4 default route to Y.Y.Y.Y
Feb  9 12:38:17 apu opnsense: /usr/local/etc/rc.newwanip: ROUTING: keeping current default gateway 'Y.Y.Y.Y'
Feb  9 12:38:17 apu opnsense: /usr/local/etc/rc.newwanip: ROUTING: setting IPv6 default route to Z.Z.Z.Z
Feb  9 12:38:17 apu opnsense: /usr/local/etc/rc.newwanip: ROUTING: removing /tmp/re1_vlan10_stf_defaultgwv6
Feb  9 12:38:17 apu opnsense: /usr/local/etc/rc.newwanip: ROUTING: creating /tmp/re1_vlan10_stf_defaultgwv6 using 'Z.Z.Z.Z'
Feb  9 12:38:24 apu opnsense: /usr/local/etc/rc.newwanip: Resyncing OpenVPN instances for interface WAN.
Feb  9 12:38:24 apu kernel: ovpns2: link state changed to DOWN
Feb  9 12:38:31 apu kernel: ovpns2: link state changed to UP
Feb  9 12:38:32 apu opnsense: /usr/local/etc/rc.newwanip: IP renewal is starting on 'ovpns2'
Feb  9 12:38:32 apu opnsense: /usr/local/etc/rc.newwanip: Interface '' is disabled or empty, nothing to do.
Feb  9 12:38:40 apu opnsense: /usr/local/etc/rc.newwanip: Dynamic DNS: updating cache file /var/cache/dyndns.org_0.cache: X.X.X.X
Feb  9 12:38:40 apu opnsense: /usr/local/etc/rc.newwanip: Dynamic DNS: (Success) No change in IP address

It logs, that the VPN interface is down and up again, but I cant verify this. I can tell you on monday. But the wan interface didnt drop any connection. I didn't apply b20f71b, should I for your tests?

So I tested the VPN connection and unfortunately it drops. I apply b20f71b to test again, but I can't answer until tonight at the earliest.

The VPN connection drops also with the patch b20f71b applied. But WAN seams to be stable all the time.

My issues is fixed with patch c83bb8d anyway. Still no issues.

c83bb8d is not going to be an official fix

@fichtner Why not if it's working?

We don't know what the error is, so it will come back if we just revert fixes for other problems. Not to mention having issues back that have been fixed with this. Does that make sense?

Totally

The VPN connection drops also with the patch b20f71b applied. But WAN seams to be stable all the time.

Since saturday WAN is stable (but the VPN interface drops) with the following patches applied

# opnsense-revert opnsense
# opnsense-patch a1dbbb5e
# opnsense-patch b20f71b

So what is the current state? Which patches should be applied?
How can we help diagnosing the issue?

We have this on an APU3C4 with WAN IP provided by DHCP.

For now we disabled "Dynamic state reset | Reset all states when a dynamic IP address changes.This option flushes the entire state table on IPv4 address changes in dynamic setups to e.g. allow VoIP servers to re-register." which solved the problem with connections dropping and the IPSec Tunnel getting stuck because of dropped states.
But the rc.newwanip script still gets called every 1200s.

I'm out of the loop on this one and have just picked it up to read through. A LONG time ago, well it seems a long time ago now we had a similar issue across the road with IPv6 dhcp6 running newwanipv6 on every renew of the client. This was the reason for adding the RENEW and REQEUST reasons to the output of dhcp6c, if the response from dhcp6c was RENEW then we stopped it from calling newwanipv6 in the script as nothing would have changed, I have not looked so I may be talking out of my derriere but is something similar not possible with dhclient? It would solve an awful lot of problems in one go.

I have the same problem with 19.1.1 on two APU2C4 systems. Another system running 19.1.1 on APU1D4 doesn't show this problem.

Is there anything I can help to diagnoses the issue?

I have the same issue and started looking around in the log files. I believe the answer lies in System: Gateways: Log File and possibly with dpinger.

I see this in the log file repeating every 15 minutes:

dpinger: send_interval 1000ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr xxx.xxx.xxx.xxx bind_addr xxx.xxx.xxx.xxx identifier "WAN_DHCP

I noticed that by going to Firewall: Settings: Advanced and checking the "Disable State Killing on Gateway Failure" option I was able to stop the routine dropping of connections.

I then noticed the following:

Feb 11 16:26:10 dpinger: GATEWAY ALARM: WAN_DHCP (Addr: xxx.xxx.xxx.xxx Alarm: 0 RTT: 66816ms RTTd: 213265ms Loss: 0%)
Feb 11 16:26:10 dpinger: WAN_DHCP xxx.xxx.xxx.xxx: Clear latency 66816us stddev 213265us loss 0%
Feb 11 16:25:58 dpinger: GATEWAY ALARM: WAN_DHCP (Addr: xxx.xxx.xxx.xxx Alarm: 1 RTT: 832201ms RTTd: 0ms Loss: 0%)
Feb 11 16:25:58 dpinger: WAN_DHCP xxx.xxx.xxx.xxx: Alarm latency 832201us stddev 0us loss 0%

Could this be the culprit?

what you see is dpinger reporting the very fact that there's a problem

I have the same problem with 19.1.1 on two APU2C4 systems. Another system running 19.1.1 on APU1D4 doesn't show this problem.

Are there any configuration difference between your systems on the APU2C4 and APU1D4 ?

I'm out of the loop on this one and have just picked it up to read through. A LONG time ago, well it seems a long time ago now we had a similar issue across the road with IPv6 dhcp6 running newwanipv6 on every renew of the client. This was the reason for adding the RENEW and REQEUST reasons to the output of dhcp6c, if the response from dhcp6c was RENEW then we stopped it from calling newwanipv6 in the script as nothing would have changed, I have not looked so I may be talking out of my derriere but is something similar not possible with dhclient? It would solve an awful lot of problems in one go.

Could this be the solution? Is someone willing to provide a patch that I can test?

No, we need to explain what happens first. Something changed, but the changes in the dhclient-script are too superficial to cause this so I'm afraid of later breakage if we don't pin this down.

I totally understand that. How can I provide further help?

Might be an idea to add some extra debug output to the dhclient-script and get @tbandixen to do some tests and give feedback. Unfortunately I'll not be able to add the extra bits myself until sometime next week!

@tbandixen - Little test for you. Copy the attached script to /usr/local/opnsense/scripts/interfaces . Backup the existing script first. See if it makes any difference at all. Pointers are needed to try and identify the issues so all this script does is filter out the RENEW response and bypass any actions.
dhclient-script.zip

Ok, I ignore the renew now (as provided in the script) lets see if the vpn is stable now. will report asap

Well, as expected, the connection remains stable. I will include the ignored lines step by step to identify the causing line.

How can I manually trigger this script so I dont have to wait 20minutes every time?

It's more a question of if the 'RENEW' Bypass helps, what has changed elsewhere that's causing the issue. Has something changed in dhclient itself perhaps? @fichtner can answer that one.

I think just bypassing the RENEW isnt that good, its there for a reason. But I dont know if behind the scenes something changed (BSD things maybe, I dont know the *nix subsystem at all). I will try to find the line that causes the drop and digg a bit further.

Actually bypassing RENEW is good, it means nothing has changed so do nothing.

The key is that the var 'changes' is changed to 'yes'. What I would do next is add a logger entry to each and every if... fi statement block to identify the culprit.

Something like this:-

    if [ "$old_ip_address" != "$new_ip_address" ]; then
        $LOGGER "old ip address != new ip address at line 348"
        delete_old_states
    fi

I think we should check if $alias_ip_address is set, otherwise if [ "$old_ip_address" != "$alias_ip_address" ]; then and if [ "$new_ip_address" != "$alias_ip_address" ]; then will be true always. My $alias_ip_address is empty... Maybe that changed?

I think setting changes="yes" because of [ "$old_ip_address" != "$alias_ip_address" ] is wrong. Because in delete_old_alias the alias will be removed IF $alias_ip_address its length is not zero, so maybe there is no change at all if the alias is empty, right? Same applies to [ "$new_ip_address" != "$alias_ip_address" ]

My VPN connection remains stable if I add [ -n "$alias_ip_address" ] && to the lines 351 and 369.
So far so good, but did that change in 19.1? I didn't change any config targeting ISP dhcp client settings.

@tbandixen thanks for the hint, the check changed as it tried to align with FreeBSD. It seems the issue is present there, but it is not critical because it doesn't have a use case maybe shrugs

# opnsense-revert opnsense
# opnsense-patch 90c0c395

That should be it then....

Its all stable now 馃槂

Thank you very much!

Ok, we'll wrap this up for 19.1.2. Thank you for the help. 鉂わ笍

Was this page helpful?
0 / 5 - 0 ratings