Gluon: autoupdater runs concurrently

Created on 6 Dec 2015  路  12Comments  路  Source: freifunk-gluon/gluon

On one node with a high load I found multiple autoupdater processes running. Could we use a lockfile to prevent this?

root@somenode:~# uptime
 23:48:41 up 3 days,  9:09,  load average: 3.03, 2.52, 2.28
root@somenode:~# ps
  PID USER       VSZ STAT COMMAND
    1 root      1388 S    /sbin/procd
    2 root         0 SW   [kthreadd]
    3 root         0 SW   [ksoftirqd/0]
    5 root         0 SW<  [kworker/0:0H]
    7 root         0 SW<  [khelper]
   62 root         0 SW<  [writeback]
   65 root         0 SW<  [bioset]
   67 root         0 SW<  [kblockd]
  100 root         0 SW   [kswapd0]
  147 root         0 SW   [fsnotify_mark]
  178 root         0 SW<  [ath79-spi]
  292 root         0 SW<  [deferwq]
  363 root         0 SWN  [jffs2_gcd_mtd3]
  446 root       892 S    /sbin/ubusd
  447 root       772 S    /sbin/askfirst ttyS0 /bin/ash --login
  603 root         0 SW<  [bat_events]
  695 root         0 SW<  [cfg80211]
  830 root      1036 S    /sbin/logd -S 16
  835 root      1600 S    /usr/sbin/haveged -w 1024 -d 32 -i 32 -v 1
  943 root      1608 S    /sbin/netifd
  974 root      1392 S    /usr/sbin/crond -f -c /etc/crontabs -l 5
  994 root      1152 S    /usr/sbin/dropbear -F -P /var/run/dropbear.1.pid -p 22 -K 300
 1003 root       788 S    /usr/sbin/gluon-crond /lib/gluon/cron
 1008 root      1108 S    /usr/sbin/gluon-radvd -i br-client -p 2001:bf7:540::/64
 1041 root      1132 S    /usr/sbin/uhttpd -f -h /lib/gluon/status-page/www -r somenode -x /c
 1051 root       916 S    /usr/sbin/dnsmasq -x /var/run/gluon-wan-dnsmasq.pid -u root -i lo -p 54 -h -r
 1125 root      3292 R    /usr/bin/fastd --config - --daemon --pid-file /var/run/fastd.mesh_vpn.pid
 1282 root      1392 S    /usr/sbin/ntpd -n -p 1.ntp.services.bremen.freifunk.net -p 2.ntp.services.brem
 1313 root      1112 S    /usr/sbin/alfred -i br-client -b bat0
 1334 root      1320 S    /usr/sbin/batadv-vis -i bat0 -s
 1340 root       812 S    odhcp6c -s /lib/netifd/dhcpv6.script -t120 br-client
 1386 root      1388 S    udhcpc -p /var/run/udhcpc-br-wan.pid -s /lib/netifd/dhcp.script -f -t 0 -i br-
 1434 root       812 S    odhcp6c -s /lib/netifd/dhcpv6.script -P0 -t120 br-wan
 1640 nobody     924 S    /usr/sbin/dnsmasq -C /var/etc/dnsmasq.conf -k
 1902 root      2152 S    /usr/bin/respondd -g ff02::2:1001 -p 1001 -c return require("gluon.announced")
 2557 root         0 SW   [kworker/0:3]
 3010 root      1388 S    /bin/sh -c /usr/sbin/autoupdater --fallback 
 3013 root      2080 S    {autoupdater} /usr/bin/lua /usr/sbin/autoupdater --fallback
 3023 root         0 Z    [sh]
 3129 root      1388 S    /bin/sh -c /usr/sbin/autoupdater --fallback 
 3131 root      2072 S    {autoupdater} /usr/bin/lua /usr/sbin/autoupdater --fallback
 3144 root      1388 S    sh -c wget -T 120 -O- 'http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/s
 3145 root      1396 S    wget -T 120 -O- http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/stable/s
 4772 root         0 Z    [sh]
 4826 root      1388 S    sh -c wget -T 120 -O- 'http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/s
 4827 root      1396 S    wget -T 120 -O- http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/stable/s
 5656 root      1388 S    /bin/sh -c /usr/sbin/autoupdater --fallback 
 5658 root      2072 S    {autoupdater} /usr/bin/lua /usr/sbin/autoupdater --fallback
 5662 root      1388 S    sh -c wget -T 120 -O- 'http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/s
 5663 root      1396 S    wget -T 120 -O- http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/stable/s
 5753 root      1388 S    /bin/sh -c /usr/sbin/autoupdater --fallback 
 5757 root      2072 S    {autoupdater} /usr/bin/lua /usr/sbin/autoupdater --fallback
 5759 root      1388 S    sh -c wget -T 120 -O- 'http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/s
 5760 root      1396 S    wget -T 120 -O- http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/stable/s
 8352 root      1388 S    /bin/sh -c /usr/sbin/autoupdater --fallback 
 8354 root      2072 S    {autoupdater} /usr/bin/lua /usr/sbin/autoupdater --fallback
 8358 root      1388 S    sh -c wget -T 120 -O- 'http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/s
 8359 root      1396 S    wget -T 120 -O- http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/stable/s
 9126 root      1388 S    /bin/sh -c /usr/sbin/autoupdater --fallback 
 9128 root      2072 S    {autoupdater} /usr/bin/lua /usr/sbin/autoupdater --fallback
 9132 root         0 Z    [sh]
 9183 root      1388 S    sh -c wget -T 120 -O- 'http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/s
 9184 root      1396 S    wget -T 120 -O- http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/stable/s
10611 root      1388 S    /bin/sh -c /usr/sbin/autoupdater --fallback 
10616 root      2072 S    {autoupdater} /usr/bin/lua /usr/sbin/autoupdater --fallback
10617 root      1388 S    sh -c wget -T 120 -O- 'http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/s
10618 root      1396 S    wget -T 120 -O- http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/stable/s
11439 root      1388 S    /bin/sh -c /usr/sbin/autoupdater 
11444 root      2072 S    {autoupdater} /usr/bin/lua /usr/sbin/autoupdater
11445 root      1388 S    sh -c wget -T 120 -O- 'http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/s
11446 root      1396 S    wget -T 120 -O- http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/stable/s
11462 root      1584 S    /usr/sbin/hostapd -P /var/run/wifi-phy0.pid -B /var/run/hostapd-phy0.conf
11503 root      1576 S    /usr/sbin/hostapd -P /var/run/wifi-phy1.pid -B /var/run/hostapd-phy1.conf
12031 root      1388 S    /bin/sh -c /usr/sbin/autoupdater --fallback 
12033 root      2072 S    {autoupdater} /usr/bin/lua /usr/sbin/autoupdater --fallback
12038 root         0 Z    [sh]
12099 root      1388 S    sh -c wget -T 120 -O- 'http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/s
12100 root      1396 S    wget -T 120 -O- http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/stable/s
14067 root      1388 S    /bin/sh -c /usr/sbin/autoupdater --fallback 
14072 root      2072 S    {autoupdater} /usr/bin/lua /usr/sbin/autoupdater --fallback
14074 root      1388 S    sh -c wget -T 120 -O- 'http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/s
14075 root      1396 S    wget -T 120 -O- http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/stable/s
14929 root         0 SW   [kworker/u2:2]
16847 root      1220 S    /usr/sbin/dropbear -F -P /var/run/dropbear.1.pid -p 22 -K 300
16944 root         0 SW   [kworker/0:0]
17185 root      1388 S    /bin/sh -c /usr/sbin/autoupdater --fallback 
17187 root      2072 S    {autoupdater} /usr/bin/lua /usr/sbin/autoupdater --fallback
17191 root      1388 S    sh -c wget -T 120 -O- 'http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/s
17192 root      1396 S    wget -T 120 -O- http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/stable/s
17311 root      1388 S    /bin/sh -c /usr/sbin/autoupdater --fallback 
17316 root      2072 S    {autoupdater} /usr/bin/lua /usr/sbin/autoupdater --fallback
17317 root      1388 S    sh -c wget -T 120 -O- 'http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/s
17318 root      1396 S    wget -T 120 -O- http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/stable/s
17435 root      1388 S    /bin/sh -c /usr/sbin/autoupdater --fallback 
17437 root      2072 S    {autoupdater} /usr/bin/lua /usr/sbin/autoupdater --fallback
17441 root         0 Z    [sh]
17500 root      1388 S    sh -c wget -T 120 -O- 'http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/s
17501 root      1396 S    wget -T 120 -O- http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/stable/s
18092 root      1388 S    /bin/sh -c /usr/sbin/autoupdater --fallback 
18094 root      2072 S    {autoupdater} /usr/bin/lua /usr/sbin/autoupdater --fallback
18098 root      1388 S    sh -c wget -T 120 -O- 'http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/s
18099 root      1396 S    wget -T 120 -O- http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/stable/s
18953 root      1388 S    /bin/sh -c /usr/sbin/autoupdater --fallback 
18960 root      2072 S    {autoupdater} /usr/bin/lua /usr/sbin/autoupdater --fallback
18971 root      1388 S    sh -c wget -T 120 -O- 'http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/s
18972 root      1396 S    wget -T 120 -O- http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/stable/s
19766 root      1168 S    /usr/sbin/dropbear -F -P /var/run/dropbear.1.pid -p 22 -K 300
21013 root         0 SW   [kworker/0:1]
21349 root      1508 S    {dhcpv6.script} /bin/sh /lib/netifd/dhcpv6.script br-client ra-updated
21350 root      1504 R    {dhcpv6.script} /bin/sh /lib/netifd/dhcpv6.script br-client ra-updated
21363 root      1392 S    -ash
21368 root      1388 R    ps
21369 root      1508 S    {dhcpv6.script} /bin/sh /lib/netifd/dhcpv6.script br-client ra-updated
21370 root      1036 R    jshn -w
22019 root      1388 S    /bin/sh -c /usr/sbin/autoupdater --fallback 
22024 root      2072 S    {autoupdater} /usr/bin/lua /usr/sbin/autoupdater --fallback
22025 root         0 Z    [sh]
22045 root      1388 S    sh -c wget -T 120 -O- 'http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/s
22046 root      1396 S    wget -T 120 -O- http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/stable/s
23324 root      1388 S    /bin/sh -c /usr/sbin/autoupdater --fallback 
23326 root      2072 S    {autoupdater} /usr/bin/lua /usr/sbin/autoupdater --fallback
23330 root      1388 S    sh -c wget -T 120 -O- 'http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/s
23331 root      1396 S    wget -T 120 -O- http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/stable/s
24215 root      1388 S    /bin/sh -c /usr/sbin/autoupdater --fallback 
24220 root      2072 S    {autoupdater} /usr/bin/lua /usr/sbin/autoupdater --fallback
24221 root      1388 S    sh -c wget -T 120 -O- 'http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/s
24222 root      1396 S    wget -T 120 -O- http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/stable/s
25814 root         0 DW   [kworker/0:2]
26316 root         0 SW   [kworker/u2:1]
26908 root      1388 S    /bin/sh -c /usr/sbin/autoupdater --fallback 
26911 root      2072 S    {autoupdater} /usr/bin/lua /usr/sbin/autoupdater --fallback
26934 root      1388 S    sh -c wget -T 120 -O- 'http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/s
26937 root      1396 S    wget -T 120 -O- http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/stable/s
29222 root      1388 S    /bin/sh -c /usr/sbin/autoupdater --fallback 
29227 root      2072 S    {autoupdater} /usr/bin/lua /usr/sbin/autoupdater --fallback
29228 root         0 Z    [sh]
29300 root      1388 S    sh -c wget -T 120 -O- 'http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/s
29301 root      1396 S    wget -T 120 -O- http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/stable/s
29338 root      1388 S    /bin/sh -c /usr/sbin/autoupdater --fallback 
29340 root      2072 S    {autoupdater} /usr/bin/lua /usr/sbin/autoupdater --fallback
29344 root      1388 S    sh -c wget -T 120 -O- 'http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/s
29345 root      1396 S    wget -T 120 -O- http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/stable/s
30421 root      1388 S    /bin/sh -c /usr/sbin/autoupdater --fallback 
30428 root      2072 S    {autoupdater} /usr/bin/lua /usr/sbin/autoupdater --fallback
30437 root      1388 S    sh -c wget -T 120 -O- 'http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/s
30438 root      1396 S    wget -T 120 -O- http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/stable/s
30490 root      1388 S    /bin/sh -c /usr/sbin/autoupdater --fallback 
30495 root      2072 S    {autoupdater} /usr/bin/lua /usr/sbin/autoupdater --fallback
30496 root         0 Z    [sh]
30529 root      1388 S    sh -c wget -T 120 -O- 'http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/s
30530 root      1396 S    wget -T 120 -O- http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/stable/s
30612 root      1388 S    /bin/sh -c /usr/sbin/autoupdater --fallback 
30617 root      2072 S    {autoupdater} /usr/bin/lua /usr/sbin/autoupdater --fallback
30628 root      1388 S    sh -c wget -T 120 -O- 'http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/s
30629 root      1396 S    wget -T 120 -O- http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/stable/s
31101 root      1388 S    /bin/sh -c /usr/sbin/autoupdater --fallback 
31106 root      2072 S    {autoupdater} /usr/bin/lua /usr/sbin/autoupdater --fallback
31108 root      1388 S    sh -c wget -T 120 -O- 'http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/s
31109 root      1396 S    wget -T 120 -O- http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/stable/s
31630 root         0 SW   [kworker/u2:0]
31905 root      1388 S    /bin/sh -c /usr/sbin/autoupdater --fallback 
31912 root      2072 S    {autoupdater} /usr/bin/lua /usr/sbin/autoupdater --fallback
31922 root         0 Z    [sh]
31994 root      1388 S    sh -c wget -T 120 -O- 'http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/s
31995 root      1396 S    wget -T 120 -O- http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/stable/s
32294 root      1388 S    /bin/sh -c /usr/sbin/autoupdater --fallback 
32299 root      2072 S    {autoupdater} /usr/bin/lua /usr/sbin/autoupdater --fallback
32300 root         0 Z    [sh]
32359 root      1388 S    sh -c wget -T 120 -O- 'http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/s
32360 root      1396 S    wget -T 120 -O- http://[2a03:b0c0:3:d0::19:4001]/ffhb-mirror/firmware/stable/s
bug

Most helpful comment

I've started working on an autoupdater rewrite that will address this issue.

All 12 comments

Weird, wget is blocking, but the -T 120 should kill it after 2 minutes. Is this a normal Gluon build? The default OpenWrt config for busybox's wget doesn't support -T (and ignores it), but on Gluon, it should work.

Yes, it's the build 2015.1.2+bremen2 / gluon-v2015.1.2:
https://github.com/FreifunkBremen/gluon-site-ffhb/tree/v2015.1.2+bremen2

The -T options seems to be build in:

wget --help
BusyBox v1.22.1 (2015-11-07 22:12:39 CET) multi-call binary.

Usage: wget [-c|--continue] [-s|--spider] [-q|--quiet] [-O|--output-document FILE]
    [--header 'header: value'] [-Y|--proxy on/off] [-P DIR]
    [-U|--user-agent AGENT] [-T SEC] URL...

Retrieve files via HTTP or FTP

    -s  Spider mode - only check file existence
    -c  Continue retrieval of aborted transfer
    -q  Quiet
    -P DIR  Save to DIR (default .)
    -T SEC  Network read timeout is SEC seconds
    -O FILE Save to FILE ('-' for stdout)
    -U STR  Use STR for User-Agent header
    -Y  Use proxy ('on' or 'off')

-T looks like "timeout between packages" not, timelimit for the whole operation. You might want to create a subshell and limit the time of this subshell with wget is executed or switch over to curl where a operation-timelimit is supported.

The full wget offers three timeout options:
--dns-timeout, --connect-timeout, and --read-timeout

@corny the full wget is not available here. using a subshell is IMHO a good idea.

To get more precise: What kind of subshell usage do you think of? Using ulimit probably is no good, because one can only limit the cpu-time, which probably isn't anything near 120 seconds even after hours of running wget.

I just tested something like the following:

( sleep 5 && pgrep -P $$ sleep > /dev/null && kill $(pgrep -P $$ sleep)) & sleep 50

This seems to do what we want: The last sleep is killed after 5 seconds if it wasn't terminated (by ^C) beforehand, in which case nothing happens after the first sleep finishes. (I used sleep as a long-running command because I couldn't reproduce the problem with the hanging autoupdater.)

If this is what you were thinking of, we could simply plug this into line 129 and line 192 of the autoupdater. But even the fact that there are two occasions probably indicate that there should be a separate function get_http(url, file) or something like that that handles calling wget appropriately.

What do you think? If I'm not running in the totally wrong direction, I'll gladly prepare a PR.

I'd rather like to do something in Lua instead of adding Shell code (nixio has a fork function which could be used to replace popen with something more powerful, if necessary).

Mostly unrelated to this issue, I've thought about either adding 'exec' to all command calls, or even changing the code not to use /bin/sh at all, to avoid having these unnecesssary shell processes...

I started implementing this in Lua, and the current version is 43 lines long. However, it only imitates io.popen() and not os.execute(). The latter actually seems to be harder to implement, because SIGCHLD is ignored by Lua and doesn't interrupt functions like nixio.nanosleep(timeout) or nixio.poll({}, timeout). My current implementation relies on having an intermediate process forward the stdout, but see for yourself if this solution would be acceptable.

To paraphrase @NeoRaider: As this isn't a regression, it won't be considered in 2016.1. After that, he wants to have the autoupdater rewritten in C. The approach with an additional Lua process is problematic, because we have no RAM to spare.

couldnt we just killall old autoupdater if new one is called ?

I've started working on an autoupdater rewrite that will address this issue.

Workarounds that should fix these issues have been added in freifunk-gluon/packages#147, a nicer fix will follow with the autoupdater rewrite.

Was this page helpful?
0 / 5 - 0 ratings