DietPi-Process_Tool | Netdata shows some failed apply items

Created on 6 Sep 2018  ·  13Comments  ·  Source: MichaIng/DietPi

Creating a bug report/issue:

Hi. Two things. The minor issue is that occured is misspelled; should be occurred :)

The not so minor issue is that well summarized at https://github.com/sparkysbc/sparky_linux_images/blob/master/Digione-dietpi-terminal.pdf but I haven't seen any issue documented here with a quick cursory check.

I have a new Allo DigiOne Signature and am seeing the two NetData failures along with the ominuos "an issue has occured" message.

It doesn't appear to be affecting Roon which is all I use this for, but I unsure if it's impacting anything else.

I have a DigiOne since pre v6 and have taken it through the upgrades to 6.14 and do not experience these Failures. Unsure if it's only a DigiOne Signature issue or if it applies to newly imaged DigiOne devices whether they are the base or Signature models.

Thanks for your help Dan. I've got DietPi running of all my RPi based devices.

Required Information:

  • DietPi version | paste -sd '.' /DietPi/dietpi/.version
    6.14 (upgraded direct from 6.13 Allo image)
  • Distro version | echo $G_DISTRO_NAME or cat /etc/debian_version
    9.4
  • Kernel version | uname -a
    Linux DigiOneSignature 4.14.61-v7+ #1133 SMP Fri Aug 10 11:04:43 BST 2018 armv7l GNU/Linux
  • SBC device | echo $G_HW_MODEL_DESCRIPTION or (EG: RPi3)
    RPi 3 Model B+ (armv7l)
  • Power supply used | (EG: 5V 1A RAVpower)
    Allo supplied 5v / 3a
  • SDcard used | (EG: SanDisk ultra)
    Sandisk

Additional Information (if applicable):

  • Software title | (EG: Nextcloud)
  • Was the software title installed freshly or updated/migrated?
  • Can this issue be replicated on a fresh installation of DietPi?
  • dietpi-bugreport ID

Steps to reproduce:


happens on boot

Expected behaviour:


not see failures

Actual behaviour:


failure alerts

Extra details:

  • ...
Enhancement Information

Most helpful comment

@blogabe

Thanks for the report :+1:

I believe netdata starts a few threads during service start. Some of those threads are finishing, before we can apply to each PID.

As such, this is more of a "info" issue, as the program itself is still functional.

Leave it with us, we'll try to improve the output and recheck PID's.

All 13 comments

@blogabe
You screens show v6.13, but the issue shows v6.14, so I guess you tested after the dietpi-update as well?

Fixed the typo, many thanks! That one survived so long on so prominent location 🤣: https://github.com/Fourdee/DietPi/commit/6194e297923cd456fa0c98e3274a514a590ed96a

About the process tool failure:

  • Nice and scheduler could not be applied to one of the netdata processes.
  • Could you identify it: ps aux | grep <PID> Replace with the number in parenthesis after the related process name.
  • I tested it here with fresh netdata install on VM and it does not show any error. But that might depend on device and netdata setup, not sure.
  • But the failure is really minor and does not lead to any issue, especially since you did not adjust Nice/Schedule in dietpi-process_tool to a non-default value.

About the soundcard issue, I have no chance to investigate here. Not sure if we disable other hardware modules e.g. when a certain soundcard is chosen. Makes sense at all stay available when Allo GUI is in use. This is something @Fourdee hopefully can check better 😃.

Corret, @MichaIng Pic is from Allo on 6.13, but I still experience the same issue after updating to 6.14. Per Allo, this started to happen sometime between 6.9 and 6.11. With respect to the soundcard, I'm unsure if this issue is preventing Allo DigiOne from being the default soundcard or not. Makes sense that it should be, but Allo offer other soundcards as well. This may be a totally unrelated issue.

The challenge w/ identifying the PID is that Allo devices typically come headless and this only happens on boot. I went ahead and connected a screen to it and grabbed the information, but this is all I see:
root 2360 0.0 0.0 4372 572 pts/0 S+ 10:13 0:00 grep 1571

My PID on the failure alert shows as 1571.

untitled

@blogabe

Thanks for the report :+1:

I believe netdata starts a few threads during service start. Some of those threads are finishing, before we can apply to each PID.

As such, this is more of a "info" issue, as the program itself is still functional.

Leave it with us, we'll try to improve the output and recheck PID's.

Yep, as I thought, if we delay the process apply by 3 seconds, 10400 no longer exists:

[  OK  ] DietPi-Services | restart : netdata
[  OK  ] DietPi-Services | restart : cron
[ SUB1 ] DietPi-Process_tool > Apply
[  OK  ] DietPi-Process_tool | Cron (10383) : Nice      0
[  OK  ] DietPi-Process_tool | Cron (10383) : Affinity  0-3
[  OK  ] DietPi-Process_tool | Cron (10383) : Scheduler SCHED_OTHER 0
[  OK  ] DietPi-Process_tool | NetData (10373) : Nice      0
[  OK  ] DietPi-Process_tool | NetData (10373) : Affinity  0-3
[  OK  ] DietPi-Process_tool | NetData (10373) : Scheduler SCHED_OTHER 0
[  OK  ] DietPi-Process_tool | NetData (10389) : Nice      0
[  OK  ] DietPi-Process_tool | NetData (10389) : Affinity  0-3
[  OK  ] DietPi-Process_tool | NetData (10389) : Scheduler SCHED_OTHER 0
[FAILED] DietPi-Process_tool | NetData (10400) : Nice      0
[  OK  ] DietPi-Process_tool | NetData (10400) : Affinity  0-3
[FAILED] DietPi-Process_tool | NetData (10400) : Scheduler SCHED_OTHER 0
[  OK  ] DietPi-Process_tool | NetData (10411) : Nice      0
[  OK  ] DietPi-Process_tool | NetData (10411) : Affinity  0-3
[  OK  ] DietPi-Process_tool | NetData (10411) : Scheduler SCHED_OTHER 0
[  OK  ] DietPi-Process_tool | OpenSSH Server (695) : Nice      0
[  OK  ] DietPi-Process_tool | OpenSSH Server (695) : Affinity  0-3
[  OK  ] DietPi-Process_tool | OpenSSH Server (695) : Scheduler SCHED_OTHER 0
[  OK  ] DietPi-Process_tool | OpenSSH Server (4631) : Nice      0
[  OK  ] DietPi-Process_tool | OpenSSH Server (4631) : Affinity  0-3
[  OK  ] DietPi-Process_tool | OpenSSH Server (4631) : Scheduler SCHED_OTHER 0
[  OK  ] DietPi-Process_tool | OpenSSH Server (10233) : Nice      0
[  OK  ] DietPi-Process_tool | OpenSSH Server (10233) : Affinity  0-3
[  OK  ] DietPi-Process_tool | OpenSSH Server (10233) : Scheduler SCHED_OTHER 0
[FAILED] DietPi-Process_tool | An issue has occurred

So we need to rerun the ps check, during the apply process

  • We cant ps ax | grep as this returns grep in the result
  • Solution is pgrep but this isnt supported on Jessie?

@MichaIng

~What was the issue with pgrep and Jessie?~
https://packages.debian.org/jessie/procps

pgrep -i supported on Stretch, not jessie

👍

In that case should be fine, although needs testing on Jessie.

@MichaIng

If you have a Jessie system available, please could you check results of:

pgrep -f cron

@blogabe

Fix will be applied in the v6.15 update (hopefully released this week).

@Fourdee
Jep this should do it. However it slows down process tool execution. Perhaps could be tested/compared on slowest SBC with much installed software titles or VM with very limited max CPU usage.

An alternative would be to stick with single ps ax, but just in case of failure recheck process existence? Would be basically a code complexity vs performance balance.

@MichaIng

However it slows down process tool execution

Hmmm, seemed faster for me with cron + netdata installed. We'll need to do some timed tests to verify.

Hmmm, seemed faster for me with cron + netdata installed. We'll need to do some timed tests to verify.

Interesting, perhaps pgrep not much slower than grep on ps ax output string? Would guess the other way round, but jep, some simple tests are the best then 😄.

Thanks @Fourdee and @MichaIng

@MichaIng

Old:

root@DietPi:~# time /DietPi/dietpi/dietpi-process_tool_old 1
[  OK  ] Root access verified.

 DietPi-Process_tool
─────────────────────────────────────────────────────
 Mode: Apply

[  OK  ] DietPi-Process_tool | Cron (6649) : Nice      0
[  OK  ] DietPi-Process_tool | Cron (6649) : Affinity  0-3
[  OK  ] DietPi-Process_tool | Cron (6649) : Scheduler SCHED_OTHER 0
[  OK  ] DietPi-Process_tool | DHCP Client (1083) : Nice      0
[  OK  ] DietPi-Process_tool | DHCP Client (1083) : Affinity  0-3
[  OK  ] DietPi-Process_tool | DHCP Client (1083) : Scheduler SCHED_OTHER 0
[  OK  ] DietPi-Process_tool | NetData (6610) : Nice      0
[  OK  ] DietPi-Process_tool | NetData (6610) : Affinity  0-3
[  OK  ] DietPi-Process_tool | NetData (6610) : Scheduler SCHED_OTHER 0
[  OK  ] DietPi-Process_tool | NetData (6622) : Nice      0
[  OK  ] DietPi-Process_tool | NetData (6622) : Affinity  0-3
[  OK  ] DietPi-Process_tool | NetData (6622) : Scheduler SCHED_OTHER 0
[  OK  ] DietPi-Process_tool | NetData (6631) : Nice      0
[  OK  ] DietPi-Process_tool | NetData (6631) : Affinity  0-3
[  OK  ] DietPi-Process_tool | NetData (6631) : Scheduler SCHED_OTHER 0
[  OK  ] DietPi-Process_tool | OpenSSH Server (783) : Nice      0
[  OK  ] DietPi-Process_tool | OpenSSH Server (783) : Affinity  0-3
[  OK  ] DietPi-Process_tool | OpenSSH Server (783) : Scheduler SCHED_OTHER 0
[  OK  ] DietPi-Process_tool | OpenSSH Server (3563) : Nice      0
[  OK  ] DietPi-Process_tool | OpenSSH Server (3563) : Affinity  0-3
[  OK  ] DietPi-Process_tool | OpenSSH Server (3563) : Scheduler SCHED_OTHER 0
[  OK  ] DietPi-Process_tool | Completed

real    0m0.565s
user    0m0.150s
sys     0m0.040s

New:

root@DietPi:~# time /DietPi/dietpi/dietpi-process_tool 1
[  OK  ] Root access verified.

 DietPi-Process_tool
─────────────────────────────────────────────────────
 Mode: Apply

[  OK  ] DietPi-Process_tool | Cron (6649) : Nice      0
[  OK  ] DietPi-Process_tool | Cron (6649) : Affinity  0-3
[  OK  ] DietPi-Process_tool | Cron (6649) : Scheduler SCHED_OTHER 0
[  OK  ] DietPi-Process_tool | DHCP Client (1083) : Nice      0
[  OK  ] DietPi-Process_tool | DHCP Client (1083) : Affinity  0-3
[  OK  ] DietPi-Process_tool | DHCP Client (1083) : Scheduler SCHED_OTHER 0
[  OK  ] DietPi-Process_tool | NetData (6610) : Nice      0
[  OK  ] DietPi-Process_tool | NetData (6610) : Affinity  0-3
[  OK  ] DietPi-Process_tool | NetData (6610) : Scheduler SCHED_OTHER 0
[  OK  ] DietPi-Process_tool | NetData (6622) : Nice      0
[  OK  ] DietPi-Process_tool | NetData (6622) : Affinity  0-3
[  OK  ] DietPi-Process_tool | NetData (6622) : Scheduler SCHED_OTHER 0
[  OK  ] DietPi-Process_tool | NetData (6631) : Nice      0
[  OK  ] DietPi-Process_tool | NetData (6631) : Affinity  0-3
[  OK  ] DietPi-Process_tool | NetData (6631) : Scheduler SCHED_OTHER 0
[  OK  ] DietPi-Process_tool | OpenSSH Server (783) : Nice      0
[  OK  ] DietPi-Process_tool | OpenSSH Server (783) : Affinity  0-3
[  OK  ] DietPi-Process_tool | OpenSSH Server (783) : Scheduler SCHED_OTHER 0
[  OK  ] DietPi-Process_tool | OpenSSH Server (3563) : Nice      0
[  OK  ] DietPi-Process_tool | OpenSSH Server (3563) : Affinity  0-3
[  OK  ] DietPi-Process_tool | OpenSSH Server (3563) : Scheduler SCHED_OTHER 0
[  OK  ] DietPi-Process_tool | Completed

real    0m0.573s
user    0m0.130s
sys     0m0.080s

Old is "slightly" faster, however, the issue with running ps earlier (instead of during the while loop with pgrep), results in missing PID's.

Will mark this as completed.

@Fourdee @MichaIng quick update on this fix from my end guys...

6.15 seems to have resolved the issue. Thank you. However, I've rebooted a few times and checked the the screen roll. While more often than not everything passes and completed without any error message, every now and then I still receive the 2 failures described above.

I think to your point there is no harm in these messages. Just letting you know that the fix shouldn't be assumed to work 100% of the time in case someone else reached out about it.

@blogabe

6.15 seems to have resolved the issue. Thank you. However, I've rebooted a few times and checked the the screen roll. While more often than not everything passes and completed without any error message, every now and then I still receive the 2 failures described above.

Thanks for testing and the info 👍

Yep, as this is outside our control (a threaded process finishes before all items are applied), the only thing we could do is:

  • Change fail to info
  • Add additional re-checks to see if the PID is available, before we apply each item (however, this will slow processing down of script and is somewhat unnecessary for general use, even then, it could still fail if the apply after a check is delayed by other system processes/programs).

I believe for now, to ensure speed of the script, we'll leave it as is.

Was this page helpful?
0 / 5 - 0 ratings