Core: Random Segfaults

Created on 19 Sep 2016  ·  78Comments  ·  Source: home-assistant/core

Make sure you are running the latest version of Home Assistant before reporting an issue.

You should only file an issue if you found a bug. Feature and enhancement requests should go in the Feature Requests section of our community forum:

Home Assistant release (hass --version):
0.29.0dev

Python release (python3 --version):
3.4.4

Component/platform:
asyncio

Description of problem:
Random segfaults. No pattern.

Expected:
Normal operation.

Problem-relevant configuration.yaml entries and steps to reproduce:

homeassistant:
  # Name of the location where Home Assistant is running
  name: 'Drømmehuset'
  # Location required to calculate the time the sun rises and sets
  latitude: !secret latitude
  longitude: !secret longitude
  elevation: 35
  # C for Celcius, F for Fahrenheit
  unit_system: metric
  # Pick yours from here: http://en.wikipedia.org/wiki/List_of_tz_database_time_zones
  time_zone: Europe/Oslo

##########################################################
# Personligfisering
##########################################################
  customize: !include customize.yaml

# Discover some devices automatically
# discovery:

# Pushbullet notification service
notify:
  platform: pushbullet
  api_key: !secret pushbullet_api_key
  name: 'pushmelding'

# Zwave settings
zwave:
  usb_path: /dev/ttyACM0
  config_path: /srv/hass/src/python-openzwave/openzwave/config
  polling_interval: 30000
  debug: 1
  autoheal: False

# RFXtrx433E settings
rfxtrx:
  device: /dev/serial/by-id/usb-RFXCOM_RFXtrx433_A1Z69C6G-if00-port0
  debug: True

# Enables support for tracking state changes over time.
history:
#  placeholder: true
recorder:
  purge_days: 14

# Track the sun
sun:

# Enables the frontend
frontend:

# View all events in a logbook
logbook:

logger:
  default: info
  logs:
    homeassistant.components.zwave: debug
    homeassistant.components.rfxtrx: debug

# Show links to resources in log and frontend
#introduction:

# Allows you to issue voice commands from the frontend
#conversation:

# Checks for available updates
# updater:

# Verisure Hub
verisure:
 username: !secret verisure_username
 password: !secret verisure_password
 code_digits: 6

http:
# development: 1
 api_password: !secret http_password

########################################################
# Overvåkere
########################################################
device_tracker:
  - platform: ddwrt
    host: 10.0.0.138
    username: !secret ddwrt_username
    password: !secret ddwrt_password
    consider_home: 36
  - platform: ddwrt
    host: 10.0.0.88
    username: !secret ddwrt_username
    password: !secret ddwrt_password

########################################################
# Grupper
########################################################
group: !include group.yaml

#########################################################
# HVAC
#########################################################
#hvac:
#  - platform: demo
#########################################################
# Sensorer
#########################################################
sensor: !include sensor.yaml

#########################################################
# Brytere
#########################################################
switch: !include switches.yaml
#########################################################
# Lysbrytere
#########################################################
light: !include light.yaml
#########################################################
# Thermostats
#########################################################
climate: !include climate.yaml

#########################################################
# Automation
#########################################################
automation: !include automation.yaml

####################################
# Automation scripts
####################################
script: !include script.yaml
####################################
# Mediaspillere
####################################
media_player:
  - platform: cast

######################################
# Input_select
######################################
input_select: !include input_select.yaml

########################################
# Kamera
########################################
camera:
  - platform: ffmpeg
    input: !secret grillbua_input
    name: Grillbua
    ffmpeg_bin: /usr/bin/ffmpeg
  - platform: ffmpeg
    input: !secret parkering_input
    name: Parkering
    ffmpeg_bin: /usr/bin/ffmpeg
  - platform: ffmpeg
    input: !secret terrasse_input
    name: Terrasse
    ffmpeg_bin: /usr/bin/ffmpeg
  - platform: local_file
    name: 'Dørklokke'
    file_path: /home/hasss/.homeassistant/ringeklokke.jpeg
  - platform: ffmpeg
    input: !secret garasje_input
    name: Garasjen
####################################################
# Cover
####################################################
#cover:
#  - platform: demo

#################################################
# Panels
#################################################
#panel_iframe:
#  ozwcp:
#    title: 'Zwave Control Panel'
#    icon: 'mdi:router-wireless'
thermostat:
  - platform: heat_control
    name: 'Gulvvarme bad 1.etg test'
    heater: switch.aeotec_dsc18103_micro_smart_switch_2nd_edition_switch_2
    target_sensor: sensor.gulvtemp_bad_1etg
    min_temp: 20
    max_temp: 33
    target_temp: 28.5
  1. No action needed to replicate.


    1. 3.

Traceback (if applicable):
http://hastebin.com/donuvejugu.pas
http://hastebin.com/ulicipebes.pas
http://hastebin.com/salopofaya.pas
http://hastebin.com/idereremil.pas
http://hastebin.com/jivipugibe.pas
http://hastebin.com/ixadobikul.pas
http://hastebin.com/qalozikawe.pas
http://hastebin.com/reduwuvewi.pas

Additional info:
Components and platforms:
device_tracker: ddwrt
notify: pushbullet
zwave: climate, switch, sensor, binary_sensor
rfxtrx: switch, light, sensor
verisure: alarm, sensor, switch, lock
media_player: cast
camera: ffmpeg
thermostat: heat_control
climate: generic_thermostat
switch: rfxtrx, zwave, command_line, template
light: rfxtrx
sensor: rfxtrx, template, command_line, systemmonitor, yr
Automations and scripts as well.

OS: ubuntu 16.04 64bit

Most helpful comment

I've pushed a branch that removes a possible issue with Python GC of the Task objects. If someone that has a segfault happen could give it a try and let me know if the segfaults persist that'd be great.

https://github.com/home-assistant/home-assistant/tree/fix/monkey-patch-asyncio

All 78 comments

I've been hitting a random segfault as well starting yesterday on dev.

Some tips I've seen for debugging:

To help debug async, you can add this under line 115 in core.py
self.loop.set_debug(True) from gitter

And also using gdb:

if you can run with gdb, the exact cause can be determined, https://wiki.python.org/moin/DebuggingWithGdb
you can attach to the running process and wait for it to segfault, or run hass with gdb and wait for it
pull info on the threads, py-bt gets a python stack-trace
you'll need to run the command from a terminal so that when it stops, you can interact with it (not a system-started script) from gitter

Recent crash from tonight with faulthandler:
http://hastebin.com/qozubivewa.vbs

There are 47 threads running. Highlighted some things from the log that could be it. If we could get a log from another user we could compare.

16 callbacks are waiting for the eventloop (searched for run_callback_threadsafe

After the segfault, the output from ping is printed to the command line, means that a command line sensor was running.

  File "/usr/local/lib/python3.4/subprocess.py", line 491 in _eintr_retry_call
  File "/usr/local/lib/python3.4/subprocess.py", line 1520 in _try_wait

The recorder is writing a query:

File "/home/hasss/.homeassistant/deps/sqlalchemy/engine/default.py", line 450 in do_execute

Verisure was updating your lock (search for verisure/devices/lock.py)

RFXtrx is reading from the serial connection (search for serial/serialposix.py)


If possible, could you turn one of the following off 1 by 1 to see if it stops the segfaulting: command line switch, rfxtrx, recorder, verisure

Here's my config, but it doesn't look like we have much in the way of component overlap:

homeassistant: !include homeassistant.yaml

zone: !include zone.yaml

group: !include groups.yaml
scene: !include scenes.yaml

logbook:
frontend:
history:
discovery:
zeroconf:
sun:
http: !include http.yaml

mqtt: !include mqtt.yaml

device_tracker: !include device_trackers.yaml
sensor onlinedness: !include sensor-onlinedness.yaml

switch harmony hub: !include harmony-switches.yaml
switch template: !include template_switches.yaml

sensor forecast: !include forecast.yaml 
sensor speedtest: !include speedtest.yaml

notify slack: !include notify-slack.yaml

light hue: !include hue.yaml
ecobee: !include ecobee.yaml
sleepiq: !include sleepiq.yaml
#zwave: !include zwave.yaml
input_boolean: !include input_booleans.yaml
emulated_hue: !include emulated_hue.yaml
vera: !include vera.yaml

script: !include_dir_named scripts
automation: !include_dir_merge_list automations

Home Assistant release (hass --version):
0.29.0dev

Python release (python3 --version):
3.4.2

Running on a Raspberry Pi 2 with Raspberian Jessie.

I didn't see any segfaults yesterday, but I have hass running under gdb now

Crash from this night. Took longer this time.
Now I have disabled all commandlne stuff including systemmonitor.
http://hastebin.com/toxerobunu.vbs
next disabling template sensors

It's now been running without problems for a day and a half.
I think I will reenable the commandline stuff, and see what happens.

If commandline stuff is to blame, we should port it over to use async stuff https://docs.python.org/3/library/asyncio-subprocess.html

Two things were disabled. First Without command line stuff == crash
Disabeld template stuff, running fine.
Now I have enabled commandline stuff, and will report back ;)

Looks like I'm in the same boat, dmesg is showing me;

[91069.635106] python[6012]: segfault at 10 ip 00007fec2e6f52d0 sp 00007fec2822dd20 error 4 in libpython3.5m.so.1.0[7fec2e556000+47c000]
[94613.108242] python[331]: segfault at 10 ip 00007f05660fb2d0 sp 00007f05603750d0 error 4 in libpython3.5m.so.1.0[7f0565f5c000+47c000]
[179867.340974] python[26568]: segfault at 10 ip 00007f9fa35732d0 sp 00007f9f9dd59e90 error 4 in libpython3.5m.so.1.0[7f9fa33d4000+47c000]

Closing #3484 as it's a dupe.

@lwis : Which components are you using?

@Danielhiversen

mqtt
notify
logbook
frontend
updater
sun
history
ifttt
influxdb
emulated_hue
netgear
owntracks
kodi
braviatv
sonos
samsungtv
cast
proximity
fastdotcom
command_line
forecast
time_date
google_travel_time
shell_command
zones

Think I got all the components + platforms.

@lwis : Could you try to disable the command_line component?

@Danielhiversen sure, what's the thought behind why that would cause a segfault?

No need to disable command line. It sems to be traced to any template stuff.

Running gdb -ex r --args python -m homeassistant --config /config now.

Got a new somewhat different traceback: http://hastebin.com/ilepedefad.vbs

Finally caught one in the act: https://gist.github.com/technicalpickles/23e097e213fcd4beb2c83c0e8cf7e06b

It's at a gdb prompt. Anything I should grab while I have it? I tried py-bt but gdb said the command wasn't found.

If anyone with a segfault using the latest dev could pip install uvloop and restart HASS, let me know if the segfaults go away.

$ pip install uvloop
Collecting uvloop
  Using cached uvloop-0.5.3.tar.gz
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-build-uyo_w28w/uvloop/setup.py", line 11, in <module>
        raise RuntimeError('uvloop requires Python 3.5 or greater')
    RuntimeError: uvloop requires Python 3.5 or greater

    ----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-uyo_w28w/uvloop/

@bbangert it looks like uvloop requires python 3.5. As far as I can tell, there isn't a Raspberian Jessie package for it. I can build from source to test, but wouldn't requiring it be a pretty significant change?

@technicalpickles awwww, bummer. Yes, its not feasible to require it since it'd up the required python too far.

@technicalpickles we wouldn't make it mandatory but it will help to be able to narrow down the seg faults to the default event loop implementation.

@bbangert @balloob I use uvloop and still experience the segfaults.

Time to get a better overview of the segfault data thus far:

| Architecture | OS | Python | Loop | User | templates? | Command Line |
| --- | --- | --- | --- | --- | --- | --- |
| AMD64 | Ubuntu 16.04 | 3.4.4 | Default (Epoll?) | @turbokongen | V | V |
| AMD64 | Ubuntu 16.04 | 3.5.2 | uvloop | @turbokongen | V | V |
| AMD64 - Docker | Alpine | 3.5.2 | Default (Epoll?) | @lwis | V | V |
| AMD64 - Docker | Alpine | 3.5.2 | uvloop | @lwis | V | V |
| RPi (?) | Raspbian Jessie | 3.4.2 (?) | Default (Epoll?) | @technicalpickles | V | V |

I can uninstall uvloop if required.

I'm running an Alpine Docker image on Ubuntu 16.04, also using an amd64 machine.

Everyone with a segfault is using a component with the template platform? And disabling the template platform config bits makes the segfaults go away?

I've not tried disabling my template configuration, but I'm happy to branch
my config to test.

On Sat, 24 Sep 2016, 7:01 am Ben Bangert, [email protected] wrote:

Everyone with a segfault is using a component with the template platform?
And disabling the template platform config bits makes the segfaults go away?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/home-assistant/home-assistant/issues/3453#issuecomment-249347640,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AA1TOwyw-Ei_vkpkbQPA7xz-tscATfMKks5qtLybgaJpZM4KAlqc
.

For people experiencing this problem, try the following three things:

  1. Remove any use of templates and see if it crashes
  2. Enable template sensors again but add import jinja2.debug to the top ofhelpers/template.py`
  3. Remove the import and add a lock around ENV.from_string(…):
import threading
lock = threading.Lock()

def render(…):
    …
    with lock:
        try:
            return ENV.from_string(…)

@balloob when you say template sensors, do you mean sensor.template ? As I don't use any, I do however use templates in other places, such a value_template in sensors.

You're right. I meant to say anything that uses templates.

@balloob starting now, didn't realise how much I used templates!

I didn't experience any crashes running without the templates, but I didn't feel that it was exercising much functionality - most of my setup uses templates so what I ended up with was quite minimal. I've been running for ~3 hours now with import jinja2.debug, and I'm yet to see any instability, I will however keep this post updated.

Crash with jinja2.debug import : http://hastebin.com/tekuxudeja.vbs

Just had a segfault also, switching to a locked render.

I just got (running with the template render lock);

During startup program terminated with signal SIGSEGV, Segmentation fault.
(gdb) bt
No stack.

Not sure what went wrong.

It segfaults everytime on startup?

@bbangert I've seen every startup when running with gdb. Doing continue seems to startup as expected after that.

I can make a new template system based on http://www.joonis.de/de/code/templite

Most recent dev, I'm no longer seeing segfaults, but I do get this sigabrt pretty quickly after starting release_the_kraken.py:

16-09-25 23:00:01 INFO (CP Server Thread-15) [homeassistant.components.http] Serving /api/states/kraken.test to 127.0.0.1 (auth: True)
16-09-25 23:00:01 INFO (MainThread) [homeassistant.core] Bus:Handling <Event state_changed[L]: entity_id=kraken.test, new_state=<state kraken.test=Run 249 @ 2016-09-25T23:00:01.545632-04:00>, old_state=<state kraken.test=Run 250 @ 2016-09-25T23:00:01.214649-04:00>>
16-09-25 23:00:01 INFO (MainThread) [homeassistant.core] Bus:Handling <Event state_changed[L]: entity_id=kraken.test, new_state=<state kraken.test=Run 275 @ 2016-09-25T23:00:01.598311-04:00>, old_state=<state kraken.test=Run 249 @ 2016-09-25T23:00:01.545632-04:00>>
python3: ../Modules/gcmodule.c:364: update_refs: Assertion `((gc)->gc.gc_refs >> (1)) != 0' failed.

I've run this with gdb, but it ends up continueing each thread, so I haven't been able to grab py-bt.

Sadly I experienced 2 segfaults overnight with the fix :(

Potentially related: https://bugs.python.org/issue26617

I have still no segfaults after last nights kraken attacks. Running fine here.
I swithced back to Python 3.4.4 though, and there I don't have uvloop. and #3515 applied

For anyone running into this, https://gist.github.com/balloob/c364470d17c01615b763dd1772791d43 can be used to stress test it.

  • clone the gist
  • edit release_the_kraken.py if necessary (port number, password)
  • python release_the_kraken.py

I still get segfault as well

        PID: 1252 (python)
           UID: 1001 (per)
           GID: 1001 (per)
        Signal: 11 (SEGV)
     Timestamp: Mon 2016-09-26 00:12:31 UTC (19h ago)
  Command Line: python -m homeassistant --daemon
    Executable: /usr/bin/python3.5
 Control Group: /user.slice/user-1001.slice/session-c3.scope
          Unit: session-c3.scope
         Slice: user-1001.slice
       Session: c3
     Owner UID: 1001 (per)
       Boot ID: 6447cfbb8a0c4b5d8cf2ac3a643f6d94
    Machine ID: 529280e30d1a42a2b88e3074fb0b13f8
      Hostname: alarmpi
      Coredump: /var/lib/systemd/coredump/core.python.1001.6447cfbb8a0c4b5d8cf2ac3a643f6d94.1252.1474848751000000.lz4
       Message: Process 1252 (python) of user 1001 dumped core.

(sorry about close and comment 😳)

I was also getting fairly consistent segfaults with Python 3.6, but have not seen one since downgrading to Python 3.4.

I split out https://github.com/home-assistant/home-assistant/issues/3538 for the SIGABRT I was seeing. I tried upgrading to python 3.5.2 just in case, and kraken passed after 12000 iterations, but I left it running with my full config and got another segfault. Trying to get a proper bt, but I installed with pyenv and I'm not sure how to get a debug build with that yet.

@lwis got a repo going with two Dockers: https://github.com/lwis/test-kraken-hass

To repro it on my Mac:

  • changed docker-compose.yml to say python kraken.py 10000
  • changed hass/Dockerfile to say RUN pip install git+https://github.com/home-assistant/home-assistant.git@b3d67a7ed91030e9f2f42cc5b24a277ab8787c19
  • Update kraken.py with this job function:
def job(i):
    try:
        remote.set_state(api, 'sensor.test_sensor_1', 'Run {}'.format(i))
        remote.set_state(api, 'sensor.test_sensor_2', 'Run {}'.format(i))
    except Exception:
        return

Kraken starts running after 20 seconds. I got a repro 50% of the time.

We should create a debug script based on this StackOverflow answer to debug things with docker-compose: http://stackoverflow.com/questions/30381410/debugging-linked-docker-containers-when-using-docker-compose/34322396#34322396

Looks like my catching of an exception doesn't work, prepare for a lot of output from Kraken while it tries but fails to finish the rest of its jobs.

Looks like there has been some confusing information in the chat on what happened to my install, so I'll clarify:
After applying #3515 to my install on python 3.4.4 without uvloop it ran completely without problems for 24 hours.
That was after 5 (50 000) kraken script attack that evening. Yesterday night I rebased of dev again, and It's still been running fine. Still no segfaults.

guys using this modified version of the kraken.py sensor here https://gist.github.com/joyrider3774/fe8816f8377a19393408ac5512db132c i can make python3.exe crash on windows. I only added the sensor to my existing config with 50 tentacles. so it seems windows is affected as well ?
the reason i modified the kraken.py sensor is because it uses the date command specific to linux so i just replaced the seconds with a random float value. The crashes happen in less than a minute
my python version is :Python 3.4.3 using the 32 bit executables on a 64 bit windows 10

ths is what i had gotten on windows when running through gdb not sure if it's helpfull and / or if i need some kind of debug version of python

Program received signal SIGSEGV, Segmentation fault.
0x690ce9e8 in python34!_PyWeakref_ClearRef ()
   from C:\WINDOWS\SYSTEM32\python34.dll
(gdb) bt
#0  0x690ce9e8 in python34!_PyWeakref_ClearRef ()
   from C:\WINDOWS\SYSTEM32\python34.dll
#1  0x690a177e in python34!PyType_GenericNew ()
   from C:\WINDOWS\SYSTEM32\python34.dll
#2  0x690742a1 in python34!PyMethod_New ()
   from C:\WINDOWS\SYSTEM32\python34.dll
#3  0x690a14de in python34!PyType_GenericNew ()
   from C:\WINDOWS\SYSTEM32\python34.dll
#4  0x690e6994 in python34!PyEval_EvalFrameEx ()
   from C:\WINDOWS\SYSTEM32\python34.dll
#5  0x690e947f in python34!PyEval_EvalCodeEx ()
   from C:\WINDOWS\SYSTEM32\python34.dll
#6  0x690e632c in python34!PyEval_GetGlobals ()
   from C:\WINDOWS\SYSTEM32\python34.dll
#7  0x00000000 in ?? ()

I had a crash now at 33hours running.
http://hastebin.com/oxiqavepil.vbs
Rebase of dev branch again now to include max 10 threads and restarting.

Just got a segfault after about 20 minutes of running 0.29.1 :(

Another segfault after almost 2 hours, this time without uvloop. I'm not sure this is as stable as we thought.

I updated to 0.29.3, and left it running overnight with python 3.5.2 and no evloop. I disabled the nmap device tracker, and template sensors. I still had discovery and a template switch enabled. I had another segfault in the morning 😓

I'm trying to disable discovery and template switch next.

Updated to latest today, and i still have random segfaults. Have disabled several things to try, but makes no difference. Logs do not indicate really anything useful before a crash either.

I don't believe anything has been done to improve the situation yet, there
was some discussion on collating information on all the segfaults to better
understand the configurations and platforms people are experiencing them
on.

It's a shame that the occasional brief period of stability is frequently
red herring.

On Fri, 30 Sep 2016, 2:30 pm Thomas, [email protected] wrote:

Updated to latest today, and i still have random segfaults. Have disabled
several things to try, but makes no difference. Logs do not indicate really
anything useful before a crash either.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/home-assistant/home-assistant/issues/3453#issuecomment-250732514,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AA1TO8A92uvzVrlAAa6vsMpj1MQslHL1ks5qvQD4gaJpZM4KAlqc
.

I had a segfault overnight on 0.29.4. I've enabled to save core files here and let's what GDB will tell us.

I have compiled python with debug flags and executed it through gdb. Managed to get the following information, maybe useful for someone with a better knowledge of python garbage collector.

python3: Modules/gcmodule.c:364: update_refs: Assertion `_PyGCHead_REFS(gc) != 0' failed.

Row 364 of gcmodules.c is commented with:

        /* Python's cyclic gc should never see an incoming refcount
         * of 0:  if something decref'ed to 0, it should have been
         * deallocated immediately at that time.
         * Possible cause (if the assert triggers):  a tp_dealloc
         * routine left a gc-aware object tracked during its teardown
         * phase, and did something-- or allowed something to happen --
         * that called back into Python.  gc can trigger then, and may
         * see the still-tracked dying object.  Before this assert
         * was added, such mistakes went on to allow gc to try to
         * delete the object again.  In a debug build, that caused
         * a mysterious segfault, when _Py_ForgetReference tried
         * to remove the object from the doubly-linked list of all
         * objects a second time.  In a release build, an actual
         * double deallocation occurred, which leads to corruption
         * of the allocator's internal bookkeeping pointers.  That's
         * so serious that maybe this should be a release-build
         * check instead of an assert?

EDIT:
Backtrace: http://pastebin.com/YpMcXhra
Backtrace Full: http://pastebin.com/9X5fjCkC

I haven't seen any more segfaults since disabling discovery and template switch.

Is there a point it makes sense to revert the changes to the core?

I've pushed a branch that removes a possible issue with Python GC of the Task objects. If someone that has a segfault happen could give it a try and let me know if the segfaults persist that'd be great.

https://github.com/home-assistant/home-assistant/tree/fix/monkey-patch-asyncio

I have ran every 0.29 release and have not had a segfault until today when I upgraded to 0.29.5 since then I have had 2. I am running 30.dev now.

@mcradit if that still has segfaults, try my branch, which is based on the latest dev with one tweak to remove a possible GC issue.

Removed most of my template sensors and moved to the core version of wunderground.py (was still using the original in custom components) but still getting segfaults.

Running 0.29.5 on Python 3.4.2 Debian Jessie

● home-assistant.service - Home Assistant
   Loaded: loaded (/etc/systemd/system/home-assistant.service; enabled)
   Active: failed (Result: signal) since Fri 2016-09-30 17:07:00 EDT; 5h 16min ago
  Process: 14089 ExecStart=/srv/hass/hass_venv/bin/hass -c /home/hass/ (code=killed, signal=SEGV)
 Main PID: 14089 (code=killed, signal=SEGV)

@rpitera How long does it take before a seg-fault? Can you try my branch?

I haven't had any yet. I didn't change anything in my config. I have a few
template sensors and Wunderground.

On Sep 30, 2016 9:43 PM, "Ben Bangert" [email protected] wrote:

@rpitera https://github.com/rpitera How long does it take before a
seg-fault? Can you try my branch?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/home-assistant/home-assistant/issues/3453#issuecomment-250887862,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AThEbeU6LDnn7Jm5U_LZ1b8DeHRBdUsJks5qvcjDgaJpZM4KAlqc
.

@bbangert It was up for at least 6-8 hours on 29.4, but only about 2-3 under 29.5.

I've never used anything besides releases so you'd have to nursemaid me through testing your branch.

@bbangert I just installed your branch and will let you know what I see.

Here's something I've noticed though... I have an alias 'start' that I use all the time. It executes: "systemctl start home-assistant; journalctl -f -u home-assistant". As long as I keep my session open I do not see any issues and my home-assistant instance maintains. A couple minutes after my session ends or I CTRL-C out of the journalctl I lose HA...

@rpitera - I used: pip3 install git+git://github.com/home-assistant/home-assistant.git@fix/monkey-patch-asyncio

For the people experiencing segfaults, are you using the discovery component?

Yes I was, will remove it and start again.

@persandstrom please also use the patch by @bbangert

@bbangert I'm testing your branch monkey-patch-asyncio now. I'll post the results tomorrow.
thx

@balloob Only disabling discovery did not help. Trying to apply patch now.

I'm using discovery. The @bbangerts's branch has been going solid for me for 9 hours.

Pretty serious issue.. there should be no code changes until this is fixed.

@bbangert @balloob Good news!! No segfaults on my environment after running https://github.com/home-assistant/home-assistant/tree/fix/monkey-patch-asyncio Awesome!! :+1:

I updated to use https://github.com/home-assistant/home-assistant/tree/fix/monkey-patch-asyncio yesterday afternoon, left it running over night, and no segfaults 🎉

Was this page helpful?
0 / 5 - 0 ratings

Related issues

i-am-shodan picture i-am-shodan  ·  3Comments

kirichkov picture kirichkov  ·  3Comments

TheZoker picture TheZoker  ·  3Comments

neonandu picture neonandu  ·  3Comments

sh0rez picture sh0rez  ·  3Comments