Core: Threading issue with mcp23017 platform when used by both binary_sensor and switch integration

Created on 15 Feb 2020  ·  16Comments  ·  Source: home-assistant/core

The problems

When a mcp23017 platform is instantiated by e.g. a _binary_sensor_ and a _switch_ component (running in different threads):

  • there is no protection mechanism/mutex to protect non atomic operations, e.g. a read-modify-write sequences -> inconsistent results when both threads access hardware at the same time (typically when hass starts up and each thread runs its _setup_platform_)
  • platform is initialized twice and 2nd initialisation/thread may therefore overwrite settings already made by the first one, e.g. switch thread ahead and already configuring pin as output with binary_sensor thread behind and instantiating a new MCP23017 object (that resets all directions to input)

I fixed the issue in my view by adding (See Additional information section below):

  • a (class) mutex in the MCP23017 class instantiated by both binary_sensor and a switch components
  • thread synchronization when accessing MCP23017 using above mutex in both binary_sensor and switch components
  • a class boolean in the MCP23017 class indicating that initialization has already been performed

Note that issue may be generic/affect others components sharing a common platform

Environment

Problem-relevant configuration.yaml

binary_sensor:
  - platform: mcp23017
    i2c_address: 0x27
    scan_interval: 1
    invert_logic: true
    pins:
      8 : Button_0
      9 : Button_1
      10: Button_2
      11: Button_3
      12: Button_4
      13: Button_5
      14: Button_6
      15: Button_7

switch:
  - platform: mcp23017
    i2c_address: 0x27
    pins:
      0 : Sortie_0
      1 : Sortie_1
      2 : Sortie_2
      3 : Sortie_3
      4 : Sortie_4
      5 : Sortie_5
      6 : Sortie_6
      7 : Sortie_7

Traceback/Error logs

None

Additional information

/srv/homeassistant/lib/python3.7/site-packages/adafruit_mcp230xx.py:

  • Addition of :

    • mutex and isInit class variables

    • condition to prevent a second initialization

...
import threading

class MCP23017:
    """Initialize MCP23017 instance on specified I2C bus and optionally
    at the specified I2C address.
    """
    mutex  = threading.Lock()
    isInit = False

    def __init__(self, i2c, address=_MCP23017_ADDRESS):
        self._device = i2c_device.I2CDevice(i2c, address)
        if not MCP23017.isInit:
            MCP23017.isInit = True
            # Reset to all inputs with no pull-ups and no inverted polarity.
            self.iodir = 0xFFFF
            self.gppu = 0x0000
            self._write_u16le(_MCP23017_IPOLA, 0x0000)
...

/srv/homeassistant/lib/python3.7/site-packages/homeassistant/components/mcp23017/binary_sensor.py:

  • Protection of all MCP23017 accesses by a mutex
...

def setup_platform(hass, config, add_devices, discovery_info=None):
    """Set up the MCP23017 binary sensors."""
    pull_mode = config[CONF_PULL_MODE]
    invert_logic = config[CONF_INVERT_LOGIC]
    i2c_address = config[CONF_I2C_ADDRESS]

    i2c = busio.I2C(board.SCL, board.SDA)
    with adafruit_mcp230xx.MCP23017.mutex:
        mcp = adafruit_mcp230xx.MCP23017(i2c, address=i2c_address)

    binary_sensors = []
    pins = config[CONF_PINS]

    for pin_num, pin_name in pins.items():
        pin = mcp.get_pin(pin_num)
        binary_sensors.append(
            MCP23017BinarySensor(pin_name, pin, pull_mode, invert_logic)
        )

    add_devices(binary_sensors, True)

class MCP23017BinarySensor(BinarySensorDevice):
    """Represent a binary sensor that uses MCP23017."""

    def __init__(self, name, pin, pull_mode, invert_logic):
        """Initialize the MCP23017 binary sensor."""
        self._name = name or DEVICE_DEFAULT_NAME
        self._pin = pin
        self._pull_mode = pull_mode
        self._invert_logic = invert_logic
        self._state = None

        with adafruit_mcp230xx.MCP23017.mutex:
            self._pin.direction = digitalio.Direction.INPUT
            self._pin.pull = digitalio.Pull.UP

    @property
    def name(self):
        """Return the name of the sensor."""
        return self._name

    @property
    def is_on(self):
        """Return the state of the entity."""
        return self._state != self._invert_logic

    def update(self):
        """Update the GPIO state."""
        with adafruit_mcp230xx.MCP23017.mutex:
            self._state = self._pin.value

/srv/homeassistant/lib/python3.7/site-packages/homeassistant/components/mcp23017/switch.py:

  • Protection of all MCP23017 accesses by a mutex
def setup_platform(hass, config, add_entities, discovery_info=None):
    """Set up the MCP23017 devices."""
    invert_logic = config.get(CONF_INVERT_LOGIC)
    i2c_address = config.get(CONF_I2C_ADDRESS)

    i2c = busio.I2C(board.SCL, board.SDA)

    with adafruit_mcp230xx.MCP23017.mutex:
        mcp = adafruit_mcp230xx.MCP23017(i2c, address=i2c_address)

    switches = []
    pins = config.get(CONF_PINS)
    for pin_num, pin_name in pins.items():
        pin = mcp.get_pin(pin_num)
        switches.append(MCP23017Switch(pin_name, pin, invert_logic))
    add_entities(switches)


class MCP23017Switch(ToggleEntity):
    """Representation of a  MCP23017 output pin."""

    def __init__(self, name, pin, invert_logic):
        """Initialize the pin."""
        self._name = name or DEVICE_DEFAULT_NAME
        self._pin = pin
        self._invert_logic = invert_logic
        self._state = False

        with adafruit_mcp230xx.MCP23017.mutex:
            self._pin.direction = digitalio.Direction.OUTPUT
            self._pin.value = self._invert_logic

    @property
    def name(self):
        """Return the name of the switch."""
        return self._name

    @property
    def should_poll(self):
        """No polling needed."""
        return False

    @property
    def is_on(self):
        """Return true if device is on."""
        return self._state

    @property
    def assumed_state(self):
        """Return true if optimistic updates are used."""
        return True

    def turn_on(self, **kwargs):
        """Turn the device on."""
        with adafruit_mcp230xx.MCP23017.mutex:
            self._pin.value = not self._invert_logic
        self._state = True
        self.schedule_update_ha_state()

    def turn_off(self, **kwargs):
        """Turn the device off."""
        with adafruit_mcp230xx.MCP23017.mutex:
            self._pin.value = self._invert_logic
        self._state = False
        self.schedule_update_ha_state()
mcp23017

Most helpful comment

I still plan to work on this/fix this properly as soon as I can find some time
Cheers
jpc

All 16 comments

Hey there @jardiamj, mind taking a look at this issue as its been labeled with a integration (mcp23017) you are listed as a codeowner for? Thanks!

@jpcornil-git I'm working on PCF8574 integration, so your finding is very useful.
I'll wait to see what others have to say about this problem.
Do you have other hardware to test this problem? pcal9535a?

No I don't but can help eventually, e.g. add debug code to expose the issue/thread activities as I did for my MCP23017-based board (https://github.com/jpcornil-git/RPiHat_GPIO_Expander)
Cheers
jpc

@jpcornil-git thank you for the link!
Let's wait for others to look into that issue.

@StephenBeirlaen, this is most likely the reason we were having issues using the INTFLAG for implementing the Interrupt logic into the component.
In order to implement this changes to the Adafruit_CircuitPython_MCP230xx would need to be done through a PR first.
Adafruit's i2c_device implementation is designed so you can lock the i2c bus and device address: https://github.com/adafruit/Adafruit_CircuitPython_BusDevice/blob/master/adafruit_bus_device/i2c_device.py
But their implementation of the MCP230xx only locks the device address on every operation (read/write): https://github.com/adafruit/Adafruit_CircuitPython_MCP230xx/blob/master/adafruit_mcp230xx/mcp230xx.py
Some sort of refactoring of the library might be needed, but I haven't put that much thought into it yet. Any suggestions?

My changes aren't generic enough; I would protect/lock by I2C address ideally (when you have multiple identical devices at different addresses you don't want to block access for all of them), same with the initialization (you want to initialize all of them but only once not everytime a new instance of the same {device, address} is made)

Nice findings, it looks promising! I can take a closer look tomorrow.

Very nice and detailed post @jpcornil-git! I also found similar problems where multiple I2C signals appeared to 'collide'. I tried to document my findings (with a logic analyser measurement) here: https://github.com/jardiamj/homeassistant/issues/1#issuecomment-532432071

I now also see why multithreading could be the issue, previously I was confused by the locks already present in the Adafruit code. I could not explain how they weren't sufficient, since I lack some more Python experience to debug this.

My particular project uses switches to momentarily turn on outputs, and these outputs controls pulse relays (more or less, for simplification). The output of the relays are fed back into the MCP as inputs (binary_sensors), to check the current state of the pulse relay (since they are pulsed to toggle them on/off, I have to verify the current state). So it's very similar to RPiHat_GPIO_Expander.

Since the measurement of an input change happens directly after an output changed its state, this leads to simultaneous I2C traffic all the time. That means I can easily reproduce the issue :)

Now I am unsure how we should fix this (universally as @jpcornil-git means). As I understand it right now, the proposed changes above are a global lock, that blocks the I2C bus for every device, not per I2C address? So two I2C devices with different addresses cannot be communicated with in parallel?

I would protect/lock by I2C address ideally (when you have multiple identical devices at different addresses you don't want to block access for all of them)

Wouldn't allowing this make it possible for multiple processes to use the I2C bus at the same time, and potentially mix up two I2C transmissions?

The Adafruit code takes a lock of the I2C bus for a single R/W transaction (to protect that transaction) and this is fine at that level and not hurting (the bus is used for that transaction/can't be used for anything else as i2c is serial, i.e. you can't communicate in parallel).

The issue is when two threads are doing read-modify-write actions, e.g.:

In the MCP23017BinarySensor and MCP23017Switch __init__ code, you have:
self._pin.direction = digitalio.Direction.OUTPUT

The above calls the direction setter method from DigitalInOut class (defined in adafruit_mcp230xx/digital_inout.py)

self._mcp.iodir = _clear_bit(self._mcp.iodir, self._pin)

That line translates into two I2C transactions:

  • self._mcp.iodir (_clear_bit parameter) calls the iodir getter method from MCP23017 class (adafruit_mcp230xx/mcp23017.py)
    -> self._read_u16le(_MCP23017_IODIRA) [1st transaction]
  • [returned value is modified/one bit cleared]
  • self._mcp.iodir (assignation) calls the iodir setter method from MCP23017 class
    -> self._write_u16le(_MCP23017_IODIRA, val) [2nd transaction]

If between these two transactions, another thread is updating the same IODIRA register (because not waiting on a lock) it will be lost/overwritten by the 2nd transaction.

To solve the issue and make such read-modify-write transaction atomic, you have to take the lock on the device for the whole sequence -> need a lock per device instance or per address (one device instance per address).

Issue is not very visible because most functions do a single read or write (but not the case during the platform setup phase)

The second issue to solve is the initialization of the device when multiple instances of the same devices are created (in different threads), it should only happen once as second initialisation may overwrite configuration of the thread that is already ahead/configuring the device (above lock won't address that)

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates.
Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍
This issue now has been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

@jpcornil-git any updates? Have problem. Thank you.

Hi Fumi,

I still plan to address this issue, just need to find time, hopefully
during this summer (the plan at least), as this is a complex one to
address/resolve.

Cheers,
jpc

Le dim. 24 mai 2020 à 11:46, Fumi Ju notifications@github.com a écrit :

@jpcornil-git https://github.com/jpcornil-git any updates? Have
problem. Thank you.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/home-assistant/core/issues/31867#issuecomment-633205640,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AJWC526BZPGEQ3EELKUC7M3RTDUHFANCNFSM4KV4FMSQ
.

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates.
Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍
This issue now has been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

I still plan to work on this/fix this properly as soon as I can find some time
Cheers
jpc

Looking forward to that :)

Thanks @jpcornil-git. For everyone else finding this thread, I'm still patching in what @jpcornil-git provided earlier, but I think I made a couple of tweaks because of library-related errors (can't remember what).

Here is the files that I'm currently using (synced as of HA Core 0.115.2):
mcp23017_patches.zip

There's a patch script in the zip that patches the homeassistant docker container (you need SSH access). It should ask you to vimdiff the files to confirm the changes (I'm using my mac with samba mounts to diff) then it will patch in the files. Diffs should be good for 0.115. There might be a syntax error with the patch script, good luck and your mileage may vary!

Was this page helpful?
0 / 5 - 0 ratings