I have an application which drives a bunch of i2c chips on the same bus. I create an instance of I2cDevice for each chip. Sometimes I experience, that the try-catch around I2CDevice.Write() logs exception and sometimes I can kill the whole panel (I need to power cycle it). So the question is the implementation of write method thread-safe, or I have to ensure it manually?
I can't answer your questions for certain, but I'm pretty sure that it _should_ be thread safe, as long as you're only accessing the same device from a single thread. I have tried this myself and it seems to basically work. However, there's a known issue that sometimes, Write (or Read) causes an exception if the device does not answer in time, and maybe also if the bus is busy. A simple retry should help there.
After your panel is "killed", it fails to reply to any I2C commands? On any chip? Can you describe this situation a bit more closely, because that would be a serious problem.
That is a great question @danielloczi thanks for asking, and my answer is assuming that you are running in Linux, if you meant Windows please let me know. The actual implementation from our side is thread safe, since all we do is to grab a pointer to the data of the message that you want to send, and we pass it down to ioctl with the right flags which will actually send the message to the i2c bus. That said, the i2c bus is in fact a shared resource, so ideally you would only be sending/receiving messages from one device at a time in order to not get noise into the bus. The truth is that we haven't tested this extensively by having many devices on the same bus and sending messages from different threads, but I would totally believe it would be possible to start getting a few errors when doing so. The bottom line is that given we are still using a shared resource, my advise would be to try to control from your side (perhaps via a semaphore) that only one thread is using the bus at a time.
I work with a Microchip SAMA5D27C-D1G-CU CPU, the operating system is Linux.
If I talked about killing the panel, that means, that my app, and also even the OS freezes, all connections to the panel lost, the heartbeat LED also stops. The only way to put it back to life is to power recycle the panel. Yes, that doesn't sound so good. :)
So if I understand correctly if we are talking about a single I2cDevice instance, calling the Write() method from multiple threads is ok, beacuse the method itself is thread-safe.
But in the case if there are multiple I2cDevice instances accessing the same bus (that's my case), as device instances don't know about each other, and there is no mechanism in the background, which serializes the calls to a single thread, so I need to implement something myself which queues the calls. Is that correct?
I'm going to implement a test app, which sends huge amount of the data, like the real app does, but in a single threaded, serialized manner inside a while(true) loop. I want to know if i can kill the panel with this app also or not. If not, then queing will solve my problem. I'll let you know about the result.
Sounds like a good test. I have actually written a program that talks to several I2C devices from different threads (2 * ADC and a LCD Display) on my Raspberry Pi4. I haven't run it for very long yet, but so far it worked ok. Certainly it never happened that the whole OS crashed.
So if I understand correctly if we are talking about a single I2cDevice instance, calling the Write() method from multiple threads is ok, beacuse the method itself is thread-safe.
When I said that I2cDevice.Write was thread safe, I didn't meant that it would serialize the calls to Write as it currently won't do that, what I meant is that the only state that this method relies on are the parameters that get passed in to the method, and it won't modify any state on the I2cDevice itself, which means that calling it several times from multiple threads won't really alter state or modify the execution of a single thread. That said, the one shared resource that the Write method uses is a call to the I2cBus to send the message, so that part might still cause issues if doing it from multiple threads even if running on the single device. That said, I am saying all of this because I'm not 100% sure that Ioctl PInvoke that we do to send messages through the bus is thread-safe or not, it may very well be that this specific native call will do the serialization for us, so I'm very interested on the results you get back from your test. I did some research on how other frameworks dealt with this and I couldn't find any that would do their own serialization of calls, so most do the same thing we do.
@danielloczi bus should be thread-safe, the I2cDevice is arguable (the Write operation itself is fine but you most likely will corrupt protocol used by the device if you used it multi-threaded).
What exception are you getting? Please see also https://github.com/dotnet/iot/issues/832 if this isn't related
What exception are you getting?
I'm getting
System.IO.IOException: Error 110 performing I2C data transfer.
System.IO.IOException: Error 121 performing I2C data transfer.
exceptions.
Well, 110 is something new, but 121 has been seen before, see #832. It is suspected to happen when either the bus is busy or the device doesn't answer in time. So far, a retry always seemed to help. Would be good if you could test that.
Of course, most tests have been done using Raspberry Pi's, so things might work (or not work) a bit differently on your hardware / operating system flavor.
There is already a retry logic in the code, which tries 4 times by default. I log the exceptions to console, and most of the time 2nd or 3rd attemp is successfull.
I'm testing which exception occurs just before the panel freeze. I guess that will be the 110.
We've had no reports of error 110 so far. The OS description for it is "The timeout for the connection has elapsed", which is not very useful :-(
I've created the test app, which drives all chips in a while(true) loop without delay. It works fine. I left the test app running for a long time, and didn't receive any exception. So communication looks good.
Also did a bunch of other tests, and it seems, that the panel freeze issue is not related to the sw stack. At this point I'm pretty sure, that in my case the problem is caused by the input event driver of the operating system or the hw. Also tried the cases with Python and the py code behaves the same.
@danielloczi So this means there's no issue with _this_ library? Can this be closed as "problem of the hardware/os driver", then?
Yes, the case can be closed. Thank you guys!
Most helpful comment
I've created the test app, which drives all chips in a while(true) loop without delay. It works fine. I left the test app running for a long time, and didn't receive any exception. So communication looks good.
Also did a bunch of other tests, and it seems, that the panel freeze issue is not related to the sw stack. At this point I'm pretty sure, that in my case the problem is caused by the input event driver of the operating system or the hw. Also tried the cases with Python and the py code behaves the same.