Our automated tests for the tls-client example in the mbed-os-example-tls fails with the following error message printed in the serial console (target UBLOX_EVK_ODIN_W2) :
mbedtls_ssl_handshake() failed: -0x7780 (-30592): SSL - A fatal alert message was received from our peer
When we enable debug printing, we observe that the TLS connection terminates prematurely because the server sent the tls-client a fatal alert message as the MAC of a TLS record does not check out:
...
ssl_tls.c:3961: |2| got an alert message, type: [2:20]
ssl_tls.c:3969: |1| is a fatal alert message (msg 20)
ssl_tls.c:3744: |1| mbedtls_ssl_handle_message_type() returned -30592 (-0x7780)
ssl_cli.c:3184: |1| mbedtls_ssl_read_record() returned -30592 (-0x7780)
ssl_tls.c:6354: |2| <= handshake
mbedtls_ssl_handshake() failed: -0x7780 (-30592): SSL - A fatal alert message was received from our peer
...
We investigated the problem and found that disabling the AES hardware acceleration code fixes it. To test this, we used the following diff:
diff --git a/features/mbedtls/targets/TARGET_STM/TARGET_STM32F4/TARGET_STM32F439xI/mbedtls_device.h b/features/mbedtls/targets/TARGET_STM/TARGET_STM32F4/TARGET_STM32F439xI/mbedtls_device.h
index dfbc820..2c2fff8 100644
--- a/features/mbedtls/targets/TARGET_STM/TARGET_STM32F4/TARGET_STM32F439xI/mbedtls_device.h
+++ b/features/mbedtls/targets/TARGET_STM/TARGET_STM32F4/TARGET_STM32F439xI/mbedtls_device.h
@@ -20,8 +20,6 @@
#ifndef MBEDTLS_DEVICE_H
#define MBEDTLS_DEVICE_H
-#define MBEDTLS_AES_ALT
-
#define MBEDTLS_SHA256_ALT
#define MBEDTLS_SHA1_ALT
Target
STM32F439xI family of devices with hardware acceleration enabled
Toolchain:
GCC_ARM
mbed-os sha:
Git tag mbed-os-5.5.5
Expected behavior
The tls-client example should succeed.
Actual behavior
The tls-client example fails with error:
mbedtls_ssl_handshake() failed: -0x7780 (-30592): SSL - A fatal alert message was received from our peer
Steps to reproduce
Run the tls-client at mbed-os-example-tls repository (with mbed-os-5.5.4 tag) using the GCC_ARM toolchain on the UBLOX_EVK_ODIN_W2 target. The failure message can be observed in the serial output.
cc @RonEld @Patater @0xc0170
Hi @andresag01 Thanks for raising this. I don't understand the description, is the target device STM32F439Xl or UBLOX_EVK_ODIN_W2 ?
Also, do you happen to know if the AES used here is AES192 by any chance?
@RonEld UBLOX_EVK_ODIN_W2 is a TARGET_STM32F439xI, so it is affected.
I see, thanks,
cc @andreaslarssonublox @andreaspeterssonublox
cc @adustm
This issue isn't affecting only u-blox targets. This issue affects at least all STM32F439xI-family targets that support AES hardware acceleration.
@RonEld: The ciphersuite used for this specific server and example is TLS-ECDHE-RSA-WITH-AES-128-GCM-SHA256. So I suppose its AES-128.
Hello, Thanks for reporting. I have reproduced the issue and will look at it.
Seems that using the HW acceleration for crypto also breaks the SD-cards init.
Hello, I have a question. Is it possible that once the issue happens ('TLS handshake failure'), the server refuses a new connection from my IP address for a while ?
It looks like it is difficult to reconnect when pressing the reset button several times in a raw
Using Ethernet LWIP
Client IP Address is 192.168.1.100
Connecting with developer.mbed.org
Starting the TLS handshake...
mbedtls_ssl_handshake() failed: -0x7780 (-30592): SSL - A fatal alert message
was received from our peer
Using Ethernet LWIP
Client IP Address is 192.168.1.100
Connecting with developer.mbed.org
Failed to connect
MBED: Socket Error: -3009
Using Ethernet LWIP
Client IP Address is 192.168.1.100
Connecting with developer.mbed.org
Failed to connect
MBED: Socket Error: -3009
Using Ethernet LWIP
Client IP Address is 192.168.1.100
Connecting with developer.mbed.org
Failed to connect
MBED: Socket Error: -3009
Using Ethernet LWIP
Client IP Address is 192.168.1.100
Connecting with developer.mbed.org
Failed to connect
MBED: Socket Error: -3009
Using Ethernet LWIP
Client IP Address is 192.168.1.100
Connecting with developer.mbed.org
Failed to connect
MBED: Socket Error: -3009
Seems that using the HW acceleration for crypto also breaks the SD-cards init.
@JanneKiiskila Why would you suggest this? AES and SD/SPI should be completely unrelated. Do you have an application example of this failure?
@adustm: I looked up the error number -3009 and found this in mbed-os/features/netsocket/nsapi_types.h:
NSAPI_ERROR_DNS_FAILURE = -3009, /*!< DNS failed to complete successfully */
Also, from the tls-client app error message you got, it seems the failure was in line tls-client/main.cpp:205:
mbedtls_printf("Connecting with %s\r\n", _domain);
ret = _tcpsocket->connect(_domain, _port);
if (ret != NSAPI_ERROR_OK) {
mbedtls_printf("Failed to connect\r\n");
printf("MBED: Socket Error: %d\r\n", ret);
_tcpsocket->close();
return;
}
It looks to me like the device is not able to resolve the DNS? Perhaps the device is not in the network or the server is somehow unreachable? Perhaps there is some network configuration that is causing your device to return this error when it is reset quickly too many times? It seems that there are multiple functions in ./features/netsocket/nsapi_dns.h that could return that specific error code, you could try looking there.
I suppose that it is also possible for servers to refuse connections from the same IP in quick succession, but I would expect the error to have a different value. Of course, I could be wrong...
I just wanted to quickly ask if there were any updates regarding this issue...
Dear all,
I've tried to disable / enable interrupts during the HW process, Remove the AES_FORCE_RESET during aes_free function / and some other things. No clue at the moment...
You can find in attachment the log files of the tls_handshake part (teratermaes_sw.txt is the OK version when there is no AES HW acceleration, teraterm_aes_alt.txt is the failing version with AES HW acceleration). This is done with DEBUG LEVEL 4
Could someone look at that ?
At line 762 of the log files, we can see that the failing version receives a message length of 2 and not 202.
teratermaes_sw.txt
teraterm_aes_alt.txt
Kind regards
Armelle
Hi @adustm As you can see, it's not only a different message length. It's a different message. The msgtype is 21 ( alert message) instead of 22 (handshake message. The reason for TLS failure is a fatal alert message received by the server. We need to investigate reason for the alert message, and why with HW acceleration the server failed. I suggest you test AES GCM with and with HW accelerated AES, perhaps there is something wrong with this part of the message
The SD-card issue for us is related to the fact that we we encrypt the SD-card content, so it seems the HW crypto block doesn't work reliably. With the mbed-os-example-client we see the TLS failure.
Hi @adustm,
Could you check whether HAL_CRYP_AESECB_Encrypt has failed on this device, and since mbedtls_aes_encrypt doesn't return error, the driver's error wasn't surfaced up?
Hi @adustm,
Code freeze for Mbed OS 5.5.6 is tomorrow (2017-08-24). Will a fix be ready by then? If not, could you please review https://github.com/ARMmbed/mbed-os/pull/4934 ?
Thanks
Hello @RonEld
Could you check whether HAL_CRYP_AESECB_Encrypt has failed on this device, and since mbedtls_aes_encrypt doesn't return error, the driver's error wasn't surfaced up?
No error was returned by HAL_CRYP_AESECB_Encrypt .
GCM selftest is also fine (tested with both master branch and mbed-os-5.5 branch).
test case: 'mbedtls_gcm_self_test' ........................................................... OK in 2.58 sec
Would you like to suggest another test ?
Kind regards
Armelle
I have modified gcm.c so that it can test 2 instances of ctx in parallel, and it's all OK. It looks like the AES hardware is perfectly well managing the save and restore context.
Would someone have a multiple aes thread example that I could work on ?
Kind regards
Armelle
Hi @adustm
The alert message that is received is MBEDTLS_SSL_ALERT_MSG_BAD_RECORD_MAC , so I am quite positive that it is a matter of GCM result is not as expected. Probably the key used on both sides is different. Since GCM uses AES, I would focus on the AES part, as you are doing.
I think your direction on multi-threading is correct.
Regards,
Ron
Hello @RonEld
I have rewritten the gcm_selftest in order to launch 5 threads of GCM in // (see attached main.txt file, to rename as main.cpp if you want to test it)
It's all OK.
| target | platform_name | test suite | result | elapsed_time (sec) | copy_method |
+-----------------------+---------------+----------------------+--------+--------------------+-------------+
| NUCLEO_F439ZI-GCC_ARM | NUCLEO_F439ZI | tests-mbedtls-thread | OK | 19.89 | shell |
+-----------------------+---------------+----------------------+--------+--------------------+-------------+
Any other idea ?
@JanneKiiskila could I access your program to test it ?
HI @adustm At the moment, I can think that perhaps there was some preemption, causing the HW to load a different key. Perhaps it's a matter of GCM + AES muti threading scenario.
Hei,
@adustm - I know STM is a member of mbed Cloud Partners, you have access to these repositories which contain the SW we are running.
Email was sent with a bit more details.
Can we raise this to blocker, please.
Can we raise this to blocker, please.
The fix will get CI once CI is back running.
The fix will get CI once CI is back running.
@0xc0170 can you add a reference here to the fixing PR?
I think this is the PR: https://github.com/ARMmbed/mbed-os/pull/4934
That's not the fix, though, that's a workaround. I guess ST, maybe @adustm, is still fighting the problem?
Hello,
The fix is eventually here in PR #5018
(with explanations)
Kind regards
Armelle