Mbed-os: BLE: Security Manager doesn't restore saved security parameters

Created on 29 Oct 2019 · 12Comments · Source: ARMmbed/mbed-os

Description of defect

BLE SecurityManager fails to restore saved security parameters from non-volatile memory after reset.

Target(s) affected by this defect ?

Testing using EP_AGORA and NRF52840_DK

Toolchain(s) (name and version) displaying this defect ?

Testing with GCC ARM

What version of Mbed-os are you using (tag or sha) ?

What version(s) of tools are you using. List all that apply (E.g. mbed-cli)

How is this defect reproduced ?

Build a BLE app that approves all pairing requests and has an on-board flash/filesystem. I have tried with both LittleFS and FATFileSystem using an internal FlashIAPBlockDevice and an external SPIF/QSPIFBlockDevice. Neither worked as expected.

I initialize the security manager with a file path to use and tell it to save security information -- SecurityManager::preservceBondingStateOnReset()

I am using Nordic's nRFConnect utility to then connect to and initiate pairing with the Mbed BLE device. The pairing goes as expected.

Then I disconnect and reset the Mbed BLE device. On startup, I check the supposedly preserved bonding table:

ble.securityManager().generateWhitelistFromBondTable()

Once the whitelistFromBondTable() callback is executed, the whitelist size is always 0.

I stepped through the initialization of the security manager at this stage (after pairing and a power cycle) and found that it is reading back an inconsistent security database version number, which causes the SecurityManager to reinitialize (delete) the security database...

I am seeing this happen at this line:

https://github.com/ARMmbed/mbed-os/blob/f1392d6b6bf8b6829aed6fd4b2392b7927cd63bc/features/FEATURE_BLE/source/generic/FileSecurityDb.cpp#L108-L121

I would suggest using the BLE_SM example in mbed-os-example-ble to try to reproduce this issue, but that example is broken in other ways.

I will continue to debug what is happening. Am I missing a step?

CLOSED mirrored bug

Source

AGlass0fMilk

👀1

Most helpful comment

I think a better work around would be to make sure the security manager closes the database file when not actively reading/writing to it.

I will work on a PR.

AGlass0fMilk on 31 Oct 2019

👍2

All 12 comments

@loverdeg-ep @trowbridgec @paul-szczepanek-arm @maclobdell

AGlass0fMilk on 29 Oct 2019

Internal Jira reference: https://jira.arm.com/browse/MBOTRIAGE-2295

ciarmcom on 29 Oct 2019

Maybe the file is not there? That function will create a blank file if one isn't there so reading the version from the blank file will yield the wrong version. Can you check the file is there before the function is run?
also @desmond-blue

paul-szczepanek-arm on 30 Oct 2019

I have confirmed that the file exists using the retargeted C file IO calls after initializing the filesystem.

When I try to read it, the size of the file is 0 and so no bytes are read back. This is after the security manager has initialized a new database, I then pair with the Mbed BLE device, disconnect, reset the chip, and step-debug the file read/write operations at filesystem initialization. The file is still size 0.

It seems the security manager or retargeted file I/O calls aren't successfully writing to disk or something.

Code:


    // Check if the security database file exists
    FILE* db_file = fopen(pairing_file_name, "rb+");
    if(db_file == NULL) {
        printf("filesystem: ble security file not found!\r\n");
    } else {
        printf("filesystem: ble security file exists\r\n");
        if(fseek(db_file, 0, SEEK_SET)) {
            printf("filesystem: db seek failed\r\n");
        } else {
            printf("filesystem: db seek success\r\n");
        }
        uint16_t version = 0;
        size_t num_read = fread(&version, sizeof(version), 1, db_file);
        printf("filesystem: numread - %d, db version - %d\r\n",
                num_read, version);
        fclose(db_file);
    }

AGlass0fMilk on 30 Oct 2019

this sounds like the problem is with files and not ble

paul-szczepanek-arm on 30 Oct 2019

this sounds like the problem is with files and not ble

I'm writing a test to write/read back a file as the BLE SM does... results in a few.

AGlass0fMilk on 30 Oct 2019

The following test code appears to work fine:

    // Test read/write file
    FILE* file = fopen("/fs/test.dat", "rb+");
    if(file == NULL) {
        printf("filesystem: creating test file...\r\n");
        file = fopen("fs/test.dat", "wb+");
        fseek(file, 0, SEEK_SET);
    uint16_t version = 1;
    uint8_t test_string[] = "hello!";
    size_t num_written = fwrite(&version, sizeof(version), 1, file);
    printf("filesystem: writing version, num written: %d\r\n", num_written);
    num_written = fwrite(test_string, sizeof(uint8_t), 6, file);
    printf("filesystem: writing test string, num written: %d\r\n", num_written);
    } else {
        printf("filesyste: reading test file...\r\n");
        fseek(file, 0, SEEK_SET);
        uint16_t version = 0;
        uint8_t test_string[10];
        size_t num_read = fread(&version, sizeof(version), 1, file);
        printf("filesystem: read %d bytes, version: %d\r\n", num_read, version);
        num_read = fread(&test_string, sizeof(uint8_t), 10, file);
        printf("filesystem: read %d bytes, string: %s\r\n", num_read, test_string);
    }

    fclose(file);

The file persists through power cycles as expected and can be read back. Going to start looking at how BLE SM handles its file operations.

AGlass0fMilk on 30 Oct 2019

It appears fclose is never called on the database file during typical operation of the BLE SM. fclose is only called in FileSecurityDb::erase_db_file (when zeroing out the file) and in the FileSecurityDb destructor.

The BLE SM only appears to explicitly call delete on its FileSecurityDb instance (which is dynamically allocated with new) in the event of an initialization error. I have only looked at the ble::generic::GenericSecurityManager code at this point.

Even if the BLE SM destructor calls delete on its FileSecurityDb, since the BLE subsystem is a singleton it typically isn't ever destroyed during normal program operation as far as I know.

Not sure if this is the cause of the problem, but the lack of an fclose may be leaving the file on the disk in a "dirty" state or the file I/O subsystem never flushes the written data to non-volatile memory.

AGlass0fMilk on 30 Oct 2019

I added a call to delete _db; in GenericSecurityManager::reset_() and now the file appears to be successfully written to and read back from disk on reset:

I can see the MAC address of the Nordic BLE dongle I am pairing to is saved in the flash and restored on reset, however it does not retain any of the keys (all the encryption keys and security info are 0 in my debugger it seems).

Not sure where it would be best to put the call to delete the security db or just call fclose on the file.

@paul-szczepanek-arm @desmond-blue

I am going to look into why the keys aren't being saved/reloaded properly.

AGlass0fMilk on 30 Oct 2019

Alright, so I was mistaken in that the keys were also being stored. I wasn't setting the "role reversal" hint and so only one set of long term keys (LTKs) were being exchanged.

I believe the original issue has been solved by the fix in my previous comment.

However, now I think I am encountering another bug. I have not yet enabled the use of privacy in my application (baby steps at the moment). So when I try to generate a whitelist from the bond table, the SecurityDb implementation passes over the entry for the bonded nRF Connect device due to this line of code:

https://github.com/ARMmbed/mbed-os/blob/cbf9f0650b650e232cf336af8e733d9f2203ab60/features/FEATURE_BLE/ble/generic/SecurityDb.h#L591-L593

I am not sure why having an Identity Resolving Key (IRK) is required to be included in the whitelist generated by the bond table. @paul-szczepanek-arm, can you clarify the intent behind this check?

Shouldn't any peer in the bond table be reported as part of the whitelist regardless of whether privacy features were used during pairing?

AGlass0fMilk on 30 Oct 2019

I think a better work around would be to make sure the security manager closes the database file when not actively reading/writing to it.

I will work on a PR.

AGlass0fMilk on 31 Oct 2019

👍2

Thanks for your uphill effort @AGlass0fMilk