Synapse: New homeserver doesn't know about cross-signing keys created before it was set up

Created on 14 May 2020  路  18Comments  路  Source: matrix-org/synapse

Not sure if this is related, but I recently set up a new homeserver B and am missing everyone's master_keys and self_signing_keys except for people who have since set up their E2E cross signing keys.

When I compared a sample user in device_lists_remote_extremeties I found that their stream_id was much larger (59005256) than on my long-running homeserver A (26308567). Unfortunately I can't check the other end (as it's matrix.org).

I tested checking out #7453 and inserted their user_id into device_lists_remote_resync but after it was removed from the table I still didn't have master_keys or self_signing_keys for the user however I did notice a timeout entry for the user_id in the log:

2020-05-12 22:04:53,641 - synapse.http.matrixfederationclient - 408 - INFO -  - {GET-O-3986} [matrix.org] Sending request: GET matrix://matrix.org/_matrix/federation/v1/user/devices/<user_id>; timeout 30.000000s
2020-05-12 22:04:53,651 - synapse.handlers.presence - 343 - INFO - persist_presence_changes-1 - Persisting 3 unpersisted presence updates
2020-05-12 22:04:53,809 - synapse.http.matrixfederationclient - 164 - INFO -  - {GET-O-3986} [matrix.org] Completed: 200 OK
2020-05-12 22:04:53,810 - synapse.storage.database - 527 - WARNING -  - Starting db txn 'update_remote_device_list_cache' from sentinel context
2020-05-12 22:04:53,810 - synapse.storage.database - 566 - WARNING -  - Starting db connection from sentinel context: metrics will be lost

_Originally posted by @flackr in https://github.com/matrix-org/synapse/issues/7418#issuecomment-627725091_

bug

Most helpful comment

RC1 got released today (https://github.com/matrix-org/synapse/releases/tag/v1.15.0rc1), so it should be out later this week or early next :)

All 18 comments

@flackr also says:

I checked with one of my friend's running a homeserver. My server does not exist in their device_lists_outbound_last_success table, but their server's users do exist in my device_lists_remote_extremeties, though their keys aren't on my homeserver.

So, it looks like t2bot.io is running into this (not a new homeserver, but would meet the criteria of 'new' here). For background: only yesterday did t2bot.io stop dropping EDUs on the floor, which means it would have dropped all device list updates and such, making it be starting from scratch.

I did hack in some support for a device list cache purge (https://github.com/t2bot/synapse/commit/41af03f58df5c35e9f2a462aa59440af5075d87b), though this doesn't appear to help when used.

For some people this is fine, like when I verified my own device after a little while of turning on the support for EDUs again. For most this issue appears to come up.

I think I've managed to track this bug down, which is that we were not processing cross-signing keys when resyncing a device list. I've opened https://github.com/matrix-org/synapse/pull/7594 which should fix this.

Very excited for this, thanks! So after this patch inserting into device_lists_remote_resync will sync the cross signing keys?

It should, yes. However, @turt2live told me that this patch might be malfunctioning, I'm going to investigate this today.

(the reason for Travis's issue was human error, and the patch seems to work for his HS)

(that human error was using the wrong database ftr. Terminal windows are hard to use.)

Thanks for fixing! I'm looking forward to being able to verify people after the next synapse update.

On a related note, I assume this only fixes the issue if you are the administrator of your synapse HS and can insert into device lists for remote resync, and it's a pretty confusing issue. Is there a plan for the HS to be able to automatically resolve missing keys?

I.e. if matrix.org hadn't picked up the cross signing keys for someone I wouldn't be able to fix it.

On a related note, I assume this only fixes the issue if you are the administrator of your synapse HS and can insert into device lists for remote resync

That's mostly true. If the remote server needs to resync for some reason (e.g. it missed an update) then it's going to save cross-signing keys. Though it doesn't mean it's going to catch up for every user it missed.

Is there a plan for the HS to be able to automatically resolve missing keys?

I'm afraid there is no realistic way that I know of for a server to figure out which user it's missing keys for, as it believes it's got the up to date device lists for every user (except if it doesn't but then it'll retry them automatically). I don't really see how we could make Synapse fetch the missing keys without requesting the device list of every single user it knows about.

I know this isn't an ideal answer. Hopefully we caught that but early enough so that not too many people will be bitten by it.

I was thinking that at the time of people sending verification requests would be a great time to check validity of current device lists / signatures. Perhaps the request could be signed and if the HS can't verify the signature it tries to resync the device list?

I think this is a good time because in most cases this issue shows up when you try to verify someone on a homeserver that isn't aware of your keys yet, and if for some reason your keys / device lists get out of sync the likely action a user would take is to try to reverify that person who is no longer verified.

Any word on when 1.15 will be released (presumably with this fix)? 馃槆

RC1 got released today (https://github.com/matrix-org/synapse/releases/tag/v1.15.0rc1), so it should be out later this week or early next :)

I can confirm that resyncing the device lists of all of the users missing keys on 1.15 worked perfectly. Thanks again for fixing this!

Very glad to hear it!! :)

Just reacting to this now because I totally missed it then (sorry!):

I was thinking that at the time of people sending verification requests would be a great time to check validity of current device lists / signatures

This would be a great idea, unfortunately it's not possible to act on that at the homeserver level given that in encrypted rooms verification requests are sent as m.room.encrypted. So the homeserver isn't able to differentiate a verification request from a simple message.

This would probably be a nice thing to have at the client level, though, but it'd require a client endpoint, so we'd need to spec it before we implement anything. I'm not saying it's not worth the trouble, because it definitely is and iirc that feature has already been asked for since some time now (though I can't find the issue anymore), just that it's not likely to land now now now.

FYI, I've opened MSC2638 to fix the specific issue of not always being able to tell the homeserver to resync.

That's great, thanks for the update!

Was this page helpful?
0 / 5 - 0 ratings