Mastodon: Responding 401 Unauthorized to Relay hosts since upgrading to v3.2.1

Created on 20 Oct 2020 · 21Comments · Source: tootsuite/mastodon

Expected behaviour

Posts from relays will work properly...

Actual behaviour

...since upgrading to v3.2.1, my Mastodon instance responds to all posts from relays with a 401 Unauthorized response. I wonder if this is related to the new HTTP Signatures changes somehow?

My logs are now full of errors like:

 Oct 20 14:59:39 mastodon-web mastodon[mastodon-mastodon-web-84f7857678-frz9n] [65c3bbb9-a60d-4275-a1a9-0eba4f5994a0] method=POST path=/inbox format=*/* controller=ActivityPub::InboxesController action=create status=401 duration=7.75 view=0.18 db=0.00 key=https://en.litepubrelay.resplendentwebservices.com/actor#main-key 
info

I've observed this with both the relays I was connected to (101010.pl and resplendentwebservices.com).

Steps to reproduce the problem

Have a working v3.2.0 install, using the tootsuite/mastodon Docker image
Configure federation relays (in my case, https://en.litepubrelay.resplendentwebservices.com/inbox and https://relay.101010.pl/inbox), have them working
Upgrade the Docker image to imageTag: v3.2.1
Watch sadly as your federated timeline drys up, and check the container logs for the above errors

Specifications

Environment: Kubernetes
Docker image: tootsuite/mastodon:v3.2.1
PostgreSQL: postgres:9.6.2
Redis: redis:6-alpine

Source

timwalls

Most helpful comment

Another pleroma maintainer stepped in and merged the pending MR (https://git.pleroma.social/pleroma/relay/-/merge_requests/23). Services running that software just have to update, they don't need to switch to another fork.

ThibG on 12 Nov 2020

🎉2 ❤1

All 21 comments

OK, so to answer my own question, the cause is this change: https://github.com/tootsuite/mastodon/pull/14556

This evidently breaks federation with the relay software on those servers.

Now I read the comments on that pull request, I see this is probably not entirely unexpected (I guess they're running on software too old to be loved.) But was it really smart to break backward compatibility in a point release? That's not how semantic versioning normally works...

timwalls on 21 Oct 2020

👍1

As you suggest the software hosting the relay itself is separate to Mastodon. The Pleroma Relay software I use to provide relays on a couple of my instances is rarely updated, and I had to recently switch to a different fork (incorporating this merge request) to get my relays to work properly with updated host headers. This predated Mastodon v3.2.1.

rodti on 21 Oct 2020

True enough. But I do think on a piece of software whose raison d'etre is federating, a release that _reduces_ the opportunities to federate - particularly with about the only two large relays in the Fediverse - probably deserves more than an "added support for latest HTTP signatures spec draft" footnote in a point release.

If it was more accurately labelled "Mandates latest HTTP signatures from a draft spec", one might say it doesn't belong outside a beta or bleeding edge release anyway...

(Being constructive - it should be feature flagged. Otherwise, anyone who wants to actually federate with anyone other than themselves is now prevented from installing future releases of Mastodon - this will not help ensure old versions of software are not being used in the Fediverse, it will encourage it.)

timwalls on 21 Oct 2020

👍1

There has never been a released version of the HTTP Signatures spec, it's always been a draft.

It's not really supporting the latest draft that breaks compatibility with the relay software, it's enforcing additional security properties. As it stands, the signatures from this relay software are not particularly useful: any person receiving a message from the relay can save the signature, and re-use it at any moment to impersonate the relay actor to any recipient and with very few restrictions on the signed payload.

ThibG on 21 Oct 2020

👍3

I don't doubt it. And "restrict federation from insecure relays" could even be a super nice thing to add in the release notes, because I imagine most people would at least not waste 2 days of their life working out why a point release made Mastodon essentially useless.

But anyway, it's not my project, so you do what feels right.

timwalls on 21 Oct 2020

👍1

Yeah, the release notes should probably have mentioned that. We just forgot about that issue when making the release.

ThibG on 21 Oct 2020

Yeah, this is actually very surprising for people writing software interacting with Mastodon.

For example I'm working on a client that does not sign the Host header (as the same activity is already being sent to multiple servers anyway) and this change silently broke the integration.

Is the requirement of having the Host header signed documented anywhere? I did check out https://tools.ietf.org/html/draft-cavage-http-signatures-12#section-2 but it didn't mention it or I'm looking on the wrong place.

wiktor-k on 22 Oct 2020

It is not documented in the HTTP Signature draft (although the draft does say applications may require additional headers to be covered).

I'm afraid it's only documented in the PR, and in the body returned by Mastodon when it rejects a signature.

ThibG on 22 Oct 2020

Thanks for taking the time to explain it @ThibG! The rejection body was really helpful to pinpoint why the object was not processed.

Is there a specific security issue that you see in accepting non-Host-signed activities? As far as I understand the activity object is self-contained already (all links are absolute). (Not sure if this should be discussed in a new issue too... :shrug: ).

wiktor-k on 22 Oct 2020

The worry about not signing Host was actually about performing signed fetches on a different server than it was intended to.

But that only applies to GET request and it seems difficult to exploit as far as Mastodon is concerned (since most fetch targets are identified by username and snowflake ids), so maybe that particular requirement could be relaxed.

EDIT: the fact that the fetch targets are identified by username and snowflake ids is irrelevant as one could craft an URI and trick software to fetch it. It could also be an issue with POST if you want only some servers to see an a priori server-independent Activity, but that's a rather minor concern.

ThibG on 22 Oct 2020

👍1

It is not documented in the HTTP Signature draft (although the draft does say applications may require additional headers to be covered). I'm afraid it's only documented in the PR, and in the body returned by Mastodon when it rejects a signature.

Documenting it in the body returned by Mastodon might be helpful for the person who runs the server at the other end (if they don't just at best ignore it, and at worst go "huh, that server returns errors, I'll blacklist it") - but it's no use for the people running the Mastodon servers affected. Even with debug logging enabled, there's zero feedback for them. It would be nice to at least log this somewhere a Mastodon admin can take note and take action.

timwalls on 22 Oct 2020

Hoping that there can be a resolution to this in an upcoming release. My instance is running on masto.host so I can't roll back to 3.2.0

MarkEEaton on 5 Nov 2020

@MarkEEaton the proper resolution would be for the relay software to be fixed (but it seems unmaintained) or the relay instances moving to a maintained fork.

ThibG on 6 Nov 2020

👍1

Pleroma relay ticket tracking this issue: https://git.pleroma.social/pleroma/relay/-/issues/10

The original PR mentions #14556

I have encountered no implementation not signing Host, Date, or Digest for POST requests. Therefore, this PR has the following, stricter requirements, that don't seem to impact backward compatibility with reasonably old fediverse software:

Clearly, pleroma relay is one such implementation. Expecting other software to implement a provisional IETF draft - which may to change in the future - appears to be unreasonable (to be fair I am unaware of the level of consensus within httpbis on said draft). One could argue the merits of implementing an IETF draft and not making the feature opt in/opt out. Considering the status of the IETF draft (it expired a month ago, even though it is adopted by the WG), such a configuration flag would be warranted.

fvdnabee on 12 Nov 2020

👍1

Yes, this relay software is such software, and I missed it.

Again, it does not have to do with conforming to a new draft version (by the way, the previous implementation was based on another draft version, there is no definitive spec about how to do authentication in ActivityPub), but with not accepting signatures that enable attackers to usurp someone's identity by replaying signatures with swapped-out payloads and recipients.

ThibG on 12 Nov 2020

No one argues that the latter should be considered wanted behaviour. What is argued is whether protection against such replay attacks should come at the expense of interfacing with popular relays, which appears to be the case in 3.2.1.

One suggestion is to make this cost configurable for mastodon admins (e.g. by a feature flag). It is not clear to me whether you are in favour of such a change @ThibG? Clearly some folks in this ticket are.

Does there exist documentation on the threat of replay attacks in the context of ActivityPub? As a mastodon admin, I currently notice the downsides of sig ver for my instance (no relaying); but maybe the upsides of signature verification appear less noticeable to me as an admin? To me this appears as a case of "Don't throw the baby out with the bathwater", but more likely I am ill-informed on the matter.

fvdnabee on 12 Nov 2020

👍1

We can't keep insecure behavior forever to remain compatible with unmaintained software (it hasn't seen any activity in months, a PR fixing this issue has been opened over 2 months ago, the maintainer has been notified, and we haven't heard back from her), even if it is popular.

The insecure signatures mean that anyone getting their hands on them can send anyone any ActivityPub payload (such as follow request, messages, etc.) impersonating the account who made the signature.

In a relay's case, the harm this could cause is somewhat lowered as a relay is only expected to accept follow requests, reject them, or well, relay stuff. So practical attacks would be forcefully unsubscribing someone else from the relay, or relay spam on the behalf of the relay (without the actual relay's involvement).

However, lowering the security for relays without lowering the security for everything else would require special-casing a significant amount of stuff in handling of ActivityPub messages. This is definitely possible, but this would make the code more error-prone and more difficult to maintain and understand, I would much prefer the insecure software to be patched.

ThibG on 12 Nov 2020

The other problem with the current approach is you're not forcing insecure software to be patched, you're forcing insecure software to stay in the wild. I at least had to downgrade to 3.2.0 because 3.2.1 is useless to me if it doesn't actually federate.

At the very least, adequate warning of a breaking change should be given, and people given suitable information and time to prepare.

I don't really have a dog in this fight because if necessary I'll just fork and patch around this, but a lot won't bother. If your long term aim is to have everyone using the latest software this take-it-or-leave-it approach is counterproductive.

(Of course if the ActivityPub specification did more than shrug and say "you should authenticate but we won't say how" this problem wouldn't arise...)

timwalls on 12 Nov 2020

you're forcing insecure software to stay in the wild.

I am sincerely surprised you decided to put the blame of an unmaintained, unpatched service on Thib and not the person responsible for said unmaintained, unpatched service, who manifestly bears little to no interest in maintaining their service anymore.

Kleidukos on 12 Nov 2020

ThibG on 12 Nov 2020

🎉2 ❤1

I am seemingly still having this issue, even after running my own relay

I eventually found a relay that worked using pleroma https://relay.libranet.de/inbo, which had me questioning why only some using pleroma's software worked, so I set up my own, and lo and behold I'm stuck "Waiting for relay's approval" on my own relay.
systemd unit displays the following, Dec 15 23:41:46 nyc python3[2789964]: [2020-12-15 23:41:46,866] INFO: Signature validation failed for: 'https://zee.li/actor'

https://zee.li/actor shows manuallyApprovesFollowers:true for some reason?

Sidekiq is returning this error when trying to use relay, Mastodon::UnexpectedResponseError: https://relay.zee.li/inbox returned code 401