Mastodon: OTR for direct messages

Created on 6 Apr 2017  路  17Comments  路  Source: tootsuite/mastodon

It'd be nice to have OTR messaging for Direct Messages. This could enable opportunistically when available (I.E., when both instances support it) and be disabled otherwise.


  • [x] I searched or browsed the repo鈥檚 other issues to ensure this is not a duplicate.
security suggestion

Most helpful comment

Adding OTR to Ruby would gain nothing. If Ruby (server) can read your message, it's not OTR (Off-the-Record). OTR must be implemented on the client-side. Perhaps there could be a client that does that over DMs. But web UI can't do this because running crypto in JavaScript is not much better than letting the server do the crypto.

All 17 comments

@wxcafe do you have a link to an OTR library that might work within Ruby?

There seems to be a ruby-FFI one here https://github.com/sophsec/ffi-otr, but I don't know what to think of it. Maybe using something like OMEMO would be better, since it allows multiple receiving endpoints? IDK. This warrants a longer discussion, and it's clearly not a priority right now, but It'd be a nice improvement in the future IMO.

Adding OTR to Ruby would gain nothing. If Ruby (server) can read your message, it's not OTR (Off-the-Record). OTR must be implemented on the client-side. Perhaps there could be a client that does that over DMs. But web UI can't do this because running crypto in JavaScript is not much better than letting the server do the crypto.

That's a good point.
I hate myself for saying this but maybe we can have a good JS implementation of OTR/OMEMO? I don't know. Maybe once the backend and the frontend are more decoupled there could be other web frontends that implement it too yeah. I think it'd be cool if it could be enabled in the default 'distribution' though

I'm currently working on an e2e solution that's decoupled from any reliance on trusted services ('TootCrypt'). The prototype works in principle (sends encrypted toots with ECDH/AES-GCM); most of the work now is implementing key sharing/identity assurance that's distributed and secure. I'll update when I have something more concrete to show.

At some point there are some enhancements that I think will be relevant to improve the UX for e2e users, particularly a separate field in a profile for public keys to reside and be acquired from the API, but I'll open issues about them at the appropriate time/place.

I'm working on a client, and I'm definitely interested in any kind of crypto standard for DMs.

@DagAgren @Rushyo is the person who would know most about this. As far as I understand all you need from me is some kinda free-form no-validation field to store the public key in an arbitrary app-mandated blob format. I feel iffy about the idea of not validating the format at all (how would different apps ever agree on the same thing then..?) but it's possible

I'm plonking this here for now, but I'll probably move it over to a more appropriate bug in future. I'll also tackle the OTR question in another comment.

TL;DR: Key sharing is part of the crypto problem, so different crypto software has different key formats, and the inter-change formats that exist are hopelessly overwrought. Mastodon isn't crypto software, so shouldn't need to understand these formats, and if it did it would have to keep up to date with all the formats it subscribed to, which change regularly and rapidly, and would break consumers badly if it ever claimed a key was falsely invalid. I recommend we keep Mastodon entirely out of the business of other software's crypto, and just use a format like this:

[ { "saltpack":"asaltpackhere" }, {"tootcryptmanifest":"atootcryptmanifest" } ]

It all depends on the needs of the crypto program. The simplest case is including some public keys and then leaving everything to the consumer. Even using the same crypto, those formats are fluid and subject to change. The established standards (x509, PKCS#X) are messy and fairly arbitrary. Long-standing formats often undergo 'updates' by new consumers, for example Keybase's 'salt' format, and protocols regularly need updating to new standards, or to handle the metadata needs of new consumers.

Salmon's "magic key" format, as a case study (since you mentioned it), is highly specific, only supporting RSA. It doesn't mandate any key lengths, so you could get a 1024-bit RSA key (insecure), or a 8192 key (obscure as h*ck). The format defines strict endianness (which can't be validated anyway), but for most metadata that a heavyweight crypto consumer would need, you need to further encode that format in a DER x509 certificate. I'm not going to go in to the problems with x509 certificates, but sufficed to say: TootCrypt will not be touching them with a barge pole.

Another issue is that the public key itself is often not what's wanted, especially terse formats that include minimal metadata. Different cryptographic implementations use different things. TLS is a mess of x509 certificate chains, OSCP, crls, and all manner of other mechanisms, which most consumers don't implement even close to properly. TootCrypt uses an XML keychain format, and that keychain can include revocation information, identity information, trust signage, etc. It then cross-verifies the sources of that information, rather than using, say WoT, in such a way that more information is ALWAYS better, even if a consumer ignores half of it. The keys themselves use a format that carries really well over social media formats, focused on optimising for character space, but of bugger all use to other cryptographic software, and validating them means having an ECDH parser to test the keys for correctness (and understanding all present and future curves that might be used). Keybase saltpack is far more complex than a simple OpenPGP key, and is intentionally tied to NaCl (aka Salt), which is a... fledging ecosystem at best. OpenPGP is generally kinda easy to consume, but has lots of issues, even around the public key format.

There's also generally two parts to the e2e problem: The act of encryption, and the key sharing. Protocols like Signal completely hand-wave much of key sharing problem: "If authentication [over a completely separate channel] is not performed, the parties receive no cryptographic guarantee as to who they are communicating with." (let's not even touch on OTR, which doesn't support asynchronous communication anyway). PGP uses Web o' Trust. It's messy, arbitrary, and not the duty of the medium to enforce. That's why PGP gets send around in emails sigs, stuck in bios, QR coded in profile pics, and otherwise chucked around. TootCrypt is complex on a fancy de-centralised key signing model, kinda simple on actual encryption. It uses XML for the keychain/manifest because signing manifests is a difficult problem and signing XML manifests is basically a solved problem; however XML is like kryptonite to many other cryptographic tools with different concerns (like running on formats with hardly any memory).

I'm not saying this to write off the crypto as complicated. I'm pointing out that these are cryptographic decisions, made by the consuming applications, arbitrary and subject to change and improvement all the time, with huge variety as it currently stands. Trying to verify their formats is unnecessary and undesirable. It's outside the scope of the transporting medium.

I re-iterate my earlier position that it's up to the consuming application to understand what it is consuming, as part of their protocol. Mastodon cannot hope to meaningfully validate the sort of keys it might be used to transport without becoming a cryptographic package itself. It will invariably lag behind actual cryptographic consumers, who it will then break compatibility with. What a cryptographic consumer needs to know is which format of blob it can expect. For this, an array of non-unique key:value pairs would suffice:

[ { "saltpack":"asaltpackhere" }, {"tootcryptmanifest":"atootcryptmanifest" } ]

These protocols all invariably include version information anyway, so that would be redundant, although it could easily be encoded in the key string (e.g. "saltpack-2.0"). You could embellish it a bit, like including nested arrays of keys/manifests, but any more and you're almost certainly tying it to cryptographic implementations.

This alone would be obscenely useful for most cryptographic consumers if consumable via API. It really doesn't need to go in to any of the other complexities of key sharing or standard formats to offer tons of value.

@DagAgren TootCrypt includes a protocol for tunnelling secure messages over social media to multiple recipients (largely frozen at this stage) which is focused around ECDH/AES-GCM (although pretty trivial to pivot to other cryptographic primitives) and isn't particularly exciting or difficult to implement in any language, with some guarantees but not others (e.g. lacks FS, that's left to the key-sharing to provide ephemeral or temporal keys, and leaks metadata like a sieve because social networks need to know that to know how to transport the message), and a separate protocol for key sharing exclusively over de-centralised sources which is much more experimental (incorporating novel solutions to novel problems). I'm not sure how much will be of use to you, if any, but the intent is to make them open formats.

Right, so, regarding OTR:

OTR is a synchronous protocol. Direct messages are an asynchronous medium. OTR doesn't support direct messages, so doesn't work here.

However, Signal (the next iteration of OTR, "v3.0") does. So let's move the discussion over to Signal.

Signal has been rejected as a candidate format for TootCrypt for a number of reasons beyond those affecting Mastodon itself. There are two extremely compelling reasons for keeping it away from Mastodon in principle, though, that are shared by federated environments in general.

The first is that it isn't designed for this purpose. In spite of supporting the bare minimum to work across this format, Signal's own authors admit that it is basically hostile in principle to a federated environment, and is designed around being a living platform that makes for a poor interoperable standard:

One of the controversial things we did with Signal early on was to build it as an unfederated service. Nothing about any of the protocols we've developed requires centralization; it's entirely possible to build a federated Signal Protocol based messenger, but I no longer believe that it is possible to build a competitive federated messenger at all.

-Moxie, https://whispersystems.org/blog/the-ecosystem-is-moving/

This comes off in every aspect of the protocol. It's not extensible, it's not well-standardised, and it tries to control all elements of the messaging through crypto in a way that limits expansion of the implementing software.

I'm not bringing this up as a value judgement of Signal for what it was designed for, but to point out that it is explicitly developed with a design philosophy that is opposed to this use case.

There's other major issues with it too for use on social media, but the second one that springs immediately to mind which I brought up above is:

It handwaves the crap out of the key sharing element.

If authentication [over a completely separate channel] is not performed, the parties receive no cryptographic guarantee as to who they are communicating with.

So the ultra important job of ensuring the identity of who you're sending to is who you expect, and not some MITM attacker, falls to.... the software, the instance, the primary threat actor in the federated model. This isn't only unacceptable and needs to be worked around, it's actively bad in a federated environment and creates loads of places in the model where it could break down, making any solution incredibly hard to achieve. Harder than creating a whole new channel, or at least a whole new protocol.

This is not even touching on other issues with this proposal, already touched on, like including e2e in the same sandbox as the thing it's supposed to be protecting against, which is why TootCrypt is entirely separate from Mastodon. It's not really 'end-to-end' if the end and the middle are inter-dependent. TootCrypt, or any disconnected e2e client that simply uses the fediverse as an tunnel for messaging. This has the principle advantage of being able to treat the entire fediverse ecosystem as untrusted, from API to OStatus, which is a basic security goal of end-to-end. It massively reduces the bulk of the threat model to largely the protocols and software the end-users have placed their explicit trust in: the e2e software; the very trust model that makes end-to-end so promising as a means of protecting yourself.

This is why the discussion thus far has been largely about what Mastodon can do to facilitate e2e clients, rather than integrate one of its own. This lets the e2e clients gain usability benefits from interoperability (such as using Mastodon as an untrusted key sharing platform, or having machine-readable messages, hypothetically), without any integration that might compromise their security model. In essence, the best of both worlds.

Thanks for that amazing read! Any advance, since, on your efforts @Rushyo or repo to check out and contribute to? Sounds to me like the implementation is along the lines of an extension. A potential advantage of a project utilising standards (or in this case a current W3C recommendation towards that) via ActivityPub.

EDIT: With Eugen's roadmap of of LDAP it feels, to me at least, that E2E is the obvious next quandary.

EDIT 2: Is it possible WebRTC might be key to this? Check out WebRTC's RTCPeerConnection and RTCDatachannel which do support E2E encryption, key exchange, and identity validation among other pretty snazzy stuff.

I've also just been passed the following links from the Internet Engineering Taskforce (IETF) regarding Messaging Layer Security (MLS). It even provides you with some guidance regarding FEDERATION!

MLS Draft Architecture and MLS Draft Protocol

As we have seen another spike in interest in Mastodon, the issue of private messaging across instance lines has again become highly relevant, particularly since it represents a feature that users see as something that centralized services provide more readily (that is, the feature of being able to deterministically know which users are able to view one's content), albeit simply by virtue of lacking such server boundaries.

MLS indeed looks like the pipe dream, and looks like it has been recently assigned to a working group with new drafts and one reference implementation so far. We may be up to a year out from the protocol stabilizing, but it's promising, and I hope that this issue receives more attention in the coming months.

it represents a feature that users see as something that centralized services provide more readily

This is due to misinformation, because centralized services have the exact same issue with even less transparency (for example, do you know which of the Twitter employees are authorized to view DMs? do you know which spying agencies have access?).

This comment should not be mistaken as an argument against the implementation of cryptographically secure DMs (although certain arguments do exist re: reporting abusive DMs).

I'm well aware, as too I believe are most users of both Mastodon and Twitter, that site admins have unrestricted access to posts that aren't encrypted client-side. This is not what I mean when talking about the difference between "private" posts on centralized and decentralized services as they stand now: there is still a difference between what kind of privacy guarantees the service can give with regards to _unprivileged_ users (which in the case of a given Mastodon instance, includes admins of remote instances, for example). The most common abuse vectors are those between people one or two degrees of separation away from the victim, so it doesn't help to paint all forms of privacy violation with the same brush.

That said, the endeavor for end-to-end encryption is obviously the solution to both kinds of privacy violations. I'm not here to split hairs, I'm just here to echo a widespread interest in this feature.

The most common abuse vectors are those between people one or two degrees of separation away from the victim, so it doesn't help to paint all forms of privacy violation with the same brush.

Thank you for adding this mental model @bugQ as this factor is a primary concern to me.

If OTR isn't for tomorow, what's about instance level encryption ? https://github.com/tootsuite/mastodon/issues/9004

why do you intend to use OTR/OMEMO and the ECDH Algo?

  • due to NIST this Algo is outdated since 2016, please use McEliece Algo (#13895)
  • OTR and Double Ratchet is a schematic protocol and you find the key ribbling off very easy
  • Key Streams or several keys or keys based on zero knowledge are much better (see Transformation of Cryptography (ISBN-13: 978-3749450749))
  • a crypto chat server for McEliece would be SmokeStack Chat Server. This can be addressed also over Python Crypto Web Programming into Mastodon.
Was this page helpful?
0 / 5 - 0 ratings