Earlier today, Krita made an account on mastodon.art, and I followed @[email protected] from mastodon.social. The account followed me back. But later in my home column, I see a boost from @[email protected], and attempting to load it in the webapp seems to be treating it as a different account that is neither following me nor am I following it. However, loading the profile URLs mastodon.art/@krita or mastodon.art/@Krita both resolve to the same account.
krita in webapp:

Krita in webapp

krita in profile

Krita in profile

master (If you're a user, don't worry about this).They also appear as two different accounts in search.

@angristan reported the same from mstdn.io. This should not be possible at all, the db constraint in the username column is case insensitive, so duplicates should not be allowed at all
I wonder if these accounts have different activitypub URIs. I don't think those are case insensitive. But they are also canonical and never expected to change. Was the username of the krita account changed after creation, resulting in a different URI?
I'm not sure that they should have different URIs. I don't think it was changed -- what I'm most sure of is that the following occured:
1) Krita makes an account at mastodon.art/@Krita. They send this toot which shows up as such (lowercase): https://mastodon.art/@krita/99710831343843952
2) Someone on the federated timeline posts a toot to the effect of oh hey, @krita is on mastodon now with the mention being in lowercase.
3) I, and other users, open that link to @krita, which resolves fine to @Krita in the server response from mastodon.art but displays as @krita locally due to the way the mention was first typed.
4) Eventually, someone mentions or replies to a toot, which gets autopopulated as @Krita. (I'm not 100% on this.)
5) People now follow the link to @Krita, which for some reason registers locally as an entirely different account, despite resolving the exact same content on the remote server.
Alternatively,
1) Krita makes an account at mastodon.art/@krita. They toot normally and receive mentions in the federated timeline.
2) I, and other users, follow @[email protected] and engage with their toots.
3) The admin of mastodon.art, in between 2 and 3 of the previous list, changes @krita to @Krita in the mastodon.art database, but profile URLs are case-insensitive.
4) The above list continues as before, from point 3 above.
In short, the problem seems to lie exclusively with the webapp making an assumption that the canonical URL is case sensitive, while the server has no such assumption. But I'm not entirely sure how the username validation occurs at signup. As per other conversations (https://mastodon.xyz/@garrett/99712887351612071), other cases of case-sensitivity may be causing other silent issues for users beforehand.
One other thing to note: the JSON from https://mastodon.art/api/v1/statuses/99711824048258946 or https://mastodon.art/api/v1/statuses/99710831343843952, taken from two different statuses on mastodon.social, both resolve to user 55816 on mastodon.art.
I mean, Shane has confirmed that he changed the username casing in the database, so it's exactly what I described.
There is a tangentially related issue: #6667
Basically, the way there's 2 unique identifiers, username+domain and URI, means we get weirdness when either of them changes. In the ActivityPub spec, the URI is immutable... however, in Mastodon the WebFinger response has always been more authoritative (historically, too, for example it allowed accounts to upgrade from OStatus to ActivityPub in the first place)
I believe that maybe the code should be fixed in such a way that the URI is allowed to change for the same username+domain rather than creating duplicates.
However, not sure how to get rid of existing duplicates :thinking: Either as a migration (maintenance rake task) or as live code that checks for and removes duplicates in the ProcessAccountService or something...
cc @ThibG @akihikodaki @unarist
In the ActivityPub spec, the URI is immutable
In that case, the URI shouldn't be based on anything that isn't also immutable, correct? If I was designing an ActivityPub Server implementation from scratch, I would say that using UUIDs is more robust than a username -- and, by extension, would that not potentially allow usernames to be changed as well? For example, the URI of the Krita account would be something like mastodon.art/users/55816 instead of mastodon.art/users/krita. Since existing statuses from e.g. Pleroma (per my comment on #6839) already link to /users/username instead of /users/uuid, any fix will most likely break existing mentions in such software -- but the sooner the logic is refactored, the less future headaches it will cause.
I'm all for keeping instance.tld/@name as a pretty URL, but the API response probably shouldn't depend so heavily on this.
If I was designing an ActivityPub Server implementation from scratch
Yes if I was designing it from scratch now, I would also do it differently. But this is what we have and updating the way this works is not workable as far as I understand.
In that case, the URI shouldn't be based on anything that isn't also immutable, correct?
Updating usernames is not allowed. So they are immutable. In our case it was because of a manual edit in the database.
It would definitely break backward compatibility -- perhaps an API change or a major version update (3.0.0?), if any work is to be done on it in the future. But not totally unworkable... it would just be a very careful and heavy decision, not something to be done lightly. It almost certainly would break older mentions pre-update if a username changes, since mentions would try to resolve to a no-longer-existing name instead of a UUID.
Still, though, even though users don't have a way to change username, admins can still change the database as is the case here. It may be bad practice, but it is still a possibility. To be fair, an admin could hypothetically also change the user ID as well, but this would be less likely and even less desirable, though perhaps that's more an issue with using sequential numbers as a UUID.
As far as what can be done with minimal breakage right now, it seems to me that the best/easiest approach is a maintenance task to deduplicate existing case-differing usernames, and making usernames case-insensitive in the webapp going forward.
Ok, I am extremely uneasy with not considering the ActivityPub URL as the authoritative URL. To be honest, I have no idea what the change you have made in 9381a7d9d55ea734d6c498a82d17d73fd02fbe87 entails.
Having two pairs of “unique identifiers” only loosely related—there is a somewhat sensible mapping procedure implemented in Mastodon, but it depends on external data and the mapping may be broken if anything changes on the remote end—is a huge pain in the ass, and I really think we have to find a sensible way to handle this mess. Unfortunately, I don't have any idea of what that solution could be, at the moment.
Anything we come up with should:
Unfortunately, with the possibility of multiple entries sharing the same ActivityPub URL, point 3 is currently an issue.
Sharing a conversation related to this issue.
Full context: https://sleeping.town/@calmbit/99823144384899376
Relevant toot 1: https://mastodon.social/@trwnh/99829868541293024
Relevant toot 2: https://sleeping.town/@calmbit/99829878145066182
Relevant toot 3: https://mastodon.social/@trwnh/99829929980443968
Probably the same problem: I am following @claudia (Claudia Hauff) with my mastodon.social account. But when I want to remote-follow here with an account from another instance, the account is @Claudia (Claudia Kardaras), see image.

@djoerd your report is a bit confusing, what do you mean by “remote-follow here with an account from another instance”? What did you do exactly to remote-follow that person and what is this remote instance?
Is Claudia Hauff supposed to be on mastodon.social?
If I was designing an ActivityPub Server implementation from scratch
Yes if I was designing it from scratch now, I would also do it differently. But this is what we have and updating the way this works is not workable as far as I understand.
Isn't there concern that by doing it this way in Mastodon, it may discourage other projects from taking this path in their applications in order to be "mastodon-compatible"?
I guess I would like to bring up the hypothetical but not-unlikely possibility that some other ActivityPub project without any technical debt decides to implement changeable usernames a la Twitter -- on such an application, it would not require an admin to manually change the database. Thus, we could start seeing cases like krita-Krita a lot more, and perhaps case-sensitivity will not be enough to deduplicate all of them...
Also, I suppose I should reword the title at this point, because "Usernames on webapp are case-sensitive, but not on profile URL" doesn't adequately describe the entire issue anymore. There are actually many closely-related issues:
At least with the hypothetical provided here, you can say that ActivityPub URI should be something like "domain.tld/users/uid", which is still dependent on DNS but should be remain immutable. Of course, loading this URI in a web browser should be masked by a pretty-URL.
Not only that, but it closes off some of the most secure and interesting futures for activitypub
The main and only interface of people getting in touch with other people on Mastodon is by writing their username@domain in the body of their post (and search bar). As such, ActivityPub URIs, DIDs, etc, are all "behind the scenes" and cannot be authoritative over username@domain. Because, what are you going to do if someone enters [email protected], but there's 12 different alice's on example.com, but with different URIs or DIDs or whatever? So I'm sorry, but this issue is not about that at all. This issue is about a broken index, which failed to guarantee an assumption we had about the database on the database constraints level.
This issue is bad enough without thinking about DIDs, because it's not simple to get rid of duplicates in the database, and especially not simple to do it in a db:migrate step if it needs any admin control at all, and the index can only be fixed once there are no duplicates.
The index itself isn't really broken -- krita/Krita both resolve to mastodon.art/users/55816 and are presented as such when querying mastodon.art. It seems strange that this information is completely ignored when it should be a valid way of referring to a certain Profile. At some point, that translation should occur, or at least it should be possible -- it is occurring in lookup on mastodon.art, after all.
Query any status posted by Krita on mastodon.art and you should receive the following account in the JSON:
"account":{
"id":"55816",
"username":"Krita",
"acct":"Krita",
"display_name":"Krita",
...
"url":"https://mastodon.art/@Krita",
...
}
By citing "usernames are authoritative over UID" as an issue, I mean that "id" seems to be ignored entirely. At no point during entry into the local database on e.g. mastodon.social was there any validation of the "id" field -- and if there was, that didn't stop the account from getting duplicated rather than updated. That's understandable if username@domain is the only representation of a Profile, but it's not (at least, internally and over ActivityPub).
ID and URI are supposed to be immutable. Since URI is currently dependent on username, that means that it's only immutable as long as the username is immutable. If you consider usernames as case-insensitive, then technically krita and Krita should never have been treated differently (as they are merely case swaps). There's no pressing reason to refactor account-handling logic to go by id instead of username just yet -- it just needs to be made case-insensitive for now, and a cleanup task should identify and possibly merge records for accounts that are found to have the same case-insensitive username.
technically krita and Krita should never have been treated differently
Yes that's the broken index
Is "id" not used in the database?
Scrolling up I'm not seeing if I described it anywhere but the actual issue that there exists a case-insensitive index on the username/domain tuple but it's a non-unique index for some reason (mistake). There exists a case-sensitive unique index on the same table as well. It's a mistake.
@Gargron I may be misunderstanding how things are done. There is a way in which this can be perfectly safe, and perhaps it is even happening currently. Let's abstract (and I know it's an over-abstraction) the process into two stages: the composition stage, and the federation stage. Let's say that at the composition stage we're allowing users to select recipients. Mastodon's "composition" UI, and composition handling backend, only support the addressing for composed activities using webfinger / usernames. Fine fine fine. But right before handing off to the federation side, Mastodon's composition system should "transform" the webfinger addresses into the canonical ID uris. If this is done, and the federation side "only" sees the IDs, then everything is fine!
The federation only sees URIs (e.g. https://mastodon.social/users/Gargron)
Ok... I guess we're good then :)
Something a bit odd on mastodon.social related to this: as of 4 days ago (2.4.0rc4 release?) the krita account has started receiving new statuses, and the Krita account has received none. Nothing has changed on mastodon.art's end.
Most helpful comment
Isn't there concern that by doing it this way in Mastodon, it may discourage other projects from taking this path in their applications in order to be "mastodon-compatible"?