Mastodon: Make account approval required by default, with improvements

Created on 17 Apr 2019  路  36Comments  路  Source: tootsuite/mastodon

Prologue

Related: #877, #8122, #10300

With the fediverse rising in popularity and gaining more attention, not all of that attention is positive. More users means more activity, and also more potential value for spammers to gain attention.

With previous historical spam attacks against the fediverse being numerous and only increasing in frequency as more organic growth occurs, it is therefore more important to address the trend of spam signups before it becomes a major problem.

In previous federated systems, we can see that spam eventually becomes a really big issue for any community, given enough time:

  • With email and the SMTP network, the rise of email spam has caused servers to require more and more anti-spam measures that made it effectively impossible to easily run your own mail server, due to the arms race between professional spammers and corporate email providers. DKIM, DMARC, blacklists, etc etc. all disadvantage the creation of new email servers, because the likely result is that any new server will find its mail undeliverable by default. This is a bad outcome.
  • With Jabber and the XMPP network, open signups and in-band registration are often disabled due to similar spam concerns. An XEP for designating an abuse contact is a de facto standard in spam prevention, but the much wider trend is that modern XMPP servers are largely self-hosted and therefore do not have open signups at all. For the remaining public modern servers, signups are usually done via a webform to request an account. It can be said that the reason further anti-spam measures aren't implemented is because the open Jabber network has been largely drowned out by locked-down, non-federating XMPP chat solutions like WhatsApp. This is a bad outcome.

Rationale

Spam occurs because signing up and posting things have no cost. Given no cost, there is therefore no mechanism to prevent infinite signups or infinite posts. Therefore, to address spam, there must be some limit placed on registration or posting.

Limitations on posting are, in effect, targeting the effect of spam. If infinite posts can be made by spambots, then one way to limit spam would be to enforce rate limits based on what a human may reasonably post in a given period of time. But as #9766 / #9960 show, if the limit is too high, then actual humans can be disadvantaged, and if the limit is too low, then spammers will simply adapt to match these limits; this reduces the volume of spam, but does not solve the problem.

Limitations on registration are, in effect, targeting the cause of spam. If infinite registrations can be made by spambots, then one way to limit spam would be to enforce limits based on heuristics for detecting a "spammy" account. With no cost for signing up, a spam account can be banned as soon as it is detected, but detecting a spam account does not occur instantly, so the spammer is free to cause damage until caught. Also, the spammer can register another account immediately and begin spamming again. Thus far, Mastodon has implemented MX-based registration limits, with other limits being explored -- IP? Captcha? This, again, reduces the frequency of spam, but still does not solve the problem.

Pitch

Instead of entering an arms race with spammers time and time again, I propose instead solving this problem by imposing a social cost rather than a technological barrier. Technological barriers can be overcome, and are hard to develop. By contrast, social barriers are much harder for automata to overcome, and also much more easily spotted.

We already have a culture of having human moderation teams to respond to reports, instead of offloading our moderation to automated algorithms. This is very effective at removing spam, once it is detected and reported. So, shouldn't it be a natural extension to say that, if a platform is responsible for removing spam, then it should also be reasonable to trust platforms to be responsible for preventing it? The fundamental factor here is that platforms are ultimately hosting other people, and must therefore take some responsibility for what others post. To be a socially responsible member of the network, each Mastodon website must have certain rules and it must enforce those rules via its moderation; otherwise, if it is too lenient, it faces the social cost of other sites choosing to silence/suspend it as an effectively unmoderated space, incompatible with the rules of the local website.

Analysis

Requiring manual approval is, in my view, the best solution to this problem in the long term, because it is the most scalable and sustainable way to maintain social responsibility of not flooding others with spam. It makes the least tradeoffs compared to other solutions; it is easy to implement because humans are very good at making judgements, but computers are not very good at determining context.

Implementation details

In practice, the current recently-added "request an invite" mode is a much-needed step up from the binary of open/closed registration, but it is currently too limited. The main problem with this mode is that moderators have no way to effectively judge users before approval, so the act of approving users is not sufficient to block anything except the most obvious spam. The only proof-of-work in establishing good faith is with a text box to explain why a user wants to sign up, but users can put very little effort into this box, or they can give a canned answer that isn't genuine.

The specific improvements I propose are as follows:

1) Before approval, allow users to perform certain "safe" actions such as setting their avatar, banner, bio, or profile fields. This gives moderators more context and information on whether to approve the account.
2) Perhaps allow unapproved users to make posts, but do not show these posts in the public timelines until the profile is approved. Having a small post history also provides additional information to mods.
3) I am not sure whether unapproved users should be able to follow others, since bio/name spam is a thing. Also, following users from remote sites can cause undesirable content to start flooding the federated timeline, so that's another reason why allowing unrestricted follows might be a bad idea. There needs to be more thought/discussion around this point, though -- I'm not confident in giving an absolute yes/no answer w/r/t this.

suggestion

Most helpful comment

Personally, as a user, I strongly _disagree_ with making accounts require approval. I would probably never have joined a Mastodon instance if I needed to ask someone to be let in.

Partially, it's not wanting to have to deal with it. But it's also about not wanting to bother an actual person about something that I really don't need to have.

Now, I'm all for having an _option_ to require manual approval. But it shouldn't be the _default_, because then good people simply won't join.

All 36 comments

I strongly disagree with allowing unapproved users to make posts, follow others, etc. Any interaction with other users should be blocked until they are approved, to prevent possible spammy actions and also to make sure they don't think there's just something broken.

I'm also iffy on the idea of being able to setup their profile. If they put a lot of work into it and are not approved, that information is deleted and could be very frustrating for someone.

I agree with making account approval the default, though. +1 to that!

My only concern here is whether human moderators can be more effective in this aspect. Bot accounts can be very confusing before they start flooding the instance -- fake anime avatars can be fetched everywhere and fake names are not really anything difficult to generate. And you can't expect "real" users to post anything or edit anything other than these before approval either; if they are not allowed to do so before approval, then it is even harder to distinguish between an innocent newcomer and a dormant spam bot.

Making approval mandatory may also open up human-moderator-targeted Denial-of-Service attacks. Since there is still zero cost creating the accounts in "pending" state, bots can simply flood the list of accounts waiting for approval with fake accounts. This would not affect the Fediverse as a whole, but the instance can be almost rendered useless because approval / denial cannot be done at such a rate and real user registrations can be diluted to the point that they are no longer visible to any human moderator. Yes, you can rate-limited the registration for the whole instance, but then these bots can still render your instance unusable to real users.

One solution to all these problems is that we add limitation on registering email address, i.e. limiting the allowed email address to one of a whitelist of domains configured by the moderators, and then depend on the verification mechanism of those mail hosts. But to me this defeats the whole point of decentralization. I wouldn't say manual approval is a bad idea because of this; I'm just thinking that at least some computational cost should be introduced alongside manual approval, or so-called "social cost".

@joycem137

I strongly disagree with allowing unapproved users to make posts, follow others, etc. Any interaction with other users should be blocked until they are approved

Making posts doesn't necessarily mean interacting with others. By defaulting to a silence until approved, this allows the user to go ahead and familiarize themselves with the platform, and no one else will see it unless the user tells people to go to their profile manually. Their public profile can also show a banner saying something like "this account is currently awaiting human approval."

As for following others, again, I'm still unsure of how best to handle that. Perhaps giving existing users an option to require manual approval of unapproved accounts trying to follow them? Sort of like a step between locked accounts and unlocked accounts, as they currently exist.

If they put a lot of work into it and are not approved, that information is deleted and could be very frustrating for someone.

This is kind of the point. Setting up a profile is not a lot of work for a human to do, and if they are rejected, then it's a very small cost. By contrast, it is a lot of work to generate genuine-seeming profiles on a large scale, and there is no payoff until the account is approved, so most spammers won't put in the effort because it's not worth their time.


@PeterCxy

you can't expect "real" users to post anything or edit anything other than these before approval either; if they are not allowed to do so before approval, then it is even harder to distinguish between an innocent newcomer and a dormant spam bot.

It is easier to distinguish when you have more information. A dormant bot will, obviously, remain dormant. Posting genuine-seeming things can be a signal that reduces the likelihood the account is not genuine. It is no guarantee, of course -- that will always remain up to human judgement. It's not a mark against someone if they choose not to post for a while; if it were that simple to decide, then an algorithm could be written to make the decisions based on human-defined criteria. It's identifying the criteria that's harder to define.

Making approval mandatory may also open up human-moderator-targeted Denial-of-Service attacks. Since there is still zero cost creating the accounts in "pending" state, bots can simply flood the list of accounts waiting for approval with fake accounts.

Not significantly different than actual spam posing a DoS on moderators, and in fact, I'd argue it's much less terrible -- with moderation after-the-fact, the burden falls on everyone (and on multiple websites' moderation teams, too). With moderation before-the-fact, the burden falls only on that one site's moderation team. The "pending" phase can be combined with filtering tools to detect multiple signups from the same IP, or from a blacklisted email domain/MX, or whatever. Again, the role of technology is not to make decisions, but instead to make it easier for humans to do their job. It might be bad for users to not be able to sign up while registrations are under attack, but this is not as bad as the alternative, which is to have existing users unable to use their accounts because public timelines are under attack.

It is easier to distinguish when you have more information.

That's what I'm thinking, and the problem I had was if we can encourage a real user to do such kind of stuff before approval and make things easier for the admins. But this comes down to UX anyway.

Perhaps giving existing users an option to require manual approval of unapproved accounts trying to follow them?

I would say that following is not something as harmful as flooding the public timeline. Following might introduce two possible scenarios (as far as I can think of):

1) Spamming people with garbage new follows (thus filling their notification queue with useless information)
2) Causing the instance to pull data from a remote account, possibly also a spamming bot that has been approved by an unmoderated instance or simply by chance.

For the first scenario, we can just silence notifications from unapproved accounts as well. Or, even better, Mastodon can choose to not let the outside know anything about an unapproved account except the profile page, so that the "follow" requests won't actually be sent until the account become approved -- "follow"-ing, in an unapproved state, means only to add the other user's public posts to the user's own timeline. This enables real users to pick their interested accounts to follow before approval without actually disturbing the fediverse, rather than being faced with a empty timeline, which is not a great experience for a newcomer.

The second scenario seems even easier. Just prevent unapproved accounts to pull new remote accounts, i.e. disable remote following on unknown remote accounts. This way, spam bots cannot cause the instance to pull any new information from any remote instance, but real users can still go around and follow people from the public federated timeline, thus getting familiar with the platform.

By doing so, we can provide a nearly full experience to real new users (everything except federation features) and also prevent spamming bots to actually harm the entire fediverse. Real users can just do anything they would like in the pending state, and only when they get approved, will their actions be actually broadcasted via federation.

Not significantly different than actual spam posing a DoS on moderators

I would agree on this. I was just thinking that manual approval doesn't solve the problem of DoS on moderators, so some kind of other verification mechanisms can be implemented alongside manual approval to actually help defending instances themselves from spamming.

I think I saw somewhere that the application form could/would have a text box added to help with determining who's a human and who's a bot?

So I guess my suggestion is, something like that but the admin writes the question. That way the question varies from instance to instance. Maybe one instance the question would be "what's your favourite flavour of ice cream?" and another instance it'd be "what made you choose this instance over others?" A lot of bots would be very easily spotted if they have no way to predict the magic question when moving from instance to instance. (Do bots move from instance to instance like that?)

And admins could change the question whenever they want, so if a bot "got used to" a question then the admin could easily shift the question, making the non-human answers more obvious, since a human would answer the question in front of them rather than the one they've been programmed to answer.

Over time admins would find which questions are harder for a bot to answer, too.

Personally, as a user, I strongly _disagree_ with making accounts require approval. I would probably never have joined a Mastodon instance if I needed to ask someone to be let in.

Partially, it's not wanting to have to deal with it. But it's also about not wanting to bother an actual person about something that I really don't need to have.

Now, I'm all for having an _option_ to require manual approval. But it shouldn't be the _default_, because then good people simply won't join.

Having a way to detect bots, or simply many registrations in a short time period, does sound like a good idea. (Although please not Google, but that's a topic for another time.)

Facebook groups often ask peope who want to join a series of questions, often to make sure they wont cause trouble. Its also inadvertently a good anti spam measure

@SilverWolf32

Personally, as a user, I strongly disagree with making accounts require approval. I would probably never have joined a Mastodon instance if I needed to ask someone to be let in.

I'd be interested to hear more about why you feel this way -- do you also not report spam because you don't want to bother moderators? What makes approving an account seem more personal than reporting an account? I think that, if you're joining a community, you should be at least comfortable with engaging with your moderation and administration team. Account approval doesn't have to be a formal and daunting thing.

Nonetheless, the proposal is simply for making approval-based registration the default, not the only option. Instances that wish to allow open registration are free to do so, if they feel capable of handling the extra responsibility.

Questions sound like they might work! Not too many, of course, because we don't want to scare people off, but some could be workable.

It might possibly also help the problem of "is this instance right for me?", depending on what the instance owners decide to ask.

_Edit: Of course this should definitely be configurable by the instance admins. Possibly blank by default, with no questions, and then have them add their own if they want to opt in._

@trwnh

I do report spam. Spam reporting is a little different, although I'm not quite sure why I feel that way. It's more of a "hey, here's some stuff that should probably be cleaned up", rather than a "could you please take time out of your day to do something for this random person you've never met, that they don't really need you to do, just because they feel like it?".

I don't have much of a problem talking to the admins, at least on the relatively small instance I'm on. But that doesn't mean I want to bother them about something that's not particularly important. (: And I'm still not clear on what I'd be able to do as a limited user anyway.

Also, this will massively discourage switching instances if you find out that the one you picked at first isn't for you, or having multiple accounts for multiple identities. It means you have to make _another_ request to someone you don't know, who doesn't know you, just so you can get started in the community.

On it being default: I realize that it can be turned off, but _most_ admins will probably leave it on. Just like _most_ users don't go tweaking every aspect of the way their computers work.

@SilverWolf32 Hmm. In that case, I'd say we need to address why some people might feel like approving an account takes time out of their day more significantly than handling reports would. It should be made as clear as possible that approval-based registrations are not a hassle. From the admin side, it's less liability on their shoulders. Approval should be as easy as possible.

re: defaults, though, that's the point. Most admins leaving it on is the intended effect. In security analysis terms, having open registrations is effectively an open attack vector. It's like granting write privileges to every new user on your system, instead of defaulting to read-only (leaving attackers free to cause damage by writing arbitrary data to your hard disk). It is much better to only grant write privileges to users that need it, and only in specific directories.

The issue of multiple accounts should be handled through other means -- splitting your content is orthogonal to security. From a security standpoint, you should not be able to make multiple accounts with no effort. From a content standpoint, you should be able to have several profiles per account, or streams per profile, etc, so that having multiple accounts is not strictly necessary. An account should represent your login credentials on one server, not your identity in a community.

Thinking about it again, I realized that I have similar feelings on approval-based registrations as @SilverWolf32 does. I can't really describe why is it this case, and I'm not sure if it is really "not wanting to cause a hassle", but it just makes me more reluctant to something if it requires manual approval. Even when buying something like a VPS, one with automatic setup after payment is much more appealing to me than those requiring manual approval, while in this case it can't even be counted as "bothering" someone because it's the merchant's responsibility anyway. I guess it's just the human factor and the uncertainty of an unknown person that make me feel like this (e.g. there cannot be any guarantee for the processing time of approvals, but I'm depending on the uncertain approval to begin using the platform normally) -- while for some reason I have no problem reporting spams just like @SilverWolf32 described, probably because that whether one specific spam report is handled isn't something I would depend on to continue using the platform.

An invitation-based registration scheme, on the other hand, is something more comfortable to me, and it works extremely well for a lot of services other than Mastodon, but unfortunately it isn't suitable as a default for every instance. I guess my feelings on a manual approval registration scheme can be fixed by a better user experience, as what I've mentioned in my last reply, and also what we're all trying to discuss on this issue. But I'm not sure how I would like the scheme even with the improvements proposed here -- maybe we can try to ask some users that haven't participated in this issue yet.

Another problem I've had since yesterday is that how existing instances should be dealt with if this proposal makes into the Mastodon release. If the old settings are not touched, then to me the change won't make much difference even in a 1-2 year scale (but it makes much more sense in a longer term); while forcibly changing the policy of every existing instance doesn't seem like a good thing to do for a large project.

Oh, one more thing. Though I'm not sure, but when considering the problem of uncertain response time in the perspective of a user, I came across the idea that mandatory approval might favor large instances and corporate ones than the current open registration and moderation after-the-fact system. Those large instances often have much larger moderator team compared to the amateur ones. When the response time determines how long (and how certain is the estimate) a user needs to wait before starting to use the platform normally, to me it seems that ones with large moderator teams win naturally. I'm not saying it's not a problem in the current system -- after all, the response time to spam reports also largely influences the usability and user experience, especially if bots are spamming the entire local timeline. However, since every user now depends on this response time to begin their journey in the platform, rather than only when uncontrolled spamming happens (even so, Mastodon is still not completely unusable because Mastodon is not just about local timelines, and users can choose to block content themselves anyway), it might extend the not-so-good experience to right after the registration, which could be a cause of more user selecting corporate instances rather than smaller ones. I'm not sure if this is something great to new users and to the Fediverse.

Again, it's not a problem specific to making approval-based registration default, but I'm just thinking if such move could favor large instances more than the current system does. I'd say that we still need more tools to assist moderation, especially to spam reports and approval-based registration. Automatic algorithms cannot replace human moderators, but it could be possible that such gap between corporate ones and amateur ones can be reduced with these tools. I'm not sure if I am going off-topic here now, because the whole thing about huge instances and amateur ones seems like an entirely different topic that has existed since the first day of Mastodon. I'm just a little concerned that making approval-based registration default might exaggerate this problem.

...Mastodon is not just about local timelines...

This isn't a particularly important part, but I'd just like to point out that for some people, it _is_ just about local timelines. I completely ignore federated, only interacting with people on other instances when people I follow boost them. In which case I might follow them, and so I do have several off-instance people I'm following, but I don't interact with the entire fediverse like I would be if I used the federated timeline.

@PeterCxy

automatic setup [...] is much more appealing to me than those requiring manual approval
Hmm, this is interesting. I can understand feeling that manual approval introduces more friction, because that's the intention -- to introduce just enough friction to stop spam, but to not discourage legitimate signups. But I'm really interested in why you specifically said "more appealing" instead of "easier". This indicates a cultural expectation and not a utilitarian one, and I'd really like to know more about why such a cultural expectation exists -- and more to the point, how to overcome that expectation and reshape it into something more befitting a social community.

Removing the "human element" is something that makes signups easier but I think it's not something that should be removed. After all, is it really a successful conversion if they stop using their account after signup? Should we make signups so easy that anyone can not only sign up on a whim, but also leave as soon as their whim has expired? Or should we instead treat the registration phase as part of the onboarding experience, rather than some distinct thing that happens beforehand?

I guess my feelings on a manual approval registration scheme can be fixed by a better user experience [...] But I'm not sure how I would like the scheme even with the improvements proposed here -- maybe we can try to ask some users that haven't participated in this issue yet [...] we still need more tools to assist moderation

Yeah, the user experience is definitely lacking, and I also wouldn't ask for it to become the default until major improvements are made. But even so, we must ask, which improvements are lacking? Which improvements would make people more comfortable with the idea? There are some proposals in this issue and also in #10597 that can make the UX better; these two issues are not mutually exclusive. Ultimately, the priority of all these suggestions is to reduce the load on small-time mods as much as possible.

mandatory approval might favor large instances and corporate ones than the current open registration and moderation after-the-fact system. Those large instances often have much larger moderator team compared to the amateur ones.

I heavily disagree with the first sentence, for exactly the reason you express in the second sentence. Moderation after-the-fact is MUCH more strenuous than approval before-the-fact. Moving toward a norm of approved signups actually decreases the load on the mod team of smaller instances. This is heavily evidenced by how many small instances have closed signups entirely due to spam waves -- only the bigger instances can afford to keep open registrations. Some small instances have open registration, but security-by-obscurity is not a dependable thing.


@SilverWolf32

I completely ignore federated [...] I don't interact with the entire fediverse like I would be if I used the federated timeline.

The unfortunate fact is that in a lot of cases, the public timelines (both local and federated) generally become unusable due to noise, but spam greatly increases noise. I use the federated timeline on mastodon.social, which would be absolutely chaotic if I didn't regularly put effort into blocking/muting/reporting spammers over the past 2+ years. For a new user, the federated timeline is practically unusable.

@trwnh

This indicates a cultural expectation and not a utilitarian one

Yeah the whole concern here is just about the cultural expectation of new users -- it's not really about how much more time it would take, it's just some kind of reluctance to an unknown human factor. But I am not sure if it is something widespread or just about me. Like, why am I not reluctant to use social networks to interact with unmet people while not willing to "bother" moderators with registration? The way the UX can be improved is also dependent on the answer to this question. But thinking about it, I can't really come up with a solid explanation for this question.

Or it may have nothing to do with the time it takes and the uncertainty at all. It may be just about the "moderation" thing itself -- simply because "I will be reviewed" one could retreat from registration because such feeling is not pleasant at best. Or maybe it's about feeling distrusted by the platform. From the perspective of a platform, it is undoubtable that no registration can be trusted by default at all, and it's completely not specific to one or two user, but as a single user the feeling is just that the specific "I" am being distrusted, and "I" am being investigated and reviewed by an unmet person rather than a simple submit-then-good-to-go program. Again, none of these are really solid, but for now I can't think of more reasons why such expectations arise.

Moving toward a norm of approved signups actually decreases the load on the mod team of smaller instances.

I do agree with this statement. What I was thinking is not about how much burden is loaded on each mod team, but still how users expect things to be. When everyone is on the approval-before-onboard scheme, users may naturally tend towards those instances that they expect would be faster for them to get approved. It doesn't even matter if it is really the case, it's just the expectation that is in play. Currently, at least when considering the registration phase alone, there is no actual tendency that could be formed. But a lot of users already prefer large or corporate instances for a variety of other reasons, so what I was concerned about is whether this could add to these reasons.


@SilverWolf32

What I meant there is the user's home timeline instead of federated timeline.

Besides, I would argue that the problem of spam bots is not really stemed from the registration scheme. Consider the email case: Spamming on email isn't a problem of sign up & registration at all. For most of the legitimite email servers, it's never zero-cost or even low-cost to create an account on; no email spam bots are stupid and register on a legitimate email service like Gmail to send their spams to users using the same email service, because they already have strict registration policy; they either use SMTP directly from their own server, or find some badly-managed email service. Now, for one single email server, it is not much of a hassle to filter out registrations from spam bots -- a well-designed CAPTCHA can work just fine for 90% of the cases, and one can always do controlled registration. The real problem comes from cross-server spamming bots, because there is simply no way for one server to ensure any registration policy is in place on another -- even if we assume email server implementations are the same by default. All it can do is to check whether the origin is authentic, and that's all. Because the network is open, one can always implement his own server to host his own spam bots, and there is little to no cost switching an identity. The only way out is to just distrust them all, even possibly using whitelists instead of blacklists. This directly leads to the problem that new self-hosted mail servers are hardly trusted at all, as is also mentioned in the main issue.

For Mastodon, the case isn't much different. The real threat of spamming never comes from the local instance, and if spamming happens, simple mechanisms like CAPTCHAs and challenge-response questions will work just fine, or if they don't work, just enable approval-before-onboard. But the real problem comes from federation: a single poorly-managed instance can spread spams to the entire fediverse. The attacker is also perfectly capable of implementing his own ActivityPub server for this purpose. These methods work no matter what the default registration policy on Mastodon is -- as long as the protocol is still open, you cannot expect the federation to be free of spam bots. Yes, instances must take responsibility, and instances should enforce a working anti-spam policy, but you cannot enforce this in the protocol, and there is simply no way instances can verify each other. One cannot and should not rely on the assumption that the default settings are in effect on all of the instances you federate with. Now we reach the same problems as in the email case -- would you trust an unknown instance to not accidentally (or deliberately) host a spam bot and not remove them?

For one instance, it's the admin's choice on what policy to adopt. But if we want to solve the problem of spamming for the Mastodon network, these problems cannot be overlooked. For cross-instance spamming, there can only ever be moderation-after-the-fact, otherwise it will turn into a whitelist. It only takes one account on one instance to disrupt the entire fediverse, and at least with the proposed changes in policy, I don't see how this could be addressed in the current state.

In a word, either for email or for Mastodon, it is not zero-cost registration that encourages spams. Rather, it is the nature of the open protocol that leads to the problem. Changing the default registration policy might work for a while, but as the network continue to grow, bots targeting ActivityPub itself will appear, and they are not hard to implement in any sense. Changing default settings isn't bad, but it's never a solid basis in any sense to put trust and reliance on.

@PeterCxy This seems really important:

maybe it's about feeling distrusted by the platform [...] the specific "I" am being distrusted, and "I" am being investigated and reviewed by an unmet person rather than a simple submit-then-good-to-go program.

We need to frame this conversation around how we can convey to people that this is not a major process, that they are not being explicitly distrusted. It is the same as submitting a form and waiting for a response, like applying for ID or a passport. Perhaps there is some language that could be used to be more welcoming? It does seem like the current "Request an invite" is phrased in a way that is a bit too forward. Perhaps something like one of these would help take the edge off that?

  • "Submit application"?
  • "Apply for an account"?
  • or simply "Apply"?

Consider the email case: Spamming on email isn't a problem of sign up & registration at all. For most of the legitimite email servers, it's never zero-cost or even low-cost to create an account on; no email spam bots are stupid and register on a legitimate email service like Gmail to send their spams to users using the same email service, because they already have strict registration policy

This is actually not true. All of the spam I receive comes from Gmail, Outlook, or Yahoo addresses these days. It's not costly at all to sign up for a free email account from one of the big three providers. Email spam has even evolved to the point where I get added to Google Shopping Lists or Google Family Groups by spammers with a spam link as their name. There is a false assumption that mail coming from one of the big three is more trustworthy than any given random domain, which is not always the case. There is no valid reason for Gmail to trust its own internal mail more than an external mail server by default; this is effectively a blind spot in their security model. Gmail cannot fully prevent spam signups on its own server; it can only attempt to apply the same algorithms that are applied elsewhere (track IP, enable captcha, request OAuth, etc).

Because the network is open, one can always implement his own server to host his own spam bots, and there is little to no cost switching an identity. The only way out is to just distrust them all, even possibly using whitelists instead of blacklists.

And this is exactly why I am advocating for approval-by-default: so that the public network doesn't effectively mandate whitelisting. Depending on blacklisting implies that you trust every origin by default. Depending on whitelisting implies that you distrust every origin by default. Therefore, in order to prevent the majority of the ecosystem from switching to whitelists as a natural outcome, spam needs to be fought as early as possible. In a hypothetical fediverse where most instances are whitelisted, it is more reasonable to whitelist a server with human-vetted accounts than a server where anything goes. If the norm were to have human-vetted accounts, then you could rest easier by just blacklisting the (relatively fewer) meganodes that have unvetted registrations.

And this is exactly why I am advocating for approval-by-default: so that the public network doesn't effectively mandate whitelisting.

All of the spam I receive comes from Gmail, Outlook, or Yahoo addresses these days.

This is _because_ the email system is whitelist-based. Approval-by-default wouldn't prevent spammers from setting up their own one-account servers to make spam from -- they would just approve their own bots. And we would still end up with whitelists. The two are orthogonal.

We need to frame this conversation around how we can convey to people that this is not a major process, that they are not being explicitly distrusted. It is the same as submitting a form and waiting for a response, like applying for ID or a passport.

No matter how you phrase it, you'll still have the "I am personally being reviewed" feeling. With an ID or a passport, you _want_ that. Of course you have to prove you are who you say you are. But that doesn't really apply with a social network like Mastodon.

There is a false assumption that mail coming from one of the big three is more trustworthy than any given random domain, which is not always the case. There is no valid reason for Gmail to trust its own internal mail more than an external mail server by default

Yes there is. In fact, there is a reason for any node in any federated network to trust themselves more than another instance -- because they know and control their own policy, and no assumptions need to be made for any of its own policy to be true.

And this is exactly why I am advocating for approval-by-default: so that the public network doesn't effectively mandate whitelisting.

The problem here is that the protocol itself is not immune to spamming in the first place. Any limitations made in the Mastodon software is limited to Mastodon, and limited to legitimate Mastodon instances. Sure, for now, those automatic spam bots are mostly just making use of Mastodon API, so requiring approval can block them very effectively. But what if they evolve to target ActivityPub directly? ActivityPub isn't rocket science in the first place, and as long as Mastodon grows, it becomes more and more appealing to target ActivityPub directly because you can interact with Mastodon nodes without any complications made by the Mastodon software. You can trust all Mastodon instances by default due to the approval-by-default fact, but there's nothing in the protocol preventing anyone from not conforming to this assumption. Should all Mastodon instances blacklist every ActivityPub implementation other than Mastodon by default? I don't see this much better than a whitelist-based fediverse.

Maybe I'm going a little bit to far ahead on this problem, in which case don't let this conversation be distracted by me :)

On the spamming issue, I personally still tend towards something like a Proof-of-Work mechanism for the federation protocol. I'll open another issue if I get to a general idea on how this should work.

@SilverWolf32:

This is because of the whitelist-based email system. Approval-by-default wouldn't prevent spammers from setting up their own one-account servers to make spam from -- they would just approve their own bots. And we would still end up with whitelists.

How is internal spam due to whitelisting? It is internal and doesn't depend on federation at all. It happens even in centralized systems. I don't understand your point at all.

But that doesn't really apply with a social network like Mastodon.

Why not? Do you not want accounts on your platform to be good-faith actors? It's not about being "who you say you are", it's about being a good citizen within a community. The sad fact is that corporate networks do not care about this. Bad-faith behavior is tolerated because it's "engagement", and it means more eyeballs to see ads.

@PeterCxy:

because they know and control their own policy, and no assumptions need to be made for any of its own policy to be true.

Any node can assume its own policies are true -- this is a tautology. But it cannot assume its policies are better or infallible. There is no valid reason to believe that Gmail's internal mail is more trustworthy than my private, hardened, registration-disabled server. It is erroneous to assume that an unvetted account is trustworthy simply because it came from the same origin. A lot of the spam on Gmail comes from Gmail. A lot of the spam on mastodon.social comes from mastodon.social.

the protocol itself is not immune to spamming in the first place. Any limitations made in the Mastodon software is limited to Mastodon,

And Mastodon currently has a majority of the userbase of the ActivityPub-based network, so any changes to Mastodon will have more impact than changes to other software. This is about establishing norms, not about technically enforcing an outcome on the entire network. Bad-faith actors could very well spin up a node of whatever they want, but it doesn't matter if they remain the minority.

How is internal spam due to whitelisting? It is internal and doesn't depend on federation at all. It happens even in centralized systems.

Oh, I thought you were referring to federated spam on other instances polluting the federated timeline, not local spam. That point absolutely doesn't apply locally.

It's not about being "who you say you are", it's about being a good citizen within a community.

How do you tell whether someone will be a good citizen, other than letting them try to be a citizen and seeing how they do?

With private servers where you only let your friends in, approval-by-default makes sense. But for public ones, where you _want_ people you don't know to join, it doesn't.

I don't know how you can get to know someone without letting them do things like post. And why start posting if nobody is going to see it anyway?

Of course, this all depends on what unapproved users are actually allowed to do. @trwnh, what were you thinking of?

it cannot assume its policies are better or infallible

The same thing can be said on policies of nodes you don't own -- in fact, you cannot even know they even have a policy. The only thing you are sure of is that you have your own policy. This is where the "self-trusting" came from -- it's not about if your stuff is better or others' stuff is bad, it's just that your own policy is the only thing you can be sure of.

so any changes to Mastodon will have more impact than changes to other software.

Can you have impact on an implementation of ActivityPub that is malicious by design? ActivityPub isn't rocket science, neither is Mastodon or Pleroma or anything else.

This is about establishing norms, not about technically enforcing an outcome on the entire network.

The problem is where the threats of spamming come from. Norms are norms and norms work only for what it can affect, and in this case it can affect Mastodon and possibly other legitimate implementations of ActivityPub. Even "do not spam" is a norm by itself, but the true threats of spamming is not from Mastodon. I agree that such norm can reduce the amount of spam in the current state, but since we are taking federation into account, we cannot ignore the fact that the protocol itself is even more subject to attacks than Mastodon itself. My concern is that while it is possible that this change can reduce the spam within Mastodon, nothing can be done on the federation protocol itself, and soon it is almost certain (as long as the fediverse grows at the current rate) the protocol itself will become the new target. In this case, we still end up with whilelists, though now it's whitelist-by-implementation, and I'm not sure whether this is better than the email or XMPP case. I'm not even sure such whitelist-by-implementation would even work, considering that changing user-agent is also a normal technique used by spammers these days. We may still end up with the same whilelists anyway.

It seems like the conversation is looping back to stuff addressed in the OP:

this all depends on what unapproved users are actually allowed to do.

Again, the proposal being made in this issue is to make improvements to default policy and onboarding. Unapproved users currently can do nothing; I am saying that they should be able to perform certain actions in a limited context so that they can better establish good-faith. And that, if enough improvements are made to the currently-very-lacking "require manual approval" mode, it can then be used with less qualms, and perhaps even set as the default policy, which would make the overall ecosystem less susceptible to spam -- effectively, servers with open and unapproved registrations are unhardened, because there is no access control at all. You can try to implement all sorts of imperfect technological barriers, but they will not be as simple or as effective as having a human look at a profile for 2 seconds. Deriving context is simply something that humans are much better at doing than machines. And, again,

soon it is almost certain (as long as the fediverse grows at the current rate)


@PeterCxy:

n this case, we still end up with whilelists, though now it's whitelist-by-implementation, and I'm not sure whether this is better than the email or XMPP case.

It is better to have a larger pool of compliant servers due to norms being enforced or encouraged by defaults in the most common software. User-agent spoofing is completely irrelevant; the only situation in which a trust-by-default ecosystem with blacklists can ever work is one where the majority of implementations adhere to a common level of trust. And unmoderated registrations is too wide of an attack vector. So yes, having a human rubber-stamp each profile is vastly preferable to antipatterns like CAPTCHA.

Unapproved users currently can do nothing; I am saying that they should be able to perform certain actions in a limited context so that they can better establish good-faith.

If new users can do nothing, why would they post anything if no one can see it? Are you proposing they be allowed to post to the local timeline, just not have their posts federate?

If this is not the case, how do admins know whether to admit a user if they're not posting anything?

User-agent spoofing is completely irrelevant

My argument is that it is relevant because instances cannot even tell whether one request is made by one specific implementation of ActivityPub. Even if instances block non-compliant implementations by default (as in the whitelist-by-implementation situation I mentioned), the attacker's ability to send malicious messages directly via ActivityPub isn't hindered at all because of how easy it is to spoof the user-agent. This directly makes whitelist-by-implementation not solid and the fallback is again the whitelists like in email servers.

So the problem here is that we can only make sending spam not zero-cost on Mastodon instances and possibly other implementations of ActivityPub, but we cannot make sending spams any harder through ActivityPub directly by such changes in Mastodon, thus not reducing the actual threat of spamming. We cannot have humans to look at every request made in ActivityPub. Anyway, such changes can at least make Mastodon instances more of "responsible citizens" inside the fediverse. What I'm concerned with is just that even with such measures, we can still end up with the situations mentioned in OP, for the reasons mentioned above -- how easy it is to send spams directly via ActivityPub, and how easy it is to spoof user-agent making even whitelist-by-implementation (which could be better than whitelist-by-instance) not feasible, and the only fallback becomes whitelist-by-instance again.

instances cannot even tell whether one request is made by one specific implementation of ActivityPub.

Instances can tell whether another instance has open signups or not by visiting their homepage. But this is really getting off-topic. ActivityPub is the protocol used between Mastodon servers, so it makes no sense to make claims about "direct" attacks on the protocol as if it were distinct.

Or put another way, the fact that spam can come from external sources should not mean ignoring the fact that spam's primary avenue is internal. If no remote user follows an account, the messages never reach their server. "Worry about your own house", as the proverb goes.

we cannot make sending spams any harder through ActivityPub directly by such changes in Mastodon, thus not reducing the actual threat of spamming

It does reduce the threat! Not all spam is external. It just doesn't eliminate it. There is no perfect solution except to completely remove all incentives for spam and bad-faith behavior -- which is a societal issue, not a technological one.

how easy it is to send spams directly via ActivityPub, and how easy it is to spoof user-agent making even whitelist-by-implementation

I'm not advocating for programmatic whitelisting based on a user agent at all, so I'm not sure why it keeps coming up. Silence/suspend policies shouldn't depend on any self-reported information, ever.

the fact that spam can come from external sources should not mean ignoring the fact that spam's primary avenue is internal

Nobody is ignoring this fact. I mentioned in the very first comment I brought this up that the solution proposed in this issue __does__ help a lot in the __current__ situation. The whole thing about the ActivityPub stuff is just a long-term concern that I came across -- the fact that the primary avenue is internal for now doesn't mean other avenues are infeasible or cost more than internal ones. I apologize if any misunderstanding was formed due to my wording in previous comments.

"direct" attacks on the protocol as if it were distinct

They are. Mastodon APIs are not ActivityPub APIs nor vice-versa. For now bots use Mastodon APIs, but there is no reason they can't just implement ActivityPub and work around Mastodon entirely.

the messages never reach their server

If I get it right, at least replies and DMs can be sent actively to a server. At least I can receive DMs, mentions and replies from those accounts and instances I don't even know of.

Not all spam is external. It just doesn't eliminate it.

This problem, from the very beginning, is that when internal spams (internal as in internal to Mastodon, not internal to one instance) become infeasible, spammers will begin resorting to external ones (external as in external implementations that are designed to be malicious), and from what I know I can't see why those are less feasible than internal ones except that they cannot spam the public timelines (but hey, email doesn't have a public timeline but spams still reach and annoy almost everyone), or except if someone try block-by-implementation (and this is why this hypothetical situation keeps coming up - your next question). Nothing can eliminate spams, but here the concern is that those external ones doesn't seem more difficult than internal ones. Will the threats be reduced much when there is another nearly equally easy way of spamming?

I've said, the measures proposed here __do make Mastodon instances more responsible for the fediverse__. In fact, at least what is proposed here can solve the problem of spams in timelines -- both local and federated, which is one difference between external and internal spams. But spams are not just sent to public timelines, and here we essentially fall back to the model of email systems.

I brought this up just as another possible concern on spams, not to say the measures should be perfect, and not to say that the measures proposed here are completely of no help. They do help, but it's just that I don't think they can help that much in terms of avoiding similar situations in the email case (which they intend to avoid) because of the reasons mentioned above. I'm just thinking that the problem is not just about the implementation Mastodon itself or the moderation of it, but also about ActivityPub and even all these federated protocols. All of these are so easy to spam on because their design, not just because of one implementation or one instance not enforcing enough moderation policy.

Anyway, I do agree that this could be going off-topic. I'm retreating from the discussion on this topic from now on. Let's get back to how to reduce the feeling of "being reviewed", which I've not had much ideas on.

Oh BTW, one of my friends pointed out that some sites that require invitations to join are actually perceived to be better just because requiring invitations make it seem that there is a shortage in supply (I don't know if the wording is correct here but it should be generally like this), and thus users are more motivated to join (they might even ask for invitations on public forums etc.). I'm wondering if there is some way to give approval-before-onboard similar feelings.

requiring invitations make it seem that there is a shortage in supply ... and thus users are more motivated to join

Is this a good thing? Personally, I would probably be _less_ motivated to join if I felt that way, because I'd feel that I would be unlikely to get in.

@SilverWolf32 it's like when something is in shortage it sounds like it should be something precious or whatever... but that only works for some sites not everyone anyway. Just mentioned because this is interesting :)

@PeterCxy Yeah, there's not much I can think of to make it impossible to send a spammy message to an arbitrary inbox -- but that's only really a long-term concern in the situation where account registration is unbounded. Eventually, every human that cares will have an account. I foresee instances getting blocked more liberally, at that point in time. But the outcome I am specifically trying to avoid is one where the ecosystem is largely reliant on technological barriers like DKIM and DMARC, which make setting up a server unnecessarily more difficult. The social cost of having vetted accounts is better than the technological cost of imposing config hell on every server by default, just to ensure that messages are delivered in the first place. It is prohibitive to expect every self-hoster to have to configure their server in a way that passes arbitrary heuristics that can still be sidestepped.

Outside of this specific issue, there are a lot of options to explore from an object-capability (OCAP) perspective -- you could require a signature from a capability endpoint, hand out revokable keys to your contacts that allow them to message you, and greylist or reject every unsigned message -- but this would be incompatible with the current iteration of the public fediverse. (See #8565 and https://github.com/w3c/activitypub/issues/319 for more about that.)

And of course, spam isn't the only malicious payload -- there are other forms of bad-faith participation, and it would certainly behoove website owners to be more mindful of who they are giving a platform, in general.

Let's get back to how to reduce the feeling of "being reviewed", which I've not had much ideas on.

I can't really think of anything else except language, as in https://github.com/tootsuite/mastodon/issues/10590#issuecomment-485291571 -- the verb "request" is too imposing IMO. It could make it seem like you are being a bit of a hassle, by taking something of the server -- the time of moderators, etc. I really would like to get some feedback on if something more impersonal like "Submit application", "Apply for an account", or "Apply" would work better. Or something else in that vein.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

thomaskuntzz picture thomaskuntzz  路  3Comments

ghost picture ghost  路  3Comments

hidrarga picture hidrarga  路  3Comments

selfagency picture selfagency  路  3Comments

almafeta picture almafeta  路  3Comments