Currently username can be any matching regex:/^[a-z0-9_-]+$/i and the other rules (eg, min: 3). A drawback of this regex is that it allows for numeric usernames. As a consequence consider:
User
Hitting
/api/user/1337 returns user 1337 for any other user.
What we have to do is disallow any numeric usernames, the UserValidator has to be modified. A consideration is what to do with existing users.
In my opinion, numeric usernames are allowed on most of software and disallow them is not the way.
Maybe, when someone build an extension to integrate login with service "A" and this service allow numeric usernames, what will happen?
The problem here, is not the numeric username, is how API works.
Open to discussing alternatives, I think the main question is what should the API call be to retrieve a single user by their username?
I guess it could be GET /users?filter[username]=Toby, although that endpoint returns a list (collection) of one users rather than a just a single resource.
Wouldn't be enough to obtain user details by its id?
Not always. Prime example is when you browse to a user page like http://discuss.flarum.org/u/Toby - we need to look up the user by their username!
Right now, we are generating duplicate content with http://discuss.flarum.org/u/Toby and http://discuss.flarum.org/u/1. Shouldn't one of them redirect to the other one or at least a canonical URL direct search engines to the version to be indexed?
Now my question changes to: Wouldn't be enough to obtain user details by its username?
(Thanks for your patience)
Sometimes you need to look up a user by their ID too. eg. when the API returns discussions/posts, they have relationships with their authors which specify a user ID.
There could be a separate endpoint to lookup by username ? like /api/users/by-username/{value} ? There's no real need for the same endpoint to work with both IDs and usernames ?
Or tweak the existing endpoint to accept the username query with a different syntax. For example if you query /api/users/username, it acts like the GET/PUT/DELETE endpoints (or just GET) except you also have to pass a username / filter[username] GET/POST attribute to tell which username you look for. As long as it doesn't collide with the way you pass new values to PUT/DELETE it would be fine. With this solution you just have to blacklist a single token that will switch the endpoint from id-based to username-based.
It seems only profile page makes use of the endpoint with a username ?
GET /api/users/by-username/{value} is the nicest option so far – is it compatible with the JSON-API specification? If possible we'd like to follow that as closely as possible.
Not sure if this is the right place, but I think it's the closest we have:
Ugh, I just gave this great article a good read - turns out there are some more things we need to consider in terms of username validation. And now is the best time to do so, because we can still break backwards compatibility. :grin:
If you really think about breaking the backwards compatiblity please consider replacing the numeric IDs with something different - like UUIDs.
It´s no good idea to have a public available system and to be able to identify all users in a row by guessing IDs. Also you can guess admin-IDs (will be the first ones). like:
I could now write a script and grab all user ID´s & names. Currently this is not a huge problem - but consider flarum grows and the userpage will eventually someday show personal data like email-adress or real name or so....
That´s the reason why bigger sites switched from numeric IDs to random characters to not have their user base guessable by ID-probing.
Perhaps you could for now just disable access by ID? Because.... what is it good for?
Oh... and we really would later need a user-flag like login-error-count to disable after too many password tries.
I´ll lock down my installation with fail2ban and use an external auth provider... but currently there is no protection of flooding password guesses. Ok,... that is an indirect feature-request ;-)
+1 for disable member profile access by user ID link.
+1 for GET /api/users/by-username/{value}
I would prefer access by UUID instead of username... because even "admin" etc are all guessable...
@cljk I don't think this issue is about hiding usernames. You do want usernames to be searchable on a forum (in particular they are the only identification in the profile url). The question is just how to offer both id and username based queries...
I thought about this again and in my opinion the only good json:api way of doing this would be to use @tobscure suggestion and query GET /users?filter[username]=Toby when displaying the user profile. It doesn't have any limitation as all single-resource parameters (includes) can also be used with collections. Plus this doesn't create any additional endpoint.
@clarkwinkelmann Alright I'm sold let's go with that 👍
I actually prefer the new endpoint, as filters seem to be the wrong tool, when they will only ever return one result (at most). :see_no_evil:
I have been in doubt as well mostly for the same reason as @franzliedke mentions. Also using the filter on the users endpoint could have the following unwanted side effect;
users:
Result of the query will be two users not one.
Oh I forgot it was only partially matching. What about adding a new filter for exact matching to the existing endpoint ? /users?filter[exact-username]=Toby. This still sounds like a json:api compliant solution.
Wait, I didn't realize /users/by-username would do a partial match. Why would we need that for the user profile URL?
@franzliedke i think we're talking about the /users endpoint not the by-username one?
@clarkwinkelmann i'd like to propose the following then:
/users?filter[username]=Toby&filter[exact]=1
I don´t like the idea to present an API to NON-Admins that is able to SEARCH in a userbase. I find it problematic enough that the user-IDs ARE inremental (!) numeric and are guessable AND are presented in an API
Thanks god the API response doesn´t seem to contain email addresses. Then it would be a 10-liner to grab all email-adresses in you discussion-forum.
To come back to the issue.... which was "usernames cannot be numeric" ... okay ...
For what do you need the user-API? Admin or user? You need both lookups? Username and ID?
Why not just split-up then... an API for username and one for ID? Then numeric usernames would be no problem.
Using numeric IDs everywhere would definitely be the easiest I guess, but it's not as good looking :sparkles:
As the reason both need to be usable via the API: clean urls with username as key, and numeric ID for all other API calls to improve performance.
Regarding user enumeration I don't think numeric vs uuids vs whatever is relevant. As soon as user listing is enabled for guests or members, you can easily scan through every member and check their role anyway. And even if user listing is disabled you can also just crawl discussions via the API and check which users have posted something.
@cljk if you don't like the information the API exposes you can easily alter it via an extension :wink:
@clarkwinkelmann That has nothing to do with "liking an API" but with the problem of a possible data leak. And I don´t want to tweak a system to become secure. But I see... I´m the only one seeing this problem. So don´t care about me.
Even if we used uuids for user IDs, you could still mine users from a forum by going through all of its discussions/posts. Regardless, that is not the issue at hand - that can be discussed elsewhere.
The issue at hand is how to structure the API endpoint so that you can look up users by both their ID and their username.
Currently we have /api/users?filter[q]=tob to do partial username matching. This is used when you're using the forum's search box to search for a user, or when you type the @ character to mention someone. The q filter also allows gambits, just like for discussion searching.
As per the JSON-API spec, /api/users/[id] should look up a user by their ID and nothing else.
I still support /api/users?filter[username]=Toby as the endpoint to look up a user by their exact username. Note that this username filter would find an exact match, whereas the q filter can be used for partial matching. This solution is clearly compliant with the JSON-API spec.
Whereas /api/users/by-username/{value} is a bit less clear – the JSON-API spec has nothing to say about this. If we did GET /api/users/by-username/{value}, would we also need/want to support PATCH and DELETE on the same endpoint? I think it's cleaner to simply require a ?filter[username] lookup in the first place to retrieve the user ID, and then from there it's the same as for every other resource.
Good enough for me.
Just an observation from today:
If you view the profile of the user 010101 at discuss.flarum.org, you will see his posts as expected. The discussions and mentions are shown correctly as well. The given URL is https://discuss.flarum.org/u/010101. If you, however, reload this page the same URL shows the user profile of the user dkio whose user ID seems to be 10101.
As I understand it, the api call for the user profile must be different when called from within the application (URL is updated by the application) compared to loading the user profile URL directly.
If you view the profile of the user 010101 at discuss.flarum.org, you will see his posts as expected. The discussions and mentions are shown correctly as well. The given URL is https://discuss.flarum.org/u/010101. If you, however, reload this page the same URL shows the user profile of the user dkio whose user ID seems to be 10101.
I experienced something similar to this. If you search for 010101 via search and go to his profile, it will launch, but clicking on its profile picture via the discussion list will give the red error dialog.
@pflstr @TurboProgramming indeed that's the case. The behaviour you both mention was the reason for investigating the cause and this confirmed issue in the first place.
I just encountered this problem on my forum and found the discussion here. If I understand correctly, the proposed solution would be to disable the possibility to view a profile page using /u/{id}, correct? In that case, I guess the title of the issue should be updated
Maybe a good compromise is usernames cannot start with numeric symbols (thinking about variable name restrictions in programming languages)? That should be enough to efficiently distinguish numeric strings and text strings? (Looking at the whole string should be better, but more complicated to implement perhaps?)
Maybe a good compromise is usernames cannot start with numeric symbols (thinking about variable name restrictions in programming languages)? That should be enough to efficiently distinguish numeric strings and text strings? (Looking at the whole string should be better, but more complicated to implement perhaps?)
@luceos you removed needs-discussion, are we going with shangjiaxuan's idea?
This is the final solution: https://github.com/flarum/core/issues/1356#issuecomment-375854441
Clark's PR is a good solution for now.
Most helpful comment
There could be a separate endpoint to lookup by username ? like
/api/users/by-username/{value}? There's no real need for the same endpoint to work with both IDs and usernames ?Or tweak the existing endpoint to accept the username query with a different syntax. For example if you query
/api/users/username, it acts like the GET/PUT/DELETE endpoints (or just GET) except you also have to pass ausername/filter[username]GET/POST attribute to tell which username you look for. As long as it doesn't collide with the way you pass new values to PUT/DELETE it would be fine. With this solution you just have to blacklist a single token that will switch the endpoint from id-based to username-based.It seems only profile page makes use of the endpoint with a username ?