I made a test database in the kolibri shell using a CSV of downloaded random names.
One name on there ended up being Alie L'estrange and I created the username by joining the names with underscores and lower-casing them.
So the username that I successfully saved to the database with FacilityUser.objects.create() was alie_l'estrange.
However, when I go to log in with that user I get an error about which characters are acceptable.
If we validate for formatting on the front-end we should validate for it when we save/create/update the model record itself to avoid saving invalid records to the DB.
I know we're working on CSV user import through the UI for an upcoming feature so we want to be sure that we're not able to save users with invalid usernames.
Open kolibri shell
from kolibri.core.auth.models import FacilityUser
FacilityUser.objects.create(fullname="whatever", username="something'invalid", facility=FacilityUser.objects.all()[0].facility)
develop (0.14.x)
Yeah, this validator will flag ' as invalid punctuation, which will prevent creating a a user like alie_l'estrange on the UI, but there is absolutely no such validation on the actual FacilityUser model
I'm not sure what the "right" answer here is.
I'm not a fan of arbitrary restrictions, but for usernames there's a slight benefit to restriction because it makes them easier to transcribe. Therefore my inclination is to make the server validation stricter to match client-side validation.
@jredrejo would you mind addressing this in the course of your PF work?
@indirectlylit "^[A-Za-z0-9-_ ]*$" is the validator I was using:, only letters, digits , _ and - allowed.
Actually, this is the complete code of the validator I have coded (it has some options to be used too when validating classes/groups names):
def valid_name(spaces=False, null=False):
def checker(v):
if null and v is None:
return checker
match = "^[A-Za-z0-9-_ ]*$" if spaces else "^[A-Za-z0-9-_]*$"
if not re.match(match, v):
raise ValueError(v)
return checker
Note that this validation is too strict as it prevents Unicode usernames, which are important for users in languages with non-Latin character sets.
Should use the regex here instead to allow Unicode usernames and be consistent with the frontend https://github.com/learningequality/kolibri/blob/develop/kolibri/core/assets/src/validators.js#L16
Should use the regex here instead to allow Unicode usernames and be consistent with the frontend https://github.com/learningequality/kolibri/blob/develop/kolibri/core/assets/src/validators.js#L16
Validator to be used with usernames and class names:
def valid_name(username=True, null=False):
def checker(v):
if null and v is None:
return checker
has_punc = "/[\s`~!@#$%^&*()\-+={}\[\]\|\\\/:;\"'<>,\.\?]/"
if not username:
has_punc = "/[`~!@#$%^&*()\+={}\[\]\|\\\/:;\"'<>\.\?]/"
if re.match(has_punc, v):
raise ValueError(v)
return checker
Sounds good to me to use the same validation for class names as for user names
Any reason to exclude punctuation from class names? If people want to call a class Wow! Science? A journey into the heart of Learning; who are we to stop them?
true
Any reason to exclude punctuation from class names? If people want to call a class
Wow! Science? A journey into the heart of Learning;who are we to stop them?
I agree. Then, for class names , should we set any limitation (appart from the maximum of 100 chars that's defined in the model) ?
@indirectlylit at least, comma should be banned if we need to import a list of classrooms in a csv field, unless we set a different separator.
I see what you're saying, but I feel like it should be the responsibility of the CSV importing to handle that edge case as opposed to pushing the responsibility to the validation.
Another option: we document a "known limitation" that CSV import does not support class names with commas?
I see what you're saying, but I feel like it should be the responsibility of the CSV importing to handle that edge case as opposed to pushing the responsibility to the validation.
Another option: we document a "known limitation" that CSV import does not support class names with commas?
Either that or limit to one classroom per person in CSV import, whatever is more practical for the case.
I don't think I've used it in any of the logic so far, but the Python csv module does define a Sniffer class (https://docs.python.org/3/library/csv.html#csv.Sniffer) that allows for determining the format of the CSV file - which would give us more flexibility in allowed delimiters, but it may be more trouble than it is worth.
Either that or limit to one classroom per person in CSV import, whatever is more practical for the case.
It's definitely more important that we allow multiple classes, than that we allow single classes with commas.
more flexibility in allowed delimiters
Even if we switched delimiters, that doesn't address the case being discussed here where commas are used to separate a list of classes _within_ a single column of the CSV, not to separate columns of the CSV. For example:
Username, Full name, User type, Assigned to teach class, Enrolled in class
teach4life,Mr. Rogers,Coach,"Geometry 1, Algebra",
student4now,Alice,Learner,,Algebra
@indirectlylit I think we can close this after #6674 and #6716 and their tests at #6798 are merged.
great, thank you