I'd be interested in taking this. I've contributed before though so if you want someone new to take this totally understand.
@DCtheTall -- no worries, please take it up.
Been looking into this some, and while I have an idea of where to start, it turns out the definition of these is quite broad. This might be more complex than I anticipated.
This link also provides some useful documentation including what ranges of Unicode values are allowed for this type.
So far here are some "base case" rules I have learned:
xsd:anyURI strings ignore any leading or trailing whitespace, so the first thing we should do after asserting the input is a string is call trim().
An empty string is valid, so we check for that. As far as I can tell from the docs, there is no upper bound on the length.
We can then check that each character is in the ranges specified in the link above.
Any URL is valid, so we can use isURL to check for those.
Still doing some research, but I would definitely appreciate any pointers. Like I said, it turns out the definition for these strings is actually pretty broad.
One alternative we could consider is adding a function which validates International Resource Identifiers (IRI), which are also mentioned in the link I provided.
@profnandaa any ideas on how to go forward with this? I admit I am a bit stuck given how broad the definition of this string type is.
@DCtheTall -- I'm very sorry I missed on this reply... Perhaps we just add an enhancement/option on isURL to support this?
Hi @profnandaa I am new here, and I see this hasn't been resolved yet, could I try to fix it? I will use the suggestions mentioned already.