Hello,
I was trying to match some utf-8 characters but it doesn't seem to work.
/[[:alnum:]]/x.match "脿" # => nil
It works in Ruby and seems compliant with PCRE, is there a bug with crystal or is there a subtlety I didn't notice here ?
Note: I also wrote an error which did not raise any alarm, a /[class]]/. Maybe it should be mandatory to escape the ']' inside or outside the class
This behavior related to absent option:
class Regex
@[Flags]
enum Options
PCRE_UCP = 0x20000000
But this option requires PCRE build with UCP support.
pcre brew formula on MacOS contains '--enable-unicode-properties' option, but for some reason PCRE_UCP failed. Needs more investigation.
See also https://gist.github.com/jweyrich/9803969
P.S.: Ruby works fine because built with special Oniguruma branch (not pcre).
Looks like somebody should replace PCRE with Onigmo.
I think we could at least provide Regex::Option::UCP so people can use it if necessary. Just need to document that this option requires a compatible libpcre.
Most helpful comment
Looks like somebody should replace PCRE with Onigmo.