Hi, I'm wondering that shouldn't the language code for Serbian be sr rather than rs (according to ISO 639-1)?
And since Serbian uses either the Cyrillic alphabet or the Latin alphabet, is there any need to distinguish between the two writing systems (e.g. sr-Cyrl for Serbian Cyrillic and sr-Latn for Serbian Latin according to ISO 15924) in spaCy to make things more clear (the newly-added stop words for Serbian in #4078 uses the Cyrillic alphabet)?
Similar issues: #2339 (Norwegian Bokm氓l and Norwegian Nynorsk), #1308, Simplified Chinese and Tradition Chinese
Hm, I think you're right, it should be "sr" (in fact @ines got it right with the issue labels!)
This is only a recent addition (cf PR #4078) so we can probably still change this without breaking too many people's code.
CC @Pavle992
I agree that it should be sr, I was using the usual abbreviation for republic of serbia. I will create new update asap.
Thanks for fixing this so quickly with PR #4159 !
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Most helpful comment
I agree that it should be sr, I was using the usual abbreviation for republic of serbia. I will create new update asap.