Njs: String.prototype.split() fails to split a unicode string correctly

Created on 14 Feb 2019 · 3Comments · Source: nginx/njs

Hi!

>> '\u0431\u0435,\u043b\u0438,\u0431\u0435\u0440,\u0434\u0430'.split(',')
[
 'бе',
 'ли',
 'бер',
 'да'
]
>> '\u0431\u0435,\u043b\u0438,\u0431\u0435\u0440,\u0434\u0430'.split('')
[
 '�',
 '�',
 '�',
 '�',
 ',',
 '�',
 '�',
 '�',
 '�',
 ',',
 '�',
 '�',
 '�',
 '�',
 '�',
 '�',
 ',',
 '�',
 '�',
 '�',
 '�'
]
>>

bug

Source

drsm

All 3 comments

while our unicode strings are not UTF-16 strings, i think we should not break surrogate pairs there, as required by the spec.
more on this:
https://stackoverflow.com/questions/4547609/how-do-you-get-a-string-to-a-character-array-in-javascript/34717402#34717402

drsm on 14 Feb 2019

@drsm Thank you for the report.

Please, try the patch below:
https://gist.github.com/xeioex/35d9cc06fb9559ca32ce1e085c7f2d92


>> 'αβγ'.split('')
[
 'α',
 'β',
 'γ'
]

>> '囲碁織'.split('')
[
 '囲',
 '碁',
 '織'
]

>> '𝟘𝟙𝟚𝟛'.split('')
[
 '𝟘',
 '𝟙',
 '𝟚',
 '𝟛'
]

>> 'яαяαяα'.split('α')
[
 'я',
 'я',
 'я',
 ''
]

xeioex on 14 Feb 2019

👍1

@xeioex
the patch works fine for me, thanks!

>> 'фыва asdf 👍'.split('')
[
 'ф',
 'ы',
 'в',
 'а',
 ' ',
 'a',
 's',
 'd',
 'f',
 ' ',
 '👍'
]

drsm on 14 Feb 2019

👍1

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Object.freeze(new Uint8Array([1,2,3]))

drsm · 4Comments

r.headersIn is undefined but r.headersIn['Authorization'] exists

porunov · 4Comments

Cannot install nginx-module-njs

reyou · 5Comments

Show the exact line where exception happened.

xeioex · 3Comments

Unable to find the njs version

laith-leo · 5Comments