Method trim corrupts UTF-8 string on a specific symbol "л":
{{ 'ж'|trim('»') }} – ok, outputs "ж"
{{ 'л'|trim('»') }} – not ok, outputs "�"
Here's a fiddle https://twigfiddle.com/z0thnd
Because » is a multibyte char \x0A\xC2\xBB and the parameter of trim is a char list.
And ж is \xD0\xB6 and л is \x0A\xD0\xBB.
So what you are doing is actually
<?php
trim('\xD0\xB6', '\x0A\xC2\xBB'); // trim('ж', '»');
trim('\x0A\xD0\xBB', '\x0A\xC2\xBB'); // trim('л', '»');
which means trailing \x0A, \xC2 and \xBB would be removed. Resulting in
'\xD0\xB6' // trim('ж', '»') => 'ж'
'\x0A\xD0' // trim('л', '»') => invalid UTF-8 sequence
If you want to trim the real », a mb_* or preg_* function should be used.
Not sure if there is a built-in Twig way to deal with it though.
Reference:
Closing as @jfcherng gave the answer
Most helpful comment
Because
»is a multibyte char\x0A\xC2\xBBand the parameter of trim is a char list.And
жis\xD0\xB6andлis\x0A\xD0\xBB.So what you are doing is actually
which means trailing
\x0A,\xC2and\xBBwould be removed. Resulting inIf you want to trim the real
», a mb_* or preg_* function should be used.Not sure if there is a built-in Twig way to deal with it though.
Reference: