After https://github.com/FriendsOfPHP/PHP-CS-Fixer/pull/3272 was released a lot of my CI builds on legacy apps started failing because sources are non-UTF8 and preg functions and/or this project code raise exceptions on something like:
/**
* NON-UTF8 file ending with this char: 脿
*/
After https://github.com/FriendsOfPHP/PHP-CS-Fixer/issues/3291 and https://github.com/FriendsOfPHP/PHP-CS-Fixer/pull/3296 I'm expecting this issue to occur with more frequency.
The issue is not with PHP-CS-Fixer since it's ok to expect UTF8 sources, but neither I can't have CI build failing for valid code.
We can't change file's encoding.
May I open a PR that adds code like https://github.com/FriendsOfPHP/PHP-CS-Fixer/blob/v2.8.3/src/Runner/FileFilterIterator.php#L70-L86 to skip in a non-blocking way code that can't be properly fixed (i.e. that aren't UTF8)?
skipping whole file is no-go.
instead, focus on tracking down where is given exception you experience raised and we could think how to fix it instead of skipping it.
I'm PRing soon a bunch of test-cases.
I have no idea how we could fix sources in different charsets without rewriting a lot of code. I think we'll need something like WhitespacesAwareFixerInterface: the Fixer should ask/get-configured (when needed) in order to handle efficiently token contents in different charsets.
starting with PR adding failing case would be a great start. having that, we will figure out how to handle it nicely.
Closing by merge of #3446
Most helpful comment
skipping whole file is no-go.
instead, focus on tracking down where is given exception you experience raised and we could think how to fix it instead of skipping it.