Related to #1195 and #3361 / Found with https://github.com/chocolatey/choco/issues/1225.
Following up on an issue we had addressed locally, we found another possibly more serious issue. If you create a UTF8 (w/out BOM) file and sign it, all is well. When you add a unicode character, such as a 漏, then sign and run the file, it does not work. Remove the unicode character and it works again. Convert the file to UTF8-BOM, it works.
For reference, the error is 'The contents of file filepath may have been tampered because the hash of the file does not match the hash stored in the digital signature.' or 'The contents of file filepath might have been changed by an unauthorized user or process, because the hash of the file does not match the hash stored in the digital signature.' (different versions of PowerShell)
Set-ExecutionPolicy AllSignedUTF8FileWithNoUnicode.ps1Get-AuthenticodeSignature .\UTF8FileWithNoUnicode.ps1.\UTF8FileWithNoUnicode.ps1Set-ExecutionPolicy AllSignedUTF8FileWithUnicode.ps1. Get-AuthenticodeSignature .\UTF8FileWithUnicode.ps1.\UTF8FileWithUnicode.ps1Set-ExecutionPolicy AllSignedUTF8BOMFileWithUnicode.ps1. Get-AuthenticodeSignature .\UTF8BOMFileWithUnicode.ps1.\UTF8BOMFileWithUnicode.ps1All scenarios should work. It should run the script as it has not been modified since it was signed.
It fails the UTF8 (no BOM) with Unicode scenario because it believes the file has been modified.
Seems like the validation incorrectly assumes UTF-8 no BOM as ASCII and fails when it encounters the Unicode character
That's the conclusion I came to as well.
Encoding is always so much fun...
Continuing our conversation from here:
Arguably, there's a coupling here that shouldn't exist (I hope I have the big picture right - do tell me if I'm wrong):
_Signing_ is (commendably) character-encoding agnostic and is purely based on the script file's _bytes_.
By contrast, _verifying_ the signature seems to rely on the engine already having interpreted the character encoding of the script correctly.
I don't know what the performance implications are, but if _signing_ is purely byte sequence-based, so should _verifying_ be.
On the flip side, you could consider the current behavior a blessing: that the verification breaks is indirectly telling you that the script's encoding is being misinterpreted - though that would only be helpful if the error message specifically indicated that condition, and I'm not sure if that could be distinguished from, say, actual tampering.
I had an internal customer report an almost identical issue. With the same character.
Most helpful comment
I had an internal customer report an almost identical issue. With the same character.