Platformio-core: 'ascii' codec can't decode byte

Created on 16 Jul 2019  ·  64Comments  ·  Source: platformio/platformio-core

'ascii' codec can't decode byte 0xe1 in position 24: ordinal not in range(128)

help wanted home

Most helpful comment

We are testing it now. I'll back with updates soon.

All 64 comments

Do you use the latest PIO Core 4.0?

pio --version

Yes It’s PlatformIO, version 4.0.0.
It just happens at PIO home screen to load recent project.




      1. 오후 7:29, Ivan Kravets notifications@github.com 작성:



Do you use the latest PIO Core 4.0?

pio --version

You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub https://github.com/platformio/platformio-core/issues/2796?email_source=notifications&email_token=AF3VJVTZEJD6PS5WJN5NBL3P7WPHTA5CNFSM4ID7IDR2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2ANQ4I#issuecomment-511760497, or mute the thread https://github.com/notifications/unsubscribe-auth/AF3VJVW7PJHFV225E5XVFSLP7WPHTANCNFSM4ID7IDRQ.

What is your locale? Operating system?

I’m using OSX 10.14.5 with Korean.




      1. 오후 6:36, Ivan Kravets notifications@github.com 작성:



What is your locale? Operating system?


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub https://github.com/platformio/platformio-core/issues/2796?email_source=notifications&email_token=AF3VJVVFYV5YLXI2WTX37ZDP73RYXA5CNFSM4ID7IDR2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2DUBBA#issuecomment-512180356, or mute the thread https://github.com/notifications/unsubscribe-auth/AF3VJVVLJHHIUAILFE5JMSDP73RYXANCNFSM4ID7IDRQ.

@ychoquet Please open PlatformIO IDE Terminal (see icon in the status bar) and type:

python -c "import sys; print(sys.getdefaultencoding(), sys.getfilesystemencoding())"

Please provide output.

python -c "import sys; print(sys.getdefaultencoding(), sys.getfilesystemencoding())"
('ascii', 'utf-8')

@ychoquet what is your OS

OSX 10.13.6

Do you see any non-ascii chars in ~/.platformio/homestate.json in recentProjects section?

Or, could you send me this file to [email protected]? Thanks!

There is a french character é in it. If I delete the project name that has the french character in it, that solve the problem.

Thanks!

Stop! :) This is a bug. I need this file with this character. Could you send me a broken homestate.json to [email protected]? Thanks!

Done!

I seem to have this same issue. Seems like there are some comments in Russian in a library that I added to the project and now it won't build.
After the line 'Dependency Graph' i get this:
UnicodeEncodeError: 'ascii' codec can't encode characters in position 20-25: ordinal not in range(128):
File "C:\users\vladi.platformio\penv\lib\site-packages\platformio\builder\main.py", line 126:
env.SConscript("$BUILD_SCRIPT")
File "C:\Users\vladi.platformio\packages\tool-scons\script..\engine\SCons\Script\SConscript.py", line 541:
return _SConscript(self.fs, files, *subst_kw)
File "C:\Users\vladi.platformio\packages\tool-scons\script..\engine\SCons\Script\SConscript.py", line 250:
exec _file_ in call_stack[-1].globals
File "C:\users\vladi.platformio\platforms\atmelavr\builder\main.py", line 155:
target_elf = env.BuildProgram()
File "C:\Users\vladi.platformio\packages\tool-scons\script..\engine\SCons\Environment.py", line 224:
return self.method(nargs, *kwargs)
File "C:\users\vladi.platformio\penv\lib\site-packages\platformio\builder\tools\platformio.py", line 122:
_build_project_deps(env)
File "C:\users\vladi.platformio\penv\lib\site-packages\platformio\builder\tools\platformio.py", line 47:
project_lib_builder = env.ConfigureProjectLibBuilder()
File "C:\Users\vladi.platformio\packages\tool-scons\script..\engine\SCons\Environment.py", line 224:
return self.method(nargs, *kwargs)
File "C:\users\vladi.platformio\penv\lib\site-packages\platformio\builder\tools\piolib.py", line 1056:
_print_deps_tree(project)
File "C:\users\vladi.platformio\penv\lib\site-packages\platformio\builder\tools\piolib.py", line 1025:
sys.stdout.write("%s|-- %s" % (margin, title))
File "C:\Users\vladi.platformio\packages\tool-scons\script..\engine\SCons\Util.py", line 1323:
self.file.write(arg)

@vovane did you set UTF-8 encoding for this file with Russian comments? You can try the next:
1) Close VSCode
2) Uninstall all Pythons from system
3) Remove ~/.platformio folder, it is located in the user home directory
4) Install the latest Python 3.7
5) Open VSCode.

Now everything should work.

@ivankravets Thanks, but when I did everything as you instructed nothing has changed. I indeed had Python 2.7 installed, which I deleted and later installed the latest Python 3.7, and the error's still there. Anything else I can do?

I had to re-install platform IO and chose to install python 3.8 and now I'm hit with a similar UnicodeEncodeError too, but I have no idea what file I should look at.

PACKAGES: toolchain-xtensa 2.40802.190218 (4.8.2), framework-arduinoespressif8266 2.20502.0 (2.5.2), tool-esptool 1.413.0 (4.13), tool-esptoolpy 1.20600.0 (2.6.0)
UnicodeEncodeError: 'charmap' codec can't encode characters in position 1671335-1671336: character maps to <undefined>:
  File "C:\users\gijs\.platformio\penv\lib\site-packages\platformio\builder\main.py", line 126:
    env.SConscript("$BUILD_SCRIPT")
  File "C:\Users\gijs\.platformio\packages\tool-scons\script\..\engine\SCons\Script\SConscript.py", line 605:
    return _SConscript(self.fs, *files, **subst_kw)
  File "C:\Users\gijs\.platformio\packages\tool-scons\script\..\engine\SCons\Script\SConscript.py", line 286:
    exec(compile(scriptdata, scriptname, 'exec'), call_stack[-1].globals)
  File "C:\users\gijs\.platformio\platforms\espressif8266\builder\main.py", line 203:
    target_elf = env.BuildProgram()
  File "C:\Users\gijs\.platformio\packages\tool-scons\script\..\engine\SCons\Environment.py", line 224:
    return self.method(*nargs, **kwargs)
  File "C:\users\gijs\.platformio\penv\lib\site-packages\platformio\builder\tools\platformio.py", line 110:
    env.BuildFrameworks(env.get("PIOFRAMEWORK"))
  File "C:\Users\gijs\.platformio\packages\tool-scons\script\..\engine\SCons\Environment.py", line 224:
    return self.method(*nargs, **kwargs)
  File "C:\users\gijs\.platformio\penv\lib\site-packages\platformio\builder\tools\platformio.py", line 288:
    env.ConvertInoToCpp()
  File "C:\Users\gijs\.platformio\packages\tool-scons\script\..\engine\SCons\Environment.py", line 224:
    return self.method(*nargs, **kwargs)
  File "C:\users\gijs\.platformio\penv\lib\site-packages\platformio\builder\tools\piomisc.py", line 198:
    out_file = c.convert(ino_nodes)
  File "C:\users\gijs\.platformio\penv\lib\site-packages\platformio\builder\tools\piomisc.py", line 57:
    return self.process(contents)
  File "C:\users\gijs\.platformio\penv\lib\site-packages\platformio\builder\tools\piomisc.py", line 80:
    assert self._gcc_preprocess(contents, out_file)
  File "C:\users\gijs\.platformio\penv\lib\site-packages\platformio\builder\tools\piomisc.py", line 90:
    fp.write(contents)
  File "c:\users\gijs\.platformio\penv\lib\encodings\cp1252.py", line 19:
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]

I'm trying to build my own ESPEasy project in PlatformIO for Windows.
The same code used to build just fine last night when I tried to build it.

N.B. the re-install was needed since VScode was not able to find Python anymore after restart of VS code.

Does it work with Python 3.7?

What is encoding for your INO file? Is it UTF-8? Does this INO file contain non-ASCII chars?

The bottom bar of VScode does state it is UTF-8, with Windows line endings.
How can I quickly check if any file has non-ASCII characters?
It should not be so, but it could be.

Have not yet checked with Python 3.7.5.

Could you attach here archived PlatformIO project? I'll try to reproduce it.

I do see some non-ASCII characters, but it looks like they are mostly (only?) in comments.
I used this in Ubuntu on Windows:

 grep --color='auto' -P -n '[^\x00-\x7F]' -R *

I made a RAR from the project. ESPEasy_platformIO_issue2796.rar

Do you still want me to test with Python 3.7.x or is it clearly a bug independent on Python version?

We are testing it now. I'll back with updates soon.

Does it appear to be a bigger problem than it first looked to be, or did it raise other issues as well?

@valeros tried different examples yesterday and couldn't reproduce this issue. Let's see when someone else will have the same issue.

Someone else on our forum did report the same issue yesterday. ESPEasy Forum - Custom build error VS-code and atom

Edit:
Maybe extending the reported error would also be helpful. For example echo'ing the line with the character and/or the file name.

@TD-er I tried all environments from your project on Win8, Win10 with Python 3.7, 3.8 and still cannot reproduce the issue. I see it fails with esp8266 platform, what environment exactly?

I tried a number of them just to be sure it was in my code and not in esp8266/Arduino latest code.

The ones I tried for sure are:

  • custom_ESP8266_4M1M uses a newer core lib
  • normal_core_241_ESP8266_4M1M uses an older core

@TD-er Could I ask you to run the following script:

from os import listdir
from os.path import join

import chardet

PROJECT_SRC_DIR = "e:\\Temp\\espeasy\\src"


def is_ascii(s):
    return all(ord(c) < 128 for c in s)


for f in listdir(PROJECT_SRC_DIR):
    if not f.endswith(".ino"):
        continue
    print("File: %s" % f)
    with open(join(PROJECT_SRC_DIR, f)) as fp:
        data = fp.read()
        encoding = chardet.detect(str.encode(data))['encoding']
        print("Encoding: %s" % encoding)
        if encoding == "ascii" or encoding is None:
            assert(is_ascii(data))

It depends on a special package, so you need to install it: pip install chardet
and then you can run python script.py (change PROJECT_SRC_DIR to your path)

I have removed Python 2.7.16 again, installed Python 3.8.0 (64 bit) and tested with the custom env I named before.
It still does not build. => check.

The output of your script:

PS C:\GitHub\TD-er\ESPEasy> pip install chardet
Requirement already satisfied: chardet in c:\users\gijs\.platformio\penv\lib\site-packages (3.0.4)
WARNING: You are using pip version 19.2.3, however version 19.3 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.
PS C:\GitHub\TD-er\ESPEasy> python .\testissue_2796.py
File: Command.ino, encoding: ascii
File: Controller.ino, encoding: ascii
File: Convert.ino, encoding: ascii
File: ESPEasy.ino, encoding: utf-8
File: ESPeasyControllerCache.ino, encoding: ascii
File: ESPeasyGPIO.ino, encoding: None
File: ESPEasyRTC.ino, encoding: ascii
File: ESPEasyRules.ino, encoding: ascii
File: ESPEasyStatistics.ino, encoding: ascii
File: ESPEasyStorage.ino, encoding: ascii
File: ESPEasyWifi.ino, encoding: ascii
File: ESPEasyWiFi_credentials.ino, encoding: ascii
File: ESPEasyWifi_ProcessEvent.ino, encoding: ascii
File: ESPEasy_checks.ino, encoding: ascii
File: ESPEasy_Log.ino, encoding: ascii
File: Hardware.ino, encoding: ascii
File: I2C.ino, encoding: ascii
File: Misc.ino, encoding: ascii
File: Modbus.ino, encoding: ascii
File: Modbus_RTU.ino, encoding: ascii
File: Networking.ino, encoding: ascii
File: Scheduler.ino, encoding: ascii
File: Serial.ino, encoding: ascii
File: StringConverter.ino, encoding: ascii
File: StringProvider.ino, encoding: ascii
File: TimeESPeasy.ino, encoding: ascii
File: TimeZoneESPeasy.ino, encoding: ascii
File: WebServer.ino, encoding: ascii
File: WebServer_404.ino, encoding: ascii
File: WebServer_AccessControl.ino, encoding: ascii
File: WebServer_AdvancedConfigPage.ino, encoding: ascii
File: WebServer_CacheControllerPages.ino, encoding: ascii
File: WebServer_ConfigPage.ino, encoding: ascii
File: WebServer_ControllerPage.ino, encoding: ascii
File: WebServer_ControlPage.ino, encoding: ascii
File: WebServer_CustomPage.ino, encoding: ascii
File: WebServer_DevicesPage.ino, encoding: ascii
File: WebServer_DownloadPage.ino, encoding: ascii
File: WebServer_FactoryResetPage.ino, encoding: ascii
File: WebServer_Favicon.ino, encoding: ascii
File: WebServer_FileList.ino, encoding: ascii
File: WebServer_HardwarePage.ino, encoding: ascii
File: WebServer_HTML_wrappers.ino, encoding: ascii
File: WebServer_I2C_Scanner.ino, encoding: ascii
File: WebServer_JSON.ino, encoding: ascii
File: WebServer_LoadFromFS.ino, encoding: ascii
File: WebServer_Log.ino, encoding: ascii
File: WebServer_login.ino, encoding: ascii
File: WebServer_Markup.ino, encoding: ascii
File: WebServer_Markup_Buttons.ino, encoding: ascii
File: WebServer_Markup_Forms.ino, encoding: ascii
File: WebServer_NotificationPage.ino, encoding: ascii
File: WebServer_PinStates.ino, encoding: ascii
File: WebServer_RootPage.ino, encoding: ascii
File: WebServer_Rules.ino, encoding: ascii
File: WebServer_SettingsArchive.ino, encoding: ascii
File: WebServer_SetupPage.ino, encoding: ascii
File: WebServer_SysInfoPage.ino, encoding: ascii
File: WebServer_SysVarPage.ino, encoding: ascii
File: WebServer_TimingStats.ino, encoding: ascii
File: WebServer_ToolsPage.ino, encoding: ascii
File: WebServer_UploadPage.ino, encoding: ascii
File: WebServer_WiFiScanner.ino, encoding: ascii
File: _C001.ino, encoding: ascii
File: _C002.ino, encoding: ascii
File: _C003.ino, encoding: ascii
File: _C004.ino, encoding: ascii
File: _C005.ino, encoding: ascii
File: _C006.ino, encoding: ascii
File: _C007.ino, encoding: ascii
File: _C008.ino, encoding: ascii
File: _C009.ino, encoding: ascii
File: _C010.ino, encoding: ascii
File: _C011.ino, encoding: ascii
File: _C012.ino, encoding: ascii
File: _C013.ino, encoding: ascii
File: _C014.ino, encoding: utf-8
File: _C015.ino, encoding: utf-8
File: _C016.ino, encoding: ascii
File: _C017.ino, encoding: ascii
File: _C018.ino, encoding: ascii
File: _C019.ino, encoding: ascii
File: _CPlugin_DomoticzHelper.ino, encoding: ascii
File: _CPlugin_Helper_webform.ino, encoding: ascii
File: _CPlugin_LoRa_TTN_helper.ino, encoding: ascii
File: _CPlugin_SensorTypeHelper.ino, encoding: ascii
File: _N001_Email.ino, encoding: ascii
File: _N002_Buzzer.ino, encoding: ascii
File: _P001_Switch.ino, encoding: ascii
File: _P002_ADC.ino, encoding: ascii
File: _P003_Pulse.ino, encoding: ascii
File: _P004_Dallas.ino, encoding: ascii
File: _P005_DHT.ino, encoding: ascii
File: _P006_BMP085.ino, encoding: ascii
File: _P007_PCF8591.ino, encoding: ascii
File: _P008_RFID.ino, encoding: ascii
File: _P009_MCP.ino, encoding: ascii
File: _P010_BH1750.ino, encoding: ascii
File: _P011_PME.ino, encoding: ascii
File: _P012_LCD.ino, encoding: ascii
File: _P013_HCSR04.ino, encoding: ascii
File: _P014_SI7021.ino, encoding: ascii
File: _P015_TSL2561.ino, encoding: ascii
File: _P016_IR.ino, encoding: ascii
File: _P017_PN532.ino, encoding: ascii
File: _P018_Dust.ino, encoding: ascii
File: _P019_PCF8574.ino, encoding: ascii
File: _P020_Ser2Net.ino, encoding: ascii
File: _P021_Level.ino, encoding: ascii
File: _P022_PCA9685.ino, encoding: ascii
File: _P023_OLED.ino, encoding: ascii
File: _P024_MLX90614.ino, encoding: ascii
File: _P025_ADS1115.ino, encoding: ascii
File: _P026_Sysinfo.ino, encoding: ascii
File: _P027_INA219.ino, encoding: ascii
File: _P028_BME280.ino, encoding: ascii
File: _P029_Output.ino, encoding: ascii
File: _P030_BMP280.ino, encoding: ascii
File: _P031_SHT1X.ino, encoding: ascii
File: _P032_MS5611.ino, encoding: ascii
File: _P033_Dummy.ino, encoding: ascii
File: _P034_DHT12.ino, encoding: ascii
File: _P035_IRTX.ino, encoding: ascii
File: _P036_FrameOLED.ino, encoding: ascii
File: _P037_MQTTImport.ino, encoding: ascii
File: _P038_NeoPixel.ino, encoding: utf-8
File: _P039_Thermocouple.ino, encoding: utf-8
File: _P040_ID12.ino, encoding: ascii
File: _P041_NeoClock.ino, encoding: ascii
File: _P042_Candle.ino, encoding: utf-8
File: _P043_ClkOutput.ino, encoding: ascii
File: _P044_P1WifiGateway.ino, encoding: ascii
File: _P045_MPU6050.ino, encoding: ascii
File: _P046_VentusW266.ino, encoding: ascii
File: _P047_i2c-soil-moisture-sensor.ino, encoding: ascii
File: _P048_Motorshield_v2.ino, encoding: ascii
File: _P049_MHZ19.ino, encoding: ascii
File: _P050_TCS34725.ino, encoding: ascii
File: _P051_AM2320.ino, encoding: ascii
Traceback (most recent call last):
  File ".\testissue_2796.py", line 18, in <module>
    data = fp.read()
  File "C:\Users\gijs\AppData\Local\Programs\Python\Python38\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 2990: character maps to <undefined>
PS C:\GitHub\TD-er\ESPEasy> 

I updated the script, could you please run it again?

Found the culprit in the file for _P052_SenseAir.ino line 66:

#define P052_EEPROM_ADDR_LOGGER_STRUCTURE_ADDRESS  0x200 // 16b Described in “BLG_ELG Logger Structure”

These quotes are non ASCII.

Still this part of the code has been here for a while already.

What is the encoding of this file?

Notepad++ as well as VScode claim it is has UTF-8 encoding.

The updated script fails on the same file.

File: _P047_i2c-soil-moisture-sensor.ino
Encoding: ascii
File: _P048_Motorshield_v2.ino
Encoding: ascii
File: _P049_MHZ19.ino
Encoding: ascii
File: _P050_TCS34725.ino
Encoding: ascii
File: _P051_AM2320.ino
Encoding: ascii
File: _P052_SenseAir.ino
Traceback (most recent call last):
  File ".\testissue_2796.py", line 17, in <module>
    data = fp.read()
  File "C:\Users\gijs\AppData\Local\Programs\Python\Python38\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 2990: character maps to <undefined>
PS C:\GitHub\TD-er\ESPEasy> 

@TD-er Thanks, could you please try this one:

from os import listdir
from os.path import join
import chardet

PROJECT_SRC_DIR = "e:\\Temp\\espeasy\\src"


def get_file_contents(path):
    try:
        with open(path) as f:
            return f.read()
    except UnicodeDecodeError:
        with open(path, encoding="latin-1") as f:
            return f.read()


def is_ascii(s):
    return all(ord(c) < 128 for c in s)


for f in listdir(PROJECT_SRC_DIR):
    if not f.endswith(".ino"):
        continue

    data = get_file_contents(join(PROJECT_SRC_DIR, f))

    encoding = chardet.detect(str.encode(data))['encoding']
    print("Encoding: %s" % encoding)
    if encoding == "ascii" or encoding is None:
        assert(is_ascii(data))

The file _P051_AM2320.ino has ASCII encoding according to the script, but VScode claims it is UTF-8.

Another difference between P051 and P052 is that 051 uses LF and 052 uses CRLF.

If I move the "bad line" of code from P052 to P051 file, then the script does crash on that file too.
So it really is the special quotes in the comment so it seems.

has ASCII encoding according

Yes, this is actually a bug in PlatformIO. We read file in ascii if it fails but save in UTF8. Let me push some fix.

I ran the last script and added a line to print the file name too (after it successful ran):

PS C:\GitHub\TD-er\ESPEasy> python .\testissue_2796.py
File: Command.ino
Encoding: ascii
File: Controller.ino
Encoding: ascii
File: Convert.ino
Encoding: ascii
File: ESPEasy.ino
Encoding: utf-8
File: ESPeasyControllerCache.ino
Encoding: ascii
File: ESPeasyGPIO.ino
Encoding: None
File: ESPEasyRTC.ino
Encoding: ascii
File: ESPEasyRules.ino
Encoding: ascii
File: ESPEasyStatistics.ino
Encoding: ascii
File: ESPEasyStorage.ino
Encoding: ascii
File: ESPEasyWifi.ino
Encoding: ascii
File: ESPEasyWiFi_credentials.ino
Encoding: ascii
File: ESPEasyWifi_ProcessEvent.ino
Encoding: ascii
File: ESPEasy_checks.ino
Encoding: ascii
File: ESPEasy_Log.ino
Encoding: ascii
File: Hardware.ino
Encoding: ascii
File: I2C.ino
Encoding: ascii
File: Misc.ino
Encoding: ascii
File: Modbus.ino
Encoding: ascii
File: Modbus_RTU.ino
Encoding: ascii
File: Networking.ino
Encoding: ascii
File: Scheduler.ino
Encoding: ascii
File: Serial.ino
Encoding: ascii
File: StringConverter.ino
Encoding: ascii
File: StringProvider.ino
Encoding: ascii
File: TimeESPeasy.ino
Encoding: ascii
File: TimeZoneESPeasy.ino
Encoding: ascii
File: WebServer.ino
Encoding: ascii
File: WebServer_404.ino
Encoding: ascii
File: WebServer_AccessControl.ino
Encoding: ascii
File: WebServer_AdvancedConfigPage.ino
Encoding: ascii
File: WebServer_CacheControllerPages.ino
Encoding: ascii
File: WebServer_ConfigPage.ino
Encoding: ascii
File: WebServer_ControllerPage.ino
Encoding: ascii
File: WebServer_ControlPage.ino
Encoding: ascii
File: WebServer_CustomPage.ino
Encoding: ascii
File: WebServer_DevicesPage.ino
Encoding: ascii
File: WebServer_DownloadPage.ino
Encoding: ascii
File: WebServer_FactoryResetPage.ino
Encoding: ascii
File: WebServer_Favicon.ino
Encoding: ascii
File: WebServer_FileList.ino
Encoding: ascii
File: WebServer_HardwarePage.ino
Encoding: ascii
File: WebServer_HTML_wrappers.ino
Encoding: ascii
File: WebServer_I2C_Scanner.ino
Encoding: ascii
File: WebServer_JSON.ino
Encoding: ascii
File: WebServer_LoadFromFS.ino
Encoding: ascii
File: WebServer_Log.ino
Encoding: ascii
File: WebServer_login.ino
Encoding: ascii
File: WebServer_Markup.ino
Encoding: ascii
File: WebServer_Markup_Buttons.ino
Encoding: ascii
File: WebServer_Markup_Forms.ino
Encoding: ascii
File: WebServer_NotificationPage.ino
Encoding: ascii
File: WebServer_PinStates.ino
Encoding: ascii
File: WebServer_RootPage.ino
Encoding: ascii
File: WebServer_Rules.ino
Encoding: ascii
File: WebServer_SettingsArchive.ino
Encoding: ascii
File: WebServer_SetupPage.ino
Encoding: ascii
File: WebServer_SysInfoPage.ino
Encoding: ascii
File: WebServer_SysVarPage.ino
Encoding: ascii
File: WebServer_TimingStats.ino
Encoding: ascii
File: WebServer_ToolsPage.ino
Encoding: ascii
File: WebServer_UploadPage.ino
Encoding: ascii
File: WebServer_WiFiScanner.ino
Encoding: ascii
File: _C001.ino
Encoding: ascii
File: _C002.ino
Encoding: ascii
File: _C003.ino
Encoding: ascii
File: _C004.ino
Encoding: ascii
File: _C005.ino
Encoding: ascii
File: _C006.ino
Encoding: ascii
File: _C007.ino
Encoding: ascii
File: _C008.ino
Encoding: ascii
File: _C009.ino
Encoding: ascii
File: _C010.ino
Encoding: ascii
File: _C011.ino
Encoding: ascii
File: _C012.ino
Encoding: ascii
File: _C013.ino
Encoding: ascii
File: _C014.ino
Encoding: utf-8
File: _C015.ino
Encoding: utf-8
File: _C016.ino
Encoding: ascii
File: _C017.ino
Encoding: ascii
File: _C018.ino
Encoding: ascii
File: _C019.ino
Encoding: ascii
File: _CPlugin_DomoticzHelper.ino
Encoding: ascii
File: _CPlugin_Helper_webform.ino
Encoding: ascii
File: _CPlugin_LoRa_TTN_helper.ino
Encoding: ascii
File: _CPlugin_SensorTypeHelper.ino
Encoding: ascii
File: _N001_Email.ino
Encoding: ascii
File: _N002_Buzzer.ino
Encoding: ascii
File: _P001_Switch.ino
Encoding: ascii
File: _P002_ADC.ino
Encoding: ascii
File: _P003_Pulse.ino
Encoding: ascii
File: _P004_Dallas.ino
Encoding: ascii
File: _P005_DHT.ino
Encoding: ascii
File: _P006_BMP085.ino
Encoding: ascii
File: _P007_PCF8591.ino
Encoding: ascii
File: _P008_RFID.ino
Encoding: ascii
File: _P009_MCP.ino
Encoding: ascii
File: _P010_BH1750.ino
Encoding: ascii
File: _P011_PME.ino
Encoding: ascii
File: _P012_LCD.ino
Encoding: ascii
File: _P013_HCSR04.ino
Encoding: ascii
File: _P014_SI7021.ino
Encoding: ascii
File: _P015_TSL2561.ino
Encoding: ascii
File: _P016_IR.ino
Encoding: ascii
File: _P017_PN532.ino
Encoding: ascii
File: _P018_Dust.ino
Encoding: ascii
File: _P019_PCF8574.ino
Encoding: ascii
File: _P020_Ser2Net.ino
Encoding: ascii
File: _P021_Level.ino
Encoding: ascii
File: _P022_PCA9685.ino
Encoding: ascii
File: _P023_OLED.ino
Encoding: ascii
File: _P024_MLX90614.ino
Encoding: ascii
File: _P025_ADS1115.ino
Encoding: ascii
File: _P026_Sysinfo.ino
Encoding: ascii
File: _P027_INA219.ino
Encoding: ascii
File: _P028_BME280.ino
Encoding: ascii
File: _P029_Output.ino
Encoding: ascii
File: _P030_BMP280.ino
Encoding: ascii
File: _P031_SHT1X.ino
Encoding: ascii
File: _P032_MS5611.ino
Encoding: ascii
File: _P033_Dummy.ino
Encoding: ascii
File: _P034_DHT12.ino
Encoding: ascii
File: _P035_IRTX.ino
Encoding: ascii
File: _P036_FrameOLED.ino
Encoding: ascii
File: _P037_MQTTImport.ino
Encoding: ascii
File: _P038_NeoPixel.ino
Encoding: utf-8
File: _P039_Thermocouple.ino
Encoding: utf-8
File: _P040_ID12.ino
Encoding: ascii
File: _P041_NeoClock.ino
Encoding: ascii
File: _P042_Candle.ino
Encoding: utf-8
File: _P043_ClkOutput.ino
Encoding: ascii
File: _P044_P1WifiGateway.ino
Encoding: ascii
File: _P045_MPU6050.ino
Encoding: ascii
File: _P046_VentusW266.ino
Encoding: ascii
File: _P047_i2c-soil-moisture-sensor.ino
Encoding: ascii
File: _P048_Motorshield_v2.ino
Encoding: ascii
File: _P049_MHZ19.ino
Encoding: ascii
File: _P050_TCS34725.ino
Encoding: ascii
File: _P051_AM2320.ino
Encoding: utf-8
File: _P052_SenseAir.ino
Encoding: utf-8
File: _P053_PMSx003.ino
Encoding: ascii
File: _P054_DMX512.ino
Encoding: utf-8
File: _P055_Chiming.ino
Encoding: ascii
File: _P056_SDS011-Dust.ino
Encoding: utf-8
File: _P057_HT16K33_LED.ino
Encoding: ascii
File: _P058_HT16K33_KeyPad.ino
Encoding: ascii
File: _P059_Encoder.ino
Encoding: ascii
File: _P060_MCP3221.ino
Encoding: ascii
File: _P061_KeyPad.ino
Encoding: utf-8
File: _P062_MPR121_KeyPad.ino
Encoding: ascii
File: _P063_TTP229_KeyPad.ino
Encoding: ascii
File: _P064_APDS9960.ino
Encoding: ascii
File: _P065_DRF0299_MP3.ino
Encoding: ascii
File: _P066_VEML6040.ino
Encoding: ascii
File: _P067_HX711_Load_Cell.ino
Encoding: ascii
File: _P068_SHT3x.ino
Encoding: ascii
File: _P069_LM75A.ino
Encoding: ascii
File: _P070_NeoPixel_Clock.ino
Encoding: ascii
File: _P071_Kamstrup401.ino
Encoding: ascii
File: _P072_HDC1080.ino
Encoding: ascii
File: _P073_7DGT.ino
Encoding: utf-8
File: _P074_TSL2591.ino
Encoding: ascii
File: _P075_Nextion.ino
Encoding: ascii
File: _P076_HLW8012.ino
Encoding: ascii
File: _P077_CSE7766.ino
Encoding: utf-8
File: _P078_Eastron.ino
Encoding: ascii
File: _P079_Wemos_Motorshield.ino
Encoding: ascii
File: _P080_DallasIButton.ino
Encoding: ascii
File: _P081_Cron.ino
Encoding: ascii
File: _P082_GPS.ino
Encoding: ascii
File: _P083_SGP30.ino
Encoding: ascii
File: _P084_VEML6070.ino
Encoding: ascii
File: _P085_AcuDC243.ino
Encoding: ascii
File: _P086_Homie.ino
Encoding: ascii
File: _P087_SerialProxy.ino
Encoding: ascii
File: _P088_HeatpumpIR.ino
Encoding: ascii
File: _Plugin_Helper_serial.ino
Encoding: ascii
File: _Pxxx_PluginTemplate.ino
Encoding: ascii
File: _Reporting.ino
Encoding: ascii
File: __CPlugin.ino
Encoding: ascii
File: __NPlugin.ino
Encoding: ascii
File: __Plugin.ino
Encoding: ascii
File: __ReleaseNotes.ino
Encoding: ascii

Thanks for the tests! Let me make a quick fix.

Please re-test with pio upgrade --dev.

Does it work now?

P.S: @vovane, your issue was fixed

Nope, still cannot build it.

> Executing task: C:\Users\gijs\.platformio\penv\Scripts\platformio.exe run --environment custom_ESP8266_4M1M <

Processing custom_ESP8266_4M1M (platform: https://github.com/platformio/platform-espressif8266.git#feature/stage; board: esp12e; framework: arduino)
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Verbose mode can be enabled via `-v, --verbose` option
Mkdir("C:\GitHub\TD-er\ESPEasy\.pio\build\custom_ESP8266_4M1M")
['PIO_FRAMEWORK_ARDUINO_ESPRESSIF_SDK22y', 'CONTROLLER_SET_ALL', 'NOTIFIER_SET_NONE', 'PLUGIN_SET_ONLY_SWITCH', 'USES_P001', 'USES_P002', 'USES_P004', 'USES_P028', 'USES_P036', 'USES_P049', 'USES_P052', 'USES_P056', 'USES_P059', 'USES_P082', 'USES_P085', 'USES_P087', 'USES_C016', 'USES_C018', 'USE_SETTINGS_ARCHIVE']
CONFIGURATION: https://docs.platformio.org/page/boards/espressif8266/esp12e.html
PLATFORM: Espressif 8266 (Stage) 2.3.0-alpha.2 #3500fb2 > Espressif ESP8266 ESP-12E
HARDWARE: ESP8266 80MHz, 80KB RAM, 4MB Flash
PACKAGES: toolchain-xtensa 2.40802.190218 (4.8.2), framework-arduinoespressif8266 45d71ae, tool-esptool 1.413.0 (4.13), tool-esptoolpy 1.20600.0 (2.6.0)
UnicodeEncodeError: 'charmap' codec can't encode characters in position 1671335-1671336: character maps to <undefined>:
  File "C:\Users\gijs\.platformio\penv\lib\site-packages\platformio\builder\main.py", line 148:
    env.SConscript("$BUILD_SCRIPT")
  File "C:\Users\gijs\.platformio\packages\tool-scons\script\..\engine\SCons\Script\SConscript.py", line 605:
    return _SConscript(self.fs, *files, **subst_kw)
  File "C:\Users\gijs\.platformio\packages\tool-scons\script\..\engine\SCons\Script\SConscript.py", line 286:
    exec(compile(scriptdata, scriptname, 'exec'), call_stack[-1].globals)
  File "C:\Users\gijs\.platformio\platforms\espressif8266@src-d2f6a4ecb96f34425e5e701de09dc0a9\builder\main.py", line 203:
    target_elf = env.BuildProgram()
  File "C:\Users\gijs\.platformio\packages\tool-scons\script\..\engine\SCons\Environment.py", line 224:
    return self.method(*nargs, **kwargs)
  File "C:\Users\gijs\.platformio\penv\lib\site-packages\platformio\builder\tools\platformio.py", line 114:
    env.BuildFrameworks(env.get("PIOFRAMEWORK"))
  File "C:\Users\gijs\.platformio\packages\tool-scons\script\..\engine\SCons\Environment.py", line 224:
    return self.method(*nargs, **kwargs)
  File "C:\Users\gijs\.platformio\penv\lib\site-packages\platformio\builder\tools\platformio.py", line 295:
    env.ConvertInoToCpp()
  File "C:\Users\gijs\.platformio\packages\tool-scons\script\..\engine\SCons\Environment.py", line 224:
    return self.method(*nargs, **kwargs)
  File "C:\Users\gijs\.platformio\penv\lib\site-packages\platformio\builder\tools\piomisc.py", line 205:
    out_file = c.convert(ino_nodes)
  File "C:\Users\gijs\.platformio\penv\lib\site-packages\platformio\builder\tools\piomisc.py", line 59:
    return self.process(contents)
  File "C:\Users\gijs\.platformio\penv\lib\site-packages\platformio\builder\tools\piomisc.py", line 80:
    assert self._gcc_preprocess(contents, out_file)
  File "C:\Users\gijs\.platformio\penv\lib\site-packages\platformio\builder\tools\piomisc.py", line 88:
    fs.write_file_contents(tmp_path, contents)
  File "C:\Users\gijs\.platformio\penv\lib\site-packages\platformio\fs.py", line 64:
    return fp.write(contents)
  File "C:\Users\gijs\AppData\Local\Programs\Python\Python38\lib\encodings\cp1252.py", line 19:
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
PS C:\GitHub\TD-er\ESPEasy> pio --version
PlatformIO, version 4.1.0b4

Sorry, this is my typo. Please pio upgrade --dev and try again.

Please re-test with pio upgrade --dev.

Does it work now?

P.S: @vovane, your issue was fixed

@ivankravets Thanks! PlatformIO updated automatically when I launched VSCode and now my project builds. Although the warnings still have garbled text in them:

In file included from lib\iarduino_OLED_txt-1.1.0\src/iarduino_OLED_txt.h:21:0, from src\main.cpp:3: lib\iarduino_OLED_txt-1.1.0\src/iarduino_OLED_txt_I2C.h: In member function 'virtual bool iarduino_I2C::readBytes(uint8_t, uint8_t, uint8_t*, uint8_t)': lib\iarduino_OLED_txt-1.1.0\src/iarduino_OLED_txt_I2C.h:145:24: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses] if (sum) { if(TWSR&0xF8!=0x50) { i=0;}} // Если после чтения очередного байта пакета значение регистра состояния шины I2C Arduino TWSR с маской 0xF8 не равно 0x50 значит произошла ошибка при чтении ^ lib\iarduino_OLED_txt-1.1.0\src/iarduino_OLED_txt_I2C.h:146:21: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses] else { if(TWSR&0xF8!=0x58) { i=0;}} // Если после чтения последного байта пакета значение регистра состояния шины I2C Arduino TWSR с маской 0xF8 не равно 0x58 значит произошла ошибка при чтении ^ lib\iarduino_OLED_txt-1.1.0\src/iarduino_OLED_txt_I2C.h: In member function 'virtual bool iarduino_I2C::readBytes(uint8_t, uint8_t*, uint8_t)': lib\iarduino_OLED_txt-1.1.0\src/iarduino_OLED_txt_I2C.h:168:24: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses] if (sum) { if(TWSR&0xF8!=0x50) { i=0;}} // Если после чтения очередного байта пакета значение регистра состояния шины I2C Arduino TWSR с маской 0xF8 не равно 0x50 значит произошла ошибка при чтении ^ lib\iarduino_OLED_txt-1.1.0\src/iarduino_OLED_txt_I2C.h:169:21: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses] else { if(TWSR&0xF8!=0x58) { i=0;}} // Если после чтения последного байта пакета значение регистра состояния шины I2C Arduino TWSR с маской 0xF8 не равно 0x58 значит произошла ошибка при чтении ^ lib\iarduino_DHT-master\src\iarduino_DHT.cpp: In member function 'int8_t iarduino_DHT::readSDA()': lib\iarduino_DHT-master\src\iarduino_DHT.cpp:31:50: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses] if( (reply[0]==0) || (reply[1]!=0) || (reply[2]&0x80>0) ) { model=22; } // Датчик определён как DHT22, так как DHT11 возвращает влажность более 10%, без десятых долей, а температуру без отрицательных значений.

Might have something to do with VSCode itself or its CPP extension, though. Not sure.

@vovane "iarduino_OLED_txt.h" - does it contain non-ascii chars?

Sorry, this is my typo. Please pio upgrade --dev and try again.

Still doesn't work. The PIO version number does seem to be the same.

PS C:\GitHub\TD-er\ESPEasy> pio upgrade --dev
Please wait while upgrading PlatformIO ...
PlatformIO has been successfully upgraded to 4.1.0b4
Release notes: https://docs.platformio.org/en/latest/history.html
> Executing task: C:\Users\gijs\.platformio\penv\Scripts\platformio.exe run --environment custom_ESP8266_4M1M <

Processing custom_ESP8266_4M1M (platform: https://github.com/platformio/platform-espressif8266.git#feature/stage; board: esp12e; framework: arduino)
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Verbose mode can be enabled via `-v, --verbose` option
['PIO_FRAMEWORK_ARDUINO_ESPRESSIF_SDK22y', 'CONTROLLER_SET_ALL', 'NOTIFIER_SET_NONE', 'PLUGIN_SET_ONLY_SWITCH', 'USES_P001', 'USES_P002', 'USES_P004', 'USES_P028', 'USES_P036', 'USES_P049', 'USES_P052', 'USES_P056', 'USES_P059', 'USES_P082', 'USES_P085', 'USES_P087', 'USES_C016', 'USES_C018', 'USE_SETTINGS_ARCHIVE']
CONFIGURATION: https://docs.platformio.org/page/boards/espressif8266/esp12e.html
PLATFORM: Espressif 8266 (Stage) 2.3.0-alpha.2 #3500fb2 > Espressif ESP8266 ESP-12E
HARDWARE: ESP8266 80MHz, 80KB RAM, 4MB Flash
PACKAGES: toolchain-xtensa 2.40802.190218 (4.8.2), framework-arduinoespressif8266 45d71ae, tool-esptool 1.413.0 (4.13), tool-esptoolpy 1.20600.0 (2.6.0)
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 850839-850840: ordinal not in range(256):
  File "C:\Users\gijs\.platformio\penv\lib\site-packages\platformio\builder\main.py", line 148:
    env.SConscript("$BUILD_SCRIPT")
  File "C:\Users\gijs\.platformio\packages\tool-scons\script\..\engine\SCons\Script\SConscript.py", line 605:
    return _SConscript(self.fs, *files, **subst_kw)
  File "C:\Users\gijs\.platformio\packages\tool-scons\script\..\engine\SCons\Script\SConscript.py", line 286:
    exec(compile(scriptdata, scriptname, 'exec'), call_stack[-1].globals)
  File "C:\Users\gijs\.platformio\platforms\espressif8266@src-d2f6a4ecb96f34425e5e701de09dc0a9\builder\main.py", line 203:
    target_elf = env.BuildProgram()
  File "C:\Users\gijs\.platformio\packages\tool-scons\script\..\engine\SCons\Environment.py", line 224:
    return self.method(*nargs, **kwargs)
  File "C:\Users\gijs\.platformio\penv\lib\site-packages\platformio\builder\tools\platformio.py", line 113:
    env.BuildFrameworks(env.get("PIOFRAMEWORK"))
  File "C:\Users\gijs\.platformio\packages\tool-scons\script\..\engine\SCons\Environment.py", line 224:
    return self.method(*nargs, **kwargs)
  File "C:\Users\gijs\.platformio\penv\lib\site-packages\platformio\builder\tools\platformio.py", line 301:
    env.ConvertInoToCpp()
  File "C:\Users\gijs\.platformio\packages\tool-scons\script\..\engine\SCons\Environment.py", line 224:
    return self.method(*nargs, **kwargs)
  File "C:\Users\gijs\.platformio\penv\lib\site-packages\platformio\builder\tools\piomisc.py", line 205:
    out_file = c.convert(ino_nodes)
  File "C:\Users\gijs\.platformio\penv\lib\site-packages\platformio\builder\tools\piomisc.py", line 59:
    return self.process(contents)
  File "C:\Users\gijs\.platformio\penv\lib\site-packages\platformio\builder\tools\piomisc.py", line 80:
    assert self._gcc_preprocess(contents, out_file)
  File "C:\Users\gijs\.platformio\penv\lib\site-packages\platformio\builder\tools\piomisc.py", line 88:
    fs.write_file_contents(tmp_path, contents)
  File "C:\Users\gijs\.platformio\penv\lib\site-packages\platformio\fs.py", line 67:
    return fp.write(contents)

@TD-er I see you have experience with Python. Could you try to play with that file on your machine? What is a valid encoding to read a broken file and write it back to the system?

What Python file do you mean?
This one C:\Users\gijs\.platformio\penv\lib\site-packages\platformio\fs.py ?

This part?

def write_file_contents(path, contents):
    try:
        with open(path, "w") as fp:
            return fp.write(contents)
    except UnicodeEncodeError:
        with io.open(path, "w", encoding="latin-1") as fp:
            return fp.write(contents)

Have not tested it in Python, but the characters seem to be part of
Western European (ISO) (code page 28591, iso-8859-1)

I tried it online to decode/encode the faulty line with these settings:
Encoding

  • from Western European (ISO) (code page 28591, iso-8859-1)
  • to Unicode (UTF-8) (code page 65001, utf-8)

The faulty line is in _p052_SenseAir.ino at line 66 in case the encoding is lost by copy/paste it here.
#define P052_EEPROM_ADDR_LOGGER_STRUCTURE_ADDRESS 0x200 // 16b Described in “BLG_ELG Logger Structure”

No, justt write your own. Try using Python 3 and read this INO file, and then write back to the file system udner test.ino. Dp you see errors?

First conclusion.
the chardet package cannot detect the right encoding.
Also specific mention of the encoding does not fix it when saving the file.
Reading the file is already mangling the data, even when providing the encoding.

I have here the test Python code + a few files included in the attached zip.

  • testwrongEncoding.ino has just the string where it fails in my code. (UFT-8 according to Notepad++)
  • iso-8859-1_Encoding.ino has the same string, saved from Notepad++ as ISO 8859-1 encoding

PIO_issue_2796.zip

The python program does try to read these and uses encoding="iso-8859-1" as fall-back when reading.
As you can see in the generated files (test is prepended to the file when saving, so multiple runs generate testtest.... etc) the one which had the right encoding when saved is now correct converted to ascii and continues to be read and saved in a well defined way. The special quotes are converted into normal quotes but that's not really a problem I guess.
Not sure what does happen with intentional used special characters in some programming languages though.

The wrongEncoding.ino file does get mangled and it remains mangled. (keeps generating decode errors)

The output of my script:

PS C:\GitHub\TD-er\ESPEasy> python .\testissue_2796.py
----
File: iso-8859-1_Encoding.ino
Encoding: {'encoding': 'ascii', 'confidence': 1.0, 'language': ''}
Fileout: testiso-8859-1_Encoding.ino
----
File: testiso-8859-1_Encoding.ino
Encoding: {'encoding': 'ascii', 'confidence': 1.0, 'language': ''}
Fileout: testtestiso-8859-1_Encoding.ino
----
File: testwrongEncoding.ino
UnicodeDecodeError
Encoding: {'encoding': 'utf-8', 'confidence': 0.99, 'language': ''}
Fileout: testtestwrongEncoding.ino
----
File: wrongEncoding.ino
UnicodeDecodeError
Encoding: {'encoding': 'utf-8', 'confidence': 0.99, 'language': ''}
Fileout: testwrongEncoding.ino
PS C:\GitHub\TD-er\ESPEasy>

I hope this can help you further a bit?
My Python experience is not that extensive. For example if StackOverflow is offline, my experience is also offline ;)

I think this describes exactly the problem we're facing here.
In my specific use case the backslashreplace error handler could be useful?
I will do some tests with it.

But I think the real fix for this is not to have a silent read fallback option to be able to continue parsing a file which is not decoded with the correct encoding table.
It should then fail with a very descriptive error what file is failing and if possible at what line.

This does fix my specific file issue:

def get_file_contents(path):
    try:
        with open(path) as f:
            return f.read()
    except UnicodeDecodeError:
        print('UnicodeDecodeError')
        with open(path, encoding="utf8", errors="backslashreplace") as f:
            return f.read()


def write_file_contents(path, contents, encoding):
    try:
        with open(path, "w", encoding=encoding) as fp:
            return fp.write(contents)
    except UnicodeEncodeError:
        print('UnicodeEncodeError')
        with open(path, "w", encoding="utf8", errors="backslashreplace") as fp:
            return fp.write(contents)

When saving it, it will save it the same as it was, so it will also have the same decoder error when reading the generated file again.

PS C:\GitHub\TD-er\ESPEasy> python .\testissue_2796.py
----
File: iso-8859-1_Encoding.ino
Encoding: {'encoding': 'ascii', 'confidence': 1.0, 'language': ''}
Fileout: testiso-8859-1_Encoding.ino
----
File: wrongEncoding.ino
UnicodeDecodeError
Encoding: {'encoding': 'utf-8', 'confidence': 0.7525, 'language': ''}
Fileout: testwrongEncoding.ino
PS C:\GitHub\TD-er\ESPEasy> python .\testissue_2796.py
----
File: iso-8859-1_Encoding.ino
Encoding: {'encoding': 'ascii', 'confidence': 1.0, 'language': ''}
Fileout: testiso-8859-1_Encoding.ino
----
File: testiso-8859-1_Encoding.ino
Encoding: {'encoding': 'ascii', 'confidence': 1.0, 'language': ''}
Fileout: testtestiso-8859-1_Encoding.ino
----
File: testwrongEncoding.ino
UnicodeDecodeError
Encoding: {'encoding': 'utf-8', 'confidence': 0.7525, 'language': ''}
Fileout: testtestwrongEncoding.ino
----
File: wrongEncoding.ino
UnicodeDecodeError
Encoding: {'encoding': 'utf-8', 'confidence': 0.7525, 'language': ''}
Fileout: testwrongEncoding.ino
PS C:\GitHub\TD-er\ESPEasy>

See also this StackOverflow reply

Just curious by the way, why does PlatformIO read and write the files?
Is it to generate the Arduino .cpp file?
If so, maybe it is then also a good idea to have the newline parameter set to None to convert CRLF or any other newline form into \n so the compiler will output useful line numbers in the compiler errors.
See: Python docs - open

I have now this in my fs.py from PlatformIO:


def get_file_contents(path):
    try:
        with open(path) as f:
            return f.read()
    except UnicodeDecodeError:
        print('UnicodeDecodeError in %s' % path)
        with open(path, encoding="utf8", errors="backslashreplace", newline=None) as f:
            return f.read()


def write_file_contents(path, contents):
    try:
        with open(path, "w") as fp:
            return fp.write(contents)
    except UnicodeEncodeError:
        print('UnicodeEncodeError in %s' % path)
        with open(path, "w", encoding="utf8", errors="backslashreplace", newline="\n") as fp:
            return fp.write(contents)

It can now build my project again and gives at least some indication on files with escaped unknown characters:

> Executing task: C:\Users\gijs\.platformio\penv\Scripts\platformio.exe run --environment custom_ESP8266_4M1M <

Processing custom_ESP8266_4M1M (platform: https://github.com/platformio/platform-espressif8266.git#feature/stage; board: esp12e; framework: arduino)
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Verbose mode can be enabled via `-v, --verbose` option
['PIO_FRAMEWORK_ARDUINO_ESPRESSIF_SDK22y', 'CONTROLLER_SET_ALL', 'NOTIFIER_SET_NONE', 'PLUGIN_SET_ONLY_SWITCH', 'USES_P001', 'USES_P002', 'USES_P004', 'USES_P028', 'USES_P036', 'USES_P049', 'USES_P052', 'USES_P056', 'USES_P059', 'USES_P082', 'USES_P085', 'USES_P087', 'USES_C016', 'USES_C018', 'USE_SETTINGS_ARCHIVE']
CONFIGURATION: https://docs.platformio.org/page/boards/espressif8266/esp12e.html
PLATFORM: Espressif 8266 (Stage) 2.3.0-alpha.2 #3500fb2 > Espressif ESP8266 ESP-12E
HARDWARE: ESP8266 80MHz, 80KB RAM, 4MB Flash
PACKAGES: toolchain-xtensa 2.40802.190218 (4.8.2), framework-arduinoespressif8266 45d71ae, tool-esptool 1.413.0 (4.13), tool-esptoolpy 1.20600.0 (2.6.0)
UnicodeDecodeError in C:\GitHub\TD-er\ESPEasy\src\_P052_SenseAir.ino
UnicodeDecodeError in C:\GitHub\TD-er\ESPEasy\src\testtestwrongEncoding.ino
UnicodeDecodeError in C:\GitHub\TD-er\ESPEasy\src\testwrongEncoding.ino
UnicodeDecodeError in C:\GitHub\TD-er\ESPEasy\src\wrongEncoding.ino
Converting ESPEasy.ino
LDF: Library Dependency Finder -> http://bit.ly/configure-pio-ldf

, errors="backslashreplace"

This is not a good solution. However, I don't see another option. The problem is a mix of encodings in a file. You have ascii + something strange (no ascii and not utf8). This file Python and other tools can't detect encoding for this file. That symbol is not valid ascii/utf-8 symbol.

In any case, I added this "backslashreplace" hook. Indeed, it does not resolve the problem, it just escapes invalid characters. In your case it will work because these symbols are in a comment. Let's imagine when these symbols are required for code or etc. If they are part of code... then firstly need to resolve the root of the problem - remove non-UTF8 chars from a file that is going to be encoded as UTF-8.

Just curious by the way, why does PlatformIO read and write the files? Is it to generate the Arduino .cpp file?

Yes, this is one of reasons why we ask developers to not use INO format. This is a bad practice to code in C/C++ where you don't do real C/C++ coding because of some converters (Arduino IDE, PlatformIO, etc) try to make final C/C++ code automatically for you.

I dream about a day when we will drop INO support and provide a tool that converts INO project to CPP. So, after this step people will use C/C++ directly and forget about all problems. Arduino IDE users have the other problems with this converting stage.

Can you add the "logging" like I showed?
It will be a great help to find strange files.

def get_file_contents(path):
    try:
        with open(path, newline=None) as f:
            return f.read()
    except UnicodeDecodeError:
        print('UnicodeDecodeError in %s' % path)
        with open(path, encoding="latin-1", errors="backslashreplace", newline=None) as f:
            return f.read()


def write_file_contents(path, contents):
    try:
        with open(path, "w", newline="\n") as fp:
            return fp.write(contents)
    except UnicodeEncodeError:
        print('UnicodeEncodeError in %s' % path)
        with open(path, "w", encoding="latin-1", errors="backslashreplace", newline="\n") as fp:
            return fp.write(contents)

@TD-er Thanks for all help! Please re-test with pio upgrade --dev.

or any other newline form into \n so the compiler will output useful line

How to reproduce this issue?

or any other newline form into \n so the compiler will output useful line

How to reproduce this issue?

I have not yet tested the change for it with my Python changes, but in the past I did change the behavior of my Git client just for this purpose.

About your fix.
You only output a warning on writing the file, but I think opening may be more useful.

See:

PACKAGES: toolchain-xtensa 2.40802.190218 (4.8.2), framework-arduinoespressif8266 45d71ae, tool-esptool 1.413.0 (4.13), tool-esptoolpy 1.20600.0 (2.6.0)
Converting ESPEasy.ino
Warning! There is a problem with contents encoding, please remove invalid characters (non-ASCII or non-UT8) in C:\Users\gijs\AppData\Local\Temp\tmp75_lv9b6
LDF: Library Dependency Finder -> http://bit.ly/configure-pio-ldf

@TD-er thanks for pointing on that. Please pio upgrade --dev :)

Works OK now :)

PACKAGES: toolchain-xtensa 2.40802.190218 (4.8.2), framework-arduinoespressif8266 45d71ae, tool-esptool 1.413.0 (4.13), tool-esptoolpy 1.20600.0 (2.6.0)
Unicode decode error has occurred, please remove invalid (non-ASCII or non-UTF8) characters from C:\GitHub\TD-er\ESPEasy\src\_P052_SenseAir.ino file
Unicode decode error has occurred, please remove invalid (non-ASCII or non-UTF8) characters from C:\GitHub\TD-er\ESPEasy\src\testtestwrongEncoding.ino file
Unicode decode error has occurred, please remove invalid (non-ASCII or non-UTF8) characters from C:\GitHub\TD-er\ESPEasy\src\testwrongEncoding.ino file    
Unicode decode error has occurred, please remove invalid (non-ASCII or non-UTF8) characters from C:\GitHub\TD-er\ESPEasy\src\wrongEncoding.ino file        
Converting ESPEasy.ino
LDF: Library Dependency Finder -> http://bit.ly/configure-pio-ldf

Great! ;)

Was this page helpful?
0 / 5 - 0 ratings