Protobuf: [Python] string representation of proto message incorrectly displays UTF8 chinese chararacters

Created on 23 May 2017 · 8Comments · Source: protocolbuffers/protobuf

Hello,

Here I'm building a proto message:

import myProto.pb as pb
mess = pb.Mess()
mess.oerrmsg = "当前状态禁止此项操作"

and here is what I see when print(mess):

oerrmsg: "\345\275\223\345\211\215\347\212\266\346\200\201\347\246\201\346\255\242\346\255\244\351\241\271\346\223\215\344\275\234"

I'd expect to see that

oerrmsg: "当前状态禁止此项操作"

I suppose the __print__ and __repr__ methods should be adjusted...?

Thank you for your help,

note: Ubuntu 16.04, python 3.6, libprotoc 3.1.0

customer issue python

Source

sulliwane

All 8 comments

The default __print__ does not print as utf8. You can import text_format and use:
text_format.MessageToString(mess, as_utf8=True)

anandolee on 15 Dec 2017

😕2 👎1

The default print does not print as utf8. You can import text_format and use:
text_format.MessageToString(mess, as_utf8=True)

it cannot solve the problem .... and the string looks like weird after i use the text_format.MessageToString(mess, as_utf8=True) function
:

æ´ªæ³°åºé

sherry255 on 17 May 2018

👍1

any update or workaround for this issue?

han4wluc on 14 Feb 2019

Same problem

taoxinyi on 12 Mar 2019

Any update？

eshijia on 5 Aug 2019

any update?

innerNULL on 26 Feb 2020

hi
The default print does not print as utf8. You can import text_format and use:
text_format.MessageToString(mess, as_utf8=True)
it cannot solve the problem .... and the string looks like weird after i use the text_format.MessageToString(mess, as_utf8=True) function
:
æ´ªæ³°å�ºé��

I found this was not related to MessageToString. If you use vim to open the file, then you can just type set encoding=utf8, and it will display the correct characters. @innerNULL @sherry255

eshijia on 28 Feb 2020

I'm dissatisfied with this being closed. I would also expect __repr__ to coerce bytestrings into human readable strings -- whether it is Chinese, Japanese, English, whatever. This is really unintuitive and I've lost easily hours of my life fighting with this.