Protobuf: [Python] string representation of proto message incorrectly displays UTF8 chinese chararacters

Created on 23 May 2017  ·  8Comments  ·  Source: protocolbuffers/protobuf

Hello,

Here I'm building a proto message:

import myProto.pb as pb
mess = pb.Mess()
mess.oerrmsg = "当前状态禁止此项操作"

and here is what I see when print(mess):

oerrmsg: "\345\275\223\345\211\215\347\212\266\346\200\201\347\246\201\346\255\242\346\255\244\351\241\271\346\223\215\344\275\234"

I'd expect to see that

oerrmsg: "当前状态禁止此项操作"

I suppose the __print__ and __repr__ methods should be adjusted...?

Thank you for your help,

note: Ubuntu 16.04, python 3.6, libprotoc 3.1.0

customer issue python

All 8 comments

The default __print__ does not print as utf8. You can import text_format and use:
text_format.MessageToString(mess, as_utf8=True)

hi

The default print does not print as utf8. You can import text_format and use:
text_format.MessageToString(mess, as_utf8=True)

it cannot solve the problem .... and the string looks like weird after i use the text_format.MessageToString(mess, as_utf8=True) function
:

洪泰基金

any update or workaround for this issue?

Same problem

Any update?

any update?

hi

The default print does not print as utf8. You can import text_format and use:
text_format.MessageToString(mess, as_utf8=True)

it cannot solve the problem .... and the string looks like weird after i use the text_format.MessageToString(mess, as_utf8=True) function
:

洪泰��

I found this was not related to MessageToString. If you use vim to open the file, then you can just type set encoding=utf8, and it will display the correct characters. @innerNULL @sherry255

I'm dissatisfied with this being closed. I would also expect __repr__ to coerce bytestrings into human readable strings -- whether it is Chinese, Japanese, English, whatever. This is really unintuitive and I've lost easily hours of my life fighting with this.

Was this page helpful?
0 / 5 - 0 ratings