Protobuf: C#和C++ string类型含中文怎么序列化

Created on 6 Apr 2018  ·  4Comments  ·  Source: protocolbuffers/protobuf

C#和C++ 平台下string类型含中文序列化protobuf报错并出现乱码,怎么解决

question

All 4 comments

The content in string field must be in UTF-8 encoding. I'd suggest using
encoding conversion libraries when interact with the string fields, or
change the field type to bytes instead.

On Thu, Apr 5, 2018 at 9:24 PM Tgsgf notifications@github.com wrote:

C#和C++ 平台下string类型含中文序列化protobuf报错并出现乱码,怎么解决


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/google/protobuf/issues/4482, or mute the thread
https://github.com/notifications/unsubscribe-auth/AATQyTrYtqE3mckN_nIXa6PNcoPEdhWOks5tlu3igaJpZM4TJfL8
.

@pherl
The problem has been solved. Thank you for your help.

@pherl
The problem has been solved. Thank you for your help.

how to solve?

@pherl
问题已经解决了。谢谢您的帮助。

怎么解决?

C++ must first convert strings to utf8 for serialization. The conversion is as follows:

#include <iostream>
#include <codecvt>

using namespace std;
#pragma comment(lib, "libprotobuf.lib")

const std::string ws2s(const wstring& ws)
{
    locale old_loc = locale::global(locale(""));
    const wchar_t* src_wstr = ws.c_str();
    size_t buffer_size = ws.size() * 4 + 1;
    char* dst_str = new char[buffer_size];
    memset(dst_str, 0, buffer_size);
    size_t i = 0;
    wcstombs_s(&i, dst_str, buffer_size, src_wstr, buffer_size);
    string result = dst_str;
    delete[]dst_str;
    locale::global(old_loc);
    return result;
}

const std::wstring s2ws(const string& s)
{
    locale old_loc = locale::global(locale(""));
    const char* src_str = s.c_str();
    const size_t buffer_size = s.size() + 1;
    wchar_t* dst_wstr = new wchar_t[buffer_size];
    wmemset(dst_wstr, 0, buffer_size);
    size_t i = 0;
    mbstowcs_s(&i, dst_wstr, buffer_size, src_str, buffer_size);
    wstring result = dst_wstr;
    delete[]dst_wstr;
    locale::global(old_loc);
    return result;
}

const std::string ws2utf8(const wstring& src)
{
    wstring_convert<codecvt_utf8<wchar_t>> conv;
    return conv.to_bytes(src);
}

const wstring utf8_2_ws(const string& src)
{
    wstring_convert<codecvt_utf8<wchar_t> > conv;
    return conv.from_bytes(src);
}

//////////////使用方法是先将字符串转成UTF-8
int main()
{
    TGS::MyInFo info;
    string name = "姓名:TGS";
    info.set_mingzi(ws2utf8(s2ws(name)));
    info.set_nianling(24);
    string dh = "123456789";
    info.set_dianhua(ws2utf8(s2ws(dh)));
    char a[400];
    memset(a, 0, sizeof(a));
    info.SerializeToArray(&a, sizeof(a));
    cout << a << endl;

    TGS::MyInFo tt;
    tt.ParseFromArray(a, sizeof(a));
    cout << ws2s(utf8_2_ws(tt.mingzi())) << " " << tt.nianling() << " " << ws2s(utf8_2_ws(tt.dianhua())) << endl;
    system("pause");
    return 0;
}
Was this page helpful?
0 / 5 - 0 ratings