Heidisql: EEncodingError: No mapping for the Unicode character in target multi-byte code page

Created on 29 Jan 2019  ·  39Comments  ·  Source: HeidiSQL/HeidiSQL

Steps to reproduce this issue

  1. Step 1; Import data from large (>300MB) sql file, in the Query tab -> Load SQL file
  2. Step 2; Click "Run files directly"
  3. Then I get...
    bugreport.txt

Current behavior


The program crashes. When "Continue" is clicked a loading window appears, saying "Reading next chunk from file", but it never does anything.

Expected behavior


The file should be imported.

Environment

  • HeidiSQL version: 10.1.0.5464
  • Operating system: Windows 10 x64 build 17763
bug confirmed

Most helpful comment

I just spent two hours of testing with these 1.5gb of international strings. HeidiSQL repeated to crash with the above message when loading the 20MB after query 214. My earlier approach for such read-errors was 10 times to increase the chunk size by 4 bytes and try again each time, hoping that broken multi byte characters then gets read correctly. But this strategy failed in many file cases. I just changed two things for the next nightly build:

  • increase that padding from 4 Bytes to 1 MB
  • catch the above crash and show an error dialog instead

For me, your file now imports correctly.

By the way, if your import is very slow, like me, then I recommend these settings in Preferences > Logging:
grafik

All 39 comments

I am having exactly the same problem and I do not know how to solve it.
When I do the restoration through Workbech it finishes correctly.
But I would like to do in HeidiSQL.

Hi! Exactly the same issue.

Could you please tell which encoding you selected in the file-open dialog?

I've got exactly the same issue.
I select the file only. "Encoding" drop-down has "UTF-8" already selected.
The file was created by HeidiSQL when I wanted to export SQL dump of the database.

Investigation results
It happens when there is a first byte of the multi-byte UTF-8 character in SQL file at position 20971519 of SQL file. It's the last position of the first 20 megabytes of SQL file data. Does HeidiSQL split input SQL files in 20-megabyte chunks?! WHY??! I mean, it's 2019, computers have tens of gigabytes of memory (mine has 32 gb + huge swap file). Why not load entire SQL file in memory?

Thanks for the file attachment. I'll test that.

Did you watch out which encoding you selected in the file-open dialog?

Yes. UTF-8.

I just stumbled across a UTF-8 issue in HeidiSQL, on files with only one character in them. I realized Delphi's TEncoding.UTF8 expects a preamble/BOM, which I just worked around with an overwritten encoding. I was hoping this fix heals this issue here. Please report back after testing the new nightly build. Thanks!

I have meet the same issue. While importing the SQL file - got the: EEncodingError: No mapping for the Unicode character in target multi-byte code page .. I have manually selected UTF8 in the import dialog.

Tested with nightly build: Revision 10.1.0.5526 from 13 Apr 2019 10:52
If you are interested I can share tests file, but it is 1.5G - but no problem to share in case of need.

under my investigation the problem could be related with splitting the big file as izogfif said, but not sure from 100% with this ...

v10.1.0.5526 includes an attempt to fix this. Could you please update and try again?

Tested yesterday with yesterday nightly build: Revision 10.1.0.5526 from 13 Apr 2019 10:52
Should I test with build from today Revision 10.1.0.5527 ?? i think no ...

I oversaw you already tested with that version. Will make some more tests. Would be good to have a test file, even if it's big - probably compress it and make it somehow available for me?

Here is the test file compressed using 7zip: https://we.tl/t-eKD2XA9e6j

thank you very much for your effort, will donate for sure this time ...
i am using heidisql for a couple of years :)

Thanks for the testfile - I just downloaded it, so you may delete it if you'd like to.

I just spent two hours of testing with these 1.5gb of international strings. HeidiSQL repeated to crash with the above message when loading the 20MB after query 214. My earlier approach for such read-errors was 10 times to increase the chunk size by 4 bytes and try again each time, hoping that broken multi byte characters then gets read correctly. But this strategy failed in many file cases. I just changed two things for the next nightly build:

  • increase that padding from 4 Bytes to 1 MB
  • catch the above crash and show an error dialog instead

For me, your file now imports correctly.

By the way, if your import is very slow, like me, then I recommend these settings in Preferences > Logging:
grafik

Hello .. sorry for the big test file with complicated structures of tables .. Anyway it is working fine for me for now! The whole file was imported now to the database with no Errors! Also no Error dialog after finishing ...

Tested with version: Revision 10.1.0.5528 from 14 Apr 2019 19:54
Thank you very much for your hard work. Have a nice rest of weekend.

I am closing this now. Just shout if you have a file which does not yet import correctly with the latest build.

Hi, I just experienced this on the latest nightly build

@IainCoSource please post the encoding of that file, and the one you selected in the file-open dialog.
Also, if you can, mail me that file if it does not contain critical information.

@IainCoSource please post the encoding of that file, and the one you selected in the file-open dialog.
Also, if you can, mail me that file if it does not contain critical information.

Will send after i remove the sensitive data

Hello .. sorry for the big test file with complicated structures of tables .. Anyway it is working fine for me for now! The whole file was imported now to the database with no Errors! Also no Error dialog after finishing ...

Tested with version: Revision 10.1.0.5528 from 14 Apr 2019 19:54
Thank you very much for your hard work. Have a nice rest of weekend.

The problem has been fixed after I updated my HeidiSql.
Thank you Ansgar.

Hi
This is still happening on the latest version. Please advise
no

I'm still experiencing this issue too

@soopermouse and @lareeth can you please verify you are selecting the right encoding in the file-open-dialog.

@ansgarbecker
TL;DR:

Edit source/apphelpers.pas
Change

BufferPadding = SIZE_MB;

to

BufferPadding = 1;

Full explanation:

If you check the changes for this issue, you'll be able to figure what's going on:

  • SQL dump is read in chunks.
  • When chunk boundary splits multi-byte Unicode character into pieces, the chunk contains an invalid character sequence.
  • An attempt to parse chunk using the specified encoding is made.
  • Since the chunk contains only a piece of multi-byte Unicode character, string decoding fails.
  • Chunk size is increased by 1 megabyte, then entire process is repeated. Up to 10 times.

Now guess what happens if there are multi-byte characters every 1048576 bytes (1mb) in SQL file? No matter how many times you increase your chunk size by 1048576 bytes, until you read the file until the very end, you'll be getting an error.

Here is the file with the test case: https://drive.google.com/open?id=1T56Kw32gObgKLRkaylHkx-jrZrhTS0Ad

It was generated via this Java program:
```import java.nio.file.Files;
import java.nio.file.Paths;
import java.nio.file.StandardOpenOption;
import java.util.Arrays;

public class Main {
public static void main(String[] args) throws Throwable {
final int FILE_SIZE = 31 * 1024 * 1024;
final int CHUNK_SIZE = 1024;
byte[] bytes = new byte[FILE_SIZE];
byte[] invalidSequence = {0x2f, 0x2a, (byte) 0xd1, (byte) 0x82, 0x2a, 0x2f};
Arrays.fill(bytes, (byte) 32);
for (int pos = CHUNK_SIZE - 3; pos + invalidSequence.length < bytes.length; pos += CHUNK_SIZE) {
System.arraycopy(invalidSequence, 0, bytes, pos, invalidSequence.length);
}
Files.write(Paths.get("C:\temp\heidi-invalid.sql"), bytes, StandardOpenOption.CREATE);
}
}
```

Proposal: set BufferPadding in source/apphelpers.pas to 1 byte. It's highly unlikely that there is a multi-byte encoding having maximum length of encoded Unicode character larger than 10 bytes (10 is the amount of attempts). So it should work.

I have the same issue

Have the same issue on version 10.2.0.5599 and 4.13 GB database dump.
Selected UTF-8.

@izogfif I just followed your advice using just 1 byte incrementation for BufferPadding. Sounds logical though I don't expect that to work. It's more a question which encoding the user selected in the file-open dialog, which some of the reporters did not tell in their comments. However, please update to the latest build in half an hour and try again.

@ansgarbecker yes this fixes the issue. You can check it yourself using the sample SQL file I mentioned previously. It only contains comments, so it should be safe to import as UTF-8 on any database. Or generate such file yourself.

Great! Thanks for your help and feedback.

Hi,

I've the same issue on latest version downloaded today. I selected UTF8 in the dialog box.
Can someone assist please

@roughed this is an old and closed thread. Please file a new issue, but previously assure you are using the latest HeidiSQL build. In that new issues, it may be helpful to see some details about your loaded file, e.g. the encoding, size and the charset of the target tables.

Also having the same issue. This is not fixed

@Nottt this is a closed issue - if you really have this issue with the latest HeidiSQL build, then please attach a sample file here.

@Nottt make sure to use the same database collation/charset.

@ansgarbecker Maybe an idea here would be, to make the error message here a little bit more easy to understand?

Like in the case a user is trying to import a database from one collation/charset into another collation/charset - he will get the error statet in the issues topic. Maybe pointing out that charset/collation of the database somebody is trying to import to needs to be the same as while exporting would help?

the error is still observed in the build 11.0.0.6096, please reopen the issue

Please look at the "new" report from @roughed : #1048 .

For me this error was being caused by trying to import a more recent backup from server to my desktop. It was a minor difference...like mariadb 10.4.2 to mariadb 10.4.1 or something like this but still caused issues.

So If someone is having this, make sure the version of the databases you are running are identical down to the last number

I have the same

Was this page helpful?
0 / 5 - 0 ratings