Player can't find files with extended Latin characters?

Created on 20 Mar 2016  路  9Comments  路  Source: EasyRPG/Player

It seems like the Player has problems when loading some files with extended Latin characters.
This is what happens in Blue Star Story:

[2016-03-20 18:32:48] Debug: Loading Map Map0121.lmu
[2016-03-20 18:32:48] Debug: Tree: PigVillage < Mapa Swiata
[2016-03-20 18:32:48] Debug: Cannot find: ChipSet/Wioska艢winiak贸wBezDrzew (!)
[2016-03-20 18:32:48] Warning: Image not found: ChipSet/Wioska艢winiak贸wBezDrzew

screenshot_8
RPG_RT can load it just fine.
I've been struggling to figure out whether it's something on my end, the Player, or the game's database.

FileFinder Patch available

Most helpful comment

Possible solution. We would just need to replace all "Utils::LowerCase" calls with ReaderUtils::Normalize in FileFinder (putted this in ReaderUtil because of ICU which is linked into liblcf and maybe useful for other tools that need this...)

// Converts UTF-8 to Lowercase and then Decomposes it
std::string ReaderUtil::Normalize(const std::string &str) {
#ifdef LCF_SUPPORT_ICU
    UnicodeString& uni = icu::UnicodeString(str.c_str()).toLower();
    UErrorCode err = U_ZERO_ERROR;
    const Normalizer2* norm = icu::Normalizer2::getNFKCInstance(err);
    UnicodeString f = norm->normalize(uni, err);
    std::string res;
    f.toUTF8String(res);
    return res;
#else
    // LAME
    std::string result = str;
    std::transform(result.begin(), result.end(), result.begin(), tolower);
    return result;
#endif
}

All 9 comments

Rename ChipSet/Wioska艣winiak贸wBezDrzew to ChipSet/Wioskan艢winiak贸wBezDrzew (capital 艢)
Marking this as a file finder bug

This would require Unicode case-insensitive string comparison with u_strcasecmp() from ICU (ustring.h) and something equivalent for the iconv fallback.

Or u_strToLower

Windows 7 64-bit, Player64.

Debug: Detected encoding: ibm-5346_P100-1998

I had it set to 1250 in RPG_RT.ini before, changing to 1251 and 1252 didn't do anything. The file actually uses the lower case "艣" character by default.
Does that mean the Player ignores files unless it gets a perfect match with what's in the database?

The Player does a case insensitive match but this currently only works for ASCII, not for anything extended.
This is for supporting platforms that are case sensitive... (all except Windows)

I have a problem similar with some projects (not everyone) with the Android app.
Some files with filenames containing "茅", "猫", or "脿" aren't find.

If I change the filename with ES Explorer with the exact name, it works.
Hint : When I delete a "茅" character, I have to press 2 times the back button, after 1 time it became a "e". So I suppose it's a similar problem : encoding.

BlisterBoy, it seems Android does not normalize filenames.
Probably stores them as e + [combiner] ` = 猫 instead of 猫.

In theory this should be solvable by asking ICU to normalize the unicode. But I never had this behaviour under Windows, so hard to debug because debugging native code under Android sucks :/

carstene1ns proposes a unicode-lowercase solution in gentools: EasyRPG/Tools/pull/22
The same can be used here.
BlisterBoys issue can be resolved using "unorm2_normalize" https://ssl.icu-project.org/apiref/icu4c/unorm2_8h.html#a0a596802db767da410b4b04cb75cbc53
Both should be added to the FileFinder, too.
Also open for discussion is, whether iconv supports this. (of course ICU is superior but some ports use iconv only)

Possible solution. We would just need to replace all "Utils::LowerCase" calls with ReaderUtils::Normalize in FileFinder (putted this in ReaderUtil because of ICU which is linked into liblcf and maybe useful for other tools that need this...)

// Converts UTF-8 to Lowercase and then Decomposes it
std::string ReaderUtil::Normalize(const std::string &str) {
#ifdef LCF_SUPPORT_ICU
    UnicodeString& uni = icu::UnicodeString(str.c_str()).toLower();
    UErrorCode err = U_ZERO_ERROR;
    const Normalizer2* norm = icu::Normalizer2::getNFKCInstance(err);
    UnicodeString f = norm->normalize(uni, err);
    std::string res;
    f.toUTF8String(res);
    return res;
#else
    // LAME
    std::string result = str;
    std::transform(result.begin(), result.end(), result.begin(), tolower);
    return result;
#endif
}
Was this page helpful?
0 / 5 - 0 ratings

Related issues

gadesx picture gadesx  路  5Comments

rohkea picture rohkea  路  5Comments

fmatthew5876 picture fmatthew5876  路  3Comments

Ghabry picture Ghabry  路  3Comments

BuffMcBigHuge picture BuffMcBigHuge  路  4Comments