Retroarch: [Import] Scanner cannot real dual-layer Wii discs

Created on 8 Apr 2018 · 25Comments · Source: libretro/RetroArch

Description

When attempting to add dual layer Wii discs to the playlist via the Scan Directory/Scan File function, the process will fail to add them to the list due to an unknown bug in the file detection system.

Expected behavior

The serial number would get read from the top of the ISO, recognized by the database, and added to the Wii playlist.

Actual behavior

From what I can tell, we don't get as far as the serial detection. The serial is valid and exists in the database, but typically when you read in a Wii disc you get something like this in the log...

[INFO] Found disk label 'RM3E01'

... but for DL discs that never happens. So presumably this function never trips...

https://github.com/libretro/RetroArch/blob/0ee124582c8f2b60cb71ab2537811dd0aad2fcc3/tasks/task_database.c#L166-L172

... which means this doesn't happen...

https://github.com/libretro/RetroArch/blob/0ee124582c8f2b60cb71ab2537811dd0aad2fcc3/tasks/task_database.c#L603

... so for some reason these files in particular don't get iterated on by the database task. The only significant difference between SL and DL for ISO files is the size; SL discs are 4699979776 bytes while DL discs are 8511160320 bytes. Perhaps the file is just too big...? Unfortunately since I'm using the Flatpak version I can't step through and see exactly where things go wrong.

Steps to reproduce the bug

Import > Scan File...
Select Wii DL ISO, should attempt to scan and say "done"
No new entry!

Tested files are Metroid Prime Trilogy, Super Smash Bros. Brawl, Metroid: Other M, and Xenoblade Chronicles.

Version/Commit

RetroArch: Flatpak 1.7.1

Environment information

OS: Fedora 27 x86_64

Source

flibitijibibo

Most helpful comment

OK, I successfully converted this into a standalone program @flibitijibibo -

go to samples/tasks/database, and run make.

Basic usage -

$ ./database_task.exe

Usage: database_task.exe {database dir} {core dir} {core info dir} {input dir} {playlist dir}

database dir - path to the RDB database directory
core dir - path to the libretro core directory
core info dir - path to the libretro core info file directory
input dir - the directory that you want to scan
playlist dir - the playlist directory - this will be used to save the newly generated playlist file to upon program completion

Here is a test invocation of the program -

$ ./database_task.exe "C:/Games/RetroArch/database/rdb" "C:/Games/RetroArch/cores" "C:/Games/RetroArch/info" "C:/Users/libre/Downloads/Games/Saturn" "C:/Games/RetroArch/playlists"

I turned verbose logging on so that is why it might be slower than usual.

I hope that it will compile and work out of the box on both Windows and Linux/OSX. So far I tested it on Windows and it worked. And it compiled, linked and ran at least on OSX, so fingers crossed.

twinaphex on 10 Apr 2018

👍3

All 25 comments

Could you provide the CRC of the four tests you provided?

RobLoach on 8 Apr 2018

8a4b439d Metroid Prime Trilogy
29da327a Metroid: Other M
854b2889 Super Smash Bros. Brawl
472ffed2 Xenoblade Chronicles

flibitijibibo on 9 Apr 2018

It's actually not only dual layer discs. And has nothing to do with CRCs (my example is a redump file anyway), because for the Wii it's scanning by serial.

It's not reading:

game (
name "Another Code: R - A Journey into Lost Memories (Europe)"
serial "RNOP01"
developer "Cing"
publisher "Nintendo"
releaseyear 2009
releasemonth 6
releaseday 26
users 1
rom (
serial "RNOP01"
)
)

in spite of the data being on the Wii dat and redump dat. It finds the serial and bails out:

[INFO] Comparing with known magic numbers...
[INFO] Found disk label 'RNOP01'
[INFO] Comparing with known magic numbers...
[INFO] Found disk label 'SL2P01'
[INFO] Written to playlist file: /media/i30817/Huggin/Documents/retroarch/playlists/Nintendo - Wii.lpl

So i think there is something wrong with int detect_serial_ascii_game(intfstream_t *fd, char *game_id) at task_database_cue.c. Should be relatively easy to solve.

i30817 on 9 Apr 2018

It actually fails at if (detect_system(fd, &system_name) < 0) called by static int intfstream_get_serial(intfstream_t *fd, char *serial) in task_database.c and then detects a 'gc' disc instead of the wii it should be.

It's detecting at least 'Another Code R' (and several others in my game list) erroneously as gamecube games because detect_system can find a gamecube signature in a wii game. Then it skips the wii detection which assumes the Wii discs have no detected system and enters gamecube detection.

Now, i'm wondering if there isn't a better way to detect Wii discs than 'didn't detect'.

i30817 on 9 Apr 2018

Ok, i have a idea. In task_database_cue.c there is this table:

struct magic_entry
{
   int32_t offset;
   const char *system_name;
   const char *magic;
};

static struct magic_entry MAGIC_NUMBERS[] = {
   { 0,        "ps1",    "\x00\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\x00\x00\x02\x00\x02\x00"},
   { 0x838840, "pcecd",  "\x82\xb1\x82\xcc\x83\x76\x83\x8d\x83\x4f\x83\x89\x83\x80\x82\xcc\x92"},
   { 0,        "scd",    "\x00\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\x00\x00\x02\x00\x01\x53"},
   { 0x000400, "gc",     "\x00\x01\xC7\x04\x80\x28\x00\x60\x00\x00\x00\x00\x00\x00\x00\x00\x00"},
   { 0,        NULL,     NULL}
};

(ps1, scd and gc start with \00)

The function searching for the signatures uses this code:
if (string_is_equal(MAGIC_NUMBERS[i].magic, magic))

Now imagine that the offset of the disc (that was read into magic) started with \00

string \00 is equal to \00 right?

Basically, don't use string_is_equal to compare fixed size file offsets? Let's see the definition of it on ./libretro-common/include/string/stdstring.h:

static INLINE bool string_is_equal(const char *a, const char *b)
{
   if (!a || !b)
      return false;
   while(*a && (*a == *b))
      a++, b++;
   return (*(const unsigned char*)a - *(const unsigned char*)b) == 0;
}

And a test:

      if(string_is_equal("\x00\x01", "\x00\xff")){
        printf("WRONG!\n");
      }

Prints WRONG! because the 'string' ie: the characters until the first inequality or \00, are indeed equal and their subtraction == to 0. Needless to say, this should check the size of the magic bytes...

the only reason it's not saying it's a 'scd' game instead of gamecube is that offset zero of the iso starts with the serial. The only reason not all wii isos aren't 'gc' is if you get lucky and the ofset of the gc magic starts on a non-null.
I'm actually surprised that ps1 serial search works because it appears it should just consume everything that starts with \00. Maybe i'm missing something.
edit: ps1 mystery is 'resolved'.... apparently things do get falsely detected as ps1, but when it tries to parse the actual serial it fails and then falls-back to checksum checks.

There isn't actually a string_is_equal(const char * a, const char * b, size_t size) on ./libretro-common/include/string/stdstring.h to fix this yet.

I'd also check other uses of string_is_equal to see if someone screwed up on its use on other places.

i30817 on 9 Apr 2018

BTW segacd 'search' is fragile too, because the offset is only taking into account 'MODE1/2352' cds.

MODE1/2048 (ie: iso) is also a perfectly viable format even on a cue/iso and it's often used for translations or hacks of segacd or saturn or dreamcast (with the tracks separated so needs a cue).

You should find a better way to figure out cds for scd because of this, something like this maybe:

static struct magic_entry MAGIC_NUMBERS[] = {
   { 0,        "ps1",    "\x00\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\x00\x00\x02\x00\x02\x00"},
   { 0x838840, "pcecd",  "\x82\xb1\x82\xcc\x83\x76\x83\x8d\x83\x4f\x83\x89\x83\x80\x82\xcc\x92"},
   { 0,        "scd",    "\x00\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\x00\x00\x02\x00\x01\x53"},
   { 0,        "scd",    "\x53\x45\x47\x41\x44\x49\x53\x43"}, //ie: segadisc
   { 0x000400, "gc",     "\x00\x01\xC7\x04\x80\x28\x00\x60\x00\x00\x00\x00\x00\x00\x00\x00\x00"},
   { 0,        NULL,     NULL}
};

ps1 doesn't need this because to convert to iso as bchunk says:

-p PSX mode for MODE2/2352: write 2336 bytes from offset 24
(default MODE2/2352 mode writes 2048 bytes from offset 24)

ie: mode2/2352 cds always keep the first 24 bytes when converting to iso, it's only MODE1/2352 (which is the redump format for scd, dreamcast and saturn) that have to worry about getting the first 16 bytes cut off

I'm also surprised you don't try to search for serials from saturn or dreamcast.

i30817 on 9 Apr 2018

Replacing that if by if (memcmp(MAGIC_NUMBERS[i].magic, magic, MAGIC_LEN) == 0) is not enough apparently. Wii discs get no system (as they're supposed to) but several still don't parse. That some parse indicates there is a fallback somewhere above in the callers but i haven't figured it out yet.

i30817 on 9 Apr 2018

Here's a question: How hard is it to isolate the database task into a separate program for the purposes of unit testing? Normally I'm not a TDD guy but if there's a way for me to just load up a filename and send it straight to the task without loading up all of RA I'd totally be up for stepping through this line-by-line. (Gotta do something while my REDACTED devkit is in limbo...)

(Also argh, that strcmp - if nothing else comes out of this, let's at least turn that into a memcmp!)

flibitijibibo on 9 Apr 2018

How hard is it to isolate the database task into a separate program for the purposes of unit testing?

I'm totally a TDD guy, and while I haven't quite been able to set up an easy testing environment for RetroArch's database tasks, there is libretro-db/c_converter.c and the other libretro-db tools, which allow interacting with the .rdb files.

In the RetroArch code, I've found good ol' RARCH_LOG() logging the most helpful :cry:

Also argh, that strcmp

Haha, yes... I miss std::string, and C#'s String Class a lot when working on legacy-C codebases.

RobLoach on 9 Apr 2018

Here's a question: How hard is it to isolate the database task into a separate program for the purposes of unit testing?

Guess we'd need to make a standalone program then that still implements the task system. This could actually be feasible since I think the task-based code is fairly well compartmentalized from the rest of the RetroArch code.

twinaphex on 9 Apr 2018

This could actually be feasible since I think the task-based code is fairly well compartmentalized from the rest of the RetroArch code.

Definitely! I did notice that the whole task was isolated in its own world in the source, but I wasn't sure that it could build as a lazy a.out file right away. If anyone has tried this, I'd be open to trying a ghetto Makefile to test this thoroughly.

Haha, yes... I miss std::string, and C#'s String Class a lot when working on legacy-C codebases.

Maybe it's all the FNA experience, but any time I can replace "string" classes with plain ol' C operations is welcome to me >_<

flibitijibibo on 9 Apr 2018

👍1

I will try to take some stabs at making the database scanning a standalone program as well using just the plain libretro-common/RetroArch code parts. You will need to be running at least a shell in order to use it though, it will be a commandline program only.

twinaphex on 9 Apr 2018

👍3

That sounds great! If this works out just ping me when the commits are in and I'll tear through this ASAP.

flibitijibibo on 9 Apr 2018

BTW, import of wii dual layer discs fails at a filesize check when searching for serial here:

https://github.com/libretro/RetroArch/blob/master/tasks/task_database.c#L226

at a guess it would also fail in the exact same check here (for crc):

https://github.com/libretro/RetroArch/blob/master/tasks/task_database.c#L378

it seems that 'intfstream_tell' return result has to accommodate really large results. And the rest of the interface probably. Unless you feel like making a specific stream just for dual layer dvds.

i30817 on 10 Apr 2018

OK, I successfully converted this into a standalone program @flibitijibibo -

go to samples/tasks/database, and run make.

Basic usage -

$ ./database_task.exe

Usage: database_task.exe {database dir} {core dir} {core info dir} {input dir} {playlist dir}

Here is a test invocation of the program -

$ ./database_task.exe "C:/Games/RetroArch/database/rdb" "C:/Games/RetroArch/cores" "C:/Games/RetroArch/info" "C:/Users/libre/Downloads/Games/Saturn" "C:/Games/RetroArch/playlists"

I turned verbose logging on so that is why it might be slower than usual.

I hope that it will compile and work out of the box on both Windows and Linux/OSX. So far I tested it on Windows and it worked. And it compiled, linked and ran at least on OSX, so fingers crossed.

twinaphex on 10 Apr 2018

👍3

@i30817 Can you try this patch? http://p.0bl.net/182560

bparker06 on 10 Apr 2018

Well, good news and bad news.

Good news is that it made the Last Story (one of the two 'dual layer' discs i have) scan. Bad news is that it didn't work for Metroid Prime Trilogy or the 'Another Code R' (this one isn't dual layer).

Seems like there are other bugs interfering. I'll add the memcmp fix back and see if that's it.
edit: nope.

In master none of the 3 scan (and it isn't that other problem of the the dots or they not being redump either because their filenames or paths have no dots and they're verified redump).

So there is at least one more bug in addition to the dot, the memcmp and this 64 bit file offset thing.

i30817 on 10 Apr 2018

I should have mentioned that I edited the post earlier with an updated patch, maybe just double check that you're using the same one that's currently in the post. Otherwise, not sure what else is wrong at the moment.

bparker06 on 10 Apr 2018

Your patch should still be applied @bparker06, since it fixes the scan of at least one game, you should open a PR. Also could you please add a commit for the memcmp fix too?

edit: you could add my open commit about the scanner missing directories with '.' in the name too.

i30817 on 10 Apr 2018

edit: groaaan. Try to apply the patch, delete your database/rdb directory and update the database again, because there was bug about that @flibitijibibo

That fixed the other problems for me (along with the dot PR because one of the games has '(v1.01)' in the name of its directory). I simply had a outdated database.

At least two bug fixes came from this wild goose chase.

i30817 on 10 Apr 2018

Stepped through and can confirm @bparker06's patch resolves this issue, the file was indeed too large!

flibitijibibo on 10 Apr 2018

I'm going to close this now if the current nightly indeed resolves this.

twinaphex on 24 Apr 2018

Do you still happen to have the patch from http://p.0bl.net/182560 ? Unsure if it was pushed up in a PR.

RobLoach on 24 Apr 2018

Filesize is int64_t in my link above now so i'd expect so (though maybe not a PR and just pushed directly). In fact, git blame from github interface makes it easy to see the commit: https://github.com/libretro/RetroArch/commit/1751f4a0afbc1c2ab21977413aac983a83279221

i30817 on 24 Apr 2018

Oh great! Glad it got pushed. Thanks a lot, Twin, i3 + flibitijibibo.

RobLoach on 24 Apr 2018

👍1

Was this page helpful?

0 / 5 - 0 ratings