Magisk: Early mount race condition with external sd card

Created on 25 Feb 2020  路  10Comments  路  Source: topjohnwu/Magisk

I'm running magisk on an ODROID device, which boots directly from an external sd card.

Unfortunately, it randomly fails to start up property due to a race condition between magiskinit early mounts and the sd card being fully probed and available (i.e., before /sys/dev/block is fully populated).

I've attached two kernel outputs (from dmesg) for comparison, one where magisk works and one where it doesn't. In the failing one you can see both system and vendor failed to early mount because magiskinit was trying to mount them before the partitions were available.

dmesg_magisk_working.txt

dmesg_magisk_failing.txt

regression

Most helpful comment

Good news! I tested the latest magisk (20419) and it works great! I restarted the device 10 times in a row and it came up fine every time. (Previously with version 20300 it was only working 60% of the time.)

Even when magiskinit starts going before meson-mmc has read the partitions, it somehow always manages to wait for that to happen right after Restoring /init before it gets to Setup vendor, Setup system, etc. One example taken from dmesg below:

[    1.727948] magiskinit: Device info:
[    1.727954] magiskinit: skip_initramfs=[0]
[    1.727957] magiskinit: force_normal_boot=[0]
[    1.727960] magiskinit: slot=[]
[    1.727963] magiskinit: dt_dir=[/proc/device-tree/firmware/android]
[    1.727966] magiskinit: hardware=[odroidn2]
[    1.727969] magiskinit: hardware.platform=[]
[    1.727995] magiskinit: RootFSInit
[    1.728325] magiskinit: Restoring /init
[    1.728509] meson-mmc: sd: rx_tuning_result[3] = 40
[    1.730245] meson-mmc: sd: rx_tuning_result[4] = 40
[    1.730248] meson-mmc: sd: best_win_start =1, best_win_size =4
[    1.730251] meson-mmc: sd: sd_emmc_regs->gclock=0x10000245,sd_emmc_regs->gadjust=0x32000
[    1.730261] sd: new ultra high speed SDR104 SDHC card at address e624
[    1.730264] sd: clock 199999997, 4-bit-bus-width\x0a
[    1.734655] mmcblk1: sd:e624 SM32G 29.7 GiB
[    1.735547] mmcblk1: p1 p2
[    1.736127] meson-mmc: Enter aml_emmc_partition_ops
[    1.736532] meson-mmc: Delete invalid mbr partition part ffffffc0c9318000, part->partno 1
[    1.736609] meson-mmc: Delete invalid mbr partition part ffffffc0c9318400, part->partno 2
[    1.737630] meson-mmc: [mmc_read_partition_tbl] mmc read partition OK!
[    1.737632] meson-mmc: add_emmc_partition
[    1.737758] meson-mmc: [mmcblk1p01]           bootloader  offset 0x000000000001, size 0x00000000077f
[    1.737839] meson-mmc: [mmcblk1p02]                  env  offset 0x000000000780, size 0x000000000080
[    1.737922] meson-mmc: [mmcblk1p03]               ptable  offset 0x000000000800, size 0x000000000008
[    1.738002] meson-mmc: [mmcblk1p04]                 misc  offset 0x000000000808, size 0x000000000008
[    1.738079] meson-mmc: [mmcblk1p05]                 logo  offset 0x000000000810, size 0x000000001000
[    1.738161] meson-mmc: [mmcblk1p06]                 dtbs  offset 0x000000001810, size 0x000000000100
[    1.738262] meson-mmc: [mmcblk1p07]                 boot  offset 0x000000001910, size 0x000000008000
[    1.738342] meson-mmc: [mmcblk1p08]             recovery  offset 0x000000009910, size 0x00000000c000
[    1.738422] meson-mmc: [mmcblk1p09]                cache  offset 0x000000015910, size 0x000000200000
[    1.738503] meson-mmc: [mmcblk1p10]                  odm  offset 0x000000215910, size 0x000000010000
[    1.738605] meson-mmc: [mmcblk1p11]               system  offset 0x000000225910, size 0x000000380000
[    1.738704] meson-mmc: [mmcblk1p12]               vendor  offset 0x0000005a5910, size 0x000000100000
[    1.738785] meson-mmc: [mmcblk1p13]              product  offset 0x0000006a5910, size 0x000000010000
[    1.738869] meson-mmc: [mmcblk1p14]                param  offset 0x0000006b5910, size 0x000000008000
[    1.738951] meson-mmc: [mmcblk1p15]             cri_data  offset 0x0000006bd910, size 0x000000004000
[    1.739052] meson-mmc: [mmcblk1p16]                 data  offset 0x0000006c1910, size 0x0000034b0af0
[    1.739063] card key: card_blk_probe.
[    1.741371] magiskinit: Setup vendor: [vendor] (179, 12)
[    1.746982] EXT4-fs (mmcblk1p12): mounted filesystem without journal. Opts: (null)
[    1.747017] magiskinit: Setup system: [system] (179, 11)
[    1.751778] EXT4-fs (mmcblk1p11): mounted filesystem without journal. Opts: (null)
[    1.751801] magiskinit: Setup product: [product] (179, 13)
[    1.753805] EXT4-fs (mmcblk1p13): mounted filesystem without journal. Opts: (null)
[    1.753828] magiskinit: Setup odm: [odm] (179, 10)
[    1.803995] magiskinit: Setup cache: [cache] (179, 9)
[    1.828305] EXT4-fs (mmcblk1p9): recovery complete
[    1.829433] EXT4-fs (mmcblk1p9): mounted filesystem with ordered data mode. Opts: (null)
[    1.833992] magiskinit: sepol: split policy
[    1.871906] amlogic-new-usb2-v2 ffe09000.usb2phy: ---Set port(0) tuning for host cf(xhci_hub_control)--
[    1.931901] usb 1-1: new high-speed USB device number 2 using xhci-hcd
[    1.940620] magiskinit: cil_add [/system/etc/selinux/plat_sepolicy.cil]
[    1.950362] magiskinit: cil_add [/system/etc/selinux/mapping/28.0.cil]
[    2.014623] magiskinit: cil_add [/vendor/etc/selinux/plat_pub_versioned.cil]
[    2.034109] magiskinit: cil_add [/vendor/etc/selinux/vendor_sepolicy.cil]
[    2.088078] hub 1-1:1.0: USB hub found
[    2.088302] hub 1-1:1.0: 4 ports detected
[    2.192018] usb 2-1: new SuperSpeed USB device number 2 using xhci-hcd
[    2.216122] hub 2-1:1.0: USB hub found
[    2.216417] hub 2-1:1.0: 4 ports detected
[    2.431902] usb 1-1.2: new low-speed USB device number 3 using xhci-hcd
[    2.571176] input: Logitech USB-PS/2 Optical Mouse as /devices/platform/ff500000.dwc3/xhci-hcd.0.auto/usb1/1-1/1-1.2/1-1.2:1.0/0003:046D:C00E.0001/input/input3
[    2.571430] hid-generic 0003:046D:C00E.0001: input,hidraw0: USB HID v1.10 Mouse [Logitech USB-PS/2 Optical Mouse] on usb-xhci-hcd.0.auto-1.2/input0
[    2.691764] magiskinit: Replace [/system/etc/selinux/plat_sepolicy.cil] -> [xxx]
[    2.710994] magiskinit: Remove flash_recovery
[    2.711062] magiskinit: Inject magisk services: [s6gksCTWd] [UnVZCaTpbej] [32FNFEQvSxi1y]
[    2.735893] magiskinit: Unmount [/sys/fs/selinux]
[    2.761572] magiskinit: Unmount [/dev/mnt/persist]
[    2.775959] magiskinit: Unmount [/odm]
[    2.800129] magiskinit: Unmount [/product]
[    2.816938] magiskinit: Unmount [/system]
[    2.832766] magiskinit: Unmount [/vendor]
[    2.867956] magiskinit: Unmount [/proc]
[    2.892759] magiskinit: Unmount [/sys]
[    2.900842] init: init first stage started!
[    2.901124] init: Using Android DT directory /proc/device-tree/firmware/android/

Feel free to close this issue now.

All 10 comments

I've worked around the problem for now using a tiny program called sleep_then_init.c that looks like this:

#include <unistd.h>
int main(int argc, char* argv[])
{
    sleep(3);
    execv("/init", argv);
    return 0;
}

I built that for ARM (linking statically), then placed it inside the initrd inside the boot image, then also modified the kernel cmdline in the inside the boot image config to include rdinit=/sleep_then_init.

But it would be nice if magisk could handle this situation itself, e.g., inside collect_devices() in init/mount.cpp if it doesn't have all the needed devices for early boot (system, vendor, product, odm), wait up to 3 seconds for them to existing, checking 10 times a second.

I've worked around the problem for now using a tiny program called sleep_then_init.c that looks like this:

#include <unistd.h>
int main(int argc, char* argv[])
{
    sleep(3);
    execv("/init", argv);
    return 0;
}

I built that for ARM (linking statically), then placed it inside the initrd inside the boot image, then also modified the kernel cmdline in the inside the boot image config to include rdinit=/sleep_then_init.

But it would be nice if magisk could handle this situation itself, e.g., inside collect_devices() in init/mount.cpp if it doesn't have all the needed devices for early boot (system, vendor, product, odm), wait up to 3 seconds for them to existing, checking 10 times a second.

That particular solution might actually solve a lot of problems, eg with gsi not detecting certain partitions or other weird devices

@sshock Any change with the new init logic in 20405?

@osm0sis sorry I didn't even see your message until just now; I haven't tried any newer Magisk since whatever was the latest in Feb (20.3?). Were changes made since then to the logic for early mounts that may have address this?

Yes. Like I said, 20405 has new init logic. Please check with the latest Canary.

Ok, cool; I'll try to test it soon, hopefully sometime next week.

Can someone remind me how to get magiskinit LOGD messages to show up? I already added printk.devkmsg=on to the cmdline inside the boot image config, but for some reason they are still not showing up in dmesg, but I know I had logs working before back in Feb...

Can someone remind me how to get magiskinit LOGD messages to show up? I already added printk.devkmsg=on to the cmdline inside the boot image config, but for some reason they are still not showing up in dmesg, but I know I had logs working before back in Feb...

Nevermind; logs work fine as long as I actually use a debug build of magisk.

Good news! I tested the latest magisk (20419) and it works great! I restarted the device 10 times in a row and it came up fine every time. (Previously with version 20300 it was only working 60% of the time.)

Even when magiskinit starts going before meson-mmc has read the partitions, it somehow always manages to wait for that to happen right after Restoring /init before it gets to Setup vendor, Setup system, etc. One example taken from dmesg below:

[    1.727948] magiskinit: Device info:
[    1.727954] magiskinit: skip_initramfs=[0]
[    1.727957] magiskinit: force_normal_boot=[0]
[    1.727960] magiskinit: slot=[]
[    1.727963] magiskinit: dt_dir=[/proc/device-tree/firmware/android]
[    1.727966] magiskinit: hardware=[odroidn2]
[    1.727969] magiskinit: hardware.platform=[]
[    1.727995] magiskinit: RootFSInit
[    1.728325] magiskinit: Restoring /init
[    1.728509] meson-mmc: sd: rx_tuning_result[3] = 40
[    1.730245] meson-mmc: sd: rx_tuning_result[4] = 40
[    1.730248] meson-mmc: sd: best_win_start =1, best_win_size =4
[    1.730251] meson-mmc: sd: sd_emmc_regs->gclock=0x10000245,sd_emmc_regs->gadjust=0x32000
[    1.730261] sd: new ultra high speed SDR104 SDHC card at address e624
[    1.730264] sd: clock 199999997, 4-bit-bus-width\x0a
[    1.734655] mmcblk1: sd:e624 SM32G 29.7 GiB
[    1.735547] mmcblk1: p1 p2
[    1.736127] meson-mmc: Enter aml_emmc_partition_ops
[    1.736532] meson-mmc: Delete invalid mbr partition part ffffffc0c9318000, part->partno 1
[    1.736609] meson-mmc: Delete invalid mbr partition part ffffffc0c9318400, part->partno 2
[    1.737630] meson-mmc: [mmc_read_partition_tbl] mmc read partition OK!
[    1.737632] meson-mmc: add_emmc_partition
[    1.737758] meson-mmc: [mmcblk1p01]           bootloader  offset 0x000000000001, size 0x00000000077f
[    1.737839] meson-mmc: [mmcblk1p02]                  env  offset 0x000000000780, size 0x000000000080
[    1.737922] meson-mmc: [mmcblk1p03]               ptable  offset 0x000000000800, size 0x000000000008
[    1.738002] meson-mmc: [mmcblk1p04]                 misc  offset 0x000000000808, size 0x000000000008
[    1.738079] meson-mmc: [mmcblk1p05]                 logo  offset 0x000000000810, size 0x000000001000
[    1.738161] meson-mmc: [mmcblk1p06]                 dtbs  offset 0x000000001810, size 0x000000000100
[    1.738262] meson-mmc: [mmcblk1p07]                 boot  offset 0x000000001910, size 0x000000008000
[    1.738342] meson-mmc: [mmcblk1p08]             recovery  offset 0x000000009910, size 0x00000000c000
[    1.738422] meson-mmc: [mmcblk1p09]                cache  offset 0x000000015910, size 0x000000200000
[    1.738503] meson-mmc: [mmcblk1p10]                  odm  offset 0x000000215910, size 0x000000010000
[    1.738605] meson-mmc: [mmcblk1p11]               system  offset 0x000000225910, size 0x000000380000
[    1.738704] meson-mmc: [mmcblk1p12]               vendor  offset 0x0000005a5910, size 0x000000100000
[    1.738785] meson-mmc: [mmcblk1p13]              product  offset 0x0000006a5910, size 0x000000010000
[    1.738869] meson-mmc: [mmcblk1p14]                param  offset 0x0000006b5910, size 0x000000008000
[    1.738951] meson-mmc: [mmcblk1p15]             cri_data  offset 0x0000006bd910, size 0x000000004000
[    1.739052] meson-mmc: [mmcblk1p16]                 data  offset 0x0000006c1910, size 0x0000034b0af0
[    1.739063] card key: card_blk_probe.
[    1.741371] magiskinit: Setup vendor: [vendor] (179, 12)
[    1.746982] EXT4-fs (mmcblk1p12): mounted filesystem without journal. Opts: (null)
[    1.747017] magiskinit: Setup system: [system] (179, 11)
[    1.751778] EXT4-fs (mmcblk1p11): mounted filesystem without journal. Opts: (null)
[    1.751801] magiskinit: Setup product: [product] (179, 13)
[    1.753805] EXT4-fs (mmcblk1p13): mounted filesystem without journal. Opts: (null)
[    1.753828] magiskinit: Setup odm: [odm] (179, 10)
[    1.803995] magiskinit: Setup cache: [cache] (179, 9)
[    1.828305] EXT4-fs (mmcblk1p9): recovery complete
[    1.829433] EXT4-fs (mmcblk1p9): mounted filesystem with ordered data mode. Opts: (null)
[    1.833992] magiskinit: sepol: split policy
[    1.871906] amlogic-new-usb2-v2 ffe09000.usb2phy: ---Set port(0) tuning for host cf(xhci_hub_control)--
[    1.931901] usb 1-1: new high-speed USB device number 2 using xhci-hcd
[    1.940620] magiskinit: cil_add [/system/etc/selinux/plat_sepolicy.cil]
[    1.950362] magiskinit: cil_add [/system/etc/selinux/mapping/28.0.cil]
[    2.014623] magiskinit: cil_add [/vendor/etc/selinux/plat_pub_versioned.cil]
[    2.034109] magiskinit: cil_add [/vendor/etc/selinux/vendor_sepolicy.cil]
[    2.088078] hub 1-1:1.0: USB hub found
[    2.088302] hub 1-1:1.0: 4 ports detected
[    2.192018] usb 2-1: new SuperSpeed USB device number 2 using xhci-hcd
[    2.216122] hub 2-1:1.0: USB hub found
[    2.216417] hub 2-1:1.0: 4 ports detected
[    2.431902] usb 1-1.2: new low-speed USB device number 3 using xhci-hcd
[    2.571176] input: Logitech USB-PS/2 Optical Mouse as /devices/platform/ff500000.dwc3/xhci-hcd.0.auto/usb1/1-1/1-1.2/1-1.2:1.0/0003:046D:C00E.0001/input/input3
[    2.571430] hid-generic 0003:046D:C00E.0001: input,hidraw0: USB HID v1.10 Mouse [Logitech USB-PS/2 Optical Mouse] on usb-xhci-hcd.0.auto-1.2/input0
[    2.691764] magiskinit: Replace [/system/etc/selinux/plat_sepolicy.cil] -> [xxx]
[    2.710994] magiskinit: Remove flash_recovery
[    2.711062] magiskinit: Inject magisk services: [s6gksCTWd] [UnVZCaTpbej] [32FNFEQvSxi1y]
[    2.735893] magiskinit: Unmount [/sys/fs/selinux]
[    2.761572] magiskinit: Unmount [/dev/mnt/persist]
[    2.775959] magiskinit: Unmount [/odm]
[    2.800129] magiskinit: Unmount [/product]
[    2.816938] magiskinit: Unmount [/system]
[    2.832766] magiskinit: Unmount [/vendor]
[    2.867956] magiskinit: Unmount [/proc]
[    2.892759] magiskinit: Unmount [/sys]
[    2.900842] init: init first stage started!
[    2.901124] init: Using Android DT directory /proc/device-tree/firmware/android/

Feel free to close this issue now.

Great news! Let us know if the issue ever recurs. 馃憤

Was this page helpful?
0 / 5 - 0 ratings