Magisk: Honor View 10 boot crash and erecovery with 20303+

Created on 2 Apr 2020  路  13Comments  路  Source: topjohnwu/Magisk

Device: berkeley (Honor View 10)
Android-Version: 9 (BKL-L09C432 9.0.0.233)
LineageOS-Version: 16.0.20200213
Kernel-Version: 4.9.97
Magisk / Manager: 20.3 / 7.5.1 (force encryption, AVB 2.0/dm-verity, Recovery mode)

Installed Magisk v20.4:

  • direct patch with MM
  • select and patch file with MM -> flashed image with fastboot
  • with TWRP

Every method works, but the device crashes on reboot (to recovery) and boots to eRecovery. I unpacked the patched recovery ramdisks with AIK and compared the ramdisk directories. The only diff in v20.4 is the file "init" and this is identical to arm/magiskinit64 from Magisk-v20.4.zip. The are no logs in cache, tried already with canary debug version.

regression

Most helpful comment

I went through the SELinux commits relevant for the above-mentioned update (2.9 -> 3.0) and found that commit dc4e54126bf25dea4d51820922ccd1959be68fbc "libsepol: Make an unknown permission an error in CIL" causes the kernel panic and reboot. Hence, it seems that the Huawei system contains some buggy policies. After reverting this commit I could build a working version of Magisk 20.4 (375ab93ee304a4aeb1e7d906e1272f57b2cf2b44, March 23). It is still possible to revert the harmful SELinux commit in version 3.1 so that even the latest GitHub commit of Magisk (fc67c0195f3d07f7be7ec62107f341b55f7dda3a) can be built with this fix. For some reason I have some trouble with this most recent Magisk version since the phone boots to the original recovery instead of Magisk, but that's most likely a different issue.

I am not sure how to proceed: should the harmful SELinux commit be reverted permanently or should the error thrown by SELinux be handled at a higher level?

All 13 comments

Any way to get a last_kmsg / console-ramoops after the reboot to see if it captured the issue?

Got some logs from /sys/fs/pstore:

  • enabled ADB root
  • removed old logs
  • installed v20.4 with MM -> Reboot
  • crash -> forced boot to erecovery -> Reboot
  • standard boot with no root
  • adb root -> pulled files

console-ramoops-0.zip
dmesg-ramoops-0.zip

Unfortunately looks like eRecovery also generates a console-ramoops so it doesn't appear that was from the same boot up as the crash.

Best I can suggest is work through the canaries leading up to 20.4 to try and figure out when the issue started.

https://github.com/topjohnwu/magisk_files/commits/canary

The problem starts with version "0dc9f5c3" (20303). I tried another approach to get logs and now you can find magiskinit in dmesg log. Maybe this time the log is more useful:

  • replaced erecovery by twrp
  • flashed the prepared ramdisk image (canary debug version "0dc9f5c3") with fastboot
  • reboot to recovery
  • crash -> forced boot to twrp
  • adb pull /sys/fs/pstore

console-ramoops-0.zip
dmesg-ramoops-0.zip

Quickly looking at the commits between b39f4075 (20302) on Jan 10, 2020 and 0dc9f5c3 (20303) on Jan 22, 2020, there are very few commits to magiskinit that could be causing the regression.

Mainly these 2 are suspect:
https://github.com/topjohnwu/Magisk/commit/836bfbdd028c19d22e3dbb11a478ea397b7a7ca4
https://github.com/topjohnwu/Magisk/commit/ba55e2bc3288963d093b5fbd88f18c4622e3a43c

CC: @topjohnwu

That dmesg looks like it contains the crash, a kernel panic after sepolicy patching, so the SELinux updates on Jan 20, 2020 could be suspect as well. :+1:

CC: @topjohnwu

@HansGeiz 20405 has a rewrite of all the init logic so might be worth trying again to confirm this is still an issue and give us some fresh logs. :+1:

No change, it keeps crashing.
console-ramoops-0.zip
dmesg-ramoops-0.zip

I can confirm that I observe the same behavior with stable 20.4 on a Honor View 10 running Pixel Experience

I can confirm the same issue on a Huawei Honor 10 (COL-L29, similar to Honor View 10) with EMUI 9 vendor and AEX Beta 1 P1 custom ROM from openkirin.net. Magisk 20.3 works, while Magisk 20.4 leads to a bootloop (the same with latest canary). Inspecting the commits between Magisk 20.3 and 20.4, I found that b2ddba4cbfd5f589c3b4e74332154845fade40a0 is the latest working version. So commit fb60bea6597700cfd322ad2c7a4cef96822090fd dealing with an SELinux update seems to be the culprit.

I went through the SELinux commits relevant for the above-mentioned update (2.9 -> 3.0) and found that commit dc4e54126bf25dea4d51820922ccd1959be68fbc "libsepol: Make an unknown permission an error in CIL" causes the kernel panic and reboot. Hence, it seems that the Huawei system contains some buggy policies. After reverting this commit I could build a working version of Magisk 20.4 (375ab93ee304a4aeb1e7d906e1272f57b2cf2b44, March 23). It is still possible to revert the harmful SELinux commit in version 3.1 so that even the latest GitHub commit of Magisk (fc67c0195f3d07f7be7ec62107f341b55f7dda3a) can be built with this fix. For some reason I have some trouble with this most recent Magisk version since the phone boots to the original recovery instead of Magisk, but that's most likely a different issue.

I am not sure how to proceed: should the harmful SELinux commit be reverted permanently or should the error thrown by SELinux be handled at a higher level?

I went through the SELinux commits relevant for the above-mentioned update (2.9 -> 3.0) and found that commit dc4e54126bf25dea4d51820922ccd1959be68fbc "libsepol: Make an unknown permission an error in CIL" causes the kernel panic and reboot. Hence, it seems that the Huawei system contains some buggy policies. After reverting this commit I could build a working version of Magisk 20.4 (375ab93, March 23). It is still possible to revert the harmful SELinux commit in version 3.1 so that even the latest GitHub commit of Magisk (fc67c01) can be built with this fix. For some reason I have some trouble with this most recent Magisk version since the phone boots to the original recovery instead of Magisk, but that's most likely a different issue.

I am not sure how to proceed: should the harmful SELinux commit be reverted permanently or should the error thrown by SELinux be handled at a higher level?

@topjohnwu, some definite progress here, and a proposed revert which could resolve this issue. Thoughts?

https://cn.ui.vmall.com/thread-21969811-1-1-7119.html
浣跨敤杩欎釜甯栧瓙閲岄潰淇鐨刬mg鏂囦欢鍙互姝e父杩涘叆

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Rakleed picture Rakleed  路  4Comments

betaxab picture betaxab  路  4Comments

ananjaser1211 picture ananjaser1211  路  4Comments

Madis0 picture Madis0  路  3Comments

Nanolx picture Nanolx  路  4Comments