mono_jit_init on macOS 10.14 has graphics corruption due to mprotect invocation

Created on 25 Sep 2018  路  46Comments  路  Source: mono/mono

This issue comes from the investigation of: https://github.com/xamarin/xamarin-macios/issues/4509

There is a regression in macOS 10.14 where if all the following are true for me:

  • Building with Xcode10GM
  • Running on macOS 10.14
  • Launching from Finder or via "open" on the command line
  • In a project with a xib _not_ a storyboard as the main entry point

AppKit starts drawing corrupted graphics.

If any of those preconditions are false, say launching from Xcode instead of Finder\open, everything appears to draw fine but sometimes later strange drawing behavior creeps in anyway.

I've tracked it down to a single mprotect call in startup after @lemonmojo did some amazing work and tracked it down to just mono_jit_init ().

I've filed radar://44763485 on this, but mono may need to implement some sort of work around.

Steps to Reproduce

  1. Unzip MProtectMadness_WithMono.zip
  2. cd MProtectMadness
    2 (a). You may need to open MProtectMadness.xcodeproj and change signing keys
  3. rm -fr build/ && xcodebuild -project MProtectMadness.xcodeproj/
  4. open ./build/Release/MProtectMadness.app

The built in screenshot tool doesn't correctly capture what's on the screen, but you should see a ghost outline of a modal dialog that is not painting correctly.

Hitting escape will break you out of it and let you close the main window \ quit.

A single mono_jit_init call or mprotect with some math I stole from the mono startup sequence both trigger it exactly the same. See USE_MONO define in sample.

MProtectMadness_WithMono.zip

Current Behavior

Trigger strange AppKit drawing bug.

Expected Behavior

Work around said brokeness

On which platforms did you notice this

[x] macOS
[ ] Linux
[ ] Windows

Version Used:

I've reproduced this on mono 5.12, 5.16, and 5.19 (master 3 commits ago today), and a local mono build at a90249fba1827172fe7b9085f8927b215bab9f09.

area-Runtime os-macOS

Most helpful comment

@DanWBR This fix was just landed in master less than 2 days ago, and the cherry picks to the release branches are still hot. Once those land, I will spin up a build with that mono bump and post it on the original issue.

I can not give a timetable currently on when that will hit Alpha channel, but trust me I realize how pressing this issue is.

All 46 comments

@luhenry could you please take over this issue

http://man7.org/linux/man-pages/man2/mprotect.2.html

mprotect(): POSIX.1-2001, POSIX.1-2008, SVr4. POSIX says that the behavior of mprotect() is unspecified if it is applied to a region of memory that was not obtained via mmap(2).

Isn't that the case here?

Also see https://stackoverflow.com/a/50691513

So, as suspected if I mmap the memory with r/w before calling mprotect, none of the weirdness happens. At least in @chamons repro case. Here's an updated version of that with mmap added in:
MProtectMadnessLM.zip

See this rust code for reference: https://github.com/rust-lang/rust/blob/master/src/libstd/sys/unix/thread.rs#L308

Quick update: Just installed 10.14.1 beta and can still reproduce the issue.

Are we likely to get an update pretty soon for a workaround ?

@chamons @marek-safar @luhenry Any news from you guys or from Apple?

We can disable that mprotect call if needed on osx, the side effect is that stack overflows will not be caught on the main thread.

@vargaz Well, that doesn't sound very desirable. Did you see my comment on mmap'ing the memory before calling mprotect?

@lemonmojo Can you mmap the already allocated stack?

I'm not sure what the effect of that would be. Stack overflow handling is niche functionality which doesn't even work very well, so its less important than mono apps working on mojave.

@jaykrell I only tried what's present in the project I linked to: https://github.com/mono/mono/files/2416840/MProtectMadnessLM.zip.
As previously mentioned, the rust guys do something similar: https://github.com/rust-lang/rust/blob/master/src/libstd/sys/unix/thread.rs#L308

I can't reproduce this issue. Given, from what I understood, that it might be related to calling mprotect over memory that was not explicitly mmaped, does this call reversal happen to fix the issue ? @chamons

https://gist.github.com/BrzVlad/dcee172c24cc532a02d887b96cd4704c

I'm working with @BrzVlad now to test the mmap "fix" since he's having trouble reproducing locally (maybe hardware dependent?)

@chamons At some level it's definitely hardware dependent as I've never been able to reproduce it in a VM.

I can however reproduce it now consistently on 3 different machines: iMac Pro, 2016 MBP and 2018 MBP.

Hmm, I can still reproduce it (obviously) but your mmp fix doens't fix things for me (either your sample or adding:

mmap (addr, length, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANON | MAP_FIXED, -1, 0);

to mine. 馃

This is still under active investigation (by the awesome @BrzVlad ) but so far I've only been able to "fix" this by effectively commenting out the mprotect. That makes us lose stack overflow detection, which is obviously :(

@chamons Any news?

I just tested this using Xcode 9.4.1 and can still reproduce some weirdness when the mprotect call is not commented out (even the mmap workaround doesn't work for me in this case).

Here's what this looks like for me with Xcode 9 (note the title bar of the NSAlert):
monomojavexcode9

When this happens, Console.app shows the same old Context ID mismatch warning.

So for me this all but confirms that this is not a bug in the SDK but rather the OS itself. This also confirms that it's not related to an alternative code path Apple's taking for apps that are linked to the 10.14 SDK.

So to sum up: Downgrading to Xcode 9 is NOT a workaround unfortunately.

I have heard _zero_ from Apple so far, which is disappointing as you can imagine.

The runtime team is actively working on it, and I've been told there should be a PR today related to it.

@chamons Yes, that is indeed disappointing. At this point I'd suggest opening a TSI. Even if the Apple support engineer can't provide a workaround, it might help promote the radar. I would've requested a TSI already but fear that they would just point to mono instead so I think it makes sense that the TSI is opened by MS.

Note that what the runtime was doing, i.e. mprotect some memory it didn't allocate might not be a supported use case, so it could get broken at any time.

So building mono with the changes from https://github.com/mono/mono/pull/10899 should fix the problem? Or is there anything else required?

@lemonmojo You beat me to the punch by one minute - Yes, please test with that and report back. We believe it should fix things.

Also, I'm sure a TSI would get back a "what you are doing in unsupported, we don't care, stop doing that".

@chamons First tests look good. I wasn't able to reproduce the strangeness with https://github.com/lemonmojo/MonoOnMojaveSimple and your MProtectMadness_WithMono sample using Xcode 10.

Apple just replied to my duplicate of your radar, @chamons. Basically they're saying they can't reproduce it. I've shared some more details, repro instructions and a video. Let's see... How's your radar going?

So what's the timeline for a new mono release that includes the mprotect changes?

You can get it from the nightly build, it'll show up in the next preview of Mono 5.16

@marek-safar Are you sure this is already included in nightly? I just installed the latest build from https://download.mono-project.com/archive/nightly/macos-10-universal/ (MonoFramework-MDK-master@6f8ca69401f-dirty-5.21.0.1.macos10.xamarin.universal.pkg) and after updating the sample projects to reference system mono I can again reproduce the issue.

The fix was merged 2 days ago so it should show up there already

If the PR is really included in this release it doesn't fix the issue for me as I can again reproduce with all repro cases. @chamons Did you test this already?

@lemonmojo That package doesn't contain the fix. The next nightly will.

Apple has not replied to mine yet at all

I didn't test it as you gave it a 馃憤 and you've had more luck reproducing this than I.

@chamons when could we expect an updated Xamarin.Mac package to be downloaded through VS alpha update channel?

@chamons

yes, I see that there are fix but I also see that it does not seem to solve the problems .... so if I understood correctly, it is not a problem Xamarin.MAC but a problem Mono ?

@Rogister As I noted here https://github.com/xamarin/xamarin-macios/issues/4509#issuecomment-424704372 and in other places, if you are having issues not described by an existing issue, please file another issue with a detailed description of your issue and steps to reproduce.

And please stop spamming mono/mono issues. Posting that you have an issue, with zero actionable details, multiple times on multiple threads is _not_ going to get your a resolution more quickly.

@DanWBR This fix was just landed in master less than 2 days ago, and the cherry picks to the release branches are still hot. Once those land, I will spin up a build with that mono bump and post it on the original issue.

I can not give a timetable currently on when that will hit Alpha channel, but trust me I realize how pressing this issue is.

@chamons

Can you tell when the fix will be available in ALPHA Channel or right here ?

Thank you.

Alain

I believe this issue is fixed with https://github.com/mono/mono/pull/10899 and the associated cherry-picks.

Please see https://github.com/xamarin/xamarin-macios/issues/4848 for Xamarin.Mac specific information on how to test this fix and report back. Please only comment here in mono/mono if you specifically have issues with mono outside of Xamarin.Mac.

@chamons @vargaz: I installed the mono nightly from 2018-10-07 today and it appears to indeed fix the issue discussed in this thread.

Unfortunately, I'm running into all kinds of other problems with this build. I've seen mono apps (including my own and Visual Studio!) just randomly hang and crash out of nowhere.
One crash I can reproduce consistently and that never happened before the update (I ran mono 5.14 before) is happening when calling the completion handler of a WKWebView delegate method (webView:didReceiveAuthenticationChallenge:completionHandler:).

Here are two different error messages I've seen when calling the completion handler:
mono_coop_mutex_lock Cannot transition thread 0x1106995c0 from STATE_BLOCKING with DO_BLOCKING
Native Crash Log

mono_threads_enter_gc_safe_region_unbalanced Cannot transition thread 0x11341c5c0 from STATE_BLOCKING with DO_BLOCKING
Native Crash Log

Could you by any chance backport the changes to 5.14 so that I can verify if the change itself causes these issues or if it's been some other changes since 5.14?

There are backports to older mono versions above, the packages are available by clicking on the green checkmarks.

@lemonmojo Also, if swapping to an older mono "fixes" it please file bug reports if you can. :)

@vargaz: Not sure where to find those, could you please point me in the right direction? Thx

So 5.14 is "2018-04" in VS release speak so you want this branch:

https://github.com/mono/mono/tree/2018-04

Click commits, find the green checkmark at the last commit, and use pkg-mono.

Or just https://xamjenkinsartifact.azureedge.net/build-package-osx-mono/2018-04/165/969357ac02b2c08a43ef89d98aca550d3648bf00/MonoFramework-MDK-5.14.0.205.macos10.xamarin.universal.pkg

@chamons Perfect, thx a lot, I'll give this a try later.

@chamons 5.14 and 5.16 look good. Have none of the mentioned issues and the issue from this thread appears to be fixed as well.

Something's seriously broken since 5.18 though. Unfortunately I spent so much time debugging the original issue from this thread that I currently can't file separate bug reports for this new issue. Guess I'll be continuing to use 5.14 with the fix for the time being.

Was this page helpful?
0 / 5 - 0 ratings