Brew: Change Hardware.oldest_cpu to something newer

Created on 19 Dec 2018  Ā·  33Comments  Ā·  Source: Homebrew/brew

See this question https://discourse.brew.sh/t/forcing-a-specific-architecture-for-non-core-bottles/3676

assuming the goal is to run on anything that supported MacOS releases runs on, that (Penryn) makes sense - but if the idea is to run on ā€˜supported’ hardware, then sandybridge seems like it might be a better goal for brew to use as a default, if I’m reading https://support.apple.com/en-us/HT201624 + the corresponding intel generations correctly

outdated question

Most helpful comment

We’ve got all the data we need for a PR. Let’s hold off further discussion on the abstract and save it for code review.

All 33 comments

I'd be in favour of changing Hardware.oldest_cpu from -march=core2 to something newer. The author suggests -march=sandybridge.
-march=core2 aka -march=penryn with SSE 4.1 is circa 2007,-march=nehalem with SSE 4.2 is circa 2008, and -march=sandybridge with SSE 4.2 is circa 2011.
See https://en.wikipedia.org/wiki/List_of_Intel_CPU_microarchitectures

@sjackman:

     I'd suggest gathering analytics on what hardware users are still running Homebrew on first.

We only bottle for the last three releases of macOS. What's the minimum CPU requirements of macOS 10.12 Sierra?

Ask and Wikipedia shall answer. https://en.wikipedia.org/wiki/MacOS_Sierra#System_requirements

iMac: Late 2009 or newer
MacBook and MacBook 12-inch: Late 2009 or newer
MacBook Pro: Mid 2010 or newer
MacBook Air: Late 2010 or newer
Mac Mini: Mid 2010 or newer
Mac Pro: Mid 2010 or newer
Xserve is no longer compatible.

Based on that, it seems like -march=nehalem -msse4.2 would be a safe bet.
https://en.wikipedia.org/wiki/List_of_Intel_CPU_microarchitectures

AIUI Nehalem was supported at release, but isn’t any more (e.g. no security updates).

Do you know the oldest CPU architecture still supported by Apple?

So, depends what you mean by ā€œsupportedā€.

Penryn: if you’re in a few specific regions (including turkey and CA)

Nehalem: you can install a recent MacOS, but may not get updates

Sandy bridge: fully supported

This is based on an hour or so of research yesterday (to figure out what 10x HHVM performance would cost in terms of supported hardware) - given Apple operate by model year this was a lot of manual cross-referencing though; so I might have missed something.

I’m 70% confident sandybridge is fine as a minimum, and > 95% confident in Nehalem - on the basis of ā€œdo Apple support itā€, not ā€œdo homebrew users use itā€

Is CA California, Canada, or other? (just curious)

Thanks for the research, @fredemmott ! That's helpful.

@MikeMcQuaid Do you have an opinion of the oldest CPU Homebrew ought to support for bottles?
I feel like it would be okay to say Nehalem with SSE 4.2 (or perhaps Sandybridge), and if you have older, you'll need to build from source.

     Oh, wait; if this is only for bottles, then never mind…

Thanks for the research @fredemmott (and hope you're well, it's been a while).

@sjackman I think it's going go to be too simplistic to simply update this as a single value. The portable-ruby is built on 10.5 and we may have users we don't know about still building bottles for old versions of macOS and we want to avoid actively breaking their setup.

My recommendation is that this is we tweak the logic in: https://github.com/Homebrew/brew/blob/a2b4d4aac294ec0872cd84d44f65bd5bd507e166/Library/Homebrew/hardware.rb#L146-L163

Instead it'd be nice to add to the existing logic and set the oldest supported CPU for each macOS release. As doing this for every release might be a bit of a pain having 10.5 remain on core2 and having 10.12/13/14 have their own value each would get @fredemmott what he wants (I think), make our bottles a bit more optimised in general without breaking old versions.

Thoughts?

Penryn: if you’re in a few specific regions (including turkey and CA)

@fredemmott Allow me to fix some miscommunication that may be involved here.

Penryn is supported worldwide.

As @sjackman pointed out, Sierra fully supports the MidĀ 2010 MacĀ mini, a Penryn-based model. Even High Sierra still supports the MidĀ 2010 MacĀ mini.

(Full disclosure: I own one, and rely on it daily. It serves as a replacement for my now-decommissioned Time Capsule.)

Where I feel we should draw the line

When it comes to _decisions for which there is no real maintenance cost on the table,_ I’d suggest for Homebrew to be _as inclusive as possible._ I believe that maintenance cost is a healthy line to draw. For example, I’ve been in favour of removing formula options because they’d introduce additional failure modes, causing additional maintenance burden. Regarding bottles for vintage microarchitectures: I just fail to see the user story here. Feel free to correct me on that; maybe I’m overlooking some important point here.

The cost of building from source

About building from source: it may _sound_ tempting and completely reasonable to say, ā€œhey, why not let those 0.1Ā % of users build from source and, at last, have the other 99.9Ā % of users enjoy the benefits of the latest SSEĀ 2 Electric Boogaloo instruction set?ā€ What makes me feel uneasy about that kind of argument is that _building_ a piece of software is inherently a lot more complex than pouring a pre-built binary. Most people reading this comment have more knowledge than I do about building formulas but you may agree in that building stuff can be a complex process under the covers. Building a formula on a vintage platform like Penryn is also largely untested. _Building a formula can fail in ways where pouring a binary wouldn’t._ I admit that I don’t have any hard data for that. But then again, I’ve been there, and I’ve had my share of builds that broke due to some faulty assumption, overlooked edge case, or missing test coverage.

I’d even be OK for _my own_ 2010 Mac mini to no longer have bottles from now on. I’m OK with brew upgrade taking two hours from now on due to building from source. I’m also OK with brew upgrade failing half-way; I do feel comfortable with uncovering and fixing errors, and I love to learn new things. I’m not OK to have Homebrew blow up in a random user’s face due to a formula build failing on their vintage microarchitecture and there’s no bottle.

My recommendation is that this is we tweak the logic in

@MikeMcQuaid I didn’t know we even had that logic 😱
I’m amazed how we haven’t had lots of unreproducible issues because of that.

make our bottles a bit more optimised in general without breaking old versions.

I strongly oppose. This is going to span a matrix of variance not covered by tests. Maybe I’m a little out of the loop but I don’t recall where we’ve ever had an issue with a bottle being not optimized (whatever that means). And even if there’s been the occasional issue: any user who _really really has a case_ for optimized builds of a specific formula is welcome to maintain a tap to tweak their formula and have it build from source.

A personal observation: Whenever I venture out to have a look at forums and communities who love to optimize (e.Ā g. Hackintosh, overclockers, eGPU enthusiasts, you name it), I’m always fascinated as to what extra lengths they’re going, including building kernel modules to support their case. Listening to what they say sometimes makes me feel there are users who love to tweak, not as a means to an end but sometimes _for the sake of it._ Those users certainly won’t object to maintaining their own tap with extremely tweaked clang parameters so they can make use of SSE 2: Electric Boogaloo. _Let’s not deprive enthusiast users of their tweaking._ šŸ˜‰

(I own an eGPU for my Mac btw šŸ™ˆ I’m so lucky that it’s worked out of the box! _Not ever gonna touch the damn thing_ as long as it works šŸ˜‚ but I digress.)

https://support.apple.com/en-us/HT201624 Is my source for considering Penryn unsupported - that page does list the 2010 Mac mini as ā€œobsoleteā€.

The bar I’m using for ā€œsupportedā€ is ā€œdoes Apple support it?ā€, not ā€œdoes it workā€, or ā€œdo people use itā€. That might not be the right bar for homebrew.

That said:

  • AFAICT Mojave never supported Penryn?
  • this can be /significant/. I started looking into this from ā€œwhy is the bottle build unusable slow compared to building the same tag from brew sh?ā€ - the operation I was doing is 3s or so with a sandy bridge build, and 58s with a Penryn build on the same hardware. This is a real world test (specifically, ā€œrun a source linter on this project I’m working onā€), not a synthetic benchmark. I’m aware that this is a rare extreme case, but there are real reasons to bump it.

The bar I’m using for ā€œsupportedā€ is ā€œdoes Apple support it?ā€, not ā€œdoes it workā€, or ā€œdo people use itā€.

@fredemmott Allow me to clarify. In Apple’s support documents, there are generally two separate notions of ā€œdoes Apple support it?ā€. One is for fixing broken devices; the other is for software support. The source cited in your comment refers to Apple being able and committed to have spare parts for the device, and (possibly for a fee) to fix your device when it breaks. In that context, Apple no longer supports Penryn models in most countries. I’d argue that this should be of no major concern for Homebrew.

The second notion is, ā€œassuming a device without any hardware faults, does Apple commit to keeping a specific major macOS version running for this particular Mac model, including minor updates and security fixes, during that major macOS version’s lifetime?ā€ The answer to this question depends on the OS major version, and can be found in the articles linked above. In that context, Apple supports Penryn models worldwide.

Mojave never supported Penryn?

You are correct; however, Apple is committed to keep Sierra and High Sierra compatible as stated in their articles. As far as Homebrew is concerned, both OSes are relevant, and will remain so as long as Apple supports them.

the operation I was doing is 3s or so with a sandy bridge build, and 58s with a Penryn build on the same hardware.

Whoa, good point! Would you be willing to condense your source into a minimal, self contained example where the difference still shows, and post your example e.Ā g. as a Gist (without disclosing any sensitive information)? I’d deem a 5 – 10Ā % improvement significant enough already to change my mind about those Penryn bottles; however, I’d still like to:

  1. know at least one reproducible, documented case, and

  2. rule out possible upstream errors in the build chain

before we make a decision to abandon bottles for legacy, but perfectly supported, hardware _over the entire formula space._

Thanks

The answer to this question depends on the OS major version, and can be found in the articles linked above. In that context, Apple supports Penryn models worldwide.

I can't find a statement from apple on their software lifecycle and I was assuming it matches the hardware - but I guess I can't really argue that's more likely than "os is supported and os did support it" :)

Mojave never supported Penryn?

Sorry for not mentioning, I was leading here towards Mike's suggestion of targetting different architectures for different builds

Would you be willing to condense your source into a minimal, self contained example where the difference still shows, and post your example e. g. as a Gist (without disclosing any sensitive information)?

I don't think I'll be able to get this to a true minimal self-contained example in a reasonable amount of time - this is involving a jit with some inline __ASM__ helper functions, and apple's profiling tools seem to get a bit confused by it (e.g. they say that the time is spend in the MySQL client, and this example does not involve talking to MySQL). That said, I do have a "run these commands" reliable repro:

So.... if you're happy to install random bottles from the internet, comparing
https://dl.hhvm.com/homebrew-bottles/hhvm-3.30.0.high_sierra.bottle.tar.gz and
https://dl.hhvm.com/homebrew-bottles/hhvm-3.30.1.high_sierra.bottle.tar.gz (brew tap hhvm/hhvm ;brew install hhvm will get you the latter - I don't know how to install a specific bottle). 3.30.0 and 3.30.1 have two differences:

  • -march default vs sandybridge
  • an irrelevant-for-these-purposes security update in the memcached client code

If you want to build from source (this takes hours unless you're on a very modern machine or Mac Pro): brew tap hhvm/hhvm; HOMEBREW_OPTFLAGS="-march=core2" brew install --build-from-source hhvm vs -march=sandybridge

Once you have comparable binaries:

git clone [email protected]:hhvm/hack-router.git
wget https://getcomposer.org/download/1.8.0/composer.phar # this is the package manager for PHP code. We're also using it for Hack
cd hack-router
hhvm composer.phar install
hhvm vendor/bin/hhast-lint src/uri-patterns/ # this takes 3 vs 58s

While we do have some hot code with assembly implementations that require SSE4.2 and hardware CRC support, with alternative C++ implementations, I would be amazed if they're actually the cause - especially as I'm not seeing this with our Debian or Ubuntu packages which use debhelper's default flags. I'm /guessing/ that we're hitting an unoptimized path in clang (our linux builds use GCC 5 through GCC 7 depending on the distribution), or possibly just that the sum of 5 years of "we get 0.x% performance improvement if we do this change" is triggering a pathological case when the assumptions are slightly off.

HOMEBREW_OPTFLAGS="-march=core2" brew install --build-from-source hhvm

I don't believe setting HOMEBREW_OPTFLAGS in the environment will work as you desire. You can however brew edit hhvm and ENV["HOMEBREW_OPTFLAGS"] = "-march=…" which should work as desired.
You can get -march=native with brew install --build-from-source hhvm, and you can get -march=core2 with brew install --build-bottle hhvm.

I don't believe setting HOMEBREW_OPTFLAGS in the environment will work as you desire.

IIRC brew should pass through any environment variable as long as it’s prefixed with HOMEBREW_.

it's out of my command history, but I'm pretty sure that's how I did it; at the time, I confirmed it was working with ps aux | grep clang | grep -v shim and saw the flags.

Sierra is the first version of macOS since OS X Mountain Lion, released in 2012, that does not run on all computers that the previous version supported.[6] Developers have created workarounds to install macOS Sierra on some Mac computers that are no longer officially supported as long as they are packed with a CPU that supports SSE4.1.[7]

https://en.wikipedia.org/wiki/MacOS_Sierra#System_requirements

As @claui said, Sierra and High Sierra can run on a Penryn SSE 4.1 system.

I don't have a definitive reference for this yet, but it sounds to me as though Mojave requires at least Nehalem with SSE 4.2 to run, because the new Metal graphics engine requires it. So it sounds as though the default optimization flags for Mojave could be changed to -march=nehalem without any ill effects on any user, since the OS requires Nehalem.

Best ref I got so far is https://en.wikipedia.org/wiki/MacOS_Mojave#System_requirements and

"Mojave installs the same way as High Sierra. The only difference is that Mojave now requires CPU instructions that were introduced with Nehalem, so you will need to add CPU feature flags for ssse3, sse4.2, and popcnt to avoid Illegal Instruction crashes in the graphics subsystem after boot is complete (causing the top menu bar to flash on and off, and Finder to crash on open)."

https://forums.macrumors.com/threads/macos-10-14-mojave-on-unsupported-macs-thread.2121473/page-35

Does -march=nehalem also enable -msse4.2 by default?

I don't believe setting HOMEBREW_OPTFLAGS in the environment will work as you desire.

IIRC brew should pass through any environment variable as long as it’s prefixed with HOMEBREW_.

It will be passed through, but the value of HOMEBREW_OPTFLAGS will be overwritten/overridden by Superenv, discarding the value specified by the user. See https://github.com/Homebrew/brew/blob/a2b4d4aac294ec0872cd84d44f65bd5bd507e166/Library/Homebrew/extend/ENV/super.rb#L54

it's out of my command history, but I'm pretty sure that's how I did it; at the time, I confirmed it was working with ps aux | grep clang | grep -v shim and saw the flags.

Check the contents of the .cc files found in $HOMEBREW_LOGS/hhvm.

My recommendation is that this is we tweak the logic in

@MikeMcQuaid I didn’t know we even had that logic 😱
I’m amazed how we haven’t had lots of unreproducible issues because of that.

Could you please elaborate? Bottles are built with -march=core2, and from-source builds are built with -march=native. Is it the latter that you feel would result in unreproducible issues?

Might have been via brew sh rather than brew install

gcc called with: -O2 -fno-strict-aliasing -fwrapv -D_FILE_OFFSET_BITS=64 -D_REENTRANT -c -g -DUSE_UNISTD=1 -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0 -I/usr/local/include -I/usr/local/opt/sqlite/include -I/tmp/hhvm-20181218-41802-1bdh8j5/  hhvm-3.30.0/hphp/hack/src/heap -pthread -DHH_BUILD_ID=hh-dc594f85facb75ddef968be77ada88e53352d02c -DHH_BUILD_TIMESTAMP=1545154812ul -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/base/caml -I/tmp/hhvm-201  81218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/base/shadow_stdlib -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/sexplib0 -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/sr  c/_build/.opam/system/lib/base -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/base/md5 -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/fieldslib -I/tmp/hhvm-20181218-4  1802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/ppx_compare/runtime-lib -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/ppx_sexp_conv/runtime-lib -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30  .0/hphp/hack/src/_build/.opam/system/lib/variantslib -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/bin_prot/shape -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/bin_  prot -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/ppx_hash/runtime-lib -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/ppx_inline_test/config -I/tmp/hhvm-20181218-41  802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/ppx_inline_test/runtime-lib -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/core_kernel/base_for_tests -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-  3.30.0/hphp/hack/src/_build/.opam/system/lib/jane-street-headers -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/ppx_assert/runtime-lib -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.  opam/system/lib/ppx_bench/runtime-lib -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/ppx_expect/common -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/ppx_expect/confi  g -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/ppx_expect/collector -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/parsexp -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3  .30.0/hphp/hack/src/_build/.opam/system/lib/sexplib -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/typerep -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/splittable_r  andom -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/stdio -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/core_kernel -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/h  php/hack/src/_build/.opam/system/lib/result -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/visitors -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/ppx_deriving -I/tmp  /hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/third-party/ocaml/build/lib/ocaml utils/get_build_id.c
  superenv removed:  -O2 -g -I/usr/local/include
  superenv added:    -pipe -w -Os -march=sandybridge -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk --sysroot=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/De  veloper/SDKs/MacOSX10.14.sdk -isystem/usr/local/include -isystem/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/OpenGL.framework/Versions/Current/Headers -I/usr/loc  al/opt/icu4c/include -I/usr/local/opt/gettext/include -I/usr/local/opt/imagemagick@6/include -I/usr/local/opt/openssl/include -I/usr/local/opt/readline/include -I/usr/local/opt/libxml2/include
  superenv executed: clang -pipe -w -Os -march=sandybridge -fno-strict-aliasing -fwrapv -D_FILE_OFFSET_BITS=64 -D_REENTRANT -c -DUSE_UNISTD=1 -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0 -I/usr/local/opt/sqlite/include -I/tmp/hhvm-20181218-41  802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/heap -pthread -DHH_BUILD_ID=hh-dc594f85facb75ddef968be77ada88e53352d02c -DHH_BUILD_TIMESTAMP=1545154812ul -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/base/caml -I/  tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/base/shadow_stdlib -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/sexplib0 -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/  hphp/hack/src/_build/.opam/system/lib/base -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/base/md5 -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/fieldslib -I/tmp/hhv  m-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/ppx_compare/runtime-lib -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/ppx_sexp_conv/runtime-lib -I/tmp/hhvm-20181218-41802-1bdh8  j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/variantslib -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/bin_prot/shape -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/sys  tem/lib/bin_prot -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/ppx_hash/runtime-lib -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/ppx_inline_test/config -I/tmp/hhvm  -20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/ppx_inline_test/runtime-lib -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/core_kernel/base_for_tests -I/tmp/hhvm-20181218-41802-1  bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/jane-street-headers -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/ppx_assert/runtime-lib -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/  src/_build/.opam/system/lib/ppx_bench/runtime-lib -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/ppx_expect/common -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/ppx_  expect/config -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/ppx_expect/collector -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/parsexp -I/tmp/hhvm-20181218-41802-1b  dh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/sexplib -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/typerep -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/  splittable_random -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/stdio -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/core_kernel -I/tmp/hhvm-20181218-41802-1bdh8j5/h  hvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/result -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/visitors -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/hphp/hack/src/_build/.opam/system/lib/ppx_der  iving -I/tmp/hhvm-20181218-41802-1bdh8j5/hhvm-3.30.0/third-party/ocaml/build/lib/ocaml utils/get_build_id.c -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk --sysroot=/Applications/  Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk -isystem/usr/local/include -isystem/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk/System/Library/Framew  orks/OpenGL.framework/Versions/Current/Headers -I/usr/local/opt/icu4c/include -I/usr/local/opt/gettext/include -I/usr/local/opt/imagemagick@6/include -I/usr/local/opt/openssl/include -I/usr/local/opt/readline/include -I/usr/local/opt/sqlite/i  nclude -I/usr/local/opt/libxml2/include

@sjackman I didn’t put much thought into it and I believe I may have overestimated the number of users with a non-64-bit CPU.

I just felt a little scared about not being able to reproduce a non-64-bit user’s build issue if Homebrew pins oldest_cpu to core2 automatically for the rest of us.

I believe all recent versions of macOS require a 64-bit processor (ref anyone?), at least versions of macOS that we bottle for (Sierra and up). Homebrew only builds bottles for 64-bit processors. Apple is deprecating the ability to run 32-bit apps on a 64-bit processor. https://support.apple.com/en-ca/HT208436

Based on the @MikeMcQuaid 's comment https://github.com/Homebrew/brew/issues/5425#issuecomment-448915835 and the understanding that macOS Mojave requires Nehalem, it sounds as though we'd accept a PR to change the default bottling architecture for Mojave from the current -march=core2 to -march=nehalem -msse4.2. Are you interested in working on such a PR?

$ brew tap brewsci/bio/racon
$ brew install racon
==> Installing racon from brewsci/bio
==> Pouring racon-1.3.1.x86_64_linux.bottle.tar.gz
…
$ time racon …
2975.95user 3.07system 3:14.91elapsed 1528%CPU (0avgtext+0avgdata 626680maxresident)k
$ brew remove racon
$ brew install -s racon
==> Installing brewsci/bio/racon
==> make install
…
$ time racon …
345.05user 3.17system 0:25.61elapsed 1359%CPU (0avgtext+0avgdata 620092maxresident)k

System under test: 64-bit Skylake with AVX-512 Intel(R) Xeon(R) Gold 6150 CPU @ 2.70GHz

| Optimization | Wall time (s) |
|-----|-----|
| -march=core2 | 194.91 |
| -march=native | 25.61 |

So for this app -march=native is 7.61 times faster than -march=core2. Now I picked this particular app Racon because I knew that optimization for SSE would make a significant difference. It's not the case that all apps see a significant improvement with optimization for SSE. On the other hand, some apps have a marked improvement when optimized for SSE.

Edit: I see the same run time (26.25 s) on a Haswell system with AVX-2 (without AVX-512) with -march=native.

We’ve got all the data we need for a PR. Let’s hold off further discussion on the abstract and save it for code review.

Does -march=nehalem also enable -msse4.2 by default?

Answer: no. You need to specify -march=nehalem -msse4.2. See the output of…

gcc -march=nehalem --help=target -Q -v | grep msse4
gcc -march=nehalem -msse4.2 --help=target -Q -v | grep msse4

Whenever I venture out to have a look at forums and communities who love to optimize (e.Ā g. Hackintosh, overclockers, eGPU enthusiasts, you name it), I’m always fascinated as to what extra lengths they’re going, including building kernel modules to support their case.

Note we specifically don't support Hackintoshes or hacking versions of macOS onto unsupported hardware. That's also going to be far less than 1% of our users. If the vast majority of users can get drastically increased performance on some applications I don't think it's worth avoiding that because we speculative there may be issues for unsupported configurations.

It will be passed through, but the value of HOMEBREW_OPTFLAGS will be overwritten/overridden by Superenv, discarding the value specified by the user.

Yes, to be explicit: we intentionally provide no way for end-users to customise CFLAGS etc. without forking Homebrew.

I just felt a little scared about not being able to reproduce a non-64-bit user’s build issue if Homebrew pins oldest_cpu to core2 automatically for the rest of us.

Again I'm afraid I'm not sure I see this being worth the cost of avoiding performance increases for the majority.

I'm afraid I've unintentionally overstated the effects of SSE. It provides a 2 fold speed up (still great), not 7.6. It seems that something was terribly wrong with the version 1.3.1 bottle of racon built 7 months ago. I rebuilt version 1.3.1_1 today, and it's 4 times faster than the 1.3.1 bottle. I'm not entirely sure what changed in that time that would account for such a big difference.

| Test | Time
|-----------------------------------------|--------
| Poured bottle 1.3.1 -Os -march=core2 | 194.273
| Poured bottle 1.3.1_1 -O2 -march=core2 | 46.117
| Build from source -march=core2 | 45.947
| Build from source -march=core2 -msse | 45.898
| Build from source -march=core2 -msse2 | 45.579
| Build from source -march=core2 -msse3 | 47.180
| Build from source -march=core2 -msse4.1 | 24.362
| Build from source -march=core2 -msse4.2 | 24.514
| Build from source -march=core2 -mavx | 24.171
| Build from source -march=core2 -mavx2 | 23.458

The good news is that for this app, -march=core2 -msse4.1 gives all the same benefits, and I believe macOS Sierra requires SSE 4.1, so we could change the default bottling flags to -march=core2 -msse4.1 for all versions of macOS that we bottle for without impacting users.

I'm not entirely sure what changed in that time that would account for such a big difference.

The Xcode and thus Clang version.

I believe macOS Sierra requires SSE 4.1, so we could change the default bottling flags to -march=core2 -msse4.1 for all versions of macOS that we bottle for without impacting users.

I'm still unconvinced that literally any users would be impacted by the Nehalem bottling change. That this particular app does not show benefits does not mean others would not.

Let's move discussion to the specific implementation in https://github.com/Homebrew/brew/pull/5429, thanks.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

JustinTArthur picture JustinTArthur  Ā·  3Comments

Rotonen picture Rotonen  Ā·  4Comments

MikeMcQuaid picture MikeMcQuaid  Ā·  3Comments

zelsonia picture zelsonia  Ā·  4Comments

stejmurphy picture stejmurphy  Ā·  4Comments