Conan: Pivoting on target-cpu/arch

Created on 9 Jan 2017 · 21Comments · Source: conan-io/conan

We have a couple common generations of CPU above the baseline x86_64 instruction - namely sandybridge and haswell, with AVX and AVX2/BMI/BMI2 respectively.

LLVM-backed languages and GGC 4.9+ all support "x86-64, sandybridge, haswell, native" for the -march/--target-cpu parameters. GCC 4.8 uses alternate identifiers corei7-avx and core-avx2 for those platforms.

These map nicely to MSVC /arch:AVX and /arch:AVX2, which is as granular as MSVC goes.

For now I'm using an extra field in .conan/settings.yml: target_cpu: [x86, x86-64, nehalem, sandybridge, haswell, native], but I need to move this down into the packages I consume as well, if I want to pivot on sandybridge/haswell support.

Has this come up before? Any convention to adopt?

Feedback please! huge medium help wanted feature

Source

lilith

👍2

Most helpful comment

@fpelliccioni The tool you describe is exactly what I've been looking for to validate my binaries. With so many compilers involved in a build it can be very difficult to ensure that an unsupported instruction didn't sneak in somewhere.

lilith on 23 Oct 2017

👍2

All 21 comments

I have no previous experience managing those alternative architectures, but, why they are not just additional arch setting values?

I'm not sure either about the native setting, from the gcc page it says This selects the CPU to generate code for at compilation time by determining the processor type of the compiling machine It sounds like it should not be a setting value, because it's variable and not determinable.

Any community feedback would be great.

lasote on 9 Jan 2017

If there is no conditional logic based on arch, and we only support the
extra instructions in 64-bit mode, then 'arch' would be fine. But do we
want to exclude 32-bit pointer-sized code optimized for these platforms?

lilith on 9 Jan 2017

These instructions sets are actually available for both x86 and x86_64 architectures.

I think native would need special handling - it should be source-only.

I'm bumping up against this pretty hard with libjpeg-turbo. It's 2-3x faster when compiled for haswell vs baseline x86_64, but recompiling from source pushes Travis over the edge and hits the 45-minute timeout.

lilith on 10 Mar 2017

I would like to push this for the 0.29, some other user has required the same thing and it's time to establish a convention to follow. Initially only the base settings, later we can think about the build helpers to inject some needed flag and some detection of the CPU microarchitecture (https://pypi.python.org/pypi/cpuid @fpelliccioni) to warn if a bad setting is detected. So @memsharded, lets work on it.
@nathanaeljones I think we could:

arch: 
  x86:
       microarch: [None, "nehallem", "bonnell", "sandy_bridge", "ivy_bridge", "silvermont", "haswell", "broadwell", "skylake", "goldmont", "kaby_lake", "coffee_lake"]
  x86_64:
       microarch: [None, "nehallem", "bonnell", "sandy_bridge", "ivy_bridge", "silvermont", "haswell", "broadwell", "skylake", "goldmont", "kaby_lake", "coffee_lake"]
  ppc64le:
  ppc64:
  armv6:
  armv7:
  armv7hf:
  armv8:
  sparc:
  sparcv9:
  mips:
  mips64:
  avr:

I don't like to repeat the microarchitectures, but I don't see a better approach.
The "None" allows the user to not specify the subsetting.

lasote on 22 Oct 2017

I'd also suggest that we may want to support 'native', i.e, whichever features are supported on the build machine. This value would need to disable build caching, though - all packages would need to be built from source.

Also, we may want to consider matching the gcc/llvm names as closely as possible. For GCC < 5 we'll have to map a few anyway, though.

lilith on 22 Oct 2017

Generations are also not very specific, and may not work on mobile or low-end editions.

I started with x86_64, nehalem, sandybridge, haswell, native, but skylake should probably be added for TSX support.

I'm not sure there's much value in including tick releases unless they add new instruction sets.

lilith on 22 Oct 2017

For LLVM:

llc -march=x86 -mattr=help
llc -march=x86-64 -mattr=help

I forget how to list the values for GCC.

lilith on 22 Oct 2017

For arm we are shipping a couple of different architectures right now:

armv7 hardfloat
armv7 softfloat
armv7 hardfloat + neon
armv7 hardfloat + thumb + neon

I think it makes sense for the armv7 platform to contain: float=["hard", "soft"], thumb=[True, False], neon=[True, False]

On some platforms we also need to set the the specific FPU like this: -mfpu=vfpv3-d16 not sure if that is something that should be abstracted in conan though.

tru on 23 Oct 2017

I was thinking about it, here I leave some of my conclusions:

A. Relying on micro-architecture is a breakthrough, but ... what if I do
the following?

g++ xxx.cpp -O3 -march=sandybridge ...
g++ xxx.cpp -O3 -march=haswell ...
g++ xxx.cpp -O3 -march=skylake ...

It is likely that the resulting binary files of 1, 2 and 3 are exactly
the same, in such a case, it does not make sense to differentiate them as
they are compatible or equal binaries/packages.

B. Some Intel micro-architectures have the same extensions as others. For
example, according to the Intel tick-tock model, in theory, Sandybridge and
Ivybridge are equivalent (with respect to sets of instructions or
extensions).
Therefore, it is not worth differentiating them.

I think, in both cases, what really matters is what sets of instructions
were used.
For this, I am working on a tool that examines an executable or library
(.a, .so, .dll, etc ...) and report which sets of instructions were used.

For example:

get_extensions("a.out",...) == ['MODE64', 'SSE', 'AVX']

In this way, the micro-architecture no longer matters, but what really
matters are the sets of instructions used.

I think Conan packages can have a setting to determine which extensions
the binaries use. For example:

class HelloConan(ConanFile):
    settings = "os", "compiler", "build_type", "arch", "extensions" [0]

extensions have to be assigned after the binary was compiled, I imagine,
creating a new method (member function) in the ConanFile class, for
example:

def set_extensions(self):
    self.extensions = get_extensions(... list of binary files ...)

On the client side, when Conan looks for a package, it can identify which
instructions are available for the processor (using the cpuid python
package, for example) and in this way find the package that best fits.

I have a demo of the tool that analyzes the executables, if the idea is of
interest/utility to the community, I could invest some time on it.
The demo for now only works for x86 and Elf format, but it can be
extensible for other architectures and formats (PE, Mach-O).

fpelliccioni on 23 Oct 2017

cat /proc/cpuinfo shows the following flags for me:

fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp

My questions would be

(a) how do we map these to, say, MSVC that only supports AVX and AVX2, and infers other instruction support based on those?

(b) is set math cost-prohibitive for conan?

(c) does the implementation cost of full instruction support outweigh the benefit? I would see nehalem/sandybridge/haswell/skylake/native as quite a bit simpler to implement and test. Permutations make things harder.

(d) don't we still have to specify either an instruction group or processor generation when publishing pre-compiled binaries for other's use?

lilith on 23 Oct 2017

👍2

Some comments:
@nathanaeljones I'm not sure about the native. I understand your point but a setting should be used to determine the binary you are getting. Any ideas @memsharded? I think native should be avoided in favor of a detection (or user declaration) of a default microarchitecture in the default profile.

@fpelliccioni About:

It is likely that the resulting binary files of 1, 2 and 3 are exactly
the same, in such a case, it does not make sense to differentiate them as
they are compatible or equal binaries/packages.

Yes, but if you know that your library is built exactly for those different microarchitectures, you can control it in the package_id() method to get only one binary. But conceptually, the code could be different if you build it for different microarchitectures, right?

About the instructions set, I understand it's what really matters but yeah, as you can support many of them it makes unmaintainable and crazy for the user. The combination is quite infinite. So it looks it couldn't be a setting.

About detecting the setting after building the library is kind of chicken-egg problem, now it's working like: you declare the settings => conan builds the library and calculating a package ID. You are proposing the opposite, the settings are determined by the built library. It affects to the core model of conan, and it's not possible to do it, in my opinion.

lasote on 24 Oct 2017

Detection is not possible when cross-compiling either.

tru on 24 Oct 2017

Hi all,

Given the previous experience modeling the standard of the language (still WIP), I have some observations, the main question is: Should we model this with options?

In #2042 I proposed a way to have options presets, so (pending review) we could do something similar to this:

class MyLib(ConanFile):
   ...    

   def options(self, config):
       config.add_microarchitecture() # We could add a preset list of march like the described here: https://gcc.gnu.org/onlinedocs/gcc-7.2.0/gcc/x86-Options.html#x86-Options

Most users are not concerned about this, so adding an option only when needed sounds reasonable.
Because with Visual Studio is very different, {IA32, SSE, SSE2, AVX, AVX2}. We could add different options if compiler == "Visual Studio" (options depending on settings)
About the native I still think that is not a good idea because we don't model as a setting the real microprocessor so we can't know in a deterministic way the real configuration for the generated binary.
About the automatic detection with the @fpelliccioni library. I see it as a very interesting next step, as a tool that maybe can be used before the build or package to check the built binaries, but not to autodetect a setting/option.
About the compatibility between them, for example, the instruction set of haswell is compatible with a skylake but I think it doesn't matter. If the user is generating specialized packages for different processors, it's ok to have both of them and let the consumer specify the better for him. But with package_id() method, a recipe creator is not providing a different binary, could model somethink like: "If self.options.march == skylake: self.info.options.marc=haswell or something similar, so it keeps opened to the binary optimizations.

lasote on 21 Nov 2017

I vote to strike 'native' from this feature request. It's orthogonal.

I would suggest that whichever method is selected, that it is inherited by the tree, such that sub-dependencies are built with the same ISA by default

lilith on 22 Nov 2017

Been thinking about this topic a lot over the last month as I get deeper into Conan, though this is the first time I've actually read through this issue. And for reference, I'm coming from the "I have to cross-compile for a bunch of different ARM CPUs" world.

I haven't gotten around to implementing this yet, but the best idea I've come up with is to put the burden on the user: create a profile that sets the CFLAGS environment variable with appropriate compiler flags. For instance, I might have a sitara profile that looks like this:

[build_requires]
[settings]
os=Linux
os_build=Linux
arch=armv7hf
arch_build=x86_64
compiler=gcc
compiler.version=4.9
compiler.libcxx=libstdc++
build_type=Release
[options]
[env]
CFLAGS=-mfloat=hard

The tough part is that this assumes your conan recipes and/or build system pay attention to the CFLAGS environment variable and use it appropriately. That isn't always the case, but it is quite frequently.

DavidZemon on 7 Feb 2018

Thanks for your feedback. You are right about the profile and the flags, but the goal is to standardize a way to generate different binary packages for the same recipe based on different micro-architectures. The problem with only the flags is that you will be overwriting the same package in the cache if you run in twice. It has to be modeled, as a setting or as an option.
We have a similar problem with the c++ standard version, we are still evaluating it.

lasote on 7 Feb 2018

The problem with only the flags is that you will be overwriting the same package in the cache if you run in twice.

Are you saying that when I set environment variables in a profile, those environment variables do not affect the hash of any packages built with that profile?

DavidZemon on 8 Feb 2018

Are you saying that when I set environment variables in a profile, those environment variables do not affect the hash of any packages built with that profile?

Yes, exactly that. The only things that affect a package ID are the settings, the options and the requirements of the package.

lasote on 8 Feb 2018

👍1

I've been reading, re-reading, and triple-re-reading this thread. I tried looking at the referenced PR (#2042) but only understood a little of it. As always, so much more complex than I initially imagined it would be.

I think we all agree that the theoretical best solution is to make Conan aware of specific instruction sets. However, @lasote might be right when he said

... it makes unmaintainable and crazy for the user. The combination is quite infinite. So it looks it couldn't be a setting.

But the other solutions are not ideal either. I would push for removal of the 1.1 milestone target and leave it blank. With no promise of a deadline, we can then all work together to come up with some kind of a solution (even if it's a major breaking API and doesn't come out until Conan v2 or v3) that minimizes the burden on the 90% of users that don't care but allows the 10% of us that _do_ care _very_ much to specify exactly which instruction sets should be enabled.

The C/C++ world still does not have a top-notch package manager solution, and I don't think any solution which fails to properly tackle this specific problem will ever gain the kind of popularity that Maven, NPM, and Pip have gathered. So, before Conan tries to take over the world, I think it should solve this problem in the absolute best possible way.

Solving this problem won't be easy. I think it will require that Conan is capable of mapping instruction sets to compiler flags. This will take a lot of research to provide by default a useful portion of this mapping for as many different compilers and instruction sets as possible. The mapping will also need to be user-extendable, just like the rest of settings.yml (though the mapping may reside in a different file).

I think the burden on the end-user could be eased by providing optional "families", such as i386 which would encompass a large group of instruction sets. The family i786 would then reference i686 and append SSE2 and SSE3.

It may be worth adding a boolean to the profile that says "I will accept binaries that are compatible with my CPU but do not fully utilize all of its features," or "I want to recompile any package that does not fully utilize my CPU." This would also require having a list of conflicting instruction sets or families (to prevent trying to cross-link 32- and 64-bit binaries).

DavidZemon on 9 Feb 2018

Having something like fpu or microarch in settings.yml would be really cool.
The difference between in code compiled with for example neon vs vfp or SSE vs AVX_512 can be quite large.