Runtime: Implement AVX Intrinsics

Created on 28 Feb 2018  路  13Comments  路  Source: dotnet/runtime

Pending (a check means it is in a pending PR):

  • [x] Blend
  • [x] BroadcastScalarToVector128
  • [x] BroadcastScalarToVector256
  • [ ] BroadcastVector128ToVector256
  • [x] Ceiling
  • [x] ConvertToVector128Int32
  • [x] ConvertToVector128Single
  • [x] ConvertToVector256Int32
  • [x] ConvertToVector256Single
  • [x] ConvertToVector256Double
  • [x] ConvertToVector128Int32WithTruncation
  • [x] ConvertToVector256Int32WithTruncation
  • [x] DotProduct
  • [ ] Extract
  • [ ] ExtractVector128
  • [x] ExtendToVector256
  • [x] Floor
  • [ ] GetLowerHalf
  • [ ] Insert
  • [ ] MaskLoad
  • [ ] MaskStore
  • [ ] MoveMask
  • [ ] Permute
  • [ ] Permute2x128
  • [ ] PermuteVar
  • [x] RoundCurrentDirection
  • [x] RoundToNearestInteger
  • [x] RoundToNegativeInfinity
  • [x] RoundToPositiveInfinity
  • [x] RoundToZero
  • [ ] SetAllVector256
  • [ ] SetHIghLow
  • [ ] Shuffle
  • [x] StaticCast
  • [x] TestC
  • [x] TestNotZAndNotC
  • [x] TestZ
  • [ ] ZeroAll
  • [ ] ZeroUpper
  • [ ] ZeroExtendToVector256
area-CodeGen-coreclr enhancement

Most helpful comment

which ones you are currently planning on implementing?

BroadcastScalarToVector128
BroadcastScalarToVector256
BroadcastVector128ToVector256
Extract
ExtractVector128
Insert
SetAllVector256
SetVector256

All 13 comments

This is a parent issue to: https://github.com/dotnet/coreclr/issues/16583

@fiigii, could you confirm which ones you are currently planning on implementing?

which ones you are currently planning on implementing?

BroadcastScalarToVector128
BroadcastScalarToVector256
BroadcastVector128ToVector256
Extract
ExtractVector128
Insert
SetAllVector256
SetVector256

@fiigii, thanks.

I am working on the "simple" intrinsics as per our earlier conversation.

This currently includes:

  • Blend
  • Ceiling
  • DotProduct
  • Floor
  • RoundCurrentDirection
  • RoundToNearestInteger
  • RoundToNegativeInfinity
  • RoundToPositiveInfinity
  • RoundToZero
  • TestC
  • TestNotZAndNotC
  • TestZ

(this is basically the Vector256 versions of the SSE4.1 I implemented in https://github.com/dotnet/coreclr/pull/16558)

https://github.com/dotnet/coreclr/pull/16655 contains the intrinsics I mentioned above

@fiigii, I'm going to work on the remaining, in alphabetical order, until you finish the ones you are currently working on, then we can sync up again.

I will implement Permute as well.

@RussKeldorph I think we can finish this for 2.1.

Currently writing tests for ExtendTo, GetLowerHalf, and StaticCast. Should be up shortly.

Can we close this issue?

@fiigii, the only ones not implemented are ZeroAll and ZeroUpper, correct?

And MaskStore.

Can you log bugs tracking those three separately?

Happy to close this issue afterwards.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

jzabroski picture jzabroski  路  3Comments

v0l picture v0l  路  3Comments

bencz picture bencz  路  3Comments

nalywa picture nalywa  路  3Comments

noahfalk picture noahfalk  路  3Comments