Roslyn: Language feature Proposal: Elementwise array operators/Vectorization

Created on 5 Feb 2017  路  5Comments  路  Source: dotnet/roslyn

I for myself like to use c# for scientific calculations: prototyping in Mathematica and if needed implementing in c# for performance. So i want to propose a new feature which would open c#/vb much more for the scientific community:

Currently, if we do something like this:

int[] a = { 1, 2, 3 };
int[] b = { 2, 3, 4 };
int[] c = a + b;

We get a simple Errormessage saying:

Error CS0019: Operator '+' cannot be applied to operands of type 'int[]' and 'int[]'

I don't see why the compiler shouldn't do this just as an elementwise operation. this could be done for every existing operator (except for 2, see below) which is overloaded for the specified type. A few examples:

string[] s = (new string[] {"a", "b", "c"}) + (new string[] { "x", "y", "z" });

{"ax", "by", "cz"}

int[] i = (new int[] { 10, 8, 6 }) / (new int[] { 5, 2, 3 });

{2, 4, 2}

double d = 5.0 * (new double[] { 1.0, 2.0, 3.0 });

{5.0, 10.0, 15.0}

int[][] di = (new int[][] { new int[] { 1, 2 }, new int[] { 3, 4 } }) + (new int[][] { new int[] { 4, 3 }, new int[] { 2, 1 } });

{ {5, 5}, {5, 5} }

Pros:

  1. More readable code (the examples werent perfect here because of the
    initializations were all made on the same line)
  2. No more nested For-Loops for such simple things.
  3. It could probably expanded to every IEnumerable Interface type
  4. It could possibly optimized by the compiler with SIMD-Instructions

Cons:

  1. Could lead to confusion about which type (singulett x singulett, singulett x array, array x array) is used
  2. Conflict with existing operators == and != and therefore need to introduce a new elementwise compare operator (i would propose something like ===, =!= or .==, .!=)
  3. Loss of importance of the Vector<> classes in System.Numerics (while in my opinion they would still have their right to exist and cases where we should use them)

Here is an example of an application of it. We implement a Runge-Kutta kernel for solving ODE's:

k1 = f(t, y0);
double arr new double[m];
for (int j = 0; j < m; ++j)
    arr[j] = y0[j] + dt * k1[t];
k2 = f(t, arr);
for (int j = 0; j < m; ++j)
    y[i, j] = y0[j] + dt * 0.5 * (k1[j] + k2[j]);

This could be written much more efficiently with elementwise operators to:

k1 = f(t, y0);
k2 = f(t, y0 + k1 + dt);
y[i] = y0 + dt * 0.5 * (k1 + k2);

All in all, i would consider this a win and see no really bad downsides But i think this is the case, because i would really wish vectorization (and more constructs like it) would be present in c# so I think i can't really being objective here.

I'd like to see a discussion here for pro and cons. Is there a reason why we didn't consider this earlier or did we? Does it make sense to implement?

Area-Language Design

Most helpful comment

I for myself like to use c# for scientific calculations: prototyping in Mathematica and if needed implementing in c# for performance

Your proposal doesn't seem to help your stated goal. If adding together two arrays produces a new array, then you're going to tank performance with these allocations. You'd need a location to put the results you were building up.

You're best served here by either providing simple methods that operate over arrays and do these operations. Or by wrapping arrays in a struct and then exposing operators that do what you want. Note that because the array is in a struct there should be no actual overhead because the compiler will elide all the accesses to the array through the struct.

Note: this is how ImmutableArray works: https://github.com/dotnet/corefx/blob/master/src/System.Collections.Immutable/src/System/Collections/Immutable/ImmutableArray_1.cs

Your type would be similar to ImmutableArray, except that it would be mutable :) And it would have the operators you want.

All 5 comments

I for myself like to use c# for scientific calculations: prototyping in Mathematica and if needed implementing in c# for performance

Your proposal doesn't seem to help your stated goal. If adding together two arrays produces a new array, then you're going to tank performance with these allocations. You'd need a location to put the results you were building up.

You're best served here by either providing simple methods that operate over arrays and do these operations. Or by wrapping arrays in a struct and then exposing operators that do what you want. Note that because the array is in a struct there should be no actual overhead because the compiler will elide all the accesses to the array through the struct.

Note: this is how ImmutableArray works: https://github.com/dotnet/corefx/blob/master/src/System.Collections.Immutable/src/System/Collections/Immutable/ImmutableArray_1.cs

Your type would be similar to ImmutableArray, except that it would be mutable :) And it would have the operators you want.

馃崫

Maybe instead of supporting the standard arithmetic operators (+, -, *, /, %) arrays of numeric types could support arithmetic compound-assignment operators (+=, -=, *=, /=, %=). The operations would occur in place on the first array rather than requiring an allocation:

// given
int[] a = new [] { 1, 2, 3 };
int[] b = new [] { 4, 5, 6 };

// the following
a += b;

// is equivalent to the following
for (int i = 0; i < a.Length && i < b.Length; i++) {
    a[i] += b[i];
}

Obviously that has a number of issues. First and foremost is that it requires that the compiler treat arrays in a special manner when it comes to these operators rather than relying on normal overloaded operators. Second, this would involve the compiler treating compound-assignment operators unlike how the language treats them normally (although delegates are another example of this). Third, you'd have to answer definitively what the compiler does based on the potential different types of the arrays, especially if one is different from the other, and in the case that one array is not the same length as the other. Addition, like the sample above is easy since it's probably safe to assume that adding an element from one array where there is no element in the other array, is likely the same as adding that value to zero. But what about multiplication or division?

As for me. arr1 + arr2 should be concat, not plus all element

What will happen if we have array of thing that can't use plus?

In your case you could write your own extension method and do arr1.Plus(arr2)

or just arr.Zip(arr2,(l,r) => l + r)

Or wait for feature extension everything so you could write extension operator for int[]

Or wait for another feature that allow you to do fixed size array without unsafe

Or just create your own Vector struct, with array, and use return ref in C# 7 to proxy X,Y,Z and indexer

C# struct Vector { int[] arr; public ref int X { get { if((arr?.Length ?? 0) < 1) arr = Array.Resize(arr,1); return ref arr[0]; } } }

@JulienKluge Use a math library, e.g. Math.NET, so that you got the same feel as working with python.
C# is a general purpose language, but it enables libraries to provide domain-specific features. There is no need to modify the language itself.

The operators mean different things in different domains. Some people use * as element-wise multiplication, while some see * as matrix multiplication, and there are also other views.

i thank everyone for the discussion/clarification and will close this thread. I will keep using libraries and System.Numerics.Vector.

Was this page helpful?
0 / 5 - 0 ratings