Mathjs: Support for typed arrays

Created on 2 Aug 2013  路  15Comments  路  Source: josdejong/mathjs

Right now mathjs doesn't work with typed arrays. For example :

math.matrix([[1, 2, 3, 4]]).size() // [1, 4]
math.matrix([new Float32Array([1, 2, 3, 4])]).size() // [1]
feature help wanted

Most helpful comment

We have already used math.js for low sample rate data and it works great. We are now looking to use the math.js expression parser to handle TypedArrays with millions of elements.

I've been experimenting with overriding typed functions and as we're only using 1D arrays we can see some big boosts in performance.

See the example code here, which attempts to profile the built in multipler, an override of Array, number and an override of Float32Array, number.

For multiplying an array with 1 million entries by a constant number we get:

"built-in: 33.714ms"
"override: 5.761ms"
"override-typed: 1.785ms"

We're running in Electron so we could even call into native code for huge arrays where SIMD instructions can give us perhaps another 2-4x boost depending on hardware:

'Float32Array, number': function (x, y) {
  const outArray = new Float32Array(x.length);

  if(x.length > 1000000) {
    native.multiply_32_constant(x, x.len, y, outArray)
  } else {
    for(var i = 0; i < x.length; i++){
      outArray[i] = x[i] * y;
    }
  }

  return outArray;
},

Obviously, exiting JavaScript land to do this is only going to be worth where it's actually faster!

All 15 comments

Yes indeed. Not sure how complicated that will be to implement - typed arrays don't have the same methods available as regular arrays, so you really need to reckon with them.

Maybe implementing a wrapper to have a common interface between typed arrays and normal arrays.

yes, just not use the native funcitons like Array.forEach, and instead use a util function util.forEach, which reckons with the type of array.

It might be a bit tricky though because it's obviously more efficient to use native typed array methods. If for example there is a forEach it could be tempting to use it when you could find your way out with a more efficient method like set. Not sure if such a case could ever happen ... but this should be kept in mind.

So for typed arrays, maybe it would make sense to do like numpy : you have an option to specify which type your array should be when creating it.

And I think that as @garrison suggested if using Typed arrays, it actually makes more sense to store the whole matrix in one array. But this sounds like it will have big implications on the API and the implementation.

However I see this as something critical anyways, cause performance is a major goal (for me at least).
But once again, I don't have any experience with this, so I have no idea what it takes to actually implement it.

I would (at least for a first version) limit to a 2d array, stored in a single (1d) array. You address specific cells via indexing like described here on wikipedia.

It will not affect the (external) API, though you will need to add support for this new Array type in each of the functions. Most functions perform element wise operations on a matrix, and that is done in util/collection.js, using a number of map functions. You can extend these map functions to handle this new type of array. Some functions like multiply are not element wise, in that case you will have to extend this function with an extra case to handle the new array type.

I'll give it a shot this week-end if I find time.

from #1438 -- why used TypedArrays

We use HDF5 files for recording lots data in a compact format.

We use https://hdf-ni.github.io/hdf5.node to pull the data into TypedArrays and we need to transform / save / send to client browsers. It would be nice to do the transforms using mathjs over the top of the TypedArrays without having to convert them because of size and performance reasons.

Also if you want to use mathjs for any webgl computations - lots of this data can be sent via typedArrays...

Thanks, Chad

Thanks for your inputs Chad.

I'm curious to see how compact it is. And if/how much performance improves (I've done tests in the past with ndarray which uses TypedArrays, it doesn't have a magically faster performance over regular arrays. It's important to get a real world benchmark in this regard.).

Anyone interested in looking into how we could start supporting TypedArrays, work out a proof of concept to see what it can bring?

We have already used math.js for low sample rate data and it works great. We are now looking to use the math.js expression parser to handle TypedArrays with millions of elements.

I've been experimenting with overriding typed functions and as we're only using 1D arrays we can see some big boosts in performance.

See the example code here, which attempts to profile the built in multipler, an override of Array, number and an override of Float32Array, number.

For multiplying an array with 1 million entries by a constant number we get:

"built-in: 33.714ms"
"override: 5.761ms"
"override-typed: 1.785ms"

We're running in Electron so we could even call into native code for huge arrays where SIMD instructions can give us perhaps another 2-4x boost depending on hardware:

'Float32Array, number': function (x, y) {
  const outArray = new Float32Array(x.length);

  if(x.length > 1000000) {
    native.multiply_32_constant(x, x.len, y, outArray)
  } else {
    for(var i = 0; i < x.length; i++){
      outArray[i] = x[i] * y;
    }
  }

  return outArray;
},

Obviously, exiting JavaScript land to do this is only going to be worth where it's actually faster!

Cool @timfish, thanks for sharing your experiments, appreciated!

I did run your code locally, there it gives me the following results:

built-in: 26.384ms      // Mixed types (type check on every matrix element)
override: 3.813ms       // Number (64 bit)
override-typed: 1.680ms // Float32Array
override-typed: 2.386ms // Float64Array

The large difference between built-in and override is I think because the build-in version does do a type check for every individual matrix element, it's still on my wishlist to make this smarter. But the difference between override shows that typed arrays can yield a serious difference :).

Happy to do a PR.

A great stop-gap solution would be to add TypedArray and built in conversions to Array as this would at least allow users to pass typed arrays into mathjs.

math.typed.addType({
  name: 'TypedArray',
  test: function (x) {
    return x instanceof Float32Array || x instanceof Float64Array || // check all of them
  }
})

math.typed.conversions.push({
   from: 'TypedArray',
   to: 'Array',
   convert: function (x) {
     return new Array(x);
   }
},
{
   from: 'Array',
   to: 'TypedArray',
   convert: function (x) {
     return new Float64Array(x);
   }
})

Because Typed arrays only contain numbers, they can be targeted so types aren't checked on every element:

'TypedArray, number': functionunction(a, b){
  return algorithm14(
    { _data: a, _size: a.length }, 
    y, 
    typed.find(multiplyScalar, 'number, number'),
    false).valueOf()
}

Yes indeed, we can prevent type checking on all elements quite easily, luckily.

I'm not sure about the stop gap solution, it gives a bit of a false sense of support for typed arrays, and it's probably just as easy to create a util function toArray which converts your typed arrays to regular a Array before passing to math.js.

We do not have to implement support for typed arrays for _all_ functions right away but I think it would at least implement a basic set of arithmetic and trigo functions before we publish it.

One other idea: typed arrays are one dimensional. It would be nice to have a Matrix implementation which uses a typed array under the hood but allows putting two and/or multiple dimensions in it. Could be there are proper libraries out there that we could integrate. This could replace the current DenseMatrix implementation which uses nested Arrays.

I had assumed that DenseMatrix could contain non-number elements so had discounted that. If its only numbers then it would make sense to have it backed by a typed array.

I did notice the the README says:

Math.js works on any ES5 compatible JavaScript engine: node.js 4 or newer, Chrome, Firefox, Safari, Edge, and IE11.

Typed arrays will work in all the above mentioned environments although I think they are technically ES6. 馃し鈥嶁檪

I had assumed that DenseMatrix could contain non-number elements

Yes that is correct. I was just thinking about a more generic Matrix implementation which allows you to select different types of Arrays to be used under the hood. In case of regular Arrays, the elements can contain any data type. Or maybe we should enforce all elements to have the same type. Just thinking aloud here. Most important and difficult change would be to change from a nested Array to a flat Array internally.

Was this page helpful?
0 / 5 - 0 ratings