Three.js: Parsing of an STL ASCII file could be much faster

Created on 5 May 2017 · 17Comments · Source: mrdoob/three.js

When I tried to load an STL ASCII file with STLLoader the first time I wondered why the parsing took such a veeeeery long time.

The reason is quite obvious:

Because of FileLoader's response type being set in line 48 of STLLoader.js FileLoader loads the file into an ArrayBuffer, even if it is ASCII, so that it has to be reconverted into text by STLLoader's ensureString function.

On my machine this reconverting takes about 12 additional seconds on Chrome and about 6 seconds on Firefox for a 40MB file.

If I have such an ASCII file and remove setResponseType('arraybuffer') on trial, the file is parsed directly and relatively quickly (about 2-3 seconds).

Suggestion

Source

jostschmithals

Most helpful comment

Do you folks think we need loader.setResponseType() or something for
browsers which don't support TextDecoder?

Getting specific, that's IE 11 and Microsoft Edge[1]. I'd encourage interested persons to upvote the issue here. It is marked as "under consideration" currently.

My preference would be to not add setResponseType() if it's purely an optimization for one vendor. Setting a "text" response type for a binary .glb will not work, so it seems likely to confuse people in addition to adding code paths we must maintain.

[1] Excluding Opera Mini, which doesn't support WebGL anyway.

donmccurdy on 16 May 2017

👍3

All 17 comments

@takahirox this seems relevant for the FBXLoader too...

mrdoob on 14 May 2017

Maybe this is relevant for GLTF2Loader too /cc @donmccurdy

I've already noticed this issue but I haven't come up with any good ideas yet.
The problem is we can't detect the file type(ascii or binary) without actually reading data.

In my mind, one easy solution is *Loader accepts file type option.
If it indicates the file is ascii, we can directly load as text.

Or loading twice (first as arraybuffer, second as text) would be faster than converting because of browser cache?

takahirox on 14 May 2017

Since I expected that this would be relevant for some other loaders as well, but I am not familiar with loader internals, I only pointed out the issue and made no suggestion.

But this matches my own thoughts:

In my mind, one easy solution is *Loader accepts file type option.
If it indicates the file is ascii, we can directly load as text.

jostschmithals on 14 May 2017

It may be worth looking into TextDecoder for these cases, instead of iterative String.fromCharCode(...). But when I tried briefly with GLTF2Loader, it wasn't clear what encoding to use.

Allowing loader.setResponseType('...') does seem fine to me, too.

donmccurdy on 14 May 2017

👍1

Oh! I didn't know that Chrome already started to support TextDecoder.
(It didn't support yet when I looked into before)
But it still seems an experimental, IE and Safari don't support yet.

So, we'll have both TextDecoder(+polyfill) and loader.setResponseType()?

First of all, I'll evaluate the performance of TextDecoder.

About encoding type, wouldn't default 'utf-8' work?

Update:
TextDecoder.decode() seems to copy the bytes, not share the original bytes.
~~So maybe the situation wouldn't change.~~

takahirox on 15 May 2017

I wrote an easy benchmark and found that TextDecoder approach is about 70x faster
than the String.fromCharCode one on my Windows10 + Chrome.
So using TextDecoder would be worth.

Benchmark:

function initArray() {

    var a = [];

    // 32MB
    for ( var i = 0; i < 0x2000000; i++ ) {

        // ascii char set
        a[ i ] = ( ( Math.random() * 128 ) | 0 );

    }

    return new Uint8Array( a );

}

function funcStringFromCharCode( array ) {

    var str = String;
    var s = '';

    for ( var i = 0, il = array.length; i < il; i++ ) {

        s += str.fromCharCode( array[ i ] );

    }

    return s;

}

function funcTextDecoder( array ) {

    return new TextDecoder().decode( array );

}

function run( label, func, array ) {

    var startTime = performance.now();
    func( array );
    var endTime = performance.now();
    console.log( label + ': ' + ( endTime - startTime ) );

}

var array = initArray();
run( 'String.fromCharCode', funcStringFromCharCode, array );
run( 'String.fromCharCode', funcStringFromCharCode, array );
run( 'String.fromCharCode', funcStringFromCharCode, array );
run( 'TextDecoder.decode ', funcTextDecoder, array );
run( 'TextDecoder.decode ', funcTextDecoder, array );
run( 'TextDecoder.decode ', funcTextDecoder, array );
console.log( 'Two results are same: ' + ( funcStringFromCharCode( array ) === funcTextDecoder( array ) ) );

Result:

String.fromCharCode: 1879.994999999999
String.fromCharCode: 1740.5349999999999
String.fromCharCode: 1706.8400000000038
TextDecoder.decode : 59.61000000000058
TextDecoder.decode : 66.30999999999767
TextDecoder.decode : 67.6649999999936
Two results are same: true

takahirox on 15 May 2017

🎉1

Niice!!

mrdoob on 15 May 2017

So, replacing existing convert function with like this so far?

function convertUint8ArrayToString( array ) {

    if ( window.TextDecoder !== undefined ) {

        return new TextDecoder().decode( array );

    } 

    var s = '';

    for ( var i = 0, il = array.length; i < il; i ++ ) {

        s += String.fromCharCode( array[ i ] );

    }

    return s;

}

takahirox on 15 May 2017

BTW, I realized the convert function of STLLoader isn't tuned well.
Just tuning it would make the performance 4x better.

I added

function funcStringFromCharCodeSTL( array ) {

    var a = [];

    for ( var i = 0; i < array.length; i++ ) {

        a.push( String.fromCharCode( array[ i ] ) );

    }

    return a.join('');

}

and compared again. The result is

String.fromCharCode: 949.6350000000093
String.fromCharCode: 1431.4800000000396
String.fromCharCode: 1858.3850000000093
String.fromCharCodeSTL: 4436.550000000047
String.fromCharCodeSTL: 5141.315000000002
String.fromCharCodeSTL: 5079.369999999995
TextDecoder.decode : 61.679999999993015
TextDecoder.decode : 65.70499999995809
TextDecoder.decode : 73.2599999999511

takahirox on 15 May 2017

@jostschmithals

Would you please try to add

if ( window.TextDecoder !== undefined ) {

    return new TextDecoder().decode( array_buffer );

}

after

var array_buffer = new Uint8Array( buf );

in STLLoader

https://github.com/mrdoob/three.js/blob/6028312ec0fe149b5534dd5de7b875eaeb148fbc/examples/js/loaders/STLLoader.js#L247

and see how the performance with your 40MB file will be?

takahirox on 15 May 2017

❤3

About encoding type, wouldn't default 'utf-8' work?

Yes, that should work fine. The encoding issues I had before were with something else that shouldn't matter here, now that I think about it.

donmccurdy on 15 May 2017

@takahirox

... and see how the performance with your 40MB file will be?

The result is excellent! - Adding

if ( window.TextDecoder !== undefined ) {

    return new TextDecoder().decode( array_buffer );

}

improves the time needed on my system for executing the ensureString function

on Chrome from 12 seconds before to 0.12 seconds afterwards
on FF from 5-6 seconds before to 0.17 seconds afterwards

(with the previously tested 40MB STL ASCII file, on Windows 10)

- and in spite of this the model looks exactly like before 😉

jostschmithals on 15 May 2017

🎉3

Niiice! I'll make PR!

Do you folks think we need loader.setResponseType() or something for
browsers which don't support TextDecoder?

takahirox on 16 May 2017

Do you folks think we need loader.setResponseType() or something for
browsers which don't support TextDecoder?

Getting specific, that's IE 11 and Microsoft Edge[1]. I'd encourage interested persons to upvote the issue here. It is marked as "under consideration" currently.