Tedious: performance issue with tedious

Created on 17 Oct 2016  路  11Comments  路  Source: tediousjs/tedious

I ran few queries that return large datasets using tedious as well as jdbc sqlserver in same environment. I got very low performance with tedious compared to jdbc. Here are the results.

| Queries | JDBC | Node Tedious |
| --- | --- | --- |
| 1000000 from tbl1 | 1.135 s | 72.359 s |
| 1000000 from tbl2 | 7.109 s | 450.915 s |
| 1000000 from tbl3 | 1.761 s | 122.01 s |
| 1000000 from tbl4 | 1.409 s | 68.083 s |
| 1000000 from tbl5 | 1.553 s | 69.083 s |

Note: I used single connection in both environment and ran 1 query at a time. For tedious, these timings are based on when DoneProc event is fired, and for JDBC, I iterated over ResultSet.

Original Question asked at SOF

known issue

Most helpful comment

Thanks! Looking at the profiler output, most of the time is spent in the token parsing code, most specifically in the row token and value parsing code. This should get a lot better with the new parsing code, but I don't have an ETA on when it will land in tedious. I'll let you know!

All 11 comments

Can you post the setup instructions for these benchmarks so I can try and
reproduce?

I started work on improving the performance of tedious all over the board,
but that is not really in any usable state yet - but should improve the
performance of tedious many times.

I'm not sure exactly what you need. But here is the code I used:

NodeJS

var Connection = require("tedious").Connection;
var Request = require("tedious").Request;
var query = process.argv[2] || "select top 1000000 * from tbl1" ;
var config = {
    userName: "user",
    password: "password",
    server: "server",
    options: {
        database: "db",
        requestTimeout: 3600000
    }
};
var connection = new Connection(config);
connection.on("connect", connected);
connection.on("infoMessage", infoError);
connection.on("errorMessage", infoError);
connection.on("end", end);
connection.on("debug", debug);
function connected(err) {
    console.time("TIME TAKEN");
    if (err) {
        console.log(err);
        process.exit(1);
    }
    console.log("connected");
    exec(query);
    process.stdin.on("end", function () {
        process.exit(0);
    });
}
function exec(sql) {
    sql = sql.toString();
    var request = new Request(sql, statementComplete);
    //request.on("row", row);
    request.on("doneProc", requestDone);
    connection.execSql(request);
}
function requestDone(rowCount, more, returnStatus, rows) {
    console.timeEnd("TIME TAKEN");
}
function statementComplete(err, rowCount) {
    if (err) {
        console.log("Statement failed: " + err);
    }
    else {
        console.log(rowCount + " rows");
        console.timeEnd("TIME TAKEN");
    }
    connection.close();
}
function end() {
    console.log("Connection closed");
    process.exit(0);
}
function infoError(info) {
    console.log(info.number + " : " + info.message);
}
function debug(message) {
    console.log(message);
}
function columnMetadata(columnsMetadata) {
    columnsMetadata.forEach(function (column) {
        console.log(column);
    });
}

Java

import java.sql.Connection;
import java.sql.ResultSet;
import java.sql.Statement;

public class App {
    public static void main(String[] args) {

        Connection conn = null;
        Statement stmt = null;
        try {
            long start_time = System.currentTimeMillis();
            DBConnection dbConn = new DBConnection();
            conn = dbConn.getConnection();
            String sql = "Select top 1000000 * from tbl1";
            stmt = conn.createStatement();
            stmt.setFetchSize(100000);
            ResultSet rs = stmt.executeQuery(sql);
            while (rs.next());
            rs.close();
            long end_time = System.currentTimeMillis();
            System.out.println((end_time - start_time) / 1000.0);
        } catch (Exception ex) {
            ex.printStackTrace();
        }

    }
}

DBConnection.java

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.SQLException;

public class DBConnection {

    private static final String url = "jdbc:sqlserver://[server];databaseName=[db];user=[user];password=[password]";
    private Connection conn = null;

    public DBConnection(){
        try {
            initConnetion();
        } catch (ClassNotFoundException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        } catch (SQLException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
    }

    public void initConnetion() throws ClassNotFoundException, SQLException{
        if(conn == null){
            Class.forName("com.microsoft.sqlserver.jdbc.SQLServerDriver");
            conn = DriverManager.getConnection(url);
            conn.setAutoCommit(false);
        }
    }

    public Connection getConnection(){
        return conn;
    }

    public void close(){
        if(conn != null){
            try {
                conn.close();
            } catch (SQLException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            }
        }
    }
}

V8 Profiler log
processed.txt

Please let me know if you need any other details.

Thanks! Looking at the profiler output, most of the time is spent in the token parsing code, most specifically in the row token and value parsing code. This should get a lot better with the new parsing code, but I don't have an ETA on when it will land in tedious. I'll let you know!

Thank you for looking into it.

@arthurschreiber Do you think the new parsing code is far off?

I am also seeing serious performance issues with a real dataset that has 19 columns x ~6500 rows.

With tedious, the query takes anywhere from 10 - 19 seconds (Linux).
The same query takes 1 - 3 seconds in the MSSQL Management Studio (Windows).

My data consists of mostly VARCHARS and numbers. Please let me know if I can assist.

The main goal of the new parsing code is to avoid closures at all cost.

If you look at some of the current code (https://github.com/tediousjs/tedious/blob/8c5932371f66befca06c372c1db16c5f3a740e9c/src/token/loginack-token-parser.js#L10-L43) you can see that the code makes heavy use of closures - and parsing a LOGINACK token is still one of the simpler tokens!

This is problematic in a few ways, that all less than optimal performance in different regards.

For one, it causes relatively high memory usage. Another, and IMHO the biggest problem is that it either forces V8 to not optimize some of the functions or causes deoptimization and reoptimization all the time, which are both huge hits in performance.

@arthurschreiber Do you think the new parsing code is far off?

@andymac4182 I'll try to push the work I've done so far to the repo here as soon as possible. Unfortunately, it's an all-or-nothing change, meaning all the parsing code has to be modified in one go, and we can't modify the parsing code in small steps.

I am also seeing serious performance issues with a real dataset that has 19 columns x ~6500 rows.

With tedious, the query takes anywhere from 10 - 19 seconds (Linux).
The same query takes 1 - 3 seconds in the MSSQL Management Studio (Windows).

My data consists of mostly VARCHARS and numbers. Please let me know if I can assist.

@MarkHerhold If you could take a look at the existing benchmarks and maybe create a new one that roughly reflects your use case, that'd be awesome!

Hi @arthurschreiber
I profiled the code and it turns out that my own code was causing the slowdown. The tedious parsing code didn't even show up in the profiler as a trouble spot and does not take long to run at all, so my apologies for the confusion and misinformation!

Find another performance issue. The profiler data seems different from the one provided at the top. Tedious is called by TypeOrm when using createQueryBuilder().insert(). The function took lots of time are

  121455   31.6%  /lib/x86_64-linux-gnu/libc-2.23.so
  30929   25.5%    LazyCompile: *writeUInt8 /xxxxx/tedious/lib/tracking-buffer/writable-tracking-buffer.js:74:31

0724-profiler.txt

Was this page helpful?
0 / 5 - 0 ratings