I'm testing orientdb and queries are very slow. For example, you can see that a simple select count(*) in the console takes 7.013sec
however I'm using indexes !
Dataset (IN Vertex) <--> DatasetData (Edge) <--> Data (OUT Vertex)
select count(*) from datasetdata where in.context.domain = "PROFESSIONAL" AND in.context.dataSetDefId="DIRECTORY_PROFESSIONAL_RPPS" and in.archive=false and out.context.itemDefId="PRO_ID_RPPS" and out.stringValue = "10003064481"

There is few records

Here are the indexes
DATASET(VERTEX)

DATA (VERTEX)

DATASETDATA (EDGE)

Here are jvm options
JAVA_OPTS="$JAVA_OPTS -server -XX:+PerfDisableSharedMem "
MAXHEAP="-Xmx2g"
MAXDISKCACHE="-Dstorage.diskCache.bufferSize=8192"
I don't know where to solve the problem.
Hi @j-mi-jim
you query is not actually using indexes in this case. I suggest you to rewrite it in this way:
select count(*) from (
select expand(outE()) from Data where stringValue = "10003064481" and context.itemDefId="PRO_ID_RPPS"
) where in.context.domain = "PROFESSIONAL" AND in.context.dataSetDefId="DIRECTORY_PROFESSIONAL_RPPS" and in.archive=false
Please let me know how the performance increases
Yes, it's definitively faster. 0.01sec
Why it's not using indexes ?
Is it better to query from vertex first than edges ?
yes, in current query executor implementation indexes are not exploited when conditions are expressed with dot notation
Ok, you are saying "current" query executor. will there be changes in the future?
yes, in v 2.1 we rewrote the query parser, next step will be the rewriting of the query executor (scheduled for 3.0), to allow better query optimization.
did optimization for this use case make it into v3? we are having a similar issue to #4612 where we cannot say something like:
select Book where Author.name = xxx
because the (existing) index does not get evaluated via the dot notation. and because of the number of objects subqueries are prohibitive.
Hi @turchinc
The new SQL engine is evolving fast, we are actively working on this feature
Thanks
Luigi
Hi @turchinc
Little update here, I pushed an optimization a few hours ago, now the SQL engine can manage simple index chains in an optimized way.
Unfortunately it's too late to have it in v 3.0.0, so it will be released with the first hotfix (v 3.0.1) a few days after the official 3.0.0. Anyway, the snapshot is already available https://oss.sonatype.org/content/repositories/snapshots/com/orientechnologies/orientdb-community/3.0.1-SNAPSHOT/
Thanks
Luigi
Hi @luigidellaquila, thanks for the update. I will pass on the info to my team.
tested with 3.0.1 Snapshot Version from 20.04.2018 and index ist used for anything like:
select from Article where owner.name = '...'
or even
select from Article where owner.name = '...' and owner.role = '...'
"owner" is a linked Document-Class with index on "name" and "role". There has to be an index also on "Article.owner" even actual query does not use it. Is this intended?
unfortunately this still does not use any index - "persons" ist a list of linked Documents:
select from Article where persons.name contains '...'
any chance that this will make it into the new SQL engine? Any other way to achieve this?
Hi @ablbi
Right now the query optimizer can manage simple index chains where the indexes have only one LINK property. It is evolving fast though, so LINKLIST/LINKSET and chains of indexes with multiple properties will be managed soon. I'll keep you updated
Thanks
Luigi
Hi @luigidellaquila
Any update on this issue or an idea of when we could plan on working with multiple LINK properties? Or maybe a good alternative way to achieve this as @ablbi was asking? Thanks!
Cheers
-chris
Hi @turchinc
I'm working on further improvements, I new batch of fixes should arrive in next few days
Thanks
Luigi
Hi @luigidellaquila, thanks for the quick update. Looking forward to your commits.
any news on that issue? We are going towards production and still not able to use index on list of linked documents. Using Version 3.0.10
also we ran into the problem with select count(*)
https://github.com/orientechnologies/orientdb/issues/3462
select count() from Article where state = 'WORK'
result:
count() = 4648934
Query executed in 449.315 sec. Returned 1 record(s).
Index on 'state' is used according to explain, but it seems complete result is scanned anyway