I would be nice if Elasticsearch/Lucene could support GPUs. I just saw an article http://tech.marksblogg.com/billion-nyc-taxi-rides-nvidia-tesla-mapd.html and I would love to see something approaching that performance with elasticsearch.
Seems nvidia has some support in this direction:
http://on-demand.gputechconf.com/gtc/2014/presentations/S4506-indexing-documents-on-gpu-web-rt.pdf
http://www.jcuda.org/
While this might really something for lucene project, I thought perhaps it's of more interest for Elasticsearch might have different needs than the generic lucene project. (could be a good paid ad-on).
By the sounds of http://stackoverflow.com/a/22868938/819598 GPU wouldn't be a great fit for most of the Elasticsearch workload.
Thats an interesting link - thanks. The pdf document I linked shows as a good fit for indexing data - perhaps because it's easily parallelized. That would be good for text/logs, less for regular analytics.
But it is interesting to think about how MapD is doing it because that is largely the same problem it seems (I found marks blog because of his reference to using elasticsearch to analyze the NY taxi data - it performed really well considering it was on a single machine and most of the other tests were in cloud clusters).
But thanks for the investigation / consideration.
In my understanding Elasticsearch has four distinct workloads:
Did you @yehosef have a specific workload in mind? If all data doesn't fit to GPU's memory there are inherent problems with moving data and results back and forth, unless you do "sufficiently much" computation / unit of data.
Also algorithms with high branching factor aren't ideal workload, it is interesting how MapD achieves its performance. Note that the taxi data benchmark was done on a machine which had 32000 USD worth of GPUs.
I didn't have specific workload in mind - all are important. It might be valuable even if it only helps one or two of the workloads. I don't know enough about lucene internals to know whether or where GPUs would help. But I'll speculate:
1) - the paper mentioned indexing so that seems reasonable. You have different documents that have to be tokenized, stemmed, etc. The same, relatively simple work happening over and over again with relatively little interaction between them.
2) merging indices - you have two indices and you need to merge them - index A for the word "foo" and Index B for "foo" need to be combined. Likewise, the two indices for "bar" need to be merged, etc.
3) I don't know how the aggregations work - seems like the parallelization might help.. different cores dealing with different elements of the agg but I'm not sure.
4) could be searching also - I have indices for "the", "quick", "brown" and "fox" which might reference millions of documents. I need to break those down to find which documents are in the majority of these terms - and include the tf/idf relationship. Different cores might take a subset of the document/references and process it and then pass along the results.
So - these are just speculations based on my limited understanding of how Elasticsearch and Lucene work. Just as Elasticsearch works by breaking down different lucene shards on different machines and bringing the results together - that may also work inside a machine with multiple cores instead of across machines.
thanks again for thinking about it.
I think this is a too high hanging fruit even for labeling it that way. We discussed it internally and we don't see this happening in the next couple of years.
So I supposed I shouldn't open a ticket to port Elasticsearch to http://www.seastar-project.org/ to get 10x performance boosts like http://www.scylladb.com/ over cassandra?
For anyone interested - this is being explored in Lucene https://issues.apache.org/jira/browse/LUCENE-7745
However with machine learning GPU support sounds quite interesting.
@Elastic, heads up! The search engines of the future are coming: https://github.com/facebookresearch/faiss
You have to invest into GPU acceleration for the vector search operations to stay competitive :)
Percolate would benefit from OpenCL or CUDA implementation.
Most helpful comment
So I supposed I shouldn't open a ticket to port Elasticsearch to http://www.seastar-project.org/ to get 10x performance boosts like http://www.scylladb.com/ over cassandra?