Painless does not have access to dense_vector at the moment except as an opaque value, users can pass it around but not access it's contents.
We can expose the vector as an iterator, which will give access to the data without revealing the internal representation.
cc: @mayya-sharipova @jtibshirani
Pinging @elastic/es-core-infra (:Core/Infra/Scripting)
I would suggest NOT to expose dense vector iterator.
There are two reasons for this:
@stu-elastic I am wondering if you have any specific need to expose dense vector iterator?
@mayya-sharipova this is based on a request from @HonzaKral. After chatting we thought an iterator was pretty lightweight.
I'm trying to understand point 2, if we iterated over a vector and provided the values, what's the issue there? We can simply ignore the magnitude at the end.
@HonzaKral I am interested to learn about a use case to access vector values directly.
@stu-elastic
We can simply ignore the magnitude at the end.
For now depending on index version it is just magnitude at the end. But later we may add more metadata.
Were you planning to iterate over Binary DocValues as it would be tricky for a user to decode these docvalues?
Alternatively I can see how we can expose an iterator over float[] -- original vectors' values by first decoding them, something we do in our vector functions
Alternatively I can see how we can expose an iterator over float[] -- original vectors' values by first decoding them, something we do in our vector functions
That's what we were thinking, allow users to get their data back.
The use case we encountered with is the ability to access data in order, just as #49695. The idea was to store historical records (price at a point of time) where we are then interested in deltas between two arbitrary points (price yesterday compared to price a year ago).
Another use case that came up was a user implementing custom vector function in painless.
@HonzaKral thanks, looks to be valid use cases to me.
That's what we were thinking, allow users to get their data back.
@stu-elastic thanks, makes sense. We need to think how to implement it as DenseVectorScriptDocValues needs to know an index version to decode vectors in a right way.
Relevant request from another user of exposing vector functions in other painless contexts besides ScoreScript.CONTEXT.
@mayya-sharipova Just wanted to check to see if you would still like this to be exposed.
Most helpful comment
The use case we encountered with is the ability to access data in order, just as #49695. The idea was to store historical records (price at a point of time) where we are then interested in deltas between two arbitrary points (price yesterday compared to price a year ago).
Another use case that came up was a user implementing custom vector function in painless.