We are designing a new Script Score Query (SSQ) to replace Function Score Query (FSQ). The goal of SSQ is to have the same (and possibly more) functionalities as FSQ available only through painless script. For this, we would like to add the below functions to painless. They can be available either in the SearchScript or a specifically designed for scoring ScoringScript.
Similar to random_score in FSQ:
"script" : {
"source" : "random_score(params.seed, doc['field'])",
"params": {"seed": 10}
}
Currently painless allows to generate random values in the way below, but it is bulky, and not the exact reproduction of random score in FSQ:
"script" : {
"source" : "Random rnd = new Random(); rnd.setSeed(doc['field'].value); rnd.nextFloat()"
}
We would like to introduce a shorter version of the following functions useful for score calculations:
Math.log10(doc['f'].value) -> log(doc['f'].value)Math.log10(doc['f'].value + 1) ->log1p(doc['field'].value) Math.log10(doc['f'].value + 2) -> log2p(doc['f'].value) Math.log(doc['f'].value) -> ln(doc['f'].value) Math.log1p(doc['f'].value + 1) -> ln1p(doc['f'].value) Math.log(doc['f'].value + 2) -> ln2p(doc['f'].value) Math.pow(doc['f'].value, 2) -> square(doc['f'].value) Math.sqrt(doc['f'].value) -> sqrt(doc['f'].value) 1.0 / doc['f'].value -> reciprocal(doc['f'].value) doc['f'].value / (k + doc['f'].value) -> rational(doc['f'].value, k) Math.pow(doc['f'].value,a) / (Math.pow(k,a) + Math.pow(doc['f'].value,a)) -> sigmoid(doc['f'].value, k, a)Similar to decay functions in FSQ:
Proposed API:
"script" : {
"source" : "decay_gauss(doc['date'], params.origin, params.scale, params.offset, params.decay)",
"params": {
"origin": "2013-09-17",
"scale": "10d",
"offset": "5d",
"decay" : 0.5
}
}
"script" : {
"source" : "decay_linear(doc['geo'], params.origin, params.scale, params.offset, params.decay)",
"params": {
"origin": "11, 12",
"scale": "2km",
"offset": "0km",
"decay" : 0.33
}
}
Investigate how to parse date and geo parameters only one per query, and don't do the parsing for every document (store in context?).
_max_score in the rescore context?_index[鈥榯ext鈥橾[鈥榳ord鈥橾.tf()Would like to get feedback from @rjernst @jdconrad
cc @polyfractal
Pinging @elastic/es-core-infra
Random score
Painless is one of the few places where we can't use Randomness right now. It might be worth looking at when we do this.
I'm for adding a new context for this -- ScoringScript is good. For now, the best way to add these methods is to add them as static methods to the ScoringScript class and whitelist them. I will work towards adding a way to have methods be called without the static type qualifier, but that will take me a bit of a time.
_index lucene terms stats (doc count, doc frequency, tf, total term frequency), e.g. _index[鈥榯ext鈥橾[鈥榳ord鈥橾.tf()
We removed these from scripting in 6.0, as they are for advanced users. I don't think we should add them back. Anyone wanting to do this should write a custom script engine (and thus have access to all of the Lucene API).
Painless is one of the few places where we can't use Randomness right now. It might be worth looking at when we do this.
@nik9000 Can you please elaborate more on why we can't use Randomness in painless?
@mayya-sharipova Randomness would be the way to implement the random_score method which does not take a seed, but for the ones taking a seed, Random should still be used directly. But I don't think we should expose Randomness in painless directly.
Closed with #34533