It has confused me for a long time, and I put this question here to make it be answered as soon as possible.
I use ad-click data to do the experiment, and set the loss function as "binary:logistic". I dump the model and find many scores of leaves in the first tree in negative. It really confused me, I think the score is the mean of target values of samples which corresponding to the leaf. So, it mustn't be negative in the first tree.
I want to know how the leaf score is calculated? Is this any papers introduce this. Please let me know.
Tanks!
The sum of leaf score is the score before logistic transformation.
@tqchen Sorry, I still can't understand it. What does "the sum of" mean? Does it mean the final prediction score is equal to the logistic transformation of the sum of leaves scores? However, I still don't know how the leaf score is calculated. Can you give me some materials which introduce this?
Thanks!
For one instance, the values of corresponding leaf are sumed together to get a score hat{y}, then probability is the logistic transformation of hat{y}
One follow-up on this thread: How about when shrinkage is not 1? If it's still just a sum, then I assume shrinkage is already lumped into leaf value?
@profyao I have just read the related code as following, your assumption is right, it's still just a sum. We can reduce a multiply operation for each instance in the prediction period if the shrinkage has already lumped into leaf value.
// file: gbtree.cc, from line 513 to line 530
// make a prediction for a single instance
inline bst_float PredValue(const RowBatch::Inst &inst,
int bst_group,
unsigned root_index,
RegTree::FVec *p_feats,
unsigned tree_begin,
unsigned tree_end) {
bst_float psum = 0.0f;
p_feats->Fill(inst);
for (size_t i = tree_begin; i < tree_end; ++i) {
if (tree_info[i] == bst_group) {
int tid = trees[i]->GetLeafIndex(*p_feats, root_index);
psum += (*trees[i])[tid].leaf_value();
}
}
p_feats->Drop(inst);
return psum;
}
I'm not sure if it's suitable to ask this question here, but I think it's somehow related:
Given that the final output prediction value of a sample is a logistic transformation of the sum of values of leaves the sample fall in, so even we find some negative values in trees, we shouldn't get any negative prediction value, is this argument correct?
But I actually get some negative prediction value, what situation may I be caught in ?
Thank you~
Most helpful comment
The sum of leaf score is the score before logistic transformation.