Xgboost: How can the predcition score of the leaf in the first tree be negative?

Created on 14 Oct 2015 · 7Comments · Source: dmlc/xgboost

It has confused me for a long time, and I put this question here to make it be answered as soon as possible.
I use ad-click data to do the experiment, and set the loss function as "binary:logistic". I dump the model and find many scores of leaves in the first tree in negative. It really confused me, I think the score is the mean of target values of samples which corresponding to the leaf. So, it mustn't be negative in the first tree.
I want to know how the leaf score is calculated? Is this any papers introduce this. Please let me know.
Tanks!

Source

wenmin-wu

Most helpful comment

The sum of leaf score is the score before logistic transformation.

tqchen on 14 Oct 2015

👍6

All 7 comments

The sum of leaf score is the score before logistic transformation.

tqchen on 14 Oct 2015

👍6

@tqchen Sorry, I still can't understand it. What does "the sum of" mean? Does it mean the final prediction score is equal to the logistic transformation of the sum of leaves scores? However, I still don't know how the leaf score is calculated. Can you give me some materials which introduce this?
Thanks!

wenmin-wu on 14 Oct 2015

http://xgboost.readthedocs.org/en/latest/model.html

tqchen on 14 Oct 2015

For one instance, the values of corresponding leaf are sumed together to get a score hat{y}, then probability is the logistic transformation of hat{y}

tqchen on 14 Oct 2015

👍2

One follow-up on this thread: How about when shrinkage is not 1? If it's still just a sum, then I assume shrinkage is already lumped into leaf value?

profyao on 2 Jun 2017

@profyao I have just read the related code as following, your assumption is right, it's still just a sum. We can reduce a multiply operation for each instance in the prediction period if the shrinkage has already lumped into leaf value.

 // file: gbtree.cc, from line 513 to line 530
 // make a prediction for a single instance
  inline bst_float PredValue(const RowBatch::Inst &inst,
                             int bst_group,
                             unsigned root_index,
                             RegTree::FVec *p_feats,
                             unsigned tree_begin,
                             unsigned tree_end) {
    bst_float psum = 0.0f;
    p_feats->Fill(inst);
    for (size_t i = tree_begin; i < tree_end; ++i) {
      if (tree_info[i] == bst_group) {
        int tid = trees[i]->GetLeafIndex(*p_feats, root_index);
        psum += (*trees[i])[tid].leaf_value();
      }
    }
    p_feats->Drop(inst);
    return psum;
  }

wenmin-wu on 4 Jun 2017

I'm not sure if it's suitable to ask this question here, but I think it's somehow related:
Given that the final output prediction value of a sample is a logistic transformation of the sum of values of leaves the sample fall in, so even we find some negative values in trees, we shouldn't get any negative prediction value, is this argument correct?
But I actually get some negative prediction value, what situation may I be caught in ?
Thank you~