In my problem I have 3 classes to predict and I'm trying to understand how the xgboost tree relates to the predicted probabilities:
model2=xgboost(data = as.matrix(train_set), label = output_vector, max.depth = 3,
eta = 1, nround = 50,objective = "multi:softprob",num_class=3,
verbose = FALSE)
names <- dimnames(train_set)[[2]][-c_set]
trees=xgb.model.dt.tree(names,model = model2)
The first few lines of the tree is as below:
ID Feature Split Yes No Missing Quality Cover Tree Yes.Feature Yes.Cover Yes.Quality
1: 0-0 location1100 0.5 0-1 0-2 0-1 26.142200 2666.6700 0 location995 2651.56 22.050700
2: 0-1 location995 0.5 0-3 0-4 0-3 22.050700 2651.5600 0 location962 2637.33 14.790700
3: 0-2 Leaf NA NA NA NA -0.579310 15.1111 0 NA NA NA
4: 0-3 location962 0.5 0-5 0-6 0-5 14.790700 2637.3300 0 Leaf 2620.44 0.723138
5: 0-4 Leaf NA NA NA NA -0.503650 14.2222 0 NA NA NA
6: 0-5 Leaf NA NA NA NA 0.723138 2620.4400 0 NA NA NA
No.Feature No.Cover No.Quality
1: Leaf 15.1111 -0.579310
2: Leaf 14.2222 -0.503650
3: NA NA NA
4: Leaf 16.8889 -0.204969
5: NA NA NA
6: NA NA NA
how would this relate to the classes as there is only a "yes" or "no" split. Even the leafs do not mention a class that they belong to nor a probability associated. I guess in some ways what I'm asking is how would I manually calculate the probabilities for a test case given the tree structure (in my case in the variables trees).
A tree always seperates only two classes. Hence, if you have 3 classes, xgb grows 3 trees per boosting round (class 1 vs rest, class 2 vs rest & class 3 vs rest).
Most helpful comment
A tree always seperates only two classes. Hence, if you have 3 classes, xgb grows 3 trees per boosting round (class 1 vs rest, class 2 vs rest & class 3 vs rest).