Flair: confidence score for all classes

Created on 1 Aug 2019  路  4Comments  路  Source: flairNLP/flair

Is it possible to access not only the confidence score of the most likely label but instead get all scores for all possible labels?

For example, if my model is trained to predict 4 classes, I want to get a score corresponding to the confidence of each entity class?

I want to implement multiple active learning strategies using flair. Currently, I normalize the confidence score of each token by dividing it through the number of tokens in a sentence and sum all of the scores up. By doing that I get a score for each sentence which indicates how confident the model is in its prediction.

If I could have more information about the confidence of all possible classes, I could try out some other strategies like entropy-based uncertainty sampling.

Can you help me out with that?

question

Most helpful comment

Hello @m-michalek,

After a prediction you can get all the scores for each Token.

In the class Token you can find :

self.tags_proba_dist: Dict[str, List[Label]] = {}

I think it's what you're looking for.

All 4 comments

Hello @m-michalek,

After a prediction you can get all the scores for each Token.

In the class Token you can find :

self.tags_proba_dist: Dict[str, List[Label]] = {}

I think it's what you're looking for.

Hello @BaptisteBlouin,
That's exactly what I was looking for, thank you!

Do you know if it is possible to train a model with online learning? I'm now able to select informative instances from my unlabeled dataset but training a whole new model with all the previously labeled data takes a very long time.

I want to add only the new instances to the previously trained model. Tutorial 7 states the possibility to resume training. But from the example, it looks like the corpus is defined once at the beginning and cant be updated after the training has stopped since, for example, the tag_dictionary depends on the corpus which wouldn't have information about the newly added instances.

Hello @m-michalek,

You can change the corpus after reloading your model. For example you can use :

corpus = Corpus(sentences_train, sentences_dev, sentences_test, name="corpus")

where sentences_... are List[Sentence] , so you can put your new instance in it.

But when you do that you have to use the same tag that you use for the previously trained model. You can't change the tag_dictionary because your predictive layer ( Linear or + CRF ) is based on this dictionnary. But if you just have to add new instances which use the same tag ( those from your unlabeled data now tagged), reload your model, regenerate a new corpus and the tag_dictionary will not have to change.

Hello @BaptisteBlouin,

Do I have to append the new data to the old corpus or should the new corpus contain only new data if I want to extend a pre-trained model? In other words, does flair know where to continue training?

So after training an initial model I would just execute the following code:

tagger = SequenceTagger.load('resources/taggers/pretrained-model/best-model.pt')

trainer: ModelTrainer = ModelTrainer(tagger, new_corpus)

# 7. start training
trainer.train('resources/taggers/updated-model',
learning_rate=0.1,
mini_batch_size=32,
max_epochs=150)

I appreciate your help!

Was this page helpful?
0 / 5 - 0 ratings