Gensim: Monitoring Training loss

Created on 27 Feb 2017  Â·  8Comments  Â·  Source: RaRe-Technologies/gensim

Hi,
How can I monitor training loss of the model if I am using Word2Vec model from gensim.models for training ?
Also, how can I see the logs generated once training has started ? Can I redirect these logs to stdout ?

Most helpful comment

To monitor loss while training the model set compute_loss=True and change the following lines to

logger.info(                                                                                                                                                
    "PROGRESS: at %.2f%% examples, %.0f words/s, in_qsize %i, out_qsize %i, current_loss %.3f",                                                             
    100.0 * example_count / total_examples, trained_word_count / elapsed,                                                                                   
    utils.qsize(job_queue), utils.qsize(progress_queue), self.get_latest_training_loss() / example_count                                                    
)

@menshikh-iv Maybe this could be automatically done when compute_loss=True?

All 8 comments

Reporting of training loss isn't yet offered, but it's a wishlist item discussed in other issues like #999.

gensim uses standard Python logging, and logs many details at INFO or DEBUG levels - so you can configure logging in various ways – https://docs.python.org/2/howto/logging.html – to see those messages wherever you'd like.

Resolved in #1201

To monitor loss while training the model set compute_loss=True and change the following lines to

logger.info(                                                                                                                                                
    "PROGRESS: at %.2f%% examples, %.0f words/s, in_qsize %i, out_qsize %i, current_loss %.3f",                                                             
    100.0 * example_count / total_examples, trained_word_count / elapsed,                                                                                   
    utils.qsize(job_queue), utils.qsize(progress_queue), self.get_latest_training_loss() / example_count                                                    
)

@menshikh-iv Maybe this could be automatically done when compute_loss=True?

@villmow maybe, can you create PR and make this change?

Can someone explain why the last training loss that I get is really so high no matter if I use log-loss or the negative sampling, and that's after 100 epochs?

loss: 8837240.0

@kirk86 If I remember correctly, the values reported are a tally of the loss over a full epoch, so its absolute value needn't necessarily become "small looking", just smaller at the end than the beginning.

If you're not seeing it smaller at the end, you may be doing something wrong in your code, such as mismanaging the alpha/iterations such that the learning-rate goes non-sensically negative. Most users should never need to change the defaults for alpha and min_alpha, nor do any alpha-decrements themselves, nor call train() more than once – if you're doing any of these, you may be following broken examples.

Also, published work on adequately-sized datasets tends to use 10-20 epochs only. If you're still having a problem a good place to discuss (with example code) would the project discussion list: https://groups.google.com/forum/#!forum/gensim

@gojomo Thanks a lot for the reply. I haven't changed any of the defaults other than the iter=100, min_count=1. But the dataset that I'm using is kind of peculiar since it's a sequence of letters ABCDDTTCE.... The loss that I'm reporting is the one that I get after setting the param compute_loss=True and then getting the latest training loss with model.get_latest_training_loss().

I use this way to print loss

from gensim.models import Word2Vec
from gensim.models.word2vec import LineSentence
from gensim.models.callbacks import CallbackAny2Vec

class callback(CallbackAny2Vec):
    '''Callback to print loss after each epoch.'''

    def __init__(self):
        self.epoch = 0
        self.loss_to_be_subed = 0

    def on_epoch_end(self, model):
        loss = model.get_latest_training_loss()
        loss_now = loss - self.loss_to_be_subed
        self.loss_to_be_subed = loss
        print('Loss after epoch {}: {}'.format(self.epoch, loss_now))
        self.epoch += 1

model = Word2Vec(LineSentence('./data/house_list'), size=100, workers=20, \
                    min_count=1, iter=30, window=5, compute_loss=True, callbacks=[callback()])
model.save('./model/v2.model')
Was this page helpful?
0 / 5 - 0 ratings

Related issues

k0nserv picture k0nserv  Â·  3Comments

mikkokotila picture mikkokotila  Â·  3Comments

franciscojavierarceo picture franciscojavierarceo  Â·  3Comments

sairampillai picture sairampillai  Â·  3Comments

johann-petrak picture johann-petrak  Â·  3Comments