Deepspeech: Release model checkpoint files

Created on 6 Dec 2017 · 11Comments · Source: mozilla/DeepSpeech

We have some audios which we want to use for training on the top of pre-trained deepspeech model. Is it possible to get the links for the release model's checkpoint files ?

Source

praveeny1986

👍1

Most helpful comment

This was done in release v0.1.1: https://github.com/mozilla/DeepSpeech/releases/tag/v0.1.1

lissyx on 2 Feb 2018

👍4

All 11 comments

I would like to second this issue :+1:

dcam0050 on 8 Dec 2017

It's going to happen, as soon as we can :)

lissyx on 14 Dec 2017

👍2

Is there a time estimate as to when those will be available?

kapursu on 16 Dec 2017

Is it possible to add on more words to it, without loosing the capability to loose old ones?

I guess with the new checkpoints, we also need to train with the old data + new data, instead of new data alone to add more words. Is there any possible way to retain the old knowledge, while training on new data alone?. So that as a result it can recognize all sort of things from old and new data.
Same question as @kapursu, when can we expect those checkpoints to be made available?

saikishor on 21 Dec 2017

Is there no way of importing the parameters from the provided .pb file in deepspeech-0.1.0-models.tar.gz ?

bernardpazio on 21 Dec 2017

Yea is there a way to fine tune given trained model (.pb file)? any success with it?

arunpatala on 28 Dec 2017

@arunpatala @bernardpazio The .pb file is a frozen version of the tensorflow graph that's optimized for inference and cannot be reused for transfer learning (i.e. additional training). To be able to train an already trained model with your data, you need checkpoints that haven't been released yet.

More info on frozen models can be found in tensorflow documentation about freezing.

@saikishor in theory, you can use only new data to fine tune the model's performance but you need to be careful to not overfit it for your custom data. Quite a few blogs describe different techniques for transfer learning in tensorflow (transfer learning blog to name one).
However what method is going to work well for deepspeech and for your use case e.g. adding just a few specialized English terms (like medical vocabulary), or switching to another language (like reusing high level features and retraining the rest for Spanish) will need to be tested and hopefully shared by the community with all params like learning speed, number of frozen layers etc.

Appologies for the lengthy text, it's probably more suitable for forums.