Hello!
I have been playing with fastText and stumbled upon something that confused me a little bit. According to the default parameters in the documentation it seems to me that -maxn is set to 0 by default:
-minn min length of char ngram [0]
-maxn max length of char ngram [0]
-thread number of threads [12]
However, when I run the code I get significantly different running time based on whether I call ./fasttext with the default -maxn parameter or explicitly set it to 0 (leading me to believe that the default value is not 0). For example:
With -maxn 0 : running time of ~3 minutes
Not calling -maxn : running time of ~18 minutes.
Hopefully you can clarify this for me. :)
Which model are you talking about?
I am using cbow in both cases, here are the parameters and running times:
-maxn 0:
time ./fasttext cbow -input
Read 100M words
Number of words: 279974
Number of labels: 0
Progress: 100.0% words/sec/thread: 395830 lr: 0.000000 loss: 1.265415 eta: 0h0m
real 3m14.749s
user 21m22.638s
sys 0m11.140s
Default -maxn
time ./fasttext cbow -input
Read 100M words
Number of words: 279974
Number of labels: 0
Progress: 36.1% words/sec/thread: 64077 lr: 0.063866 loss: 1.862925 eta: 0h6m ^C
real 6m48.212s
user 47m8.746s
sys 0m17.913s
As can be seen I did not finish running the default -maxn run, but it took longer time already at ~36% and has a much lower words/sec/thread count.
The defaults are different depending on the model, and I don't think that the default maxn for CBOW is 0. Try running: fasttext cbow. I see
-minn min length of char ngram [3]
-maxn max length of char ngram [6]
(though I'm not sure about the version on this machine)
I think the documentation in the https://github.com/facebookresearch/fastText/blob/master/README.md could be confusing. The program itself will printout the "correct" default parameters for the unsupervised training when executed:
$ fasttext cbow
[...]
-minn min length of char ngram [3]
-maxn max length of char ngram [6]
This corresponds to https://github.com/facebookresearch/fastText/blob/master/src/args.cc#L32
Only when the program is called with fasttext supervised the "maxn=0" is displayed. So the default for cbow or skipgram training is not 0 but 6, - if I understand correctly. And that could explain the timing results.
I wonder if the README.md should be changed/extended with the default parameters for the unsupervised case.
Yes that is totally correct! I should have checked the default parameters in the program itself rather than just reading the documentation on GitHub.
Yes I agree it might be a good idea to extend the documentation a little bit.
Hi @arnor-sigurdsson, @bkj and @fnielsen,
This is correct, the default parameters are not the same for the different modes of fastText.
The best way to get the default parameters for a given mode is to run fastText without arguments, e.g.
./fasttext skipgram
or
./fasttext supervised
We will make this clearer in the documentation.
@EdouardGrave
Can I change the value of maxn when training a supervised model?
Most helpful comment
Hi @arnor-sigurdsson, @bkj and @fnielsen,
This is correct, the default parameters are not the same for the different modes of fastText.
The best way to get the default parameters for a given mode is to run fastText without arguments, e.g.
or
We will make this clearer in the documentation.