Fasttext: Distributed fastText - Parallel Training?

Created on 26 Jan 2017  路  5Comments  路  Source: facebookresearch/fastText

Is there any distributed implementation of fastText (Ex. Spark), for handling really large input text corpuses for learning word vectors. Word2Vec has a Spark Implementation http://spark.apache.org/docs/latest/ml-features.html#word2vec. Since sub-word information in fastText (summation of vectors of character n-grams) is a defining difference, is it straightforward to work towards a Spark based implementation, using Word2Vec as base code?

Most helpful comment

I'll add this as a future feature we might consider implementing. For now it's not on our list of priorities, but it might very well soon.

All 5 comments

I'll add this as a future feature we might consider implementing. For now it's not on our list of priorities, but it might very well soon.

i firmly believe this(with spark) will be very helpful for us to train a very large Chinese text

@cpuhrsch Any plan on this?

Hello,
Is it always in your scope to implement fasttext with spark?

any updates on implementing fasttext with spark ?

Was this page helpful?
0 / 5 - 0 ratings