Fasttext: Distributed fastText - Parallel Training?

Created on 26 Jan 2017 · 5Comments · Source: facebookresearch/fastText

Is there any distributed implementation of fastText (Ex. Spark), for handling really large input text corpuses for learning word vectors. Word2Vec has a Spark Implementation http://spark.apache.org/docs/latest/ml-features.html#word2vec. Since sub-word information in fastText (summation of vectors of character n-grams) is a defining difference, is it straightforward to work towards a Spark based implementation, using Word2Vec as base code?

Source

netankit

👍15

Most helpful comment

I'll add this as a future feature we might consider implementing. For now it's not on our list of priorities, but it might very well soon.

cpuhrsch on 23 May 2017

👍4

All 5 comments

I'll add this as a future feature we might consider implementing. For now it's not on our list of priorities, but it might very well soon.

cpuhrsch on 23 May 2017

👍4

i firmly believe this(with spark) will be very helpful for us to train a very large Chinese text

XiaolinZHONG on 15 Dec 2017

👍3

@cpuhrsch Any plan on this?

qingyuanxingsi on 30 Jul 2019

👍2

Hello,
Is it always in your scope to implement fasttext with spark?

medali994 on 28 Feb 2020

any updates on implementing fasttext with spark ?

ahmarz on 29 Apr 2020

👍2

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Python load_model outputs blank lines to the console

alanorth · 3Comments

Is different between fasttext and fastText in python?

nomadlx · 3Comments

Unable to install fasttext (python) on windows.

ragvri · 3Comments

Question: How to analyze sentence similarity under fastText?

leonardgithub · 4Comments

wordNgrams in unsupervised mode (cbow and skipgram)

mino98 · 3Comments