Fasttext: The memory error when loading the pre-trained model

Created on 9 May 2018 · 16Comments · Source: facebookresearch/fastText

There is a memory error when I trying to load the pre-trained model, e.g., model = fasttext.load_model('D:/download/wiki.en/wiki.en.bin').

Since the size of this bin file is almost 9G, and my memory size is only 4G.
I am trying to find a memory friendly method to load the model. Can anyone give me a clue?
Thanks a lot!

Source

zhouchichun

Most helpful comment

./fasttext quantize -output wiki.simple.bin
Empty input or output path.

The following arguments are mandatory:
  -input              training file path
  -output             output file path

hyonschu on 15 May 2018

👍2

All 16 comments

Quantization
In order to create a .ftz file with a smaller memory footprint do:

$ ./fasttext quantize -output model
All other commands such as test also work with this model

$ ./fasttext test model.ftz test.txt

dawin2015 on 10 May 2018

😕2 👍1

./fasttext quantize -output wiki.simple.bin
Empty input or output path.

The following arguments are mandatory:
  -input              training file path
  -output             output file path

hyonschu on 15 May 2018

👍2

@hyonschu
thank you so much! I have figured out a way of solving the memory error by building a dict that can get the embedding vector of the words I need in the task instead of loading all the embedding vector.

zhouchichun on 18 May 2018

@dawin2015
thank you so much! I have figured out a way of solving the memory error by building a dict that can get the embedding vector of the words I need in the task instead of loading all the embedding vector.

zhouchichun on 18 May 2018

@zhouchichun Do you mind sharing the solution? I'm having the same problem.

sodaliAyran on 23 May 2018

@sodaliAyran check https://github.com/WenchenLi/nlp_vocab

zhouchichun on 26 May 2018

As @hyonschu mentioned that command gives an error:

./fasttext quantize -output wiki.en.bin
Empty input or output path.

The following arguments are mandatory:
  -input              training file path
  -output             output file path

Mentioning both -input and -output paths seems to work but only to give another error:

./fasttext quantize -input wiki.en.bin -output -model
Model file cannot be opened for loading!

mraduldubey on 20 Jul 2018

Hi @mraduldubey and @hyonschu,

The command line option for the quantize mode are a bit confusing:

-input is used to specify the data that will be used for fine-tuning the quantize model. This file is only used if the -retrain option is set, and should be the same as the original training data. Without fine-tuning, this path is not used.
-output is used to specify the path to the model without the .bin extension.
Here is an example of command lines to train and then quantize a model:

> ./fasttext supervised -input DATA.TXT -output MODEL -wordNgrams 2 -dim 50
> ./fasttext quantize -input DATA.TXT -output MODEL -dsub 2

Please note that only supervised models can be quantized for now.

Best,
Edouard.

EdouardGrave on 20 Jul 2018

👍1

@EdouardGrave What if I already trained a model using a fasttext and it's too large to be loaded in my memory at once?

harrypotter0 on 23 Jul 2018

@EdouardGrave Thanks. So, basically this command can be used to quantize the pretrained fasttext vectors without any fine-tuning:

./fasttext quantize -output "result/wiki.en" -input "data/dbpedia.train" -qnorm -epoch 1 -cutoff 100000

The -input path remains unused here.

mraduldubey on 23 Jul 2018

@mraduldubey

Hi! I was wondering how you ensure that the -input path is unused?
I wanted to quantize the wikipedia pretrained fasttext model and I have no training data to give it.

gohurali on 30 Dec 2018

@gohurali I quantized the pretrained fasttext english model itself. So, you can follow the same command mentioned above. The -input path is redundant.

mraduldubey on 8 Jan 2019

does this still work? I'm having trouble with the pretrained .bin

./fasttext quantize -output "wiki-news-300d-1M-subword" -input "wiki-news-300d-1M-subword" -qnorm -epoch 1 -cutoff 300000

terminate called after throwing an instance of 'std::invalid_argument' what(): For now we only support quantization of supervised models Aborted (core dumped)

Jack000 on 8 May 2019

Hi,

I am trying to use gensim model but giving below error. I have been trying for 2 days and checked my RAM. I have 16 GB RAM and only 29% is used at the time of running this code unable to understand how to fix it. Please help.

Code snippet:

import gensim.downloader as api
fasttext_model300 = api.load('fasttext-wiki-news-subwords-300')

Error:

C:\Python3.7\python.exe C:/Users/amitabhseth/IdeaProjects/class1/Test1.py
Traceback (most recent call last):
File "C:/Users/amitabhseth/IdeaProjects/class1/Test1.py", line 34, in
fasttext_model300 = api.load('fasttext-wiki-news-subwords-300')
File "C:\Python3.7\lib\site-packages\gensim\downloader.py", line 502, in load
return module.load_data()
File "C:\Users\amitabhseth/gensim-data\fasttext-wiki-news-subwords-300_init_.py", line 8, in load_data
model = KeyedVectors.load_word2vec_format(path, binary=False)
File "C:\Python3.7\lib\site-packages\gensim\models\keyedvectors.py", line 1498, in load_word2vec_format
limit=limit, datatype=datatype)
File "C:\Python3.7\lib\site-packages\gensim\models\utils_any2vec.py", line 349, in _load_word2vec_format
result.vectors = zeros((vocab_size, vector_size), dtype=datatype)
MemoryError

Amitabh8 on 13 Aug 2019

I have the same problem of not being able to load 8GB wiki.en.bin and i tried

./fasttext quantize -output "rez/wiki.en" -input "data/dbpedia.train" -qnorm -epoch 1 -cutoff 50000

But the program throws the following error,

terminate called after throwing an instance of 'std::invalid_argument'
  what():  For now we only support quantization of supervised models
Aborted (core dumped)

I tried this on both, new and older, versions of fasttext.
Any idea why this command runs for just a few people ? Or any other way of reducing the size / or be able to run load_model with wiki.en.bin

pkrsn4 on 25 Nov 2019

the quantize function only works for supervised models - it's not clear from the docs but all of the downloadable models are unsupervised. The quantize function only works for supervised models that you train yourself.

the size of the unsupervised model comes from 2 things - the pretrained vectors for whole words and the subword hash table.

it is possible to reduce the size of the .bin of the unsupervised model by dropping some of the pretrained vectors, but this is not possible with the cli tools provided by fasttext (ie. you'll have to modify the C code a bit, specifically use the dict->_threshold function). You can shave off a few gb by doing this, depending on how many dict values you drop.

here's some quick and dirty code that changes the quantize function to do this (the threshold value determines which dictionary values to drop)

/* note: some dict_ members are public for easier access */
void FastText::quantize(const Args& qargs) {
  /*if (args_->model != model_name::sup) {
    throw std::invalid_argument(
        "For now we only support quantization of supervised models");
  }*/
  args_->input = qargs.input;
  args_->qout = qargs.qout;
  args_->output = qargs.output;
  std::shared_ptr<DenseMatrix> input =
      std::dynamic_pointer_cast<DenseMatrix>(input_);
  std::shared_ptr<DenseMatrix> output =
      std::dynamic_pointer_cast<DenseMatrix>(output_);
  bool normalizeGradient = (args_->model == model_name::sup);

  if (qargs.cutoff > 0 && qargs.cutoff < input->size(0)) {
    /*auto idx = selectEmbeddings(qargs.cutoff);
    dict_->prune(idx);*/
    int32_t rows = dict_->size_+args_->bucket;
    dict_->threshold(2000, 2000);
    std::cerr << "words:  " << dict_->size_ << std::endl;
    std::cerr << "rows:  " << rows << std::endl;
    /*std::shared_ptr<DenseMatrix> ninput =
        std::make_shared<DenseMatrix>(idx.size(), args_->dim);*/
    int32_t new_rows = dict_->size_+args_->bucket;
    std::shared_ptr<DenseMatrix> ninput = std::make_shared<DenseMatrix>(dict_->size_+args_->bucket, args_->dim);
    for (auto i = 0; i < dict_->size_; i++) {
      for (auto j = 0; j < args_->dim; j++) {
        int32_t index = dict_->getId(dict_->words_[i].word);
        ninput->at(i, j) = input->at(index, j);
      }
    }

    int32_t offset = rows-new_rows;
    for (auto i = dict_->size_; i < new_rows; i++) {
      for (auto j = 0; j < args_->dim; j++) {
        ninput->at(i, j) = input->at(i+offset, j);
      }
    }
    /*input = ninput;*/
    input_ = ninput;
    if (qargs.retrain) {
      args_->epoch = qargs.epoch;
      args_->lr = qargs.lr;
      args_->thread = qargs.thread;
      args_->verbose = qargs.verbose;
      auto loss = createLoss(output_);
      model_ = std::make_shared<Model>(input, output, loss, normalizeGradient);
      startThreads();
    }
  }

  /*input_ = std::make_shared<QuantMatrix>(
      std::move(*(input.get())), qargs.dsub, qargs.qnorm);*/

  /*if (args_->qout) {
    output_ = std::make_shared<QuantMatrix>(
        std::move(*(output.get())), 2, qargs.qnorm);
  }
*/
  /*quant_ = true;*/
  auto loss = createLoss(output_);
  model_ = std::make_shared<Model>(input_, output_, loss, normalizeGradient);
}

Jack000 on 25 Nov 2019

Was this page helpful?

0 / 5 - 0 ratings