Flair: The sqlite I/O problem on portuguese flair embeddings

Created on 8 Jan 2019  路  1Comment  路  Source: flairNLP/flair

Describe the bug
I am creating embeddings for portuguese dataset.
It happen after processing 85% of dataset?

cursor.execute(req, arg)
sqlite3.OperationalError: disk I/O error

Environment (please complete the following information):

  • OS [e.g. iOS, Linux]: Linux
  • Version [e.g. flair-0.3.2]: 0.4.0
    -Cuda : 9
bug

Most helpful comment

Hello @nooralahzadeh - I think this happens because your hard drive is full. If you are installing from pip, the FlairEmbeddings class automatically serializes all produced word embeddings into the flair folder at ~/.flair/embeddings/.

This is in fact undesired behavior which we have fixed in the current master branch. We only want serialization to occur if the user requests it. But currently it happens automatically. So for now you have two options:

A. Turn off automatic serialization by instantiating your flair embeddings by passing use_cache=False, like this:

embeddings = FlairEmbeddings('multi-forward', use_cache=False)

B. Use a different hard drive where you have more space available. You can pass the serialization location like this:

embeddings = FlairEmbeddings('multi-forward', use_cache=True, cache_directory=Path('path/to/folder/with/space'))

Hope this helps!

>All comments

Hello @nooralahzadeh - I think this happens because your hard drive is full. If you are installing from pip, the FlairEmbeddings class automatically serializes all produced word embeddings into the flair folder at ~/.flair/embeddings/.

This is in fact undesired behavior which we have fixed in the current master branch. We only want serialization to occur if the user requests it. But currently it happens automatically. So for now you have two options:

A. Turn off automatic serialization by instantiating your flair embeddings by passing use_cache=False, like this:

embeddings = FlairEmbeddings('multi-forward', use_cache=False)

B. Use a different hard drive where you have more space available. You can pass the serialization location like this:

embeddings = FlairEmbeddings('multi-forward', use_cache=True, cache_directory=Path('path/to/folder/with/space'))

Hope this helps!

Was this page helpful?
0 / 5 - 0 ratings