Flair: Any way to parallelize embedding extraction on CPU cores?

Created on 2 Aug 2019  路  2Comments  路  Source: flairNLP/flair

Hey guys

I tried to extract embeddings out of corpus using GPU but gpu utilisation is around 7% only so I thought there might be a way of getting all cores of CPU to extract embeddings so that this could save me time
I've a corpus of around 3 mil tokens and embeddings are gonna be saved in a python dictionary. the following is how I used flair to get embeddings in a parallel way via python Pool() but its not working. I think each process should have it's own "StackedEmbedding" obj and doing this needs a lot of memory.

import os
from multiprocessing import Pool, Manager
from flair.embeddings import WordEmbeddings
from flair.data import Sentence

def procces_sents(chunks):
    sentences = [Sentence(i) for i in chunks]
    stacked_embeddings.embed(sentences)

    for sent in sentences:
        for token in sent:
            if token in my_dict:
                pass
            else:
                pass

stacked_embeddings = StackedEmbeddings([
                                        WordEmbeddings('glove'), 
                                        PooledFlairEmbeddings('news-forward-fast'), 
                                        PooledFlairEmbeddings('news-backward-fast'),
                                       ])

my_dict=Manager().dict()
my_list["1.txt"]
n=32
filepath = "/some/path/"
for filename in my_list:
    with open(os.path.join(filepath,filename) ,"r") as m:

    all=[]
    data = m.readlines()
    all = [x.strip() for x in all_data]

    all_chunks = [all[i * n:(i + 1) * n] for i in range((len(all) + n - 1) // n )]

    p=Pool()
    p.map(procces_sents, all_chunks)
    p.close()
    p.join()

I expect that each core takes a 32 sized chunk of sentences and process them.

"all" is a list of all sentences

"all_chunks" is chunks of all list (each containing 32 sentences)

question wontfix

Most helpful comment

Hello @ctrboundary , did you try to parallel your task with Joblib?

from joblib import Parallel, delayed
from tqdm import tqdm_notebook
from multiprocessing import cpu_count

def parallelize(iterable, func):
    return Parallel(n_jobs=cpu_count() - 1, prefer="threads")(delayed(func)(i) for i in tqdm_notebook(iterable))

parallelize(all_chunks, procces_sents)

the prefer="threads" parameter should avoid to have stacked_embeddings object duplicated in memory.

All 2 comments

Hello @ctrboundary , did you try to parallel your task with Joblib?

from joblib import Parallel, delayed
from tqdm import tqdm_notebook
from multiprocessing import cpu_count

def parallelize(iterable, func):
    return Parallel(n_jobs=cpu_count() - 1, prefer="threads")(delayed(func)(i) for i in tqdm_notebook(iterable))

parallelize(all_chunks, procces_sents)

the prefer="threads" parameter should avoid to have stacked_embeddings object duplicated in memory.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

jewl123 picture jewl123  路  3Comments

alanakbik picture alanakbik  路  3Comments

frtacoa picture frtacoa  路  3Comments

shoarora picture shoarora  路  3Comments

mittalsuraj18 picture mittalsuraj18  路  3Comments