Flair: Change the tokenizer inside the model.

Created on 21 Sep 2019 · 2Comments · Source: flairNLP/flair

I find the way the way to change tokenizer in the test stage.

# your text of many sentences
text = "This is a sentence. This is another sentence. I love Berlin."

# use a library to split into sentences
from segtok.segmenter import split_single
from flair.data import segtok_tokenizer

sentences = [Sentence(sent, use_tokenizer=segtok_tokenizer) for sent in split_single(text)]

# predict tags for list of sentences
tagger: SequenceTagger = SequenceTagger.load('ner')
tagger.predict(sentences)

Are we able to use customized tokenizers in the training stage?
Thank you.

question

Source