I find the way the way to change tokenizer in the test stage.
# your text of many sentences
text = "This is a sentence. This is another sentence. I love Berlin."
# use a library to split into sentences
from segtok.segmenter import split_single
from flair.data import segtok_tokenizer
sentences = [Sentence(sent, use_tokenizer=segtok_tokenizer) for sent in split_single(text)]
# predict tags for list of sentences
tagger: SequenceTagger = SequenceTagger.load('ner')
tagger.predict(sentences)
Are we able to use customized tokenizers in the training stage?
Thank you.
To train a NER you are supposed the provide a TextCorpus so you are free to use the tokenizer you like. Is there something special you need?
I see. Thank you.