Facebook has recently released LASER - https://github.com/facebookresearch/LASER, a language-Agnostic sentence representations, that internally makes use of a relatively new C++ BPE implementation - https://github.com/glample/fastBPE
It could make sense to support this package within FastText as general approach to subwords so that it would handle multiple languages.
In my case, handling languages like Hindi, etc. could be tricky when segmenting/tokenizing before training a new FT model, let' say for language recognition. Having a generic approach could solve this.
[UPDATE]
fastBPE now supports Python (through Cython) so it could be used in the FastText Python wrapper directly - https://github.com/glample/fastBPE/issues/12
Most helpful comment
[UPDATE]
fastBPE now supports Python (through Cython) so it could be used in the FastText Python wrapper directly - https://github.com/glample/fastBPE/issues/12