Rasa: How to train using data supplied in markdown format?

Created on 31 Aug 2017  路  5Comments  路  Source: RasaHQ/rasa

Hi! The documentation here:
https://rasa-nlu.readthedocs.io/en/latest/dataformat.html#markdown-format
says that "training data can be used in the following markdown format":

## intent:check_balance
- what is my balance <!-- no entity -->
- how much do I have on my [savings](source_account) <!-- entity "source_account" has value "savings" -->
- how much do I have on my [my savings account](source_account:savings) <!-- synonyms, method 1-->

## intent:greet
- hey
- hello

## synonym:savings   <!-- synonyms, method 2 -->
- pink pig

However, it's not clear to me how to train using this data. I tried:
python -m rasa_nlu.train -c config_spacy_test.json
using the following config file:

{
  "pipeline": "spacy_sklearn",
  "path" : "./models",
  "data" : "./data/examples/rasa/test.md"
}

and got:

INFO:rasa_nlu.utils.spacy_utils:Trying to load spacy model with name 'en'
INFO:rasa_nlu.components:Added 'nlp_spacy' to component cache. Key 'nlp_spacy-en'.
Traceback (most recent call last):
  File "/home/ax02211/anaconda3/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/ax02211/anaconda3/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/ax02211/anaconda3/lib/python3.6/site-packages/rasa_nlu/train.py", line 88, in <module>
    do_train(config)
  File "/home/ax02211/anaconda3/lib/python3.6/site-packages/rasa_nlu/train.py", line 77, in do_train
    training_data = load_data(config['data'])
  File "/home/ax02211/anaconda3/lib/python3.6/site-packages/rasa_nlu/converters.py", line 288, in load_data
    fformat = guess_format(files)
  File "/home/ax02211/anaconda3/lib/python3.6/site-packages/rasa_nlu/converters.py", line 258, in guess_format
    file_data = json.loads(f.read())
  File "/home/ax02211/anaconda3/lib/python3.6/json/__init__.py", line 354, in loads
    return _default_decoder.decode(s)
  File "/home/ax02211/anaconda3/lib/python3.6/json/decoder.py", line 339, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/home/ax02211/anaconda3/lib/python3.6/json/decoder.py", line 357, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

How can I train using data in markdown format? Thanks!

type

Most helpful comment

Need to fix a couple more things before we release the next version - but if you install from github you should be good to go with the markdown format

All 5 comments

Did you install from GitHub or pip? What's your Rasa NLU version?

pip, if I recall correctly. Why should it matter?

The markdown training format is a part of latest, which is only available when installing directly from GitHub. I don't know when it will get pushed to pypi.

Need to fix a couple more things before we release the next version - but if you install from github you should be good to go with the markdown format

For the benefit of anyone else that encounters this behaviour: I can confirm that installing from github did indeed resolve the issue. Thanks!

Was this page helpful?
0 / 5 - 0 ratings