rasa NLU version : 0.8.2
Used backend / pipeline : spacy_sklearn
Operating system (windows, osx, ...): Windows 10
Issue: Several
Am following instructions at http://rasa-nlu.readthedocs.io/en/latest/python.html
The first problem is with the training. It has a permission error on the temp directory. Since that is unlikely, I suspect that it is masking the real problem.
>>> trainer.train(training_data)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "D:\Tools\Anaconda\lib\site-packages\rasa_nlu\model.py", line 157, in train
updates = component.train(*args)
File "D:\Tools\Anaconda\lib\site-packages\rasa_nlu\extractors\crf_entity_extractor.py", line 80, in train
self._train_model(dataset)
File "D:\Tools\Anaconda\lib\site-packages\rasa_nlu\extractors\crf_entity_extractor.py", line 308, in _train_model
self.ent_tagger.open(self.crf_file.name)
File "pycrfsuite/_pycrfsuite.pyx", line 571, in pycrfsuite._pycrfsuite.Tagger.open (pycrfsuite/_pycrfsuite.cpp:7731)
File "pycrfsuite/_pycrfsuite.pyx", line 717, in pycrfsuite._pycrfsuite.Tagger._check_model (pycrfsuite/_pycrfsuite.cpp:10037)
PermissionError: [Errno 13] Permission denied: 'R:\\LocalTmp\\tmp5rrxv8ce'
**Content of configuration file** (if used & relevant):
```json
{
"pipeline": "spacy_sklearn",
"path" : "./Rasa/models",
"data" : "./Rasa/train/examples/demo-rasa.json",
"emulate" : "luis"
}
Hi Vicharian,
I have faced the same issue and had posted an issue under :
https://github.com/scrapinghub/python-crfsuite/issues/61
It seems writing into temporary files in Windows is an issue and one way might be to implement sklearn crfsuite.
Not sure if the developers will agree.
There has been no comment from them as of now.
for future reverence: had a chat with the creator of python-crfsuite and the best solution is to switch to sklearn-crfsuite (which he also maintains). this will allow us to avoid the temporary persistence of the model and directly use it after we trained it.
This is why I like ubuntu.It works perfectly in linux systems.
However,may I know how much time switching to sklearn might take.I need to implement rasa_nlu for my office work and the system is windows.
I'll take a look at it, as both libraries us the same underlying c library I guess it shouldn't be to big of a deal.
I replaced the implementation on the sklearn-crf branch - @shuvayan do you mind trying that out with your windows installation?
Hi @tmbo ,
I just tested the new version on windows. It seems to be working fine but I am getting only the last entity from the text and not the others.
{'entities': [{'end': 62,
'entity': 'size',
'extractor': 'ner_crf',
'start': 61,
'value': '9'}],
'intent': {'confidence': 0.0, 'name': ''},
'text': 'I want to buy shoes of brand Adidas and color brown and size 9'}
here there should be three entities : 1. brand : adidas 2. color : brown and size : 9
I have changed nothing in my code except replacing the crf_entity_extractor code as modified by you.
Not sure if this issue is because of the change in underlying library.
@shuvayan thanks a lot for trying it out. I'didn't notice that behaviour but I will try to reproduce the issue later today. Appreciate the feedback :+1:
@shuvayan I couldn'r reproduce the issue, it looks fine on the demo dataset (e.g. it finds two entities in i am looking for an italian restaurant in the north).
If you don't mind it would speed up things if you could send me your data + config to [email protected] (will only be used for debugging this issue).
hi @tmbo ,
Sorry for the delay. I am attaching the data,config file and the python code used.
from rasa_nlu.converters import load_data
from rasa_nlu.config import RasaNLUConfig
from rasa_nlu.model import Trainer
from rasa_nlu.model import Metadata, Interpreter
training_data = load_data('data/testData.json')
trainer = Trainer(RasaNLUConfig("config_spacy.json"))
trainer.train(training_data)
model_directory = trainer.persist('./models/')
metadata = Metadata.load(model_directory)
interpreter = Interpreter.load(metadata, RasaNLUConfig("config_spacy.json"))
entities = interpreter.parse(u"I want to buy a brown pants of size 32")
Only the file corresponding to the crf_entity_extractor has to be changed,right? I did not change any other file in the source code.
Hi @tmbo ,
I would also like to request you to update the documentation with an example as to how to measure metrics like accuracy and f1-score now that sklearn is being used. It would be really helpful.
Hi @tmbo ,
Would like to know if you could reproduce the issue at your end??Or do I need to provide something more.
Sorry, @shuvayan Was at a conference last week and was involved in a client side project so not to much time there to look at this yet.
I will have a look at it today tough - promised 馃槈
Thanks a lot @tmbo . I really appreciate it. :+1:
I still can not reproduce your issue. Running your training data and using the generated model with the server results in the following output for the input I want to buy a brown pants of size 32:
{
"entities": [
{
"end": 21,
"entity": "color",
"extractor": "ner_crf",
"start": 16,
"value": "brown"
},
{
"end": 27,
"entity": "product",
"extractor": "ner_crf",
"start": 22,
"value": "pants"
},
{
"end": 38,
"entity": "size",
"extractor": "ner_crf",
"start": 36,
"value": "32"
}
],
"intent": {
"confidence": 0.0,
"name": ""
},
"text": "I want to buy a brown pants of size 32"
}