Rasa NLU version: 0.10.1
Used backend / pipeline: spacy_sklearn
Operating system: osx
Issue:
When using "fixed_model_name" in configuration file, it would be nice if the server would treat the fixed_model_name as the default model for requests. Currently one needs to specify the model in the query, which seems unnecessary when providing a fixed model name in the config.
The behaviour is also misleading when specifying a fixed_model_name that has not the required date format, as requests will not work at all then.
E.g. config
{
"pipeline": "spacy_sklearn",
"language": "en",
"path": "./projects",
"fixed_model_name": "model_en_deploy",
"data": "data/test.json",
"port": 5000
}
with query
curl -X POST \
http://localhost:5000/parse \
-H 'content-type: application/json' \
-d '{
"q":"I am looking for blue jeans"
}'
results in
{
"error": "time data 'en_deploy' does not match format '%Y%m%d-%H%M%S'"
}
@tmbo: it looks like an enhancement to me, not really a bug. To begin with:
The behaviour is also misleading when specifying a fixed_model_name that has not the required date format, as requests will not work at all then.
It's not that misleading: you're not supposed to have custom model names as models only represent a snapshot of a project: you have for instance an English speaking customer support chatbot project you named "en_customer_support", the "model_XXXXXX" and "model_YYYYYY" just represent two versions of this same chatbot project. Models are a way to version your project so you're not supposed to ask for "en_deploy" as a model.
As for setting a "fixed_model_name", more generally it would be interesting to have some per project metadata specifying the default model to use, the language, the pipeline and if the project is production ready or in development.
@PHLF if you are not supposed to have custom model names, a fixed_model_name configuration seems misleading to me..
In terms of ensuring that you deploy a certain model, which might not be the most current, it would be helpful to specify its name in the configuration, and throw an error if this model is not available, without a fallback to the latest model
Just to be sure, do we both agree that "fixed_model_name" is currently not a genuine parameter name ?
Edit: my bad, it is. I missed that. I don't really know when this was added, but I think it would be better to set a default model per project to override the latest model train being the default one: that would solve @clennan's issue.
It wouldn't be misleading even if you don't have a custom model name as it would allow you to specify the model you want to use as a default for a given project: "fixed_model_name": "model_MMDDYYYY".
As for throwing an error, a fallback means what it means: it allows you to deliver your service in a worst case scenario, because maybe at some points the models may be stored in a separate DB that gets corrupted and the server is not able to find the right model anymore. Moreover there is currently a server side warning when the server uses a fallback model.
so are we saying the fixed_model_name is completely broken at the moment? Does it ever work as expected?
Exactly.
The error message "error": "time data 'en_deploy' does not match format '%Y%m%d-%H%M%S'" says it all: the model name need to match a specific format thus nullifying this fixed_model_name parameter as long as it differs from the model_YYYYMMDD-HHMMSS format.
You can set a fixed_model_name but you cannot set a custom model name.
So we either need to remove this parameter or fix it. One proposal for fixing it is to make it model_prefix rather than fixed_model_name and just have it override the model_ prefix?
@clennan can you help me with your use case, why are you trying to use a fixed model name?
Seems fixed_model_name parameter has been used only while training and it creates the model name with specified value. However the status and parse endpoints expect model name in specific format. Hence good to remove fixed_model_name (at least from Configuration documentation ) until status and parse endpoint does not recognize it.
this should be fixed.
Most helpful comment
@PHLF if you are not supposed to have custom model names, a fixed_model_name configuration seems misleading to me..
In terms of ensuring that you deploy a certain model, which might not be the most current, it would be helpful to specify its name in the configuration, and throw an error if this model is not available, without a fallback to the latest model