Rasa: Update documentation on components and policies

Created on 24 Feb 2020  路  13Comments  路  Source: RasaHQ/rasa

There are a couple of comments in https://github.com/RasaHQ/rasa/pull/5266 that are valid and still open. We should address those.

In general:

  • add better descriptions to our model parameters, explain what they are and when to modify those
  • have proper suggestions how to configure certain model parameters
  • explain when to use what component and policy
area type

Most helpful comment

Parsing times with DIETClassifier + these featurizers -
CountVectorizer(word level) : 20 ms
ConveRTFeaturizer: 30 ms
LanguageModelFeaturizer(BERT): 140 ms

All 13 comments

Mention that you can still load in fast text vectors etc on the featurizer docs

The table for the features for the LexicalSyntaxFeaturizer does not look good

Using language models like BERT, GPT-2: Should we add a warning of them being slow?

@dakshvar22 Is training using language models like BERT, GPT-2 slower? Do you have any numbers?

training is a bit slower, because it takes more time to featurize training data. More interesting is whether prediction is slower

@tabergma As @Ghostvv mentioned, the actual training of DIET Classifier should not be affected much. I tested the prediction time with rasa shell. Once the components are loaded, the actual processing of a new message takes around 100 ms. Ofcourse that would change a bit depending the length of the input message.

and without lm models?

Parsing times with DIETClassifier + these featurizers -
CountVectorizer(word level) : 20 ms
ConveRTFeaturizer: 30 ms
LanguageModelFeaturizer(BERT): 140 ms

Outcome of our meeting:

@erohmensing and @ArjaanBuijk are working on a proposal for updating the "choosing a pipeline" page. The current page contains quite a lot of content: (1) what is a pipeline? (2) how does it work? (3) when to use what components? (4) deprecated content about pipeline templates. We want to simplify the page and maybe split it into multiple pages.

@tabergma is updating the "Components" page. We decided to not list the parameters twice. We will explain important parameters in more detail and explain when to update those and how they could influence the models performance. All the other parameters should be listed in a table with just a short description. We might consider making some parts of the components description collapsable.

@koaning

@erohmensing @ArjaanBuijk where are you working on this proposal btw? Do we have an issue for it?

I have an idea.

We can have diagrams for every pipeline step. I'll take DIET now as an example.

Suppose that we have our mouse cursor ...

image

Wouldn't it be great if then you'd see something like.

image

If this makes folks go "yep! me want dat!" ... I could make a demo in d3.

Closed via #5410 and #5456. Let's open new issues for further improvements.

Was this page helpful?
0 / 5 - 0 ratings