Rasa: Refactor Agent / Processor / TrackerStore

Created on 18 Feb 2020 · 11Comments · Source: RasaHQ/rasa

The Problem

the Agent class is big and hard to maintain
new types of models are hard to implement (e.g. the response selector)

Goals

Make it easy to implement different / new types of models without having to squeeze it into the existing code
De-couple our classes to make Rasa Open Source more modular and easier adapt to change
Agent:
- separate model loading logic from class
- separate training from agent class
- remove duplicate code between Agent / Processor
TrackerStore: Separate it from the model specific domain (the TrackerStore should not know of the model domain)
make this not breaking with the existing Rasa Open Source models

Implementation Proposal

This is about adding a layer before the actual model implementations. So we would e.g. have to adapt our Processor class to the Model interface but could use this then as before

UML Diagrams:

Class hierarchy

rasa

Running a training

This UML diagram how the workflow is when running rasa train
It shows how

a model is trained using a TrainingDataImporter
the ModelFactory is selected based on a default value / config

train

Inference Mode / Interacting with a running Rasa server

It shows how

the interaction works with the changed TrackerStore (returns a list of events instead of a DialogueStateTracker
getting response from a model which is behind a Model interface

predict

Starting a Rasa Server and loading a model

This shows how:

the right repository / factory classes are selected to load the desired model type
a model is loaded from disk
how models are re-loaded when using a ModelServer (that's the most vague part I'd say)

load-model

area type type

Source

wochinge

Most helpful comment

Refactoring steps

Simplifications

no need to dynamically load the factory for the model in the beginning. That can be a later step

Todos

Think about processor._cancel_reminders and _schedule_reminders

Steps

Part 0: Agree on a `Model` interface

The current suggestion in the issue is just a quick draft and needs to be discussed with the team.

Part 1: The divorce of `Domain` and `TrackerStore`

Rough plan

return a List[Event] for get_or_create_tracker
move init_tracker somewhere else (probably to DialogueStateTracker)
optional: refactor TrackerStore.save to accept a List[Event]
adapt sub implementations

Current Dependencies (just for plan ideation)

processor.get_tracker
exporter._fetch_events_within_time_range
save returns a DialogueStateTracker
- imo it can continue to accept a DialogueStateTracker

Part 2: Hello `Model` interface

Rough plan

Create Model Interface
Let Processsor implement model interface
Let Agent only speak to processor via interface

Current Dependencies (just for plan ideation)

Dependencies Agent <- -> Processor
- Model.parse_message
  - Agent.parse_message_using_nlu_interpreter
- Model.predict
  - Agent.handle_message
  - Agent.trigger_intent
- Model.predict_next
  - Agent.predict_next
- Model.execute
  - ‼️ This is missing in the interface declaration
  - Agent.execute_action
    - do we want to remove this?
- Open for discussion
  - Agent.log_message
    - Do we even need that?
    - seems to a weird way to add the result of parse_message directly to the tracker + adding slots

Part 3: Remove `TrackerStore` from `Processor`

Rough plan

Remove direct access from server module to processor
- speak to my Agent 🤚
Pass in current Events (not DialogueStateTracker!) to every function of the processor which needs it
Remove tracker_store from Processor constructor

Current Dependencies (just for plan ideation)

Server currently skips the agent and speaks to model directly, e.g.
- get_tracker
Processor always deals with TrackerStore itself

Situation after Part 3

Agent still knows about training
Agent still knows to much about model loading
The Agent still knows to much about the insides of Processor
- interpreter
- domain
- policies

Part 4: Hello `ModelFactory` : Removing Rasa Core training from `Agent`

Rough plan

Create ModelFactory interface
Create a ModelFactory with a from_training_data_importer function
- Do all of the Agent.train steps in here
- this will return a PolicyEnsemble at this step
Add a save function to ModelFactory which accepts the model and can save it
point rasa.core.train to implementation of ModelFactory
dump all the train code from Agent
pass trained Ensemble to Agent as usually

Current Dependencies for Rasa Core training (just for plan ideation)

References
- restaurant bot example
- rasa.core.train
Agent has
- policy ensemble
- domain
- load_data
  - this one is just the Agent taking to the Ensemble
- This is equivalent to
  - Create policy ensemble
  - generate DialogueStateTrackers using Domain
  - Tell the policy ensemble to train
  - set a fingerprint
- Agent knows how to persist models 🙈

Situation after Part 4

Agent doesn't know about Core training any more
Agent still knows to much about model loading
The Agent still knows to much about the insides of Processor
- interpreter
- domain
- policies

Part 5: ModelFactory, let me introduce you to NLU

Rough Steps

Move nlu.train.train to factory (same part where core is trained)
Let factory method from_training_data_importer an implementation of the Model interface (aka Processor)
pimp the save function in the ModelFactory to save everything as a Rasa Model
optional: Move the no need to retrain logic to ModelFactory
delete toggle_memoization or think about it

Open Question
How does our "no need to retrain" fit in there

Part 6: Agent, forget about the model internals

Rough Steps

Add a function to ModelFactory to load Model from path
make rasa.core.run.load_agent_on_start just pass the path instead of the interpreter
Refactor load_agent to use the ModelFactory when loading model from path
- There are 3 cases
  - load_from_server
  - load_from_remote_storage
  - load_local_model

Situation after Part 6

Agent doesn't know about Core training any more
Agent still knows to much about model loading
The Agent does not now how a model looks inside anymore

Part 7: Specialized Model loader

Requirements
- Needs a ModelFactory
- needs to be able to replace the Model within the Agent
  
  Rough Steps (needs further thoughts ‼️)
there are at least two ways to handle this: we need to think about pros / cons
- one centralised model downloader. Sets a variable which points to the latest model on disk and each workers just loads it from disk
- Each workers pulls separately and updates their model
Make Agent model loading one simple / function callback and move other related code to a new / different module

Thoughts

Appscheduler or sanic background task for repeatedly pulling
This should be process safe -> Same Rasa model for all Sanic workers
Monitoring + pulling for a newer one shouldn't be related to where it is stored imo but test condition might change 🤔

Step 8: Remove `Agent.visualize`

Current Dependencies (just for plan ideation)

policies are required to determine max_history
domain is required
interpreter is None which means probably RegexInterpreter

Rough steps

use TrainingDataImporter to load required data
pass importer to rasa.core.visualize and just skip agent.visualize
determine configured max_history based on config + default value + CLI variable
remove Agent.visualize

Step dependencies

wochinge on 18 May 2020

🎉2

All 11 comments

This looks very interesting and I want to understand it better - would you mind adding a very short description of what each UML diagram describes? (just to understand their context better)

federicotdn on 18 Mar 2020

@federicotdn Done. Do you have any specific questions which I should clarify?

wochinge on 18 Mar 2020

👍1

How does this simplify e.g. the implementation of the response selector?

tmbo on 18 Mar 2020

A couple of questions:

This means that you would like to merge Agent and Processor into a new class, Assistant, right? Asking because it's not explicitly mentioned.
I like the idea of adding a ModelFactory. It would separate the logic of creating a model (which is relatively complex, if training is involved) with actually using the model. However, I don't understand why ModelFactory is in charge of saving the models to files. That sounds like it could go either in Model itself (like it is now), or maybe ModelFactory should be renamed to represent the fact that it has more responsibilities than just loading models.
Why do you think TrackerStore.get_or_create_tracker should return a list of events, rather than a DialogueStateTracker itself? Isn't the latter a better way of representing a series of conversation events?
I couldn't understand the objective of having the ModelRepository class. In the last UML diagram, it looks like it is delegating the model loading to ModelFactory.
Can we rename DialogueStateTracker to Tracker? (and then add DialogueStateTracker = Tracker below the class declaration 🤠 )

federicotdn on 18 Mar 2020

How does this simplify e.g. the implementation of the response selector?

Good question. If we would have a structure as proposed then you would approach implementing the feature less from a "how can I fit this in" but rather a "where does this belong" perspective. To be honest I know too less about the difficulties of the response selector implementation to propose how I would have done it given the proposed changes.

An easier example is maybe implementing an end-to-end model. You could simply create a new Model subclass and start coding (while re-using what you need from the existing modules). Both models could live side by side until we decide that we are ready too switch completely.

This means that you would like to merge Agent and Processor into a new class, Assistant, right? Asking because it's not explicitly mentioned.

I wouldn't say merging. Actually it would be more separating them since they are quite entangled at the moment. I guess that a lot of Processor logic would go into RasaModel while the Assistant would be more focused on delegating between channels / tracker stores / lock stores / models .

However, I don't understand why ModelFactory is in charge of saving the models to files.

True, I think both is possible. Might be that the ModelFactory is too much abstraction. From my view the ModelFactory is responsible for everything around (de)-serialization of the model. You could e.g. implement a different ModelFactory to store models on different sources (e.g. S3, disk, ...) or in different formats.

I couldn't understand the objective of having the ModelRepository class. In the last UML diagram, it looks like it is delegating the model loading to ModelFactory.

The ModelRepository is responsible for selecting the appropriate model, e.g. selecting the suited model from a directory on disk or from bucket on AWS. The ModelFactory would then deal with converting the selected model to the right Python objet.

Why do you think TrackerStore.get_or_create_tracker should return a list of events, rather than a DialogueStateTracker itself? Isn't the latter a better way of representing a series of conversation events?

The TrackerStore currently requires the Domain to return the DialogueStateTracker. The Domain is part of the model and shouldn't be exposed to the TrackerStore in my opinion (e.g. different models might handle the events differently). I completely agree that the Model should then apply a suited data structure to this raw lists to make the handling easier.

Can we rename DialogueStateTracker to Tracker? (and then add DialogueStateTracker = Tracker below the class declaration 🤠

I'd prefer ConversationHistory actually since it's more intuitive 😄

In general this is just a first draft / proposal. A lot details about the interactions (e.g. between ModelRepository / ModelFactory still need revisiting.

wochinge on 18 Mar 2020

Thanks for your detailed feedback @federicotdn !

wochinge on 18 Mar 2020

👍1

The way I see our code is as that it is organized as a bunch of lego blocks which interact with each other through clearly defined interfaces. This makes the single blocks replaceable, the code concise and hence more flexible.

Currently, the agent / processor logic leaks into some of our existing "lego blocks" (e.g. training data, tracker store, policies) (sorry, that I forgot the lockstore).

now

(I believe) Introducing a Model interface + the other changes would reinforce the re-usable blocks and remove the leaks.
then

wochinge on 6 Apr 2020

That looks great, I like the idea of reworking the Agent and Processor classes into Assistant, while at the same time making the boundaries between all classes better defined. Currently Agent and Processor are very tightly related and sometimes it's difficult to see what each of them individually is meant to do. I think these changes also give us the opportunity to improve our public Python API.

federicotdn on 7 Apr 2020

An issue regarding process safe handling of models. Not necessary part of this issue, but worth thinking about before implementing this https://github.com/RasaHQ/rasa/issues/5376

wochinge on 17 Apr 2020

Refactoring steps

Simplifications

no need to dynamically load the factory for the model in the beginning. That can be a later step

Todos

Think about processor._cancel_reminders and _schedule_reminders