Rasa: Extracting Entity Associations and Modifiers

Created on 26 Jul 2017  路  14Comments  路  Source: RasaHQ/rasa

At present we can find multiple entities but I am unable to find association among them with the current rasa.
For eg:
Apple CEO Tim Cook met with Microsoft CEO Satya Nadella last month in California.
In this I can easily find "Apple", "TIm Cook","Microsoft","Satya Nadella" as entities but how can I relate first 2 with each other and last 2 with each other, how can I feed this info while training my model.

Earlier I was solving this issue using dep parsing but with rasa is there any way to solve this.

Secondly, Are we using Delexicalisation anywhere in rasa.

Regards,
Mayank

help wanted stale type type

Most helpful comment

To build a new component achieving that, have a look at one of the extractor source code. You just have to extract dependencies using the same tools you are using now and add the result to Rasa output data.

But talking about the design, I'd like to think that chatbots (at least currently) are just simple API interfaces in natural language. In your example:

Book a train from A to B and flight from B to C

You have two api calls:

  • One for booking train places
  • Another for booking flights

In the same way you wouldn't try to overcomplexify your API calls by composing them together, you should design your chatbot to handle these separately (count/check your extracted entities types for instance).
Otherwise you'll end up wanting your chatbot to handle things like:

You: Hi. Do you know where I can find an Indian restaurant in Chigaco open after 10 p.m. because I'll travel by plane from New York to Chicago from tomorrow to the end of September and I'm so fond of Indian food (even if I like eating Italian from time to time). So could you also book the cheapest airplane ticket available please and check the meteo: I'd like to see if there will be some turbulences as I feel a little bit uncomfortable in planes. And could you also send a mail to my GF to tell her I'm OK once I arrived?

Your chatbot: ...

You get the point.

All 14 comments

Can anyone from the rasa team tell me whether its possible or not whatever I mentioned in my first comment.

To clarify a little bit your question, can you show an example of the data structure you want to extract from the user input? Is it something like this:

{
  "text": "What can you tell me about Satya Nadella from Microsoft?",
  "intent": "info",
  "entities": [
    {
      "start": 28,
      "end": 41,
      "value": "Satya Nadella",
      "entity": "person",
      "relation": { "Microsoft": "CEO" }
    }
  ]
}

There is no way to feed this kind of information in your training data.
Rasa is not really meant to be a low-level NLP framework, but if you feel it necessary for your chatbot, you can try to build your own component (see the link below) to fill this gap.

As for your second question, do you have an example of what you call "delexicalisation"?

Also have a look at the doc related to Rasa NLP internals.

For the first part the data structure should look like as follows:

{
"text": "Apple CEO Tim Cook met with Microsoft CEO Satya Nadella last month in California",
"intent": "info",
"entities": [
{
"start": ,
"end": ,
"value": "Apple",
"entity": "org",
"relation_to": { "Apple": "Tim Cook" }
},
{
"start": ,
"end": ,
"value": "Tim Cook",
"entity": "person",
"related_to": { "Tim Cook": "Apple" }
},
{
"start": ,
"end": ,
"value": "Microsoft",
"entity": "org",
"related_to": { "Microsoft": "Satya Nadella" }
},
{
"start": ,
"end": ,
"value": "Satya Nadella",
"entity": "person",
"related_to": { "Satya Nadella":"Microsoft"}
},
{
"start": ,
"end": ,
"value": "California",
"entity": "loc",
"related_to": {}
}
]
}

I haven't add start and end but the structure should look like this and I don't need any knowledge from outside as you explained in your structure ie. "CEO", I just want to tag whatever is there in sentence.

For second question ,
Delexicalisation: slots and values are replaced by generic tokens (e.g. keywords like Chinese food are replaced by [v.food] [s.food] to allow weight sharing)

To quote myself:

There is no way to feed this kind of information in your training data.
Rasa is not really meant to be a low-level NLP framework, but if you feel it necessary for your chatbot, you can try to build your own component to fill this gap.

I don't think Rasa use any "delexicalisation" mechanism as words are just processed into features vectors and no featurizer use "delexicalisation" AFAIK.

Can I ask what the use case is or would be for the association in the realm of chatbots or natural language agents?

@wrathagom If multiple entities are present in user query and you want to take action based on that, then you should know which entities are related to each other. For eg. Book a train from A to B and flight from B to C , here I can extract entities A, B, C but how to relate train to A and B & flight to B and C. Another use case , I need a red shirt and blue tie, here red, shirt, blue , tie I can find but how to relate red to shirt and blue to tie.
I think you understand my requirements, for now I am using dep parsing and semantic parsing to resolve such issues, but how can I go with data driven approach like rasa to solve such issue.
Can you help me out what can be the best way I can develop a component to solve such problems.

To build a new component achieving that, have a look at one of the extractor source code. You just have to extract dependencies using the same tools you are using now and add the result to Rasa output data.

But talking about the design, I'd like to think that chatbots (at least currently) are just simple API interfaces in natural language. In your example:

Book a train from A to B and flight from B to C

You have two api calls:

  • One for booking train places
  • Another for booking flights

In the same way you wouldn't try to overcomplexify your API calls by composing them together, you should design your chatbot to handle these separately (count/check your extracted entities types for instance).
Otherwise you'll end up wanting your chatbot to handle things like:

You: Hi. Do you know where I can find an Indian restaurant in Chigaco open after 10 p.m. because I'll travel by plane from New York to Chicago from tomorrow to the end of September and I'm so fond of Indian food (even if I like eating Italian from time to time). So could you also book the cheapest airplane ticket available please and check the meteo: I'd like to see if there will be some turbulences as I feel a little bit uncomfortable in planes. And could you also send a mail to my GF to tell her I'm OK once I arrived?

Your chatbot: ...

You get the point.

@PHLF I agree with you , it will be handled by 2 api calls only but my question is what is your thought on how to divide the query based on entities as all are cities and from intent perspective I can get only single intent, for my second example, I need a blue shirt and red tie, what should be your approach for this in order to link blue to shirt and red with tie.

  1. A first raw idea would be to define pieces of text like from <city> as being entities of <departure_city> and to <city> as being entities of <arrival_city>. Of course such an approach can be improved a lot but it can already cover a lot of cases.
  2. This case is a little bit more complex as what you describe here are structured entities: you have entities and subentities. This is the kind of things LUIS for instance can handle natively. Rasa doesn't at the moment (the CRF entities extractor component could theoretically achieve that as it is extremely powerful and has margin for improvement). But you can also work with entities like: color, size... and cloth_type, define your API to require only cloth_type as a mandatory entity then check if there are other generic entities in the user input like color... So you'll be able to handle cases such as:

I'd like to buy a tie // Ok let's look for all available ties.

I'd like to buy a blue tie // Ok let's look for all available big blue ties

I'd like to buy a big blue tie // etc.

With structured entities you would be able to handle things like:

I'd like to rent a six wheeled car

As being wheeled is an inherent property of a car object (subentity).

Multiple intents per message is already in #374 and the Rasa guys have already said they want to handle it. I can see value for different associations within the entities.

So:

I want to buy a red shirt shirt may be the actual entity and red is detected to modify it.
I want to buy a big blue tie tie is the actual entity with two modified big and blue.

You may be able to accomplish this with syntactic analysis. I'm sure the Rasa guys could come up with a fancier solution too.

Even I want to book a flight from A to B and book a train from B to C may be able to be split into two using the same method.

For today, I wonder if you can train a too_many_questions sort of intent until some of the above could be implemented.

So have a book_a_train intent and book_a_plane intent and then one that tries to do both with a default response of. "I think I can help you, but let's handle things one at a time. What would you like to do first" sort of a thing. I'm not saying that's the most sophisticated or elegant, but it may be a good solution for the functionality that exists today.

partial duplicate, Github is half right.

@mayankg53 @wrathagom @PHLF
I also have difficulties concerning structured entities or entity assiciations. For example, let me show some simple requests for a simple database Friendship:

  1. "show info of PersonA" --> intent: 'get_info'; with a 'person' entity
  2. "show info of PersonA's friend PersonB" --> intent: 'get_info'; with two 'person' entities, but we want to only know about PersonB
  3. "show info of PersonA and PersonB" --> intent: 'get_info'; with two 'person' entities, and we want to know both of them
  4. "show info of PersonA's friends" --> intent: 'get_friends'; with a 'person' entity

Things rapidly become much more complicated if more types of relationship (teacher/student, parent/child), more intents (get_teacher), more complex text structures (ex: "A, who is a friend of ...") are concerned.

One solution I can eaisly imagine, it's the sentence sparsing (lower-level, ex: that of SpaCy). For the 2nd example, the sentence structure can tell us we want to know about PersonB, not PersonA; while in the 3rd example, we want to know the both. To my knowledge, it cannot be done at the level of Rasa NLU yet at present (please tell me if i'm wrong). Another solution may be 'relation extraction', for ex2, get_info(PersonB), with Relation (PersonB, be_friend_of, PersonA).

PS: and eventually structured intents (not only a problem of 'multi-intents', but also intent associations and hierarchical intent), such as get_info(get_friend_names) or yes_no(count(get_friend_names)==10), etc.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

we now have support and examples around knowledgebases - I'll close this for now

Was this page helpful?
0 / 5 - 0 ratings