Chatterbot: Word Aliases

Created on 7 Sep 2017 · 4Comments · Source: gunthercox/ChatterBot

for ex:
AI
Artificial intelligence
machine intelligence

how to treat all this words as one word in an custom logic adapter

question

Source

tosaravanan

Most helpful comment

Try JaccardSimilarity to achieve your approach

Calculates the similarity of two statements based on the Jaccard index.

The Jaccard index is composed of a numerator and denominator. In the numerator, we count the number of items that are shared between the sets. In the denominator, we count the total number of items across both sets. Let’s say we define sentences to be equivalent if 50% or more of their tokens are equivalent. Here are two sample sentences:
The young cat is hungry. The cat is very hungry.
When we parse these sentences to remove stopwords, we end up with the following two sets:
{young, cat, hungry} {cat, very, hungry}
In our example above, our intersection is {cat, hungry}, which has count of two. The union of the sets is {young, cat, very, hungry}, which has a count of four. Therefore, our Jaccard similarity index is two divided by four, or 50%. Given our similarity threshold above, we would consider this to be a match.

Here is the python snippet .

from chatterbot import ChatBot
from chatterbot.comparisons import levenshtein_distance

chatbot = ChatBot(
    # ...
    statement_comparison_function=jaccard_similarity
)

vkosuri on 11 Sep 2017

👍2

All 4 comments

@tosaravanan I think this information will help you to do so.

>>> from nltk import wordnet
>>> synonyms = []
>>> 
>>> for syn in wordnet.synsets("ai"):
...     for l in syn.lemmas():
...         synonyms.append(l.name())
... 
>>> print(set(synonyms))
{'AI', 'ai', 'Army_Intelligence', 'Bradypus_tridactylus', 'artificial_intelligence', 'three-toed_sloth', 'artificial_insemination'}

For more information try to read these documents this thread will help you https://stackoverflow.com/questions/19258652/how-to-get-synonyms-from-nltk-wordnet-python and http://www.nltk.org/howto/wordnet.html

And also I tried to debug further how to add new words wordnet.synsets , again this information will help you https://stackoverflow.com/questions/20749730/add-words-to-a-local-copy-of-wordnet

vkosuri on 8 Sep 2017

👍1

Thanks @vkosuri
am trying to do this, extracting intents and entities whereas price from an web-service can you guide me with best approach
tell samsung j7 price
get price samsung mobile j7

tosaravanan on 11 Sep 2017

Try JaccardSimilarity to achieve your approach

Calculates the similarity of two statements based on the Jaccard index.

The Jaccard index is composed of a numerator and denominator. In the numerator, we count the number of items that are shared between the sets. In the denominator, we count the total number of items across both sets. Let’s say we define sentences to be equivalent if 50% or more of their tokens are equivalent. Here are two sample sentences:
The young cat is hungry. The cat is very hungry.
When we parse these sentences to remove stopwords, we end up with the following two sets:
{young, cat, hungry} {cat, very, hungry}
In our example above, our intersection is {cat, hungry}, which has count of two. The union of the sets is {young, cat, very, hungry}, which has a count of four. Therefore, our Jaccard similarity index is two divided by four, or 50%. Given our similarity threshold above, we would consider this to be a match.

Here is the python snippet .

from chatterbot import ChatBot
from chatterbot.comparisons import levenshtein_distance

chatbot = ChatBot(
    # ...
    statement_comparison_function=jaccard_similarity
)

vkosuri on 11 Sep 2017

👍2

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.