@gunthercox, @vkosuri, @mymusise
See you guys concern the performance issue for statement get response, what I did for performance improvement may be helpful. The performance improves from 1.9s to 96.8ms by following changes to LevenshteinDistance:
move
import sys
from difflib import SequenceMatcher
to the front of the class.
comment out try ... exception ... block for library import.
# import sys
#
# # Use python-Levenshtein if available
# try:
# from Levenshtein.StringMatcher import StringMatcher as SequenceMatcher
# except ImportError:
# from difflib import SequenceMatcher
# PYTHON = sys.version_info[0]
# Return 0 if either statement has a falsy text value
# if not statement.text or not other_statement.text:
# return 0
#
# # Get the lowercase version of both strings
# if PYTHON < 3:
# statement_text = unicode(statement.text.lower()) # NOQA
# other_statement_text = unicode(other_statement.text.lower()) # NOQA
# else:
# statement_text = str(statement.text.lower())
# other_statement_text = str(other_statement.text.lower())
statement_text = str(statement.text.lower())
other_statement_text = str(other_statement.text.lower())
Good luck!
Thanks for tips, could you please make a PR for this task?
@zxsimple Looked like avoiding the repetition of import improve the performance.
But I make a test seem like the performance wasn't improve obviously. Here's the code:
import time
t1 = time.time()
for i in range(200000):
from difflib import SequenceMatcher
similarity = SequenceMatcher(
None,
'statement text',
'other statement_text'
)
print("Use: ",time.time() - t1)
Output
Use: 1.5159368515014648
then, make a change:
import time
from difflib import SequenceMatcher
t1 = time.time()
for i in range(200000):
similarity = SequenceMatcher(
None,
'statement text',
'other statement_text'
)
print("Use: ",time.time() - t1)
Output
Use: 1.2639625072479248
Is there any change in your code?
@vkosuri I'll create PR after fully test
Yeah Levenshtein.StringMatcher.StringMatcher and difflib.SequenceMatcher are both different libraries. Maybe faster but I think this try/except is because ChatterBot support both Python 2.7 and 3
@pylobot YES, I confirmed that I didn't install python Levenshtein, each invoke will go into the except block, that is pretty time consuming.
@vkosuri pull request has been created https://github.com/gunthercox/ChatterBot/pull/1158, please help review it.
@zxsimple sure
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Most helpful comment
Yeah
Levenshtein.StringMatcher.StringMatcheranddifflib.SequenceMatcherare both different libraries. Maybe faster but I think this try/except is because ChatterBot support both Python 2.7 and 3