Hello,
I am using run_squad.py to build my own question answering system, the problem is that, I want the system can output multiple answers for a question.
The number of answers can be 0, or one, or multiple if possible, how can I do to the code to achieve this? Thank you
soloved
Can you share how did you solve that problem?
@armheb
Sure, some questions in my dataset have multiple answers, some have one answer, some no answer.
Firstly, I add a for loop in the "read_squad_example" method to allow the code to read all answers for each question and build N SquadExamples for each question, N is the number of answers (This is for my case, you don't have to do it, because I need to use all answers, the original squad code only reads the first answer of each question even the question has multiple answers).
The run_squad.py produces a "nbest_predictions.json" file, you can see the model provides top 20 possible answers for each question, with possibilities, so I just simply pick some of those answers according to their possibilities.
However, I have to admit that eventually the performance isn't that good. it works but just not that good, but I think it can be improved by some way.
@mushro00om
Hi,
Can you give sample codes for how you used your model for prediction given a text corpus and a question?
@Swathygsb Hi, sorry for late reply. Actually the code script I used is not this Pytorch version, I used the Tensorflow version provided by Google, because it is much more easier and they provide very clear guidance, Here is the link:
https://github.com/google-research/bert
The most of the code remained unchanged, I basically modified the read_squad_examples method to allow process multiple answers (in my task, a question may have more than one answer, the original code can only process one answer for each question).
So if all your questions have only one particular answer, you can simply follow the guidance, or if your questions may have more than one answer, you can give me your email and i can send my code to you.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.