I've seen the descriptions of reorder_incremental_state:
This should be called when the order of the input has changed from the
previous time step. A typical use case is beam search, where the input
order changes between time steps based on the choice of beams.
But i'm still confused at this typical use case, what is the new order in each step? Is it ordered by the cumulative score in the current step?
I've made some changes in fconv.py: add another gru decoder. I.e., I have a fconv decoder and gru decoder to calculate decoder states simultaneously, then concatenate the features they two get, then fully connect to output layer with size N_dictionary.
My problem is: training loss and valid loss seems to be normal. However, the generation result shows BLEU = 0.43 on training error=4.56. Can you give me some hint on the causes? Is it because gru and fconv share the same reorder_incremental_state function? PS: I've only modified fconv.py
During beam search we start with K beams, e.g., A, B, C, D, E for beam=5. Then at the next step we try to extend each beam by one word. For example, we could consider AA, AB, ..., AZ, BA, BB, ..., BZ, ..., ..., ..., EZ and choose the top-K among these, e.g., AF, AG, EG, CF, DF. Now we need to "reorder" the incremental state since we've removed the beam that starts with "B" and shifted the others around. So we would call something like reorder_incremental_state([A, A, E, C, D]) which will reorder any necessary state (e.g., the hidden and cell states for LSTM) to match the new order.
Re: the GRU decoder, it's hard to help debug without seeing the code. Are you able to share what you have so far?
Thank you, I've found the problem, it has nothing to do with the reorder operation in beam search.
Most helpful comment
During beam search we start with K beams, e.g.,
A, B, C, D, Efor beam=5. Then at the next step we try to extend each beam by one word. For example, we could considerAA, AB, ..., AZ, BA, BB, ..., BZ, ..., ..., ..., EZand choose the top-K among these, e.g.,AF, AG, EG, CF, DF. Now we need to "reorder" the incremental state since we've removed the beam that starts with "B" and shifted the others around. So we would call something likereorder_incremental_state([A, A, E, C, D])which will reorder any necessary state (e.g., the hidden and cell states for LSTM) to match the new order.Re: the GRU decoder, it's hard to help debug without seeing the code. Are you able to share what you have so far?