Fairseq: what does reorder_incremental_state do?

Created on 17 Nov 2017  路  2Comments  路  Source: pytorch/fairseq

  1. I've seen the descriptions of reorder_incremental_state:

    This should be called when the order of the input has changed from the
    previous time step. A typical use case is beam search, where the input
    order changes between time steps based on the choice of beams.

    But i'm still confused at this typical use case, what is the new order in each step? Is it ordered by the cumulative score in the current step?

  2. I've made some changes in fconv.py: add another gru decoder. I.e., I have a fconv decoder and gru decoder to calculate decoder states simultaneously, then concatenate the features they two get, then fully connect to output layer with size N_dictionary.
    My problem is: training loss and valid loss seems to be normal. However, the generation result shows BLEU = 0.43 on training error=4.56. Can you give me some hint on the causes? Is it because gru and fconv share the same reorder_incremental_state function? PS: I've only modified fconv.py

Most helpful comment

During beam search we start with K beams, e.g., A, B, C, D, E for beam=5. Then at the next step we try to extend each beam by one word. For example, we could consider AA, AB, ..., AZ, BA, BB, ..., BZ, ..., ..., ..., EZ and choose the top-K among these, e.g., AF, AG, EG, CF, DF. Now we need to "reorder" the incremental state since we've removed the beam that starts with "B" and shifted the others around. So we would call something like reorder_incremental_state([A, A, E, C, D]) which will reorder any necessary state (e.g., the hidden and cell states for LSTM) to match the new order.

Re: the GRU decoder, it's hard to help debug without seeing the code. Are you able to share what you have so far?

All 2 comments

During beam search we start with K beams, e.g., A, B, C, D, E for beam=5. Then at the next step we try to extend each beam by one word. For example, we could consider AA, AB, ..., AZ, BA, BB, ..., BZ, ..., ..., ..., EZ and choose the top-K among these, e.g., AF, AG, EG, CF, DF. Now we need to "reorder" the incremental state since we've removed the beam that starts with "B" and shifted the others around. So we would call something like reorder_incremental_state([A, A, E, C, D]) which will reorder any necessary state (e.g., the hidden and cell states for LSTM) to match the new order.

Re: the GRU decoder, it's hard to help debug without seeing the code. Are you able to share what you have so far?

Thank you, I've found the problem, it has nothing to do with the reorder operation in beam search.

Was this page helpful?
0 / 5 - 0 ratings