Fairseq: add detailed documents for beam search implementation

Created on 27 Feb 2019 · 8Comments · Source: pytorch/fairseq

The SequenceGenerator class is so hard to understand, can someone provide a detailed document?
e.g.

        # get the top beam_size active hypotheses, which are just the hypos
        # with the smallest values in active_mask
        active_hypos, _ignore = buffer('active_hypos'), buffer('_ignore')  # [b, k]
        torch.topk(
            active_mask, k=beam_size, dim=1, largest=False,
            out=(_ignore, active_hypos)
        )

        active_bbsz_idx = buffer('active_bbsz_idx')
        torch.gather(
            cand_bbsz_idx, dim=1, index=active_hypos,
            out=active_bbsz_idx,
        )
        active_scores = torch.gather(
            cand_scores, dim=1, index=active_hypos,
            out=scores[:, step].view(bsz, beam_size),
        )

        active_bbsz_idx = active_bbsz_idx.view(-1)
        active_scores = active_scores.view(-1)

        # copy tokens and scores for active hypotheses
        torch.index_select(
            tokens[:, :step + 1], dim=0, index=active_bbsz_idx,
            out=tokens_buf[:, :step + 1],
        )
        torch.gather(
            cand_indices, dim=1, index=active_hypos,
            out=tokens_buf.view(bsz, beam_size, -1)[:, :, step + 1],
        )
        if step > 0:
            torch.index_select(
                scores[:, :step], dim=0, index=active_bbsz_idx,
                out=scores_buf[:, :step],
            )
        torch.gather(
            cand_scores, dim=1, index=active_hypos,
            out=scores_buf.view(bsz, beam_size, -1)[:, :, step],`

The above vectorized code makes me almost crazy. I know it helps speeding computation, but at the cost of understanding.

documentation help wanted

Source

shawnthu

👍5

Most helpful comment

@villmow, take a look here: https://github.com/pytorch/fairseq/commit/20bbbdc068dabef1a3a4689dd5a9ccffcc8297c5

It should be a drop-in replacement for SequenceGenerator. It's still batched, but removes a lot of the other complexity. Happy to take a PR if you want to take a stab at integrating this more cleanly or simplifying it further.

myleott on 30 May 2019

👍2

All 8 comments

We plan to release a simpler (albeit much slower) implementation as an alternative soon. If you want to modify the search code, the simplified version may be a bit easier to work with.

Or, are you trying to understand the fast version? If so, I highly recommend putting a break point (pdb) and stepping through it line by line with a batch of data to see what’s happening. The code you referenced is simply selecting the topk hypotheses that don’t end in eos, and adding the selected tokens/scores to the final tokens/scores tensors.

myleott on 28 Feb 2019

Hi, could you give me a contact information to discuss the beam search in fairseq ？

wx: AsahiNever

shawnthu on 5 Mar 2019

We plan to release a simpler (albeit much slower) implementation as an alternative soon. If you want to modify the search code, the simplified version may be a bit easier to work with.

Or, are you trying to understand the fast version? If so, I highly recommend putting a break point (pdb) and stepping through it line by line with a batch of data to see what’s happening. The code you referenced is simply selecting the topk hypotheses that don’t end in eos, and adding the selected tokens/scores to the final tokens/scores tensors.

thx O(∩_∩)O

shawnthu on 5 Mar 2019

Hi, has there been any further update on the more-understandable beam search that the one from SequenceGenerator?
Also, what is SequenceScorer doing exactly? From the code if i'm not wrong, is it just a normal forward pass inference equivalent to beam_size=1?

ashutoshsaboo on 13 Mar 2019

Someone wrote a really nice tutorial about the beam search implementation in fairseq: http://www.telesens.co/2019/04/21/understanding-incremental-decoding-in-fairseq/

We have a slightly simpler beam search implementation, but we'd like to simplify it even further (by removing all batching) before releasing it. In any case this will be much slower than the current vectorized implementation.

myleott on 23 Apr 2019

👍1

Any updates on the release of the simpler beam search implementation?

villmow on 29 May 2019

@villmow, take a look here: https://github.com/pytorch/fairseq/commit/20bbbdc068dabef1a3a4689dd5a9ccffcc8297c5

myleott on 30 May 2019

👍2

I echo @shawnthu's sentiment, the current SequenceGenerator class and especially its generate method is very hard to dissect for an outsider (like me).