Fairseq: add detailed documents for beam search implementation

Created on 27 Feb 2019  Â·  8Comments  Â·  Source: pytorch/fairseq

The SequenceGenerator class is so hard to understand, can someone provide a detailed document?
e.g.

        # get the top beam_size active hypotheses, which are just the hypos
        # with the smallest values in active_mask
        active_hypos, _ignore = buffer('active_hypos'), buffer('_ignore')  # [b, k]
        torch.topk(
            active_mask, k=beam_size, dim=1, largest=False,
            out=(_ignore, active_hypos)
        )

        active_bbsz_idx = buffer('active_bbsz_idx')
        torch.gather(
            cand_bbsz_idx, dim=1, index=active_hypos,
            out=active_bbsz_idx,
        )
        active_scores = torch.gather(
            cand_scores, dim=1, index=active_hypos,
            out=scores[:, step].view(bsz, beam_size),
        )

        active_bbsz_idx = active_bbsz_idx.view(-1)
        active_scores = active_scores.view(-1)

        # copy tokens and scores for active hypotheses
        torch.index_select(
            tokens[:, :step + 1], dim=0, index=active_bbsz_idx,
            out=tokens_buf[:, :step + 1],
        )
        torch.gather(
            cand_indices, dim=1, index=active_hypos,
            out=tokens_buf.view(bsz, beam_size, -1)[:, :, step + 1],
        )
        if step > 0:
            torch.index_select(
                scores[:, :step], dim=0, index=active_bbsz_idx,
                out=scores_buf[:, :step],
            )
        torch.gather(
            cand_scores, dim=1, index=active_hypos,
            out=scores_buf.view(bsz, beam_size, -1)[:, :, step],`

The above vectorized code makes me almost crazy. I know it helps speeding computation, but at the cost of understanding.

documentation help wanted

Most helpful comment

@villmow, take a look here: https://github.com/pytorch/fairseq/commit/20bbbdc068dabef1a3a4689dd5a9ccffcc8297c5

It should be a drop-in replacement for SequenceGenerator. It's still batched, but removes a lot of the other complexity. Happy to take a PR if you want to take a stab at integrating this more cleanly or simplifying it further.

All 8 comments

We plan to release a simpler (albeit much slower) implementation as an alternative soon. If you want to modify the search code, the simplified version may be a bit easier to work with.

Or, are you trying to understand the fast version? If so, I highly recommend putting a break point (pdb) and stepping through it line by line with a batch of data to see what’s happening. The code you referenced is simply selecting the topk hypotheses that don’t end in eos, and adding the selected tokens/scores to the final tokens/scores tensors.

Hi, could you give me a contact information to discuss the beam search in fairseq ?

wx: AsahiNever

We plan to release a simpler (albeit much slower) implementation as an alternative soon. If you want to modify the search code, the simplified version may be a bit easier to work with.

Or, are you trying to understand the fast version? If so, I highly recommend putting a break point (pdb) and stepping through it line by line with a batch of data to see what’s happening. The code you referenced is simply selecting the topk hypotheses that don’t end in eos, and adding the selected tokens/scores to the final tokens/scores tensors.

thx O(∩_∩)O

Hi, has there been any further update on the more-understandable beam search that the one from SequenceGenerator?
Also, what is SequenceScorer doing exactly? From the code if i'm not wrong, is it just a normal forward pass inference equivalent to beam_size=1?

Someone wrote a really nice tutorial about the beam search implementation in fairseq: http://www.telesens.co/2019/04/21/understanding-incremental-decoding-in-fairseq/

We have a slightly simpler beam search implementation, but we'd like to simplify it even further (by removing all batching) before releasing it. In any case this will be much slower than the current vectorized implementation.

Any updates on the release of the simpler beam search implementation?

@villmow, take a look here: https://github.com/pytorch/fairseq/commit/20bbbdc068dabef1a3a4689dd5a9ccffcc8297c5

It should be a drop-in replacement for SequenceGenerator. It's still batched, but removes a lot of the other complexity. Happy to take a PR if you want to take a stab at integrating this more cleanly or simplifying it further.

I echo @shawnthu's sentiment, the current SequenceGenerator class and especially its generate method is very hard to dissect for an outsider (like me).

Was this page helpful?
0 / 5 - 0 ratings