Fairseq: Loading trained model

Created on 28 Jan 2020 · 12Comments · Source: pytorch/fairseq

❓ Questions and Help

What is your question?

I trained a model to translate EN-FR using your code. Now I would like to load it and run on my own dataset. However, I haven't found the right way of loading the model in order to extract the embedding layer. My expectation after loading the model is the specification of the architecture (see example below), thus indicating which layer I have to access to get the embedding:

EncoderDecoder(
(encoder): Encoder(
(layers): ModuleList(
(0): EncoderLayer(
(self_attn): MultiHeadedAttention(
(linears): ModuleList(
(0): Linear(in_features=512, out_features=512, bias=True)
(1): Linear(in_features=512, out_features=512, bias=True)
(2): Linear(in_features=512, out_features=512, bias=True)
(3): Linear(in_features=512, out_features=512, bias=True)
)
(dropout): Dropout(p=0.1, inplace=False)
....
)
(src_embed): Sequential(
(0): Embeddings(
(lut): Embedding(40834, 512)
)
(1): PositionalEncoding(
(dropout): Dropout(p=0.1, inplace=False)
)
)
(tgt_embed): Sequential(
(0): Embeddings(
(lut): Embedding(30337, 512)
)
(1): PositionalEncoding(
(dropout): Dropout(p=0.1, inplace=False)
)
)
(generator): Generator(
(proj): Linear(in_features=512, out_features=30337, bias=True)
)
)

However, what I get from loading the model trained with your code looks like this:
model_fairseq = torch.load('./checkpoint_best.pt')
list(model_fairseq['model'])
['encoder.version',
'encoder.embed_tokens.weight',
'encoder.embed_positions._float_tensor',
'encoder.layers.0.self_attn.k_proj.weight',
'encoder.layers.0.self_attn.k_proj.bias',
'encoder.layers.0.self_attn.v_proj.weight',
'encoder.layers.0.self_attn.v_proj.bias',
'encoder.layers.0.self_attn.q_proj.weight',
'encoder.layers.0.self_attn.q_proj.bias',
'encoder.layers.0.self_attn.out_proj.weight',
'encoder.layers.0.self_attn.out_proj.bias',
'encoder.layers.0.self_attn_layer_norm.weight',
'encoder.layers.0.self_attn_layer_norm.bias',
'encoder.layers.0.fc1.weight',
'encoder.layers.0.fc1.bias',
'encoder.layers.0.fc2.weight',
'encoder.layers.0.fc2.bias',
'encoder.layers.0.final_layer_norm.weight',
...

What's your environment?

fairseq Version: 0.9.0
PyTorch Version (e.g., 1.0): 1.2.0
OS (e.g., Linux): Ubuntu 18.04.3 LTS
How you installed fairseq (pip, source): compiled from source

Any idea about how I can unpack your model to get the same structure as the example I showed?
I would be happy with being able to do a forward pass through your embedding layer for the source and target languages.

Thanks

question

Source

ahof1704

All 12 comments

You'll need to instantiate the model object and then load the state_dict. The easiest way to do this is probably to use the torch hub interface. Following this example, you should be able to do:

from fairseq.models.transformer import TransformerModel
model = TransformerModel.from_pretrained(
  '/path/to/checkpoints',
  checkpoint_file='checkpoint_best.pt',
  data_name_or_path=<your data path>,
  bpe=...,
  bpe_codes=...
)

lematt1991 on 29 Jan 2020

👍1

Hi,

I am not sure if I know what is the bpe and bpe_code there. This is the script I used to train your model:

fairseq-preprocess \
--source-lang en --target-lang fr \
--trainpref bpe.32k/train \
--validpref bpe.32k/valid \
--testpref bpe.32k/test \
--align-suffix align \
--destdir binarized/ \
--joined-dictionary \
--workers 32

fairseq-train \
binarized \
--arch transformer_wmt_en_de_big_align --share-all-embeddings \
--optimizer adam --adam-betas '(0.9, 0.98)' --clip-norm 0.0 --activation-fn relu\
--lr 0.0002 --lr-scheduler inverse_sqrt --warmup-updates 4000 --warmup-init-lr 1e-07 \
--dropout 0.3 --attention-dropout 0.1 --weight-decay 0.0 \
--max-tokens 3500 --label-smoothing 0.1 \
--save-dir ./checkpoints --log-interval 1000 --max-update 60000 \
--keep-interval-updates -1 --save-interval-updates 0 \
--load-alignments --criterion label_smoothed_cross_entropy_with_alignment \
--fp16

paste bpe.32k/train.en bpe.32k/train.fr | awk -F '\t' '{print $1 " ||| " $2}' > bpe.32k/train.en-fr

$ALIGN -i bpe.32k/train.en-fr -d -o -v > bpe.32k/train.align

So is bpe = 'bpe.32k/train.en' ?

Thanks

ahof1704 on 31 Jan 2020

Assuming that you followed this example, then you should do:

from fairseq.models.transformer import TransformerModel
model = TransformerModel.from_pretrained(
  '/path/to/checkpoints',
  checkpoint_file='checkpoint_best.pt',
  data_name_or_path=<your data path>,
  bpe='fastbpe',
  bpe_codes=<bpe_file>
)

Where <bpe_file> is the file that gets created in this line

lematt1991 on 31 Jan 2020

Ok, I think it is getting closer now. Here are the inputs I am passing:

from fairseq.models.transformer import TransformerModel
model = TransformerModel.from_pretrained(
'./joint_alignment_translation/checkpoints',
checkpoint_file='checkpoint_best.pt',

data_name_or_path='./joint_alignment_translation/wmt18_en_fr/tmp/train.tags.en-fr.tok.en',
bpe= './joint_alignment_translation/fastBPE',
bpe_codes='./joint_alignment_translation/bpe.32k/codes'
)

I am getting an error in the bpe_codes:

---------------------------------------------------------------------------CompressionError
Traceback (most recent call
last) in 25
data_name_or_path='./joint_alignment_translation/wmt18_en_fr/tmp/train.tags.en-fr.tok.en',
26 bpe= './joint_alignment_translation/fastBPE',---> 27
bpe_codes='./joint_alignment_translation/bpe.32k/codes' 28 )
/fairseq/models/fairseq_model.py in from_pretrained(cls,
model_name_or_path, checkpoint_file, data_name_or_path, *kwargs)
172 data_name_or_path, 173
archive_map=cls.hub_models(),--> 174 *kwargs, 175
) 176 print(x['args'])
/fairseq/fairseq/hub_utils.py in from_pretrained(model_name_or_path,
checkpoint_file, data_name_or_path, archive_map, *kwargs) 52
kwargs['data'] = os.path.abspath(os.path.join(model_path,
data_name_or_path)) 53 else:---> 54 kwargs['data'] =
file_utils.load_archive_file(data_name_or_path) 55 for file,
arg in { 56 'code': 'bpe_codes',
/fairseq/fairseq/file_utils.py in load_archive_file(archive_file)
78 resolved_archive_file, tempdir)) 79 ext =
os.path.splitext(archive_file)[1][1:]---> 80 with
tarfile.open(resolved_archive_file, 'r:' + ext) as archive: 81
top_dir = os.path.commonprefix(archive.getnames()) 82
archive.extractall(tempdir)
~/.conda/envs/.../lib/python3.7/tarfile.py in open(cls, name, mode,
fileobj, bufsize, *kwargs) 1588 func = getattr(cls,
cls.OPEN_METH[comptype]) 1589 else:-> 1590
raise CompressionError("unknown compression type %r" % comptype)
1591 return func(name, filemode, fileobj, **kwargs) 1592
CompressionError: unknown compression type 'en'

Any suggestion??

Thanks!

ᐧ

On Fri, Jan 31, 2020 at 9:46 AM Matt Le notifications@github.com wrote:

Assuming that you followed this
https://github.com/pytorch/fairseq/tree/master/examples/joint_alignment_translation
example, then you should do:

from fairseq.models.transformer import TransformerModel
model = TransformerModel.from_pretrained(
'/path/to/checkpoints',
checkpoint_file='checkpoint_best.pt',
data_name_or_path=,
bpe='fastbpe',
bpe_codes=
)

Where is the file that gets created in this
https://github.com/pytorch/fairseq/blob/master/examples/joint_alignment_translation/prepare-wmt18en2de_no_norm_no_escape_no_agressive.sh#L117
line

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/pytorch/fairseq/issues/1655?email_source=notifications&email_token=AB52JRB77OXXPXK3E73MMXLRAQ2U7A5CNFSM4KMWSS32YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKO3YWY#issuecomment-580762715,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AB52JRCWJ6ZQHWWYLKV4OSTRAQ2U7ANCNFSM4KMWSS3Q
.

--
Antonio Henrique de Oliveira Fonseca
[email protected]
Automation, Instrumentation and Robotics Engineering - UFABC / Laurentian
University.
Post-Graduate Associate in Comparative Medicine - Yale School of Medicine

ahof1704 on 3 Feb 2020

The error message would be helpful, but I think it should be:

from fairseq.models.transformer import TransformerModel
model = TransformerModel.from_pretrained(
  './joint_alignment_translation/checkpoints',
  checkpoint_file='checkpoint_best.pt',
  data_name_or_path='./joint_alignment_translation/wmt18_en_fr/tmp/train.tags.en-fr.tok.en',
  bpe= 'fastbpe',
  bpe_codes='./joint_alignment_translation/bpe.32k/codes'
)

lematt1991 on 3 Feb 2020

Sorry. Here is the error message:

CompressionError Traceback (most recent call last)
in
25 data_name_or_path='./joint_alignment_translation/wmt18_en_fr/tmp/train.tags.en-fr.tok.en',
26 bpe= './joint_alignment_translation/fastBPE',
---> 27 bpe_codes='./joint_alignment_translation/bpe.32k/codes'
28 )

/fairseq/models/fairseq_model.py in from_pretrained(cls, model_name_or_path, checkpoint_file, data_name_or_path, *kwargs)
172 data_name_or_path,
173 archive_map=cls.hub_models(),
--> 174 *kwargs,
175 )
176 print(x['args'])

/fairseq/fairseq/hub_utils.py in from_pretrained(model_name_or_path, checkpoint_file, data_name_or_path, archive_map, **kwargs)
52 kwargs['data'] = os.path.abspath(os.path.join(model_path, data_name_or_path))
53 else:
---> 54 kwargs['data'] = file_utils.load_archive_file(data_name_or_path)
55 for file, arg in {
56 'code': 'bpe_codes',

/fairseq/fairseq/file_utils.py in load_archive_file(archive_file)
78 resolved_archive_file, tempdir))
79 ext = os.path.splitext(archive_file)[1][1:]
---> 80 with tarfile.open(resolved_archive_file, 'r:' + ext) as archive:
81 top_dir = os.path.commonprefix(archive.getnames())
82 archive.extractall(tempdir)

~/.conda/envs/.../lib/python3.7/tarfile.py in open(cls, name, mode, fileobj, bufsize, *kwargs)
1588 func = getattr(cls, cls.OPEN_METH[comptype])
1589 else:
-> 1590 raise CompressionError("unknown compression type %r" % comptype)
1591 return func(name, filemode, fileobj, *kwargs)
1592

CompressionError: unknown compression type 'en'

ahof1704 on 3 Feb 2020

Any further suggestions? :)

ahof1704 on 7 Feb 2020

What directory contains your binarized data (i.e. the output of your fairseq-preprocess command)? It should be

from fairseq.models.transformer import TransformerModel
model = TransformerModel.from_pretrained(
  './joint_alignment_translation/checkpoints',
  checkpoint_file='checkpoint_best.pt',
  data_name_or_path='<path to binarized data>',
  bpe= 'fastbpe',
  bpe_codes='./joint_alignment_translation/bpe.32k/codes'
)

lematt1991 on 7 Feb 2020

@ahof1704, did this get resolved?

lematt1991 on 11 Feb 2020

Hi Matt, sorry for the delay in replying. The problem persists... I think I fixed the path to the binarized data now:

import torch
import os

from fairseq.models.transformer import TransformerModel
model = TransformerModel.from_pretrained(
  './fairseq/examples/joint_alignment_translation/checkpoints',
  checkpoint_file='checkpoint_best.pt',
  data_name_or_path='./fairseq/examples/joint_alignment_translation/binarized',
  bpe= './fairseq/examples/joint_alignment_translation/fastBPE',
  bpe_codes='./fairseq/examples/joint_alignment_translation/bpe.32k/codes'

But here is the error message:

KeyError Traceback (most recent call last)
in
27 data_name_or_path='./fairseq/examples/joint_alignment_translation/binarized',
28 bpe= './fairseq/examples/joint_alignment_translation/fastBPE',
---> 29 bpe_codes='./fairseq/examples/joint_alignment_translation/bpe.32k/codes'
30 )

./fairseq/fairseq/models/fairseq_model.py in from_pretrained(cls, model_name_or_path, checkpoint_file, data_name_or_path, *kwargs)
172 data_name_or_path,
173 archive_map=cls.hub_models(),
--> 174 *kwargs,
175 )
176 print(x['args'])

./fairseq/fairseq/hub_utils.py in from_pretrained(model_name_or_path, checkpoint_file, data_name_or_path, archive_map, **kwargs)
67 models, args, task = checkpoint_utils.load_model_ensemble_and_task(
68 [os.path.join(model_path, cpt) for cpt in checkpoint_file.split(':')],
---> 69 arg_overrides=kwargs,
70 )
71

./fairseq/fairseq/checkpoint_utils.py in load_model_ensemble_and_task(filenames, arg_overrides, task)
176 if not os.path.exists(filename):
177 raise IOError("Model file not found: {}".format(filename))
--> 178 state = load_checkpoint_to_cpu(filename, arg_overrides)
179
180 args = state["args"]

./fairseq/fairseq/checkpoint_utils.py in load_checkpoint_to_cpu(path, arg_overrides)
152 for arg_name, arg_val in arg_overrides.items():
153 setattr(args, arg_name, arg_val)
--> 154 state = _upgrade_state_dict(state)
155 return state
156

./fairseq/fairseq/checkpoint_utils.py in _upgrade_state_dict(state)
339 choice = getattr(state["args"], registry_name, None)
340 if choice is not None:
--> 341 cls = REGISTRY["registry"][choice]
342 registry.set_defaults(state["args"], cls)
343

KeyError: './fairseq/examples/joint_alignment_translation/fastBPE'

Do you see anything else wrong?

ahof1704 on 12 Feb 2020

Any updates?

ahof1704 on 17 Feb 2020

Yes, again I would suggest the following:

from fairseq.models.transformer import TransformerModel
model = TransformerModel.from_pretrained(
  './joint_alignment_translation/checkpoints',
  checkpoint_file='checkpoint_best.pt',
  data_name_or_path='<path to binarized data>',
  bpe= 'fastbpe',
  bpe_codes='./joint_alignment_translation/bpe.32k/codes'
)

Note that the bpe argument is fastbpe, not './fairseq/examples/joint_alignment_translation/fastBPE',

lematt1991 on 17 Feb 2020

👍1

Was this page helpful?

0 / 5 - 0 ratings