I trained a model to translate EN-FR using your code. Now I would like to load it and run on my own dataset. However, I haven't found the right way of loading the model in order to extract the embedding layer. My expectation after loading the model is the specification of the architecture (see example below), thus indicating which layer I have to access to get the embedding:
EncoderDecoder(
(encoder): Encoder(
(layers): ModuleList(
(0): EncoderLayer(
(self_attn): MultiHeadedAttention(
(linears): ModuleList(
(0): Linear(in_features=512, out_features=512, bias=True)
(1): Linear(in_features=512, out_features=512, bias=True)
(2): Linear(in_features=512, out_features=512, bias=True)
(3): Linear(in_features=512, out_features=512, bias=True)
)
(dropout): Dropout(p=0.1, inplace=False)
....
)
(src_embed): Sequential(
(0): Embeddings(
(lut): Embedding(40834, 512)
)
(1): PositionalEncoding(
(dropout): Dropout(p=0.1, inplace=False)
)
)
(tgt_embed): Sequential(
(0): Embeddings(
(lut): Embedding(30337, 512)
)
(1): PositionalEncoding(
(dropout): Dropout(p=0.1, inplace=False)
)
)
(generator): Generator(
(proj): Linear(in_features=512, out_features=30337, bias=True)
)
)
However, what I get from loading the model trained with your code looks like this:
model_fairseq = torch.load('./checkpoint_best.pt')
list(model_fairseq['model'])
['encoder.version',
'encoder.embed_tokens.weight',
'encoder.embed_positions._float_tensor',
'encoder.layers.0.self_attn.k_proj.weight',
'encoder.layers.0.self_attn.k_proj.bias',
'encoder.layers.0.self_attn.v_proj.weight',
'encoder.layers.0.self_attn.v_proj.bias',
'encoder.layers.0.self_attn.q_proj.weight',
'encoder.layers.0.self_attn.q_proj.bias',
'encoder.layers.0.self_attn.out_proj.weight',
'encoder.layers.0.self_attn.out_proj.bias',
'encoder.layers.0.self_attn_layer_norm.weight',
'encoder.layers.0.self_attn_layer_norm.bias',
'encoder.layers.0.fc1.weight',
'encoder.layers.0.fc1.bias',
'encoder.layers.0.fc2.weight',
'encoder.layers.0.fc2.bias',
'encoder.layers.0.final_layer_norm.weight',
...
pip, source): compiled from sourceAny idea about how I can unpack your model to get the same structure as the example I showed?
I would be happy with being able to do a forward pass through your embedding layer for the source and target languages.
Thanks
You'll need to instantiate the model object and then load the state_dict. The easiest way to do this is probably to use the torch hub interface. Following this example, you should be able to do:
from fairseq.models.transformer import TransformerModel
model = TransformerModel.from_pretrained(
'/path/to/checkpoints',
checkpoint_file='checkpoint_best.pt',
data_name_or_path=<your data path>,
bpe=...,
bpe_codes=...
)
Hi,
I am not sure if I know what is the bpe and bpe_code there. This is the script I used to train your model:
fairseq-preprocess \
--source-lang en --target-lang fr \
--trainpref bpe.32k/train \
--validpref bpe.32k/valid \
--testpref bpe.32k/test \
--align-suffix align \
--destdir binarized/ \
--joined-dictionary \
--workers 32
fairseq-train \
binarized \
--arch transformer_wmt_en_de_big_align --share-all-embeddings \
--optimizer adam --adam-betas '(0.9, 0.98)' --clip-norm 0.0 --activation-fn relu\
--lr 0.0002 --lr-scheduler inverse_sqrt --warmup-updates 4000 --warmup-init-lr 1e-07 \
--dropout 0.3 --attention-dropout 0.1 --weight-decay 0.0 \
--max-tokens 3500 --label-smoothing 0.1 \
--save-dir ./checkpoints --log-interval 1000 --max-update 60000 \
--keep-interval-updates -1 --save-interval-updates 0 \
--load-alignments --criterion label_smoothed_cross_entropy_with_alignment \
--fp16
paste bpe.32k/train.en bpe.32k/train.fr | awk -F '\t' '{print $1 " ||| " $2}' > bpe.32k/train.en-fr
$ALIGN -i bpe.32k/train.en-fr -d -o -v > bpe.32k/train.align
So is bpe = 'bpe.32k/train.en' ?
Thanks
Assuming that you followed this example, then you should do:
from fairseq.models.transformer import TransformerModel
model = TransformerModel.from_pretrained(
'/path/to/checkpoints',
checkpoint_file='checkpoint_best.pt',
data_name_or_path=<your data path>,
bpe='fastbpe',
bpe_codes=<bpe_file>
)
Where <bpe_file> is the file that gets created in this line
Ok, I think it is getting closer now. Here are the inputs I am passing:
from fairseq.models.transformer import TransformerModel
model = TransformerModel.from_pretrained(
'./joint_alignment_translation/checkpoints',
checkpoint_file='checkpoint_best.pt',
data_name_or_path='./joint_alignment_translation/wmt18_en_fr/tmp/train.tags.en-fr.tok.en',
bpe= './joint_alignment_translation/fastBPE',
bpe_codes='./joint_alignment_translation/bpe.32k/codes'
)
I am getting an error in the bpe_codes:
---------------------------------------------------------------------------CompressionError
Traceback (most recent call
last)
data_name_or_path='./joint_alignment_translation/wmt18_en_fr/tmp/train.tags.en-fr.tok.en',
26 bpe= './joint_alignment_translation/fastBPE',---> 27
bpe_codes='./joint_alignment_translation/bpe.32k/codes' 28 )
/fairseq/models/fairseq_model.py in from_pretrained(cls,
model_name_or_path, checkpoint_file, data_name_or_path, *kwargs)
172 data_name_or_path, 173
archive_map=cls.hub_models(),--> 174 *kwargs, 175
) 176 print(x['args'])
/fairseq/fairseq/hub_utils.py in from_pretrained(model_name_or_path,
checkpoint_file, data_name_or_path, archive_map, *kwargs) 52
kwargs['data'] = os.path.abspath(os.path.join(model_path,
data_name_or_path)) 53 else:---> 54 kwargs['data'] =
file_utils.load_archive_file(data_name_or_path) 55 for file,
arg in { 56 'code': 'bpe_codes',
/fairseq/fairseq/file_utils.py in load_archive_file(archive_file)
78 resolved_archive_file, tempdir)) 79 ext =
os.path.splitext(archive_file)[1][1:]---> 80 with
tarfile.open(resolved_archive_file, 'r:' + ext) as archive: 81
top_dir = os.path.commonprefix(archive.getnames()) 82
archive.extractall(tempdir)
~/.conda/envs/.../lib/python3.7/tarfile.py in open(cls, name, mode,
fileobj, bufsize, *kwargs) 1588 func = getattr(cls,
cls.OPEN_METH[comptype]) 1589 else:-> 1590
raise CompressionError("unknown compression type %r" % comptype)
1591 return func(name, filemode, fileobj, **kwargs) 1592
CompressionError: unknown compression type 'en'
Any suggestion??
Thanks!
ᐧ
On Fri, Jan 31, 2020 at 9:46 AM Matt Le notifications@github.com wrote:
Assuming that you followed this
https://github.com/pytorch/fairseq/tree/master/examples/joint_alignment_translation
example, then you should do:from fairseq.models.transformer import TransformerModel
model = TransformerModel.from_pretrained(
'/path/to/checkpoints',
checkpoint_file='checkpoint_best.pt',
data_name_or_path=,
bpe='fastbpe',
bpe_codes=
)Where
is the file that gets created in this
https://github.com/pytorch/fairseq/blob/master/examples/joint_alignment_translation/prepare-wmt18en2de_no_norm_no_escape_no_agressive.sh#L117
line—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/pytorch/fairseq/issues/1655?email_source=notifications&email_token=AB52JRB77OXXPXK3E73MMXLRAQ2U7A5CNFSM4KMWSS32YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKO3YWY#issuecomment-580762715,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AB52JRCWJ6ZQHWWYLKV4OSTRAQ2U7ANCNFSM4KMWSS3Q
.
--
Antonio Henrique de Oliveira Fonseca
[email protected]
Automation, Instrumentation and Robotics Engineering - UFABC / Laurentian
University.
Post-Graduate Associate in Comparative Medicine - Yale School of Medicine
The error message would be helpful, but I think it should be:
from fairseq.models.transformer import TransformerModel
model = TransformerModel.from_pretrained(
'./joint_alignment_translation/checkpoints',
checkpoint_file='checkpoint_best.pt',
data_name_or_path='./joint_alignment_translation/wmt18_en_fr/tmp/train.tags.en-fr.tok.en',
bpe= 'fastbpe',
bpe_codes='./joint_alignment_translation/bpe.32k/codes'
)
Sorry. Here is the error message:
CompressionError Traceback (most recent call last)
25 data_name_or_path='./joint_alignment_translation/wmt18_en_fr/tmp/train.tags.en-fr.tok.en',
26 bpe= './joint_alignment_translation/fastBPE',
---> 27 bpe_codes='./joint_alignment_translation/bpe.32k/codes'
28 )
/fairseq/models/fairseq_model.py in from_pretrained(cls, model_name_or_path, checkpoint_file, data_name_or_path, *kwargs)
172 data_name_or_path,
173 archive_map=cls.hub_models(),
--> 174 *kwargs,
175 )
176 print(x['args'])
/fairseq/fairseq/hub_utils.py in from_pretrained(model_name_or_path, checkpoint_file, data_name_or_path, archive_map, **kwargs)
52 kwargs['data'] = os.path.abspath(os.path.join(model_path, data_name_or_path))
53 else:
---> 54 kwargs['data'] = file_utils.load_archive_file(data_name_or_path)
55 for file, arg in {
56 'code': 'bpe_codes',
/fairseq/fairseq/file_utils.py in load_archive_file(archive_file)
78 resolved_archive_file, tempdir))
79 ext = os.path.splitext(archive_file)[1][1:]
---> 80 with tarfile.open(resolved_archive_file, 'r:' + ext) as archive:
81 top_dir = os.path.commonprefix(archive.getnames())
82 archive.extractall(tempdir)
~/.conda/envs/.../lib/python3.7/tarfile.py in open(cls, name, mode, fileobj, bufsize, *kwargs)
1588 func = getattr(cls, cls.OPEN_METH[comptype])
1589 else:
-> 1590 raise CompressionError("unknown compression type %r" % comptype)
1591 return func(name, filemode, fileobj, *kwargs)
1592
CompressionError: unknown compression type 'en'
Any further suggestions? :)
What directory contains your binarized data (i.e. the output of your fairseq-preprocess command)? It should be
from fairseq.models.transformer import TransformerModel
model = TransformerModel.from_pretrained(
'./joint_alignment_translation/checkpoints',
checkpoint_file='checkpoint_best.pt',
data_name_or_path='<path to binarized data>',
bpe= 'fastbpe',
bpe_codes='./joint_alignment_translation/bpe.32k/codes'
)
@ahof1704, did this get resolved?
Hi Matt, sorry for the delay in replying. The problem persists... I think I fixed the path to the binarized data now:
import torch
import os
from fairseq.models.transformer import TransformerModel
model = TransformerModel.from_pretrained(
'./fairseq/examples/joint_alignment_translation/checkpoints',
checkpoint_file='checkpoint_best.pt',
data_name_or_path='./fairseq/examples/joint_alignment_translation/binarized',
bpe= './fairseq/examples/joint_alignment_translation/fastBPE',
bpe_codes='./fairseq/examples/joint_alignment_translation/bpe.32k/codes'
But here is the error message:
KeyError Traceback (most recent call last)
27 data_name_or_path='./fairseq/examples/joint_alignment_translation/binarized',
28 bpe= './fairseq/examples/joint_alignment_translation/fastBPE',
---> 29 bpe_codes='./fairseq/examples/joint_alignment_translation/bpe.32k/codes'
30 )
./fairseq/fairseq/models/fairseq_model.py in from_pretrained(cls, model_name_or_path, checkpoint_file, data_name_or_path, *kwargs)
172 data_name_or_path,
173 archive_map=cls.hub_models(),
--> 174 *kwargs,
175 )
176 print(x['args'])
./fairseq/fairseq/hub_utils.py in from_pretrained(model_name_or_path, checkpoint_file, data_name_or_path, archive_map, **kwargs)
67 models, args, task = checkpoint_utils.load_model_ensemble_and_task(
68 [os.path.join(model_path, cpt) for cpt in checkpoint_file.split(':')],
---> 69 arg_overrides=kwargs,
70 )
71
./fairseq/fairseq/checkpoint_utils.py in load_model_ensemble_and_task(filenames, arg_overrides, task)
176 if not os.path.exists(filename):
177 raise IOError("Model file not found: {}".format(filename))
--> 178 state = load_checkpoint_to_cpu(filename, arg_overrides)
179
180 args = state["args"]
./fairseq/fairseq/checkpoint_utils.py in load_checkpoint_to_cpu(path, arg_overrides)
152 for arg_name, arg_val in arg_overrides.items():
153 setattr(args, arg_name, arg_val)
--> 154 state = _upgrade_state_dict(state)
155 return state
156
./fairseq/fairseq/checkpoint_utils.py in _upgrade_state_dict(state)
339 choice = getattr(state["args"], registry_name, None)
340 if choice is not None:
--> 341 cls = REGISTRY["registry"][choice]
342 registry.set_defaults(state["args"], cls)
343
KeyError: './fairseq/examples/joint_alignment_translation/fastBPE'
Do you see anything else wrong?
Any updates?
Yes, again I would suggest the following:
from fairseq.models.transformer import TransformerModel
model = TransformerModel.from_pretrained(
'./joint_alignment_translation/checkpoints',
checkpoint_file='checkpoint_best.pt',
data_name_or_path='<path to binarized data>',
bpe= 'fastbpe',
bpe_codes='./joint_alignment_translation/bpe.32k/codes'
)
Note that the bpe argument is fastbpe, not './fairseq/examples/joint_alignment_translation/fastBPE',