Transformers: How to train from scratch

Created on 3 Nov 2019 · 4Comments · Source: huggingface/transformers

I would like to train the model from scratch.
How can I drop the trained weight? using the same architecture for Gpt2

Source

anandhperumal

👍1

Most helpful comment

If you want to randomly initialize a model simply initialize it via its constructor rather than from the from_pretrained method:

from transformers import GPT2Config, GPT2Model

config = GPT2Config()  # define your configuration here
model = GPT2Model(config)  # Initialize your model from your config

LysandreJik on 5 Nov 2019

👍4

All 4 comments

If you want to randomly initialize a model simply initialize it via its constructor rather than from the from_pretrained method:

from transformers import GPT2Config, GPT2Model

config = GPT2Config()  # define your configuration here
model = GPT2Model(config)  # Initialize your model from your config

LysandreJik on 5 Nov 2019

👍4

@LysandreJik Thanks for the input.
I did something like this

    config = GPT2Config(vocab_size)
    model = GPT2Model(config)

Apart from vocab size, I'm keeping everything else to default value how do I make sure that it doesn't have any pre-trained value?

anandhperumal on 6 Nov 2019

The values are only loaded if your instantiate the model by calling ˋGPT2Model.from_pretrained`, so you’re fine 🙂

rlouf on 6 Nov 2019

👍3

@rlouf Thanks

anandhperumal on 6 Nov 2019

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Sudden catastrophic classification output during NER training

fabiocapsouza · 3Comments

_load_from_state_dict() takes 7 positional arguments but 8 were given

guanlongtianzi · 3Comments

Finetuning OpenAI GPT-2 for another language.

0x01h · 3Comments

Problem about convert TF model and pretraining

zhezhaoa · 3Comments

ValueError while using --optimize_on_cpu

rsanjaykamath · 3Comments