11

I'm trying to use transformer's huggingface pretrained model bert-base-uncased, but I want to increace dropout. There isn't any mention to this in from_pretrained method, but colab ran the object instantiation below without any problem. I saw these dropout parameters in classtransformers.BertConfig documentation.

Am I using bert-base-uncased AND changing dropout in the correct way?

model = BertForSequenceClassification.from_pretrained(
        pretrained_model_name_or_path='bert-base-uncased',
        num_labels=2,
        output_attentions = False,
        output_hidden_states = False,
        attention_probs_dropout_prob=0.5,
        hidden_dropout_prob=0.5
    )
0

3 Answers 3

8

As Elidor00 already said it, your assumption is correct. Similarly I would suggest using a separated Config because it is easier to export and less prone to cause errors. Additionally someone in the comments ask how to use it via from_pretrained:

from transformers import BertModel, AutoConfig

configuration = AutoConfig.from_pretrained('bert-base-uncased')
configuration.hidden_dropout_prob = 0.5
configuration.attention_probs_dropout_prob = 0.5
        
bert_model = BertModel.from_pretrained(pretrained_model_name_or_path = 'bert-base-uncased', 
config = configuration)
3

Yes this is correct, but note that there are two dropout parameters and that you are using a specific Bert model, that is BertForSequenceClassification.

Also as suggested by the documentation you could first define the configuration and then the way in the following way:

from transformers import BertModel, BertConfig

# Initializing a BERT bert-base-uncased style configuration
configuration = BertConfig()

# Initializing a model from the bert-base-uncased style configuration
model = BertModel(configuration)

# Accessing the model configuration
configuration = model.config
4
  • how does it work with the from_pretrained() function? Apr 18, 2021 at 8:56
  • I recommend you to see this example to understand how it works. A simple example of token classification made with bert
    – Elidor00
    Apr 18, 2021 at 15:07
  • maybe you can change your answer to include a working example that answers OP question with your suggested config solution (Personally I tried passing a config to the from_pretrained function and had some problems, but maybe I was doing it wrong) Apr 18, 2021 at 15:12
  • 1
    The answer on how to use from_pretrained() for config, model and tokenizer can be found in the link I sent, exactly here. In case you are not clear about something, I advise you to ask a new question: explain your problem well and insert the code you are using, otherwise it is very difficult to be able to help you.
    – Elidor00
    Apr 18, 2021 at 15:18
1

What you did is just fine, but for the classifiers layer we have another parameter that is needed to set called classifier_dropout. In case you don't specify that, it will use the hidden_dropout_prob as fallback value.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.