9

BERT output is not deterministic. I expect the output values are deterministic when I put a same input, but my bert model the values are changing. Sounds awkwardly, the same value is returned twice, once. That is, once another value comes out, the same value comes out and it repeats. How I can make the output deterministic? let me show snippets of my code. I use the model as below.

For the BERT implementation, I use huggingface implemented BERT pytorch implementation. which is quite fameous model ri implementation in the pytorch area. [link] https://github.com/huggingface/pytorch-pretrained-BERT/

        tokenizer = BertTokenizer.from_pretrained(self.bert_type, do_lower_case=self.do_lower_case, cache_dir=self.bert_cache_path)
        pretrain_bert = BertModel.from_pretrained(self.bert_type, cache_dir=self.bert_cache_path)
        bert_config = pretrain_bert.config

Get the output like this

        all_encoder_layer, pooled_output = self.model_bert(all_input_ids, all_segment_ids, all_input_mask)

        # all_encoder_layer: BERT outputs from all layers.
        # pooled_output: output of [CLS] vec.

pooled_output

tensor([[-3.3997e-01,  2.6870e-01, -2.8109e-01, -2.0018e-01, -8.6849e-02,

tensor([[ 7.4340e-02, -3.4894e-03, -4.9583e-03,  6.0806e-02,  8.5685e-02,

tensor([[-3.3997e-01,  2.6870e-01, -2.8109e-01, -2.0018e-01, -8.6849e-02,

tensor([[ 7.4340e-02, -3.4894e-03, -4.9583e-03,  6.0806e-02,  8.5685e-02,

for the all encoder layer, the situation is same, - same in twice an once.

I extract word embedding feature from the bert, and the situation is same.

wemb_n
tensor([[[ 0.1623,  0.4293,  0.1031,  ..., -0.0434, -0.5156, -1.0220],

tensor([[[ 0.0389,  0.5050,  0.1327,  ...,  0.3232,  0.2232, -0.5383],

tensor([[[ 0.1623,  0.4293,  0.1031,  ..., -0.0434, -0.5156, -1.0220],

tensor([[[ 0.0389,  0.5050,  0.1327,  ...,  0.3232,  0.2232, -0.5383],

2 Answers 2

9

Please try to set the seed. I faced the same issue and set the seed to make sure we get same values every time. One of the possible reasons could be dropout taking place in BERT.

2
  • Oh, great. I put see to python, numpy and torch as well to get the same values now. Thank you.
    – Keanu Paik
    Jun 19, 2019 at 14:05
  • 6
    Do we know what is randomized during inference? During training, I believe most versions of BERT use dropout, which uses randomization. But, for inference, I am not sure where a random number generator is used. Aug 8, 2019 at 20:21
3

Quite late but if anyone gets here: this is because the model is not in eval mode by default (there must be dropout and perhaps other sources of randomness in training).

To fix this, just set your model to eval mode before any inference:

model.eval()

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.