Highest scored 'bert-language-model+tensorflow' questions

43 votes

2 answers

27k views

Why Bert transformer uses [CLS] token for classification instead of average over all tokens?

I am doing experiments on bert architecture and found out that most of the fine-tuning task takes the final hidden layer as text representation and later they pass it to other models for the further ...

Aaditya Ura

12.3k

asked Jul 2, 2020 at 21:25

17 votes

2 answers

33k views

The size of tensor a (707) must match the size of tensor b (512) at non-singleton dimension 1

I am trying to do text classification using pretrained BERT model. I trained the model on my dataset, and in the phase of testing; I know that BERT can only take to 512 tokens, so I wrote if condition ...

Mee

1,561

asked Oct 12, 2020 at 15:34

12 votes

3 answers

37k views

OSError: Error no file named ['pytorch_model.bin', 'tf_model.h5', 'model.ckpt.index']

When I load the BERT pretrained model online I get this error OSError: Error no file named ['pytorch_model.bin', 'tf_model.h5', 'model.ckpt.index'] found in directory uncased_L-12_H-768_A-12 or '...

Asma

189

asked Jul 17, 2020 at 20:52

11 votes

2 answers

13k views

How to use Transformers for text classification?

I have two questions about how to use Tensorflow implementation of the Transformers for text classifications. First, it seems people mostly used only the encoder layer to do the text classification ...

khemedi

806

asked Sep 26, 2019 at 19:18

10 votes

1 answer

14k views

How to get intermediate layers' output of pre-trained BERT model in HuggingFace Transformers library?

(I'm following this pytorch tutorial about BERT word embeddings, and in the tutorial the author is access the intermediate layers of the BERT model.) What I want is to access the last, lets say, 4 ...

Yagel

1,262

asked Apr 27, 2020 at 17:47

9 votes

1 answer

4k views

BERT embedding for semantic similarity

I earlier posted this question. I wanted to get embedding similar to this youtube video, time 33 minutes onward. 1) I dont think that the embedding that i am getting from CLS token are similar to ...

user2543622

6,258

asked Apr 2, 2020 at 16:37

8 votes

4 answers

16k views

Error importing BERT: module 'tensorflow._api.v2.train' has no attribute 'Optimizer'

I tried to use bert-tensorflow in Google Colab, but I got the following error: --------------------------------------------------------------------------- AttributeError ...

Belkacem Thiziri

615

asked Apr 16, 2020 at 12:30

8 votes

6 answers

6k views

Problem with inputs when building a model with TFBertModel and AutoTokenizer from HuggingFace's transformers

I'm trying to build the model illustrated in this picture: I obtained a pre-trained BERT and respective tokenizer from HuggingFace's transformers in the following way: from transformers import ...

Gerardo Zinno

1,672

asked Sep 15, 2021 at 15:28

8 votes

1 answer

3k views

HuggingFace BERT `inputs_embeds` giving unexpected result

The HuggingFace BERT TensorFlow implementation allows us to feed in a precomputed embedding in place of the embedding lookup that is native to BERT. This is done using the model's call method's ...

Vivek Subramanian

1,174

asked May 2, 2020 at 23:18

6 votes

1 answer

3k views

Bert Embedding Layer raises `Type Error: unsupported operand type(s) for +: 'None Type' and 'int'` with BiLSTM

I've problems integrating Bert Embedding Layer in a BiLSTM model for word sense disambiguation task, Windows 10 Python 3.6.4 TenorFlow 1.12 Keras 2.2.4 No virtual environments were used PyCharm ...

ElSheikh

319

asked Oct 29, 2019 at 12:48

6 votes

3 answers

4k views

TypeError: Layer input_spec must be an instance of InputSpec. Got: InputSpec(shape=(None, 128, 768), ndim=3)

I am trying to use a BERT pretrained model to do a multiclass classification (of 3 classes). Here's my function to use the model and also added some extra functionalities: def create_model(max_seq_len,...

Hrisav Bhowmick

91

asked Aug 18, 2021 at 15:57

5 votes

3 answers

6k views

AttributeError: 'str' object has no attribute 'dim' in pytorch

I got the following error output in the PyTorch when sent model predictions into the model. Does anyone know what's going on? Following are the architecture model that I created, in the error output, ...

Bei Zhao

71

asked Nov 30, 2020 at 18:41

5 votes

1 answer

3k views

Getting embedding lookup result from BERT

Prior to passing my tokens through BERT, I would like to perform some processing on their embeddings, (the result of the embedding lookup layer). The HuggingFace BERT TensorFlow implementation allows ...

Vivek Subramanian

1,174

asked May 3, 2020 at 5:10

5 votes

2 answers

3k views

Why are the matrices in BERT called Query, Key, and Value?

Within the transformer units of BERT, there are modules called Query, Key, and Value, or simply Q,K,V. Based on the BERT paper and code (particularly in modeling.py), my pseudocode understanding of ...

solvingPuzzles

8,709

asked Jun 25, 2019 at 2:49

5 votes

2 answers

5k views

How to use trained BERT model checkpoints for prediction?

I trained the BERT with SQUAD 2.0 and got the model.ckpt.data, model.ckpt.meta, model.ckpt.index (F1 score : 81) in the output directory along with predictions.json, etc. using the BERT-master/...

Jeeva Bharathi

544

asked Jun 28, 2019 at 4:28

5 votes

1 answer

6k views

How do I interpret my BERT output from Huggingface Transformers for Sequence Classification and tensorflow?

I am using bert for a sequence classification task with 3 labels. To do this, I am using huggingface transformers with tensorflow, more specifically the TFBertForSequenceClassification class with the ...

alxgal

129

asked Dec 21, 2020 at 17:04

5 votes

2 answers

2k views

Error with using BERT model from Tensorflow

I have tried to follow Tensorflow instructions to use BERT model: (https://www.tensorflow.org/tutorials/text/classify_text_with_bert) However, when I run these lines: text_test = ['this is such an ...

Jason

69

asked Dec 15, 2020 at 0:28

5 votes

4 answers

5k views

Convert a BERT Model to TFLite

I have this code for semantic search engine built using the pre-trained bert model. I want to convert this model into tflite for deploying it to google mlkit. I want to know how to convert it. I want ...

Ali Memon

125

asked Apr 1, 2020 at 9:37

4 votes

1 answer

2k views

Finetuning BERT on custom data

I want to train a 21 class text classification model using Bert. But I have very little training data, so a downloaded a similar dataset with 5 classes with 2 million samples.t And finetuned ...

danishansari

644

asked May 4, 2019 at 5:40

4 votes

2 answers

2k views

BERT get sentence level embedding after fine tuning

I came across this page 1) I would like to get sentence level embedding (embedding given by [CLS] token) after the fine tuning is done. How could I do it? 2) I also noticed that the code on that ...

user2543622

6,258

asked Mar 20, 2020 at 0:46

4 votes

1 answer

4k views

Huggingface TFBertForSequenceClassification always predicts the same label

TL;DR: My model always predicts the same labels and I don't know why. Below is my entire code for fine-tuning in the hopes that someone can point out to me where I am going wrong. I am using ...

alxgal

129

asked Jan 12, 2021 at 0:24

4 votes

1 answer

6k views

BERT outputs explained

The keys of the BERT encoder's output are default, encoder_outputs, pooled_output and sequence_output As far as I can know, encoder_outputs are the output of each encoder, pooled_output is the output ...

OK 400

1,159

asked Nov 4, 2021 at 8:41

4 votes

2 answers

3k views

Loading tf.keras model, ValueError: The two structures don't have the same nested structure

I created a tf.keras model that has BERT and I want to train and save it for further use. Loading this model is a big issue cause I keep getting error: ValueError: The two structures don't have the ...

Nadja

43

asked Sep 28, 2021 at 14:57

4 votes

1 answer

1k views

How can I apply pruning on a BERT model?

I have trained a BERT model using ktrain (TensorFlow wrapper) to recognize emotion on text. It works, but it suffers from really slow inference. That makes my model not suitable for a production ...

Stamatis Tiniakos

830

asked Oct 20, 2020 at 13:04

4 votes

1 answer

3k views

Tensorflow BERT for token-classification - exclude pad-tokens from accuracy while training and testing

I'm doing token-based classification using the pre-trained BERT-model for tensorflow to automatically label cause and effects in sentences. To access BERT, I'm using the TFBertForTokenClassification-...

user3228384

63

asked Jul 8, 2020 at 14:41

4 votes

1 answer

2k views

How to access BERT intermediate layer outputs in TF Hub Module?

Does anybody know a way to access the outputs of the intermediate layers from BERT's hosted models on Tensorflow Hub? The model is hosted here. I have explored the meta graph and found the only ...

AlexDelPiero

289

asked Mar 25, 2019 at 8:10

3 votes

3 answers

5k views

what is the difference between pooled output and sequence output in bert layer?

everyone! I was reading about Bert and wanted to do text classification with its word embeddings. I came across this line of code: pooled_output, sequence_output = self.bert_layer([input_word_ids, ...

mitra mirshafiee

453

asked Aug 12, 2020 at 13:09

3 votes

5 answers

3k views

bert-serving-start giving error TypeError: cannot unpack non-iterable NoneType object - tried multiple paths to the model

I am trying to use BERT with bert-serving-start in python3.8 but it does not initialise and throws error: TypeError: cannot unpack non-iterable NoneType object This may have something to do with the ...

geds133

1,375

asked Jun 11, 2020 at 0:13

3 votes

1 answer

3k views

ValueError: Unknown layer: TFBertModel. Please ensure this object is passed to the `custom_objects` argument

Here I training the bert model. below code i used to train, when i load the saved model for predict, it's shows this error. can anyone please help me out? import tensorflow as tf import logging from ...

waji

71

asked Aug 31, 2022 at 14:49

3 votes

2 answers

1k views

Is possible multiples GPUs work as one with more memory?

I have a deep learning workstation where there are 4 GPUs with 6 GB of memory each. Would it be possible to make a docker container see the 4 GPUs as one but with 24 GB? Thank you.

Celso França

724

asked Feb 10, 2020 at 2:31

3 votes

1 answer

644 views

InternalError when using TPU for training Keras model

I am attempting to fine-tune a BERT model on Google Colab from the Tensorflow Hub using this link. However, I run into the following error: InternalError: RET_CHECK failure (third_party/tensorflow/...

a_002311

43

asked Dec 25, 2021 at 10:11

3 votes

2 answers

4k views

Bert Model Compile Error - TypeError: Invalid keyword argument(s) in `compile`: {'steps_per_execution'}

I have been using bert and trying to compile the model using the below line of code. model = TFBertForSequenceClassification.from_pretrained('bert-base-uncased') optimizer = tf.keras.optimizers.Adam(...

sruthi

31

asked Oct 27, 2021 at 9:34

3 votes

3 answers

2k views

List index out of range when saving finetuned Tensorflow model

I'm trying to fine-tune a pre-trained BERT model from Huggingface using Tensorflow. Everything runs smoothly and the model builds and trains without error. But when I try to save the model it stops ...

Haag

47

asked Mar 9, 2021 at 22:04

3 votes

1 answer

807 views

How to set output_shape of BERT preprocessing layer from tensorflow hub?

I am building a simple BERT model for text classification, using the tensorflow hub. import tensorflow as tf import tensorflow_hub as tf_hub bert_preprocess = tf_hub.KerasLayer("https://tfhub....

lazarea

1,219

asked Sep 18, 2022 at 14:26

3 votes

1 answer

588 views

TF BERT input packer on more than two inputs

Some of the TensorFlow examples using BERT models show a use of the BERT preprocessor to "pack" inputs. E.g. in this example, text_preprocessed = bert_preprocess.bert_pack_inputs([tok, tok], ...

D__

210

asked Oct 9, 2021 at 18:11

3 votes

1 answer

1k views

Retraining existing base BERT model with additional data

I have generated new Base BERT model(dataset1_model_cased_L-12_H-768_A-12) using cased_L-12_H-768_A-12 as trained multi label classification from biobert-run_classifier I need to add more additional ...

veilupearl

929

asked Jul 17, 2020 at 6:22

3 votes

1 answer

4k views

Running BERT on CPU instead of GPU

I am trying to execute BERT's run_clasifier.py script using terminal as below: python run_classifier.py --task_name=cola --do_predict=true --data_dir=<data-dir> --vocab_file=$BERT_BASE_DIR/...

Ashwin Geet D'Sa

6,934

asked Jun 17, 2019 at 8:42

3 votes

0 answers

855 views

Same input, same model, same weights but getting different results

I'm finetuning sentence-bert to do some task like sentence cosine-similarity calculation in Tensorflow. I set up a encoder, let's say, encoder1 using the code below: from sentence_transformers import ...

PlasticSaber

63

asked Mar 12, 2022 at 16:16

3 votes

2 answers

2k views

how to save and load custom siamese bert model

I am following this tutorial on how to train a siamese bert network: https://keras.io/examples/nlp/semantic_similarity_with_bert/ all good, but I am not sure what is the best way to save the model ...

Carbo

916

asked Mar 8, 2022 at 14:20

3 votes

0 answers

1k views

RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/

n_epochs = 6 model = CNN_Text() loss_fn = nn.CrossEntropyLoss(reduction='sum') optimizer = torch.optim.Adam(filter(lambda p: p.requires_grad, model.parameters()), lr=0.001) model.cuda() # Load train ...

Tahir Ullah

43

asked Sep 13, 2021 at 7:58

3 votes

0 answers

708 views

I'm trying to load BERT "tfbert-large-uncased" but i got an error "Can't load config.json file"

I'm trying to load the pre-train BERT model but I'm getting an error while loading tokenized it says config.json is not found. If anyone knows how to solve these issues please help me Model and path ...

iamhimanshu0

379

asked May 20, 2021 at 16:10

3 votes

0 answers

233 views

Name of training and test data files in NLP (BioBERT GitHub repo)

I'm reading the README.md file of the BioBERT GitHub repo: Let $NER_DIR indicate a folder for a single NER dataset which contains train_dev.tsv, train.tsv, devel.tsv and test.tsv. Also, set $...

Contestosis

395

asked May 20, 2021 at 9:50

3 votes

0 answers

2k views

Fail to run trainer.train() with huggingface transformer

I am trying to set up a TensorFlow fine-tune framework for a question-answering project. Using hugging-face/transformer as the prototype, but cannot run through the trainer. The experiment is ...

user3381299

177

asked Jul 13, 2020 at 16:16

3 votes

0 answers

2k views

Bert model giving CUDA out of memory error on google colab

I am using the following tutorial here to train and test a Bertsequenceclassifier model on a dataset of documents of varying lengths (small(0-280), medium(280-10000), large(10000 plus)) on the google ...

user1365234

325

asked Jan 24, 2020 at 3:13

3 votes

1 answer

6k views

How to get the vocab file for Bert tokenizer from TF Hub

I'm trying to use Bert from TensorFlow Hub and build a tokenizer, this is what I'm doing: >>> import tensorflow_hub as hub >>> from bert.tokenization import FullTokenizer >>&...

bachr

5,898

asked Jan 8, 2020 at 21:39

2 votes

1 answer

7k views

tflite converter error operation not supported

I was trying to convert .pb model of albert to tflite I made .pb model using https://github.com/google-research/albert in tf 1.15 And I used tconverter = tf.compat.v1.lite.TFLiteConverter....

Mid_gang

25

asked Apr 25, 2021 at 8:42

2 votes

1 answer

1k views

How does BERT utilize TPU memories?

README in the Google's BERT repo says, even a single sentence of length 512 can not sit in a 12 GB Titan X for the BERT-Large model. But in the BERT paper, it says 64 TPU chips are used to train BERT-...

soloice

1,000

asked May 12, 2019 at 18:26

2 votes

1 answer

2k views

Tensorflow2.4 NotFoundError: No algorithm worked! with Keras Conv1D Layer

I've been looking for a solution to this error for days and I can't find solutions for this: NotFoundError: 3 root error(s) found. (0) Not found: No algorithm worked! [[node model/conv1d/conv1d (...

Eduardo Watanabe

21

asked Mar 4, 2021 at 23:36

2 votes

3 answers

2k views

SimpleTransformers Error: VersionConflict: tokenizers==0.9.4? How do I fix this?

I'm trying to execute the simpletransformers example from their site on google colab. Example: from simpletransformers.classification import ClassificationModel, ClassificationArgs import pandas as pd ...

Reema Q Khan

878

asked Jan 27, 2021 at 17:16

2 votes

1 answer

815 views

BERT pre-training from scratch with tensorflow version 2.x

i used run_pretraining.py (https://github.com/google-research/bert/blob/master/run_pretraining.py) python script in tensorflow version 1.15.5 version before. I use Google cloud TPU, as well. Is it ...

hazal

33

asked Jul 25, 2022 at 10:34

Collectives™ on Stack Overflow

All Questions

Related Tags