All Questions

Filter by
Sorted by
Tagged with
43 votes
2 answers
27k views

Why Bert transformer uses [CLS] token for classification instead of average over all tokens?

I am doing experiments on bert architecture and found out that most of the fine-tuning task takes the final hidden layer as text representation and later they pass it to other models for the further ...
Aaditya Ura's user avatar
  • 12.3k
17 votes
2 answers
33k views

The size of tensor a (707) must match the size of tensor b (512) at non-singleton dimension 1

I am trying to do text classification using pretrained BERT model. I trained the model on my dataset, and in the phase of testing; I know that BERT can only take to 512 tokens, so I wrote if condition ...
Mee's user avatar
  • 1,561
12 votes
3 answers
37k views

OSError: Error no file named ['pytorch_model.bin', 'tf_model.h5', 'model.ckpt.index']

When I load the BERT pretrained model online I get this error OSError: Error no file named ['pytorch_model.bin', 'tf_model.h5', 'model.ckpt.index'] found in directory uncased_L-12_H-768_A-12 or '...
Asma's user avatar
  • 189
11 votes
2 answers
13k views

How to use Transformers for text classification?

I have two questions about how to use Tensorflow implementation of the Transformers for text classifications. First, it seems people mostly used only the encoder layer to do the text classification ...
khemedi's user avatar
  • 806
10 votes
1 answer
14k views

How to get intermediate layers' output of pre-trained BERT model in HuggingFace Transformers library?

(I'm following this pytorch tutorial about BERT word embeddings, and in the tutorial the author is access the intermediate layers of the BERT model.) What I want is to access the last, lets say, 4 ...
Yagel's user avatar
  • 1,262
9 votes
1 answer
4k views

BERT embedding for semantic similarity

I earlier posted this question. I wanted to get embedding similar to this youtube video, time 33 minutes onward. 1) I dont think that the embedding that i am getting from CLS token are similar to ...
user2543622's user avatar
  • 6,258
8 votes
4 answers
16k views

Error importing BERT: module 'tensorflow._api.v2.train' has no attribute 'Optimizer'

I tried to use bert-tensorflow in Google Colab, but I got the following error: --------------------------------------------------------------------------- AttributeError ...
Belkacem Thiziri's user avatar
8 votes
6 answers
6k views

Problem with inputs when building a model with TFBertModel and AutoTokenizer from HuggingFace's transformers

I'm trying to build the model illustrated in this picture: I obtained a pre-trained BERT and respective tokenizer from HuggingFace's transformers in the following way: from transformers import ...
Gerardo Zinno's user avatar
8 votes
1 answer
3k views

HuggingFace BERT `inputs_embeds` giving unexpected result

The HuggingFace BERT TensorFlow implementation allows us to feed in a precomputed embedding in place of the embedding lookup that is native to BERT. This is done using the model's call method's ...
Vivek Subramanian's user avatar
6 votes
1 answer
3k views

Bert Embedding Layer raises `Type Error: unsupported operand type(s) for +: 'None Type' and 'int'` with BiLSTM

I've problems integrating Bert Embedding Layer in a BiLSTM model for word sense disambiguation task, Windows 10 Python 3.6.4 TenorFlow 1.12 Keras 2.2.4 No virtual environments were used PyCharm ...
ElSheikh's user avatar
  • 319
6 votes
3 answers
4k views

TypeError: Layer input_spec must be an instance of InputSpec. Got: InputSpec(shape=(None, 128, 768), ndim=3)

I am trying to use a BERT pretrained model to do a multiclass classification (of 3 classes). Here's my function to use the model and also added some extra functionalities: def create_model(max_seq_len,...
Hrisav Bhowmick's user avatar
5 votes
3 answers
6k views

AttributeError: 'str' object has no attribute 'dim' in pytorch

I got the following error output in the PyTorch when sent model predictions into the model. Does anyone know what's going on? Following are the architecture model that I created, in the error output, ...
Bei Zhao's user avatar
5 votes
1 answer
3k views

Getting embedding lookup result from BERT

Prior to passing my tokens through BERT, I would like to perform some processing on their embeddings, (the result of the embedding lookup layer). The HuggingFace BERT TensorFlow implementation allows ...
Vivek Subramanian's user avatar
5 votes
2 answers
3k views

Why are the matrices in BERT called Query, Key, and Value?

Within the transformer units of BERT, there are modules called Query, Key, and Value, or simply Q,K,V. Based on the BERT paper and code (particularly in modeling.py), my pseudocode understanding of ...
solvingPuzzles's user avatar
5 votes
2 answers
5k views

How to use trained BERT model checkpoints for prediction?

I trained the BERT with SQUAD 2.0 and got the model.ckpt.data, model.ckpt.meta, model.ckpt.index (F1 score : 81) in the output directory along with predictions.json, etc. using the BERT-master/...
Jeeva Bharathi's user avatar
5 votes
1 answer
6k views

How do I interpret my BERT output from Huggingface Transformers for Sequence Classification and tensorflow?

I am using bert for a sequence classification task with 3 labels. To do this, I am using huggingface transformers with tensorflow, more specifically the TFBertForSequenceClassification class with the ...
alxgal's user avatar
  • 129
5 votes
2 answers
2k views

Error with using BERT model from Tensorflow

I have tried to follow Tensorflow instructions to use BERT model: (https://www.tensorflow.org/tutorials/text/classify_text_with_bert) However, when I run these lines: text_test = ['this is such an ...
Jason's user avatar
  • 69
5 votes
4 answers
5k views

Convert a BERT Model to TFLite

I have this code for semantic search engine built using the pre-trained bert model. I want to convert this model into tflite for deploying it to google mlkit. I want to know how to convert it. I want ...
Ali Memon's user avatar
  • 125
4 votes
1 answer
2k views

Finetuning BERT on custom data

I want to train a 21 class text classification model using Bert. But I have very little training data, so a downloaded a similar dataset with 5 classes with 2 million samples.t And finetuned ...
danishansari's user avatar
4 votes
2 answers
2k views

BERT get sentence level embedding after fine tuning

I came across this page 1) I would like to get sentence level embedding (embedding given by [CLS] token) after the fine tuning is done. How could I do it? 2) I also noticed that the code on that ...
user2543622's user avatar
  • 6,258
4 votes
1 answer
4k views

Huggingface TFBertForSequenceClassification always predicts the same label

TL;DR: My model always predicts the same labels and I don't know why. Below is my entire code for fine-tuning in the hopes that someone can point out to me where I am going wrong. I am using ...
alxgal's user avatar
  • 129
4 votes
1 answer
6k views

BERT outputs explained

The keys of the BERT encoder's output are default, encoder_outputs, pooled_output and sequence_output As far as I can know, encoder_outputs are the output of each encoder, pooled_output is the output ...
OK 400's user avatar
  • 1,159
4 votes
2 answers
3k views

Loading tf.keras model, ValueError: The two structures don't have the same nested structure

I created a tf.keras model that has BERT and I want to train and save it for further use. Loading this model is a big issue cause I keep getting error: ValueError: The two structures don't have the ...
Nadja's user avatar
  • 43
4 votes
1 answer
1k views

How can I apply pruning on a BERT model?

I have trained a BERT model using ktrain (TensorFlow wrapper) to recognize emotion on text. It works, but it suffers from really slow inference. That makes my model not suitable for a production ...
Stamatis Tiniakos's user avatar
4 votes
1 answer
3k views

Tensorflow BERT for token-classification - exclude pad-tokens from accuracy while training and testing

I'm doing token-based classification using the pre-trained BERT-model for tensorflow to automatically label cause and effects in sentences. To access BERT, I'm using the TFBertForTokenClassification-...
user3228384's user avatar
4 votes
1 answer
2k views

How to access BERT intermediate layer outputs in TF Hub Module?

Does anybody know a way to access the outputs of the intermediate layers from BERT's hosted models on Tensorflow Hub? The model is hosted here. I have explored the meta graph and found the only ...
AlexDelPiero's user avatar
3 votes
3 answers
5k views

what is the difference between pooled output and sequence output in bert layer?

everyone! I was reading about Bert and wanted to do text classification with its word embeddings. I came across this line of code: pooled_output, sequence_output = self.bert_layer([input_word_ids, ...
mitra mirshafiee's user avatar
3 votes
5 answers
3k views

bert-serving-start giving error TypeError: cannot unpack non-iterable NoneType object - tried multiple paths to the model

I am trying to use BERT with bert-serving-start in python3.8 but it does not initialise and throws error: TypeError: cannot unpack non-iterable NoneType object This may have something to do with the ...
geds133's user avatar
  • 1,375
3 votes
1 answer
3k views

ValueError: Unknown layer: TFBertModel. Please ensure this object is passed to the `custom_objects` argument

Here I training the bert model. below code i used to train, when i load the saved model for predict, it's shows this error. can anyone please help me out? import tensorflow as tf import logging from ...
waji's user avatar
  • 71
3 votes
2 answers
1k views

Is possible multiples GPUs work as one with more memory?

I have a deep learning workstation where there are 4 GPUs with 6 GB of memory each. Would it be possible to make a docker container see the 4 GPUs as one but with 24 GB? Thank you.
Celso França's user avatar
3 votes
1 answer
644 views

InternalError when using TPU for training Keras model

I am attempting to fine-tune a BERT model on Google Colab from the Tensorflow Hub using this link. However, I run into the following error: InternalError: RET_CHECK failure (third_party/tensorflow/...
a_002311's user avatar
3 votes
2 answers
4k views

Bert Model Compile Error - TypeError: Invalid keyword argument(s) in `compile`: {'steps_per_execution'}

I have been using bert and trying to compile the model using the below line of code. model = TFBertForSequenceClassification.from_pretrained('bert-base-uncased') optimizer = tf.keras.optimizers.Adam(...
sruthi's user avatar
  • 31
3 votes
3 answers
2k views

List index out of range when saving finetuned Tensorflow model

I'm trying to fine-tune a pre-trained BERT model from Huggingface using Tensorflow. Everything runs smoothly and the model builds and trains without error. But when I try to save the model it stops ...
Haag's user avatar
  • 47
3 votes
1 answer
807 views

How to set output_shape of BERT preprocessing layer from tensorflow hub?

I am building a simple BERT model for text classification, using the tensorflow hub. import tensorflow as tf import tensorflow_hub as tf_hub bert_preprocess = tf_hub.KerasLayer("https://tfhub....
lazarea's user avatar
  • 1,219
3 votes
1 answer
588 views

TF BERT input packer on more than two inputs

Some of the TensorFlow examples using BERT models show a use of the BERT preprocessor to "pack" inputs. E.g. in this example, text_preprocessed = bert_preprocess.bert_pack_inputs([tok, tok], ...
D__'s user avatar
  • 210
3 votes
1 answer
1k views

Retraining existing base BERT model with additional data

I have generated new Base BERT model(dataset1_model_cased_L-12_H-768_A-12) using cased_L-12_H-768_A-12 as trained multi label classification from biobert-run_classifier I need to add more additional ...
veilupearl's user avatar
3 votes
1 answer
4k views

Running BERT on CPU instead of GPU

I am trying to execute BERT's run_clasifier.py script using terminal as below: python run_classifier.py --task_name=cola --do_predict=true --data_dir=<data-dir> --vocab_file=$BERT_BASE_DIR/...
Ashwin Geet D'Sa's user avatar
3 votes
0 answers
855 views

Same input, same model, same weights but getting different results

I'm finetuning sentence-bert to do some task like sentence cosine-similarity calculation in Tensorflow. I set up a encoder, let's say, encoder1 using the code below: from sentence_transformers import ...
PlasticSaber's user avatar
3 votes
2 answers
2k views

how to save and load custom siamese bert model

I am following this tutorial on how to train a siamese bert network: https://keras.io/examples/nlp/semantic_similarity_with_bert/ all good, but I am not sure what is the best way to save the model ...
Carbo's user avatar
  • 916
3 votes
0 answers
1k views

RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/

n_epochs = 6 model = CNN_Text() loss_fn = nn.CrossEntropyLoss(reduction='sum') optimizer = torch.optim.Adam(filter(lambda p: p.requires_grad, model.parameters()), lr=0.001) model.cuda() # Load train ...
Tahir Ullah's user avatar
3 votes
0 answers
708 views

I'm trying to load BERT "tfbert-large-uncased" but i got an error "Can't load config.json file"

I'm trying to load the pre-train BERT model but I'm getting an error while loading tokenized it says config.json is not found. If anyone knows how to solve these issues please help me Model and path ...
iamhimanshu0's user avatar
3 votes
0 answers
233 views

Name of training and test data files in NLP (BioBERT GitHub repo)

I'm reading the README.md file of the BioBERT GitHub repo: Let $NER_DIR indicate a folder for a single NER dataset which contains train_dev.tsv, train.tsv, devel.tsv and test.tsv. Also, set $...
Contestosis's user avatar
3 votes
0 answers
2k views

Fail to run trainer.train() with huggingface transformer

I am trying to set up a TensorFlow fine-tune framework for a question-answering project. Using hugging-face/transformer as the prototype, but cannot run through the trainer. The experiment is ...
user3381299's user avatar
3 votes
0 answers
2k views

Bert model giving CUDA out of memory error on google colab

I am using the following tutorial here to train and test a Bertsequenceclassifier model on a dataset of documents of varying lengths (small(0-280), medium(280-10000), large(10000 plus)) on the google ...
user1365234's user avatar
3 votes
1 answer
6k views

How to get the vocab file for Bert tokenizer from TF Hub

I'm trying to use Bert from TensorFlow Hub and build a tokenizer, this is what I'm doing: >>> import tensorflow_hub as hub >>> from bert.tokenization import FullTokenizer >>&...
bachr's user avatar
  • 5,898
2 votes
1 answer
7k views

tflite converter error operation not supported

I was trying to convert .pb model of albert to tflite I made .pb model using https://github.com/google-research/albert in tf 1.15 And I used tconverter = tf.compat.v1.lite.TFLiteConverter....
Mid_gang's user avatar
2 votes
1 answer
1k views

How does BERT utilize TPU memories?

README in the Google's BERT repo says, even a single sentence of length 512 can not sit in a 12 GB Titan X for the BERT-Large model. But in the BERT paper, it says 64 TPU chips are used to train BERT-...
soloice's user avatar
  • 1,000
2 votes
1 answer
2k views

Tensorflow2.4 NotFoundError: No algorithm worked! with Keras Conv1D Layer

I've been looking for a solution to this error for days and I can't find solutions for this: NotFoundError: 3 root error(s) found. (0) Not found: No algorithm worked! [[node model/conv1d/conv1d (...
Eduardo Watanabe's user avatar
2 votes
3 answers
2k views

SimpleTransformers Error: VersionConflict: tokenizers==0.9.4? How do I fix this?

I'm trying to execute the simpletransformers example from their site on google colab. Example: from simpletransformers.classification import ClassificationModel, ClassificationArgs import pandas as pd ...
Reema Q Khan's user avatar
2 votes
1 answer
815 views

BERT pre-training from scratch with tensorflow version 2.x

i used run_pretraining.py (https://github.com/google-research/bert/blob/master/run_pretraining.py) python script in tensorflow version 1.15.5 version before. I use Google cloud TPU, as well. Is it ...
hazal's user avatar
  • 33

1
2 3 4 5 6