Questions tagged [bert-language-model]

BERT, or Bidirectional Encoder Representations from Transformers, is a method of pre-training language representations which obtains state-of-the-art results on a wide array of Natural Language Processing (NLP) tasks. BERT uses Transformers (an attention mechanism that learns contextual relations between words or sub words in a text) to generate a language model.

Filter by
Sorted by
Tagged with
4 votes
1 answer
5k views

BertModel or BertForPreTraining

I want to use Bert only for embedding and use the Bert output as an input for a classification net that I will build from scratch. I am not sure if I want to do finetuning for the model. I think the ...
Amit S's user avatar
  • 243
4 votes
2 answers
2k views

BERT get sentence level embedding after fine tuning

I came across this page 1) I would like to get sentence level embedding (embedding given by [CLS] token) after the fine tuning is done. How could I do it? 2) I also noticed that the code on that ...
user2543622's user avatar
  • 6,258
4 votes
1 answer
2k views

Using Arabert model with SpaCy

SpaCy doesn't support the Arabic language, but Can I use SpaCy with the pretrained Arabert model? Is it possible to modify this code so it can accept bert-large-arabertv02 instead of en_core_web_lg? !...
Reem's user avatar
  • 127
4 votes
3 answers
5k views

How to apply max_length to truncate the token sequence from the left in a HuggingFace tokenizer?

In the HuggingFace tokenizer, applying the max_length argument specifies the length of the tokenized text. I believe it truncates the sequence to max_length-2 (if truncation=True) by cutting the ...
Ondrej Sotolar's user avatar
4 votes
1 answer
788 views

pytorch model evaluation slow when deployed on kubernetes

I would like to make the result of a text classification model (finBERT pytorch model) available through an endpoint that is deployed on Kubernetes. The whole pipeline is working but it's super slow ...
move_ludwig's user avatar
4 votes
1 answer
4k views

Huggingface TFBertForSequenceClassification always predicts the same label

TL;DR: My model always predicts the same labels and I don't know why. Below is my entire code for fine-tuning in the hopes that someone can point out to me where I am going wrong. I am using ...
alxgal's user avatar
  • 129
4 votes
1 answer
4k views

Unsupervised finetuning of BERT for embeddings only?

I would like to fine-tuning BERT for a specific domain on unlabeled data and get the output layer to check the similarity between them. How can I do it? Do I need to fine-tuning first a classifier ...
Q_Dbk's user avatar
  • 41
4 votes
1 answer
6k views

BERT outputs explained

The keys of the BERT encoder's output are default, encoder_outputs, pooled_output and sequence_output As far as I can know, encoder_outputs are the output of each encoder, pooled_output is the output ...
OK 400's user avatar
  • 1,159
4 votes
1 answer
763 views

Restrict Vocab for BERT Encoder-Decoder Text Generation

Is there any way to restrict the vocabulary of the decoder in a Huggingface BERT encoder-decoder model? I'd like to force the decoder to choose from a small vocabulary when generating text rather than ...
Joseph Harvey's user avatar
4 votes
1 answer
2k views

Correct Way to Fine-Tune/Train HuggingFace's Model from scratch (PyTorch)

For example, I want to train a BERT model from scratch but using the existing configuration. Is the following code the correct way to do so? model = BertModel.from_pretrained('bert-base-cased') model....
Allan-J's user avatar
  • 335
4 votes
1 answer
5k views

Why do we need state_dict = state_dict.copy()

I want to load the weights of a pre-trained model on my local model. I don’t understand why state_dict = state_dict.copy() is necessary if the two networks have the same name state_dict. # copy ...
dan's user avatar
  • 41
4 votes
2 answers
4k views

How to convert model.safetensor to pytorch_model.bin?

I'm fine tuning a pre-trained bert model and i have a weird problem: When i'm fine tuning using the CPU, the code saves the model like this: With the "pytorch_model.bin". But when i use ...
Gabriel Henrique's user avatar
4 votes
1 answer
2k views

How to stop data shuffling while training the HuggingFace BERT model?

I want to train a BERT transformer model using the HuggingFace implementation/library. During training, HuggingFace shuffles the training data for each epoch, but I don't want to shuffle the data. For ...
Nusrat Jahan's user avatar
4 votes
2 answers
6k views

Adding new tokens to BERT/RoBERTa while retaining tokenization of adjacent tokens

I'm trying to add some new tokens to BERT and RoBERTa tokenizers so that I can fine-tune the models on a new word. The idea is to fine-tune the models on a limited set of sentences with the new word, ...
Jigsaw's user avatar
  • 429
4 votes
2 answers
3k views

Loading tf.keras model, ValueError: The two structures don't have the same nested structure

I created a tf.keras model that has BERT and I want to train and save it for further use. Loading this model is a big issue cause I keep getting error: ValueError: The two structures don't have the ...
Nadja's user avatar
  • 43
4 votes
1 answer
3k views

how to train a bert model from scratch with huggingface?

i find a answer of training model from scratch in this question: How to train BERT from scratch on a new domain for both MLM and NSP? one answer use Trainer and TrainingArguments like this: from ...
Jack.Sparrow's user avatar
4 votes
1 answer
1k views

How can I apply pruning on a BERT model?

I have trained a BERT model using ktrain (TensorFlow wrapper) to recognize emotion on text. It works, but it suffers from really slow inference. That makes my model not suitable for a production ...
Stamatis Tiniakos's user avatar
4 votes
1 answer
2k views

PyTorch tokenizers: how to truncate tokens from left?

As we can see in the below code snippet, specifying max_length and truncation for a tokenizer cuts excess tokens from the left: tokenizer("hello, my name", truncation=True, max_length=6).input_ids ...
aayc's user avatar
  • 41
4 votes
1 answer
602 views

Training SVM classifier (word embeddings vs. sentence embeddings)

I want to experiment with different embeddings such Word2Vec, ELMo, and BERT but I'm a little confused about whether to use the word embeddings or sentence embeddings, and why. I'm using the ...
NST's user avatar
  • 125
4 votes
1 answer
5k views

PyTorch GPU memory leak during inference

I am trying to encode documents sentence-wise with a huggingface transformer module. I'm using the very small google/bert_uncased_L-2_H-128_A-2 pretrained model with the following code: def ...
Marco Moldovan's user avatar
4 votes
1 answer
3k views

How to process TransformerEncoderLayer output in pytorch

I am trying to use bio-bert sentence embeddings for text classification of longer pieces of text. As it currently stands I standardize the number of sentences in each piece of text (some sentences are ...
Wackaman's user avatar
  • 161
4 votes
1 answer
3k views

Tensorflow BERT for token-classification - exclude pad-tokens from accuracy while training and testing

I'm doing token-based classification using the pre-trained BERT-model for tensorflow to automatically label cause and effects in sentences. To access BERT, I'm using the TFBertForTokenClassification-...
user3228384's user avatar
4 votes
1 answer
1k views

Error: Inferring the task automatically requires to check the hub with a model_id defined as a `str`. AraBERT model

I'm training a transformer model by regular training as described in this notebook to classify the questions with their expected answer class. After training the model, I want to see the predictions ...
RJ94's user avatar
  • 69
4 votes
1 answer
7k views

There appear to be 1 leaked semaphore objects to clean up at shutdown

I am using MacOS & used DistilBert model using Sentence Transformer for chatbot implementation and generated the API in VS code. But after giving 3 inputs it pop’s up this error: UserWarning: ...
Tejas Sutar's user avatar
4 votes
2 answers
820 views

Why are models such as BERT or GPT-3 considered unsupervised learning during pre-training when there is an output (label)

I am not very experienced with unsupervised learning, but my general understanding is that in unsupervised learning, the model learns without there being an output. However, during pre-training in ...
danielkim9's user avatar
4 votes
1 answer
12k views

HuggingFace Bert Sentiment analysis

I am getting the following error : AssertionError: text input must of type str (single example), List[str] (batch or single pretokenized example) or List[List[str]] (batch of pretokenized examples)., ...
paris's user avatar
  • 43
4 votes
1 answer
2k views

How to access BERT intermediate layer outputs in TF Hub Module?

Does anybody know a way to access the outputs of the intermediate layers from BERT's hosted models on Tensorflow Hub? The model is hosted here. I have explored the meta graph and found the only ...
AlexDelPiero's user avatar
4 votes
0 answers
2k views

ValueError: Exception encountered when calling layer "tf_bert_for_sequence_classification" (type TFBertForSequenceClassification)

train = df2[:25] test = df2[25:] def convert_data_to_examples(train, test, text, Airline_Cat): train_InputExamples = train.apply(lambda x: InputExample(guid=None, ...
Nandhini Palanikumar's user avatar
4 votes
1 answer
1k views

Dutch sentiment analysis RobBERT

I have a question about Dutch sentiment analysis in Python. For a project at school I want to analyse the sentiment of a Dutch interview. I have worked with Vader but that doesn't work in Dutch. So I ...
Niels's user avatar
  • 41
4 votes
1 answer
4k views

cannot import name 'TrainingArguments' from 'transformers'

I am trying to fine-tune a pretrained huggingface BERT model. I am importing the following from transformers import (AutoTokenizer, AutoConfig, ...
jacqui_suis's user avatar
4 votes
0 answers
841 views

max_steps and generative dataset huggingface

I am fine tuning a model on my domain using both MLM and NSP. I am using the TextDatasetForNextSentencePrediction for NSP and DataCollatorForLanguageModeling for MLM. The problem is with ...
Prasanna's user avatar
  • 4,422
4 votes
0 answers
748 views

HuggingFace BertForMaskedLM: Expected input batch_size (3200) to match target batch_size (16)

Im working on a Multiclass Classification (Bengali Language Sentiment Analysis) on a pretrained Huggingface (BertForMaskedLM) model. When the error occured I knew I have to change the label(output) ...
epitope21's user avatar
4 votes
1 answer
2k views

How to build a dataset for language modeling with the datasets library as with the old TextDataset from the transformers library

I am trying to load a custom dataset that I will then use for language modeling. The dataset consists of a text file that has a whole document in each line, meaning that each line overpasses the ...
Daniel Díez's user avatar
4 votes
0 answers
469 views

How to train a Masked Language Model with a big text corpus(200GB) using PyTorch?

Recently I am training a masked language model with a big text corpus(200GB) using transformers. The training data is too big to fit into computer equiped with 512GB memory and V100(32GB)*8. Is it ...
Chirs's user avatar
  • 73
4 votes
0 answers
1k views

Word embeddings with BERT and map tensors to words

I try to aggregate BERT embeddings on the token level. For each token in the corpus vocabulary, I would like to create a list of all their contextual embeddings and average them to get one ...
Andrej's user avatar
  • 3,799
4 votes
0 answers
4k views

PCA on BERT word embeddings

I am trying to take a set of sentences that use multiple meanings of the word "duck", and compute the word embeddings of each "duck" using BERT. Each word embedding is a vector of around 780 elements, ...
Nisha Prabhakar's user avatar
4 votes
0 answers
285 views

How to handle text classification model that gives few results with higher confidence to wrong category?

I had a dataset of 15k records. I trained the model using a k-train package and 'bert' model with 5k samples. The train-test split is 70-30% and test results gave me accuracy and f1 scores as 93-94%. ...
Giri Sai Ram's user avatar
4 votes
0 answers
199 views

How to create iob tags for a sentence?

I have a dataset for NER in which I have to do POS tagging and IOB tagging, but I don't understand the concept or method of how iob tags are created. Even CoNLL is pretagged.
Umang Bhalani's user avatar
3 votes
1 answer
10k views

Huggingface's BERT tokenizer not adding pad token

It's not entirely clear from the documentation, but I can see that BertTokenizer is initialised with pad_token='[PAD]', so I assume when you encode with add_special_tokens=True then it would ...
doctopus's user avatar
  • 5,557
3 votes
3 answers
7k views

Removal of Stop Words and Stemming/Lemmatization for BERTopic

For Topic Modelling, I'm trying out the BERTopic: Link I'm little confused here, I am trying out the BERTopic on my custom Dataset. Since BERT was trained in such a way that it holds the semantic ...
WarlockQ's user avatar
  • 161
3 votes
3 answers
6k views

Tensorflow 2.X Error - Op type not registered 'CaseFoldUTF8' in binary running on Colab

I have been using BERT encoder from the Tensorflow hub for quite sometime now. Here are the syntaxes: tfhub_handle_encoder = "https://tfhub.dev/tensorflow/bert_multi_cased_L-12_H-768_A-12/4" ...
Yogesh_25's user avatar
3 votes
3 answers
1k views

String comparison with BERT seems to ignore "not" in sentence

I implemented a string comparison method using SentenceTransformers and BERT like following from sentence_transformers import SentenceTransformer from sklearn.metrics.pairwise import cosine_similarity ...
Tiago Bachiega de Almeida's user avatar
3 votes
1 answer
4k views

Applying LIME interpretation on my fine-tuned BERT for sequence classification model?

I fine tuned BERT For Sequence Classification on task specific, I wand to apply LIME interpretation to see how each token contribute to be classified to specific label as LIME handle the classifier as ...
Eliza William's user avatar
3 votes
4 answers
18k views

Cannot import BertModel from transformers

I am trying to import BertModel from transformers, but it fails. This is code I am using from transformers import BertModel, BertForMaskedLM This is the error I get ImportError: cannot import name '...
Moaz Mohammed Husain's user avatar
3 votes
2 answers
7k views

Having 6 labels instead of 2 in Hugging Face BertForSequenceClassification

I was just wondering if it is possibel to extend the HuggingFace BertForSequenceClassification model to more than 2 labels. The docs say, we can pass positional arguments, but it seems like "labels" ...
Alex's user avatar
  • 73
3 votes
2 answers
2k views

Are the pre-trained layers of the Huggingface BERT models frozen?

I use the following classification model from Huggingface: model = AutoModelForSequenceClassification.from_pretrained("dbmdz/bert-base-german-cased", num_labels=2).to(device) As I ...
Theodor Peifer's user avatar
3 votes
2 answers
3k views

PipelineException: No mask_token ([MASK]) found on the input

I am getting this error "PipelineException: No mask_token ([MASK]) found on the input" when I run this line. fill_mask("Auto Car .") I am running it on Colab. My Code: from ...
Naqi's user avatar
  • 135
3 votes
1 answer
6k views

Tokens returned in transformers Bert model from encode()

I have a small dataset for sentiment analysis. The classifier will be a simple KNN but I wanted to get the word embedding with the Bert model from the transformers library. Note that I just found out ...
Edv Beq's user avatar
  • 936
3 votes
3 answers
5k views

what is the difference between pooled output and sequence output in bert layer?

everyone! I was reading about Bert and wanted to do text classification with its word embeddings. I came across this line of code: pooled_output, sequence_output = self.bert_layer([input_word_ids, ...
mitra mirshafiee's user avatar
3 votes
2 answers
2k views

Where can I get the pretrained word embeddinngs for BERT?

I know that BERT has total vocabulary size of 30522 which contains some words and subwords. I want to get the initial input embeddings of BERT. So, my requirement is to get the table of size [30522, ...
Ruchit's user avatar
  • 346

1 2 3
4
5
37