Highest scored 'bert-language-model' questions - Page 4

4 votes

1 answer

5k views

BertModel or BertForPreTraining

I want to use Bert only for embedding and use the Bert output as an input for a classification net that I will build from scratch. I am not sure if I want to do finetuning for the model. I think the ...

Amit S

243

asked Mar 12, 2021 at 7:53

4 votes

2 answers

2k views

BERT get sentence level embedding after fine tuning

I came across this page 1) I would like to get sentence level embedding (embedding given by [CLS] token) after the fine tuning is done. How could I do it? 2) I also noticed that the code on that ...

user2543622

6,258

asked Mar 20, 2020 at 0:46

4 votes

1 answer

2k views

Using Arabert model with SpaCy

SpaCy doesn't support the Arabic language, but Can I use SpaCy with the pretrained Arabert model? Is it possible to modify this code so it can accept bert-large-arabertv02 instead of en_core_web_lg? !...

Reem

127

asked Oct 13, 2022 at 22:00

4 votes

3 answers

5k views

How to apply max_length to truncate the token sequence from the left in a HuggingFace tokenizer?

In the HuggingFace tokenizer, applying the max_length argument specifies the length of the tokenized text. I believe it truncates the sequence to max_length-2 (if truncation=True) by cutting the ...

Ondrej Sotolar

1,382

asked May 11, 2022 at 13:52

4 votes

1 answer

788 views

pytorch model evaluation slow when deployed on kubernetes

I would like to make the result of a text classification model (finBERT pytorch model) available through an endpoint that is deployed on Kubernetes. The whole pipeline is working but it's super slow ...

move_ludwig

41

asked Aug 12, 2021 at 13:43

4 votes

1 answer

4k views

Huggingface TFBertForSequenceClassification always predicts the same label

TL;DR: My model always predicts the same labels and I don't know why. Below is my entire code for fine-tuning in the hopes that someone can point out to me where I am going wrong. I am using ...

alxgal

129

asked Jan 12, 2021 at 0:24

4 votes

1 answer

4k views

Unsupervised finetuning of BERT for embeddings only?

I would like to fine-tuning BERT for a specific domain on unlabeled data and get the output layer to check the similarity between them. How can I do it? Do I need to fine-tuning first a classifier ...

Q_Dbk

41

asked Apr 22, 2020 at 15:15

4 votes

1 answer

6k views

BERT outputs explained

The keys of the BERT encoder's output are default, encoder_outputs, pooled_output and sequence_output As far as I can know, encoder_outputs are the output of each encoder, pooled_output is the output ...

OK 400

1,159

asked Nov 4, 2021 at 8:41

4 votes

1 answer

763 views

Restrict Vocab for BERT Encoder-Decoder Text Generation

Is there any way to restrict the vocabulary of the decoder in a Huggingface BERT encoder-decoder model? I'd like to force the decoder to choose from a small vocabulary when generating text rather than ...

Joseph Harvey

83

asked Oct 6, 2021 at 14:07

4 votes

1 answer

2k views

Correct Way to Fine-Tune/Train HuggingFace's Model from scratch (PyTorch)

For example, I want to train a BERT model from scratch but using the existing configuration. Is the following code the correct way to do so? model = BertModel.from_pretrained('bert-base-cased') model....

Allan-J

335

asked Aug 19, 2020 at 1:57

4 votes

1 answer

5k views

Why do we need state_dict = state_dict.copy()

I want to load the weights of a pre-trained model on my local model. I don’t understand why state_dict = state_dict.copy() is necessary if the two networks have the same name state_dict. # copy ...

dan

41

asked Apr 30, 2020 at 20:07

4 votes

2 answers

4k views

How to convert model.safetensor to pytorch_model.bin?

I'm fine tuning a pre-trained bert model and i have a weird problem: When i'm fine tuning using the CPU, the code saves the model like this: With the "pytorch_model.bin". But when i use ...

Gabriel Henrique

53

asked Dec 23, 2023 at 20:43

4 votes

1 answer

2k views

How to stop data shuffling while training the HuggingFace BERT model?

I want to train a BERT transformer model using the HuggingFace implementation/library. During training, HuggingFace shuffles the training data for each epoch, but I don't want to shuffle the data. For ...

Nusrat Jahan

41

asked Nov 12, 2022 at 18:16

4 votes

2 answers

6k views

Adding new tokens to BERT/RoBERTa while retaining tokenization of adjacent tokens

I'm trying to add some new tokens to BERT and RoBERTa tokenizers so that I can fine-tune the models on a new word. The idea is to fine-tune the models on a limited set of sentences with the new word, ...

Jigsaw

429

asked Dec 7, 2021 at 4:39

4 votes

2 answers

3k views

Loading tf.keras model, ValueError: The two structures don't have the same nested structure

I created a tf.keras model that has BERT and I want to train and save it for further use. Loading this model is a big issue cause I keep getting error: ValueError: The two structures don't have the ...

Nadja

43

asked Sep 28, 2021 at 14:57

4 votes

1 answer

3k views

how to train a bert model from scratch with huggingface?

i find a answer of training model from scratch in this question: How to train BERT from scratch on a new domain for both MLM and NSP? one answer use Trainer and TrainingArguments like this: from ...

Jack.Sparrow

141

asked Sep 10, 2021 at 3:30

4 votes

1 answer

1k views

How can I apply pruning on a BERT model?

I have trained a BERT model using ktrain (TensorFlow wrapper) to recognize emotion on text. It works, but it suffers from really slow inference. That makes my model not suitable for a production ...

Stamatis Tiniakos

830

asked Oct 20, 2020 at 13:04

4 votes

1 answer

2k views

PyTorch tokenizers: how to truncate tokens from left?

As we can see in the below code snippet, specifying max_length and truncation for a tokenizer cuts excess tokens from the left: tokenizer("hello, my name", truncation=True, max_length=6).input_ids ...

aayc

41

asked Feb 13, 2022 at 18:44

4 votes

1 answer

602 views

Training SVM classifier (word embeddings vs. sentence embeddings)

I want to experiment with different embeddings such Word2Vec, ELMo, and BERT but I'm a little confused about whether to use the word embeddings or sentence embeddings, and why. I'm using the ...

NST

125

asked Jul 2, 2021 at 12:22

4 votes

1 answer

5k views

PyTorch GPU memory leak during inference

I am trying to encode documents sentence-wise with a huggingface transformer module. I'm using the very small google/bert_uncased_L-2_H-128_A-2 pretrained model with the following code: def ...

Marco Moldovan

41

asked Jan 26, 2021 at 18:16

4 votes

1 answer

3k views

How to process TransformerEncoderLayer output in pytorch

I am trying to use bio-bert sentence embeddings for text classification of longer pieces of text. As it currently stands I standardize the number of sentences in each piece of text (some sentences are ...

Wackaman

161

asked Dec 7, 2020 at 22:12

4 votes

1 answer

3k views

Tensorflow BERT for token-classification - exclude pad-tokens from accuracy while training and testing

I'm doing token-based classification using the pre-trained BERT-model for tensorflow to automatically label cause and effects in sentences. To access BERT, I'm using the TFBertForTokenClassification-...

user3228384

63

asked Jul 8, 2020 at 14:41

4 votes

1 answer

1k views

Error: Inferring the task automatically requires to check the hub with a model_id defined as a `str`. AraBERT model

I'm training a transformer model by regular training as described in this notebook to classify the questions with their expected answer class. After training the model, I want to see the predictions ...

RJ94

69

asked Nov 14, 2022 at 0:14

4 votes

1 answer

7k views

There appear to be 1 leaked semaphore objects to clean up at shutdown

I am using MacOS & used DistilBert model using Sentence Transformer for chatbot implementation and generated the API in VS code. But after giving 3 inputs it pop’s up this error: UserWarning: ...

Tejas Sutar

81

asked Jun 9, 2022 at 7:17

4 votes

2 answers

820 views

Why are models such as BERT or GPT-3 considered unsupervised learning during pre-training when there is an output (label)

I am not very experienced with unsupervised learning, but my general understanding is that in unsupervised learning, the model learns without there being an output. However, during pre-training in ...

danielkim9

41

asked Feb 17, 2022 at 9:04

4 votes

1 answer

12k views

HuggingFace Bert Sentiment analysis

I am getting the following error : AssertionError: text input must of type str (single example), List[str] (batch or single pretokenized example) or List[List[str]] (batch of pretokenized examples)., ...

paris

43

asked Jan 25, 2021 at 9:13

4 votes

1 answer

2k views

How to access BERT intermediate layer outputs in TF Hub Module?

Does anybody know a way to access the outputs of the intermediate layers from BERT's hosted models on Tensorflow Hub? The model is hosted here. I have explored the meta graph and found the only ...

AlexDelPiero

289

asked Mar 25, 2019 at 8:10

4 votes

0 answers

2k views

ValueError: Exception encountered when calling layer "tf_bert_for_sequence_classification" (type TFBertForSequenceClassification)

train = df2[:25] test = df2[25:] def convert_data_to_examples(train, test, text, Airline_Cat): train_InputExamples = train.apply(lambda x: InputExample(guid=None, ...

Nandhini Palanikumar

41

asked Sep 23, 2022 at 5:50

4 votes

1 answer

1k views

Dutch sentiment analysis RobBERT

I have a question about Dutch sentiment analysis in Python. For a project at school I want to analyse the sentiment of a Dutch interview. I have worked with Vader but that doesn't work in Dutch. So I ...

Niels

41

asked Mar 22, 2022 at 13:42

4 votes

1 answer

4k views

cannot import name 'TrainingArguments' from 'transformers'

I am trying to fine-tune a pretrained huggingface BERT model. I am importing the following from transformers import (AutoTokenizer, AutoConfig, ...

jacqui_suis

51

asked Nov 22, 2021 at 13:43

4 votes

0 answers

841 views

max_steps and generative dataset huggingface

I am fine tuning a model on my domain using both MLM and NSP. I am using the TextDatasetForNextSentencePrediction for NSP and DataCollatorForLanguageModeling for MLM. The problem is with ...

Prasanna

4,422

asked Nov 5, 2021 at 16:23

4 votes

0 answers

748 views

HuggingFace BertForMaskedLM: Expected input batch_size (3200) to match target batch_size (16)

Im working on a Multiclass Classification (Bengali Language Sentiment Analysis) on a pretrained Huggingface (BertForMaskedLM) model. When the error occured I knew I have to change the label(output) ...

epitope21

41

asked Jun 18, 2021 at 19:48

4 votes

1 answer

2k views

How to build a dataset for language modeling with the datasets library as with the old TextDataset from the transformers library

I am trying to load a custom dataset that I will then use for language modeling. The dataset consists of a text file that has a whole document in each line, meaning that each line overpasses the ...

Daniel Díez

143

asked May 3, 2021 at 10:55

4 votes

0 answers

469 views

How to train a Masked Language Model with a big text corpus(200GB) using PyTorch？

Recently I am training a masked language model with a big text corpus(200GB) using transformers. The training data is too big to fit into computer equiped with 512GB memory and V100(32GB)*8. Is it ...

Chirs

73

asked Mar 3, 2021 at 6:29

4 votes

0 answers

1k views

Word embeddings with BERT and map tensors to words

I try to aggregate BERT embeddings on the token level. For each token in the corpus vocabulary, I would like to create a list of all their contextual embeddings and average them to get one ...

Andrej

3,799

asked Aug 4, 2020 at 9:49

4 votes

0 answers

4k views

PCA on BERT word embeddings

I am trying to take a set of sentences that use multiple meanings of the word "duck", and compute the word embeddings of each "duck" using BERT. Each word embedding is a vector of around 780 elements, ...

Nisha Prabhakar

61

asked Jun 5, 2020 at 19:37

4 votes

0 answers

285 views

How to handle text classification model that gives few results with higher confidence to wrong category?

I had a dataset of 15k records. I trained the model using a k-train package and 'bert' model with 5k samples. The train-test split is 70-30% and test results gave me accuracy and f1 scores as 93-94%. ...

Giri Sai Ram

41

asked May 12, 2020 at 13:44

4 votes

0 answers

199 views

How to create iob tags for a sentence?

I have a dataset for NER in which I have to do POS tagging and IOB tagging, but I don't understand the concept or method of how iob tags are created. Even CoNLL is pretagged.

Umang Bhalani

41

asked Feb 19, 2020 at 5:28

3 votes

1 answer

10k views

Huggingface's BERT tokenizer not adding pad token

It's not entirely clear from the documentation, but I can see that BertTokenizer is initialised with pad_token='[PAD]', so I assume when you encode with add_special_tokens=True then it would ...

doctopus

5,557

asked Apr 26, 2020 at 15:37

3 votes

3 answers

7k views

Removal of Stop Words and Stemming/Lemmatization for BERTopic

For Topic Modelling, I'm trying out the BERTopic: Link I'm little confused here, I am trying out the BERTopic on my custom Dataset. Since BERT was trained in such a way that it holds the semantic ...

WarlockQ

161

asked Jun 25, 2021 at 8:20

3 votes

3 answers

6k views

Tensorflow 2.X Error - Op type not registered 'CaseFoldUTF8' in binary running on Colab

I have been using BERT encoder from the Tensorflow hub for quite sometime now. Here are the syntaxes: tfhub_handle_encoder = "https://tfhub.dev/tensorflow/bert_multi_cased_L-12_H-768_A-12/4" ...

Yogesh_25

51

asked Feb 27, 2023 at 5:17

3 votes

3 answers

1k views

String comparison with BERT seems to ignore "not" in sentence

I implemented a string comparison method using SentenceTransformers and BERT like following from sentence_transformers import SentenceTransformer from sklearn.metrics.pairwise import cosine_similarity ...

Tiago Bachiega de Almeida

121

asked Sep 7, 2021 at 16:18

3 votes

1 answer

4k views

Applying LIME interpretation on my fine-tuned BERT for sequence classification model?

I fine tuned BERT For Sequence Classification on task specific, I wand to apply LIME interpretation to see how each token contribute to be classified to specific label as LIME handle the classifier as ...

Eliza William

73

asked Oct 22, 2020 at 14:45

3 votes

4 answers

18k views

Cannot import BertModel from transformers

I am trying to import BertModel from transformers, but it fails. This is code I am using from transformers import BertModel, BertForMaskedLM This is the error I get ImportError: cannot import name '...

Moaz Mohammed Husain

107

asked Jun 15, 2020 at 10:47

3 votes

2 answers

7k views

Having 6 labels instead of 2 in Hugging Face BertForSequenceClassification

I was just wondering if it is possibel to extend the HuggingFace BertForSequenceClassification model to more than 2 labels. The docs say, we can pass positional arguments, but it seems like "labels" ...

Alex

73

asked Jun 11, 2020 at 15:23

3 votes

2 answers

2k views

Are the pre-trained layers of the Huggingface BERT models frozen?

I use the following classification model from Huggingface: model = AutoModelForSequenceClassification.from_pretrained("dbmdz/bert-base-german-cased", num_labels=2).to(device) As I ...

Theodor Peifer

3,296

asked Jul 4, 2022 at 9:09

3 votes

2 answers

3k views

PipelineException: No mask_token ([MASK]) found on the input

I am getting this error "PipelineException: No mask_token ([MASK]) found on the input" when I run this line. fill_mask("Auto Car .") I am running it on Colab. My Code: from ...

Naqi

135

asked May 12, 2021 at 22:35

3 votes

1 answer

6k views

Tokens returned in transformers Bert model from encode()

I have a small dataset for sentiment analysis. The classifier will be a simple KNN but I wanted to get the word embedding with the Bert model from the transformers library. Note that I just found out ...

Edv Beq

936

asked Aug 27, 2020 at 1:38

3 votes

3 answers

5k views

what is the difference between pooled output and sequence output in bert layer?

everyone! I was reading about Bert and wanted to do text classification with its word embeddings. I came across this line of code: pooled_output, sequence_output = self.bert_layer([input_word_ids, ...

mitra mirshafiee

453

asked Aug 12, 2020 at 13:09

3 votes

2 answers

2k views

Where can I get the pretrained word embeddinngs for BERT?

I know that BERT has total vocabulary size of 30522 which contains some words and subwords. I want to get the initial input embeddings of BERT. So, my requirement is to get the table of size [30522, ...

Ruchit

346

asked Jul 28, 2020 at 2:57

Collectives™ on Stack Overflow

Questions tagged [bert-language-model]

Related Tags