All Questions

Filter by
Sorted by
Tagged with
50 votes
10 answers
125k views

CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling cublasCreate(handle)

I got the following error when I ran my PyTorch deep learning model in Google Colab /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in linear(input, weight, bias) 1370 ret = ...
Mr. NLP's user avatar
  • 971
39 votes
3 answers
36k views

dropout(): argument 'input' (position 1) must be Tensor, not str when using Bert with Huggingface

My code was working fine and when I tried to run it today without changing anything I got the following error: dropout(): argument 'input' (position 1) must be Tensor, not str Would appreciate if ...
Tashinga Musanhu's user avatar
24 votes
1 answer
49k views

How does max_length, padding and truncation arguments work in HuggingFace' BertTokenizerFast.from_pretrained('bert-base-uncased')?

I am working with Text Classification problem where I want to use the BERT model as the base followed by Dense layers. I want to know how does the 3 arguments work? For example, if I have 3 sentences ...
Deshwal's user avatar
  • 3,872
24 votes
1 answer
62k views

PyTorch BERT TypeError: forward() got an unexpected keyword argument 'labels'

Training a BERT model using PyTorch transformers (following the tutorial here). Following statement in the tutorial loss = model(b_input_ids, token_type_ids=None, attention_mask=b_input_mask, labels=...
PinkBanter's user avatar
  • 1,826
22 votes
6 answers
27k views

AttributeError: module 'torch' has no attribute '_six'. Bert model in Pytorch

I tried to load pre-trained model by using BertModel class in pytorch. I have _six.py under torch, but it still shows module 'torch' has no attribute '_six' import torch from pytorch_pretrained_bert ...
Ruitong LIU's user avatar
21 votes
1 answer
30k views

PyTorch: RuntimeError: Input, output and indices must be on the current device

I am running a BERT model on torch. It's a multi-class sentiment classification task with about 30,000 rows. I have already put everything on cuda, but not sure why I'm getting the following run time ...
Roy's user avatar
  • 984
19 votes
5 answers
68k views

Pytorch: IndexError: index out of range in self. How to solve?

This training code is based on the run_glue.py script found here: # Set the seed value all over the place to make this reproducible. seed_val = 42 random.seed(seed_val) np.random.seed(seed_val) torch....
sylvester's user avatar
  • 243
18 votes
1 answer
12k views

BertForSequenceClassification vs. BertForMultipleChoice for sentence multi-class classification

I'm working on a text classification problem (e.g. sentiment analysis), where I need to classify a text string into one of five classes. I just started using the Huggingface Transformer package and ...
stackoverflowuser2010's user avatar
17 votes
2 answers
33k views

The size of tensor a (707) must match the size of tensor b (512) at non-singleton dimension 1

I am trying to do text classification using pretrained BERT model. I trained the model on my dataset, and in the phase of testing; I know that BERT can only take to 512 tokens, so I wrote if condition ...
Mee's user avatar
  • 1,561
17 votes
2 answers
11k views

Difficulty in understanding the tokenizer used in Roberta model

from transformers import AutoModel, AutoTokenizer tokenizer1 = AutoTokenizer.from_pretrained("roberta-base") tokenizer2 = AutoTokenizer.from_pretrained("bert-base-cased") sequence = "A Titan RTX has ...
Mr. NLP's user avatar
  • 971
16 votes
3 answers
23k views

How to understand hidden_states of the returns in BertModel?(huggingface-transformers)

Returns last_hidden_state (torch.FloatTensor of shape (batch_size, sequence_length, hidden_size)): Sequence of hidden-states at the output of the last layer of the model. pooler_output (torch....
island145287's user avatar
15 votes
3 answers
35k views

Python: BERT Error - Some weights of the model checkpoint at were not used when initializing BertModel

I am creating an entity extraction model in PyTorch using bert-base-uncased but when I try to run the model I get this error: Error: Some weights of the model checkpoint at D:\Transformers\bert-entity-...
Ishan Dutta's user avatar
14 votes
1 answer
14k views

PyTorch torch.no_grad() versus requires_grad=False

I'm following a PyTorch tutorial which uses the BERT NLP model (feature extractor) from the Huggingface Transformers library. There are two pieces of interrelated code for gradient updates that I don'...
stackoverflowuser2010's user avatar
13 votes
4 answers
8k views

How to fine tune BERT on unlabeled data?

I want to fine tune BERT on a specific domain. I have texts of that domain in text files. How can I use these to fine tune BERT? I am looking here currently. My main objective is to get sentence ...
Rish's user avatar
  • 541
12 votes
4 answers
11k views

Training TFBertForSequenceClassification with custom X and Y data

I am working on a TextClassification problem, for which I am trying to traing my model on TFBertForSequenceClassification given in huggingface-transformers library. I followed the example given on ...
Rahul Goel's user avatar
12 votes
3 answers
37k views

OSError: Error no file named ['pytorch_model.bin', 'tf_model.h5', 'model.ckpt.index']

When I load the BERT pretrained model online I get this error OSError: Error no file named ['pytorch_model.bin', 'tf_model.h5', 'model.ckpt.index'] found in directory uncased_L-12_H-768_A-12 or '...
Asma's user avatar
  • 189
12 votes
2 answers
5k views

Get probability of multi-token word in MASK position

It is relatively easy to get a token's probability according to a language model, as the snippet below shows. You can get the output of a model, restrict yourself to the output of the masked token, ...
Bram Vanroy's user avatar
  • 27.7k
11 votes
1 answer
2k views

what's the difference between "self-attention mechanism" and "full-connection" layer?

I am confused with these two structures. In theory, the output of them are all connected to their input. what magic make 'self-attention mechanism' is more powerful than the full-connection layer?
tom_cat's user avatar
  • 325
10 votes
3 answers
9k views

Using trained BERT Model and Data Preprocessing

When using a pre-trained BERT embeddings from pytorch (which are then fine-tuned), should the text data fed into the model be pre-processed like in any standard NLP task? For instance, should ...
SFD's user avatar
  • 575
10 votes
3 answers
12k views

BertTokenizer - when encoding and decoding sequences extra spaces appear

When using Transformers from HuggingFace I am facing a problem with the encoding and decoding method. I have a the following string: test_string = 'text with percentage%' Then I am running the ...
Henryk Borzymowski's user avatar
9 votes
1 answer
23k views

RuntimeError: The size of tensor a (4000) must match the size of tensor b (512) at non-singleton dimension 1

I'm trying to build a model for document classification. I'm using BERT with PyTorch. I got the bert model with below code. bert = AutoModel.from_pretrained('bert-base-uncased') This is the code for ...
Venkatesh Dharavath's user avatar
9 votes
1 answer
24k views

BERT tokenizer & model download

I`m beginner.. I'm working with Bert. However, due to the security of the company network, the following code does not receive the bert model directly. tokenizer = BertTokenizer.from_pretrained('bert-...
ybin's user avatar
  • 575
9 votes
1 answer
8k views

How do I use BertForMaskedLM or BertModel to calculate perplexity of a sentence?

I want to use BertForMaskedLM or BertModel to calculate perplexity of a sentence, so I write code like this: import numpy as np import torch import torch.nn as nn from transformers import ...
Kaim hong's user avatar
  • 113
8 votes
1 answer
9k views

How to calculate perplexity of a sentence using huggingface masked language models?

I have several masked language models (mainly Bert, Roberta, Albert, Electra). I also have a dataset of sentences. How can I get the perplexity of each sentence? From the huggingface documentation ...
Penguin's user avatar
  • 2,148
8 votes
3 answers
5k views

How to compute mean/max of HuggingFace Transformers BERT token embeddings with attention mask?

I'm using the HuggingFace Transformers BERT model, and I want to compute a summary vector (a.k.a. embedding) over the tokens in a sentence, using either the mean or max function. The complication is ...
stackoverflowuser2010's user avatar
7 votes
1 answer
5k views

How exactly should the input file be formatted for the language model finetuning (BERT through Huggingface Transformers)?

I wanted to employ the examples/run_lm_finetuning.py from the Huggingface Transformers repository on a pretrained Bert model. However, from following the documentation it is not evident how a corpus ...
nminds's user avatar
  • 79
7 votes
2 answers
14k views

The model did not return a loss from the inputs - LabSE error

I want to fine tune LabSE for Question answering using squad dataset. and i got this error: ValueError: The model did not return a loss from the inputs, only the following keys: last_hidden_state,...
Mateusz Pasierbek's user avatar
7 votes
2 answers
6k views

The essence of learnable positional embedding? Does embedding improve outcomes better?

I was recently reading the bert source code from the hugging face project. I noticed that the so-called "learnable position encoding" seems to refer to a specific nn.Parameter layer when it ...
AdamHommer's user avatar
7 votes
1 answer
8k views

Mismatched size on BertForSequenceClassification from Transformers and multiclass problem

I just trained a BERT model on a Dataset composed by products and labels (departments) for an e-commerce website. It's a multiclass problem. I used BertForSequenceClassification to predict the ...
Guilherme Giuliano Nicolau's user avatar
7 votes
1 answer
4k views

ModuleNotFoundError: No module named 'torch.utils._pytree'

I have installed PyTorch 1.7.1, and it works very well. However, when I try to run this code: import transformers from transformers import BertTokenizer from transformers.models.bert.modeling_bert ...
A_B_Y's user avatar
  • 422
6 votes
1 answer
8k views

huggingface bert showing poor accuracy / f1 score [pytorch]

I am trying BertForSequenceClassification for a simple article classification task. No matter how I train it (freeze all layers but the classification layer, all layers trainable, last k layers ...
Zabir Al Nazi's user avatar
6 votes
1 answer
34k views

Pytorch expects each tensor to be equal size

When running this code: embedding_matrix = torch.stack(embeddings) I got this error: RuntimeError: stack expects each tensor to be equal size, but got [7, 768] at entry 0 and [8, 768] at entry 1 I'm ...
sam's user avatar
  • 79
5 votes
1 answer
13k views

TypeError: linear(): argument 'input' (position 1) must be Tensor, not str

so ive been trying to work on some example of bert that i found on github as its the first time im trying to use bert and see how it works. The respiratory im working with is the following: https://...
user avatar
5 votes
1 answer
5k views

How to get the probability of a particular token(word) in a sentence given the context

I'm trying to calculate the probability or any type of score for words in a sentence using NLP. I've tried this approach with GPT2 model using Huggingface Transformers library, but, I couldn't get ...
Dilrukshi Perera's user avatar
5 votes
1 answer
818 views

How to save parameters just related to classifier layer of pretrained bert model due to the memory concerns?

I fine tuned the pretrained model here by freezing all layers except the classifier layers. And I saved weight file with using pytorch as .bin format. Now instead of loading the 400mb pre-trained ...
Mehmet Calikus's user avatar
5 votes
2 answers
2k views

BERT-based NER model giving inconsistent prediction when deserialized

I am trying to train an NER model using the HuggingFace transformers library on Colab cloud GPUs, pickle it and load the model on my own CPU to make predictions. Code The model is the following: from ...
flyingjapans's user avatar
5 votes
2 answers
3k views

Can I use BERT as a feature extractor without any finetuning on my specific data set?

I'm trying to solve a multilabel classification task of 10 classes with a relatively balanced training set consists of ~25K samples and an evaluation set consists of ~5K samples. I'm using the ...
SBflying's user avatar
5 votes
1 answer
2k views

Does BertForSequenceClassification classify on the CLS vector?

I'm using the Huggingface Transformer package and BERT with PyTorch. I'm trying to do 4-way sentiment classification and am using BertForSequenceClassification to build a model that eventually leads ...
stackoverflowuser2010's user avatar
5 votes
1 answer
9k views

Get the value of '[UNK]' in BERT

I have designed a model based on BERT to solve NER task. I am using transformers library with the "dccuchile/bert-base-spanish-wwm-cased" pre-trained model. The problem comes when my model detect an ...
Javier Jiménez de la Jara's user avatar
4 votes
3 answers
5k views

How to apply max_length to truncate the token sequence from the left in a HuggingFace tokenizer?

In the HuggingFace tokenizer, applying the max_length argument specifies the length of the tokenized text. I believe it truncates the sequence to max_length-2 (if truncation=True) by cutting the ...
Ondrej Sotolar's user avatar
4 votes
1 answer
788 views

pytorch model evaluation slow when deployed on kubernetes

I would like to make the result of a text classification model (finBERT pytorch model) available through an endpoint that is deployed on Kubernetes. The whole pipeline is working but it's super slow ...
move_ludwig's user avatar
4 votes
1 answer
2k views

Correct Way to Fine-Tune/Train HuggingFace's Model from scratch (PyTorch)

For example, I want to train a BERT model from scratch but using the existing configuration. Is the following code the correct way to do so? model = BertModel.from_pretrained('bert-base-cased') model....
Allan-J's user avatar
  • 335
4 votes
1 answer
5k views

Why do we need state_dict = state_dict.copy()

I want to load the weights of a pre-trained model on my local model. I don’t understand why state_dict = state_dict.copy() is necessary if the two networks have the same name state_dict. # copy ...
dan's user avatar
  • 41
4 votes
2 answers
4k views

How to convert model.safetensor to pytorch_model.bin?

I'm fine tuning a pre-trained bert model and i have a weird problem: When i'm fine tuning using the CPU, the code saves the model like this: With the "pytorch_model.bin". But when i use ...
Gabriel Henrique's user avatar
4 votes
1 answer
2k views

PyTorch tokenizers: how to truncate tokens from left?

As we can see in the below code snippet, specifying max_length and truncation for a tokenizer cuts excess tokens from the left: tokenizer("hello, my name", truncation=True, max_length=6).input_ids ...
aayc's user avatar
  • 41
4 votes
1 answer
5k views

PyTorch GPU memory leak during inference

I am trying to encode documents sentence-wise with a huggingface transformer module. I'm using the very small google/bert_uncased_L-2_H-128_A-2 pretrained model with the following code: def ...
Marco Moldovan's user avatar
4 votes
1 answer
3k views

How to process TransformerEncoderLayer output in pytorch

I am trying to use bio-bert sentence embeddings for text classification of longer pieces of text. As it currently stands I standardize the number of sentences in each piece of text (some sentences are ...
Wackaman's user avatar
  • 161
4 votes
0 answers
469 views

How to train a Masked Language Model with a big text corpus(200GB) using PyTorch?

Recently I am training a masked language model with a big text corpus(200GB) using transformers. The training data is too big to fit into computer equiped with 512GB memory and V100(32GB)*8. Is it ...
Chirs's user avatar
  • 73
4 votes
0 answers
1k views

Word embeddings with BERT and map tensors to words

I try to aggregate BERT embeddings on the token level. For each token in the corpus vocabulary, I would like to create a list of all their contextual embeddings and average them to get one ...
Andrej's user avatar
  • 3,799
3 votes
4 answers
18k views

Cannot import BertModel from transformers

I am trying to import BertModel from transformers, but it fails. This is code I am using from transformers import BertModel, BertForMaskedLM This is the error I get ImportError: cannot import name '...
Moaz Mohammed Husain's user avatar

1
2 3 4 5
9