Questions tagged [bert-language-model]
BERT, or Bidirectional Encoder Representations from Transformers, is a method of pre-training language representations which obtains state-of-the-art results on a wide array of Natural Language Processing (NLP) tasks. BERT uses Transformers (an attention mechanism that learns contextual relations between words or sub words in a text) to generate a language model.
1,796
questions
9
votes
1
answer
4k
views
Why BERT model have to keep 10% MASK token unchanged?
I am reading BERT model paper. In Masked Language Model task during pre-training BERT model, the paper said the model will choose 15% token ramdomly. In the chose token (Ti), 80% it will be replaced ...
9
votes
1
answer
8k
views
How do I use BertForMaskedLM or BertModel to calculate perplexity of a sentence?
I want to use BertForMaskedLM or BertModel to calculate perplexity of a sentence, so I write code like this:
import numpy as np
import torch
import torch.nn as nn
from transformers import ...
9
votes
1
answer
9k
views
BERT document embedding
I am trying to do document embedding using BERT. The code I use is a combination of two sources. I use BERT Document Classification Tutorial with Code, and BERT Word Embeddings Tutorial. Below is the ...
9
votes
1
answer
4k
views
BERT embedding for semantic similarity
I earlier posted this question. I wanted to get embedding similar to this youtube video, time 33 minutes onward.
1) I dont think that the embedding that i am getting from CLS token are similar to ...
8
votes
1
answer
18k
views
How is the number of parameters be calculated in BERT model?
The paper "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding" by Devlin & Co. calculated for the base model size 110M parameters (i.e. L=12, H=768, A=12) ...
8
votes
1
answer
9k
views
How to calculate perplexity of a sentence using huggingface masked language models?
I have several masked language models (mainly Bert, Roberta, Albert, Electra). I also have a dataset of sentences. How can I get the perplexity of each sentence?
From the huggingface documentation ...
8
votes
4
answers
16k
views
Error importing BERT: module 'tensorflow._api.v2.train' has no attribute 'Optimizer'
I tried to use bert-tensorflow in Google Colab, but I got the following error:
--------------------------------------------------------------------------- AttributeError ...
8
votes
1
answer
4k
views
Uni-directional Transformer VS Bi-directional BERT
I just finished reading the Transformer paper and BERT paper. But couldn't figure out why Transformer is uni-directional and BERT is bi-directional as mentioned in BERT paper. As they don't use ...
8
votes
1
answer
14k
views
How to store Word vector Embeddings?
I am using BERT Word Embeddings for sentence classification task with 3 labels. I am using Google Colab for coding. My problem is, since I will have to execute the embedding part every time I restart ...
8
votes
3
answers
5k
views
How to compute mean/max of HuggingFace Transformers BERT token embeddings with attention mask?
I'm using the HuggingFace Transformers BERT model, and I want to compute a summary vector (a.k.a. embedding) over the tokens in a sentence, using either the mean or max function. The complication is ...
8
votes
6
answers
6k
views
Problem with inputs when building a model with TFBertModel and AutoTokenizer from HuggingFace's transformers
I'm trying to build the model illustrated in this picture:
I obtained a pre-trained BERT and respective tokenizer from HuggingFace's transformers in the following way:
from transformers import ...
8
votes
1
answer
3k
views
HuggingFace BERT `inputs_embeds` giving unexpected result
The HuggingFace BERT TensorFlow implementation allows us to feed in a precomputed embedding in place of the embedding lookup that is native to BERT. This is done using the model's call method's ...
7
votes
1
answer
5k
views
How exactly should the input file be formatted for the language model finetuning (BERT through Huggingface Transformers)?
I wanted to employ the examples/run_lm_finetuning.py from the Huggingface Transformers repository on a pretrained Bert model. However, from following the documentation it is not evident how a corpus ...
7
votes
3
answers
19k
views
Why can't I import functions in bert after pip install bert
I am a beginner for bert, and I am trying to use files of bert given on the GitHub:https://github.com/google-research/bert
However I cannot import files(such as run_classifier, optimisation and so on)...
7
votes
2
answers
14k
views
The model did not return a loss from the inputs - LabSE error
I want to fine tune LabSE for Question answering using squad dataset. and i got this error:
ValueError: The model did not return a loss from the inputs, only the following keys: last_hidden_state,...
7
votes
2
answers
6k
views
The essence of learnable positional embedding? Does embedding improve outcomes better?
I was recently reading the bert source code from the hugging face project. I noticed that the so-called "learnable position encoding" seems to refer to a specific nn.Parameter layer when it ...
7
votes
1
answer
8k
views
max_seq_length for transformer (Sentence-BERT)
I'm using sentence-BERT from Huggingface in the following way:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-MiniLM-L6-v2')
model.max_seq_length = 512
model....
7
votes
1
answer
6k
views
Passing multiple sentences to BERT?
I have a dataset with paragraphs that I need to classify into two classes. These paragraphs are usually 3-5 sentences long. The overwhelming majority of them are less than 500 words long. I would like ...
7
votes
1
answer
14k
views
How padding in huggingface tokenizer works?
I tried following tokenization example:
tokenizer = BertTokenizer.from_pretrained(MODEL_TYPE, do_lower_case=True)
sent = "I hate this. Not that.",
_tokenized = tokenizer(sent, ...
7
votes
1
answer
2k
views
Fine-tune Bert for specific domain (unsupervised)
I want to fine-tune BERT on texts that are related to a specific domain (in my case related to engineering). The training should be unsupervised since I don't have any labels or anything. Is this ...
7
votes
1
answer
8k
views
Mismatched size on BertForSequenceClassification from Transformers and multiclass problem
I just trained a BERT model on a Dataset composed by products and labels (departments) for an e-commerce website. It's a multiclass problem. I used BertForSequenceClassification to predict the ...
7
votes
1
answer
2k
views
Use BERT under spaCy to get sentence embeddings
I am trying to use BERT to get sentence embeddings. Here is how I am doing it:
import spacy
nlp = spacy.load("en_core_web_trf")
nlp("The quick brown fox jumps over the lazy dog")....
7
votes
2
answers
4k
views
Pretraining a language model on a small custom corpus
I was curious if it is possible to use transfer learning in text generation, and re-train/pre-train it on a specific kind of text.
For example, having a pre-trained BERT model and a small corpus ...
7
votes
1
answer
8k
views
How to specify a proxy in transformers pipeline
I am using sentiment-analysis pipeline as described here.
from transformers import pipeline
classifier = pipeline('sentiment-analysis')
It's failing with a connection error message
ValueError: ...
7
votes
1
answer
4k
views
ModuleNotFoundError: No module named 'torch.utils._pytree'
I have installed PyTorch 1.7.1, and it works very well. However, when I try to run this code:
import transformers
from transformers import BertTokenizer
from transformers.models.bert.modeling_bert ...
7
votes
1
answer
2k
views
What is the significance of the magnitude/norm of BERT word embeddings?
We generally compare similarity between word embeddings with cosine similarity, but this only takes into account the angle between the vectors, not the norm. With word2vec, the norm of the vector ...
7
votes
1
answer
5k
views
Token indices sequence length error when using encode_plus method
I got a strange error when trying to encode question-answer pairs for BERT using the encode_plus method provided in the Transformers library.
I am using data from this Kaggle competition. Given a ...
6
votes
2
answers
9k
views
How to untokenize BERT tokens?
I have a sentence and I need to return the text corresponding to N BERT tokens to the left and right of a specific word.
from transformers import BertTokenizer
tz = BertTokenizer.from_pretrained("...
6
votes
2
answers
11k
views
BERT get sentence embedding
I am replicating code from this page. I have downloaded the BERT model to my local system and getting sentence embedding.
I have around 500,000 sentences for which I need sentence embedding and it is ...
6
votes
1
answer
11k
views
BertWordPieceTokenizer vs BertTokenizer from HuggingFace
I have the following pieces of code and trying to understand the difference between BertWordPieceTokenizer and BertTokenizer.
BertWordPieceTokenizer (Rust based)
from tokenizers import ...
6
votes
2
answers
6k
views
Can you train a BERT model from scratch with task specific architecture?
BERT pre-training of the base-model is done by a language modeling approach, where we mask certain percent of tokens in a sentence, and we make the model learn those missing mask. Then, I think in ...
6
votes
1
answer
7k
views
"You have to specify either input_ids or inputs_embeds", but I did specify the input_ids
I trained a BERT based encoder decoder model (EncoderDecoderModel) named ed_model with HuggingFace's transformers module.
I used the BertTokenizer named as input_tokenizer
I tokenized the input with:
...
6
votes
2
answers
3k
views
ImportError when from transformers import BertTokenizer
My code is:
import torch
from transformers import BertTokenizer
from IPython.display import clear_output
I got error in line from transformers import BertTokenizer:
ImportError: /lib/x86_64-linux-gnu/...
6
votes
3
answers
5k
views
How to stop BERT from breaking apart specific words into word-piece
I am using a pre-trained BERT model to tokenize a text into meaningful tokens. However, the text has many specific words and I don't want BERT model to break them into word-pieces. Is there any ...
6
votes
1
answer
8k
views
huggingface bert showing poor accuracy / f1 score [pytorch]
I am trying BertForSequenceClassification for a simple article classification task.
No matter how I train it (freeze all layers but the classification layer, all layers trainable, last k layers ...
6
votes
2
answers
3k
views
How to test masked language model after training it?
I have followed this tutorial for masked language modelling from Hugging Face using BERT, but I am unsure how to actually deploy the model.
Tutorial: https://github.com/huggingface/notebooks/blob/...
6
votes
2
answers
3k
views
How to fix random seed for BERTopic?
I'd like to fix the random seed from BERTopic library to get reproducible results. Looking at the code of BERTopic I see it uses numpy. Will using np.random.seed(123) be enough? or do I also need to ...
6
votes
1
answer
34k
views
Pytorch expects each tensor to be equal size
When running this code: embedding_matrix = torch.stack(embeddings)
I got this error:
RuntimeError: stack expects each tensor to be equal size, but got [7, 768] at entry 0 and [8, 768] at entry 1
I'm ...
6
votes
3
answers
9k
views
Huggingface BERT Tokenizer add new token
I am using Huggingface BERT for an NLP task. My texts contain names of companies which are split up into subwords.
tokenizer = BertTokenizerFast.from_pretrained('bert-base-uncased')
tokenizer....
6
votes
1
answer
3k
views
Bert Embedding Layer raises `Type Error: unsupported operand type(s) for +: 'None Type' and 'int'` with BiLSTM
I've problems integrating Bert Embedding Layer in a BiLSTM model for word sense disambiguation task,
Windows 10
Python 3.6.4
TenorFlow 1.12
Keras 2.2.4
No virtual environments were used
PyCharm ...
6
votes
1
answer
5k
views
Using BERT to generate similar word or synonyms through word embeddings
As we all know the capability of BERT model for word embedding, it is probably better than the word2vec and any other models.
I want to create a model on BERT word embedding to generate synonyms or ...
6
votes
1
answer
8k
views
Sliding window for long text in BERT for Question Answering
I've read post which explains how the sliding window works but I cannot find any information on how it is actually implemented.
From what I understand if the input are too long, sliding window can be ...
6
votes
1
answer
9k
views
Using BERT Embeddings in Keras Embedding layer
I want to use the BERT Word Vector Embeddings in the Embeddings layer of LSTM instead of the usual default embedding layer. Is there any way I can do it?
6
votes
2
answers
7k
views
Latest Pre-trained Multilingual Word Embedding
Are there any latest pre-trained multilingual word embeddings (multiple languages are jointly mapped to a same vector space)?
I have looked at the following but they don't fit my needs:
FastText / ...
6
votes
1
answer
1k
views
BERT performing worse than word2vec
I am trying to use BERT for a document ranking problem. My task is pretty straightforward. I have to do a similarity ranking for an input document. The only issue here is that I don’t have labels - so ...
6
votes
1
answer
939
views
How to predict the probability of an empty string using BERT
Suppose we have a template sentence like this:
"The ____ house is our meeting place."
and we have a list of adjectives to fill in the blank, e.g.:
"yellow"
"large"
&...
6
votes
3
answers
5k
views
Why aren't transformers imported in Python?
I want to import transformers in jupyter notebook but I get the following error. What is the reason for this error? My Python version is 3.8
ImportError: cannot import name 'TypeAlias' from '...
6
votes
3
answers
4k
views
TypeError: Layer input_spec must be an instance of InputSpec. Got: InputSpec(shape=(None, 128, 768), ndim=3)
I am trying to use a BERT pretrained model to do a multiclass classification (of 3 classes). Here's my function to use the model and also added some extra functionalities:
def create_model(max_seq_len,...
6
votes
1
answer
9k
views
How to feed Bert embeddings to LSTM
I am working on a Bert + MLP model for text classification problem. Essentially, I am trying to replace the MLP model with a basic LSTM model.
Is it possible to create a LSTM with embedding? Or, is ...
6
votes
0
answers
6k
views
How to add index to python FAISS incrementally
I am using Faiss to index my huge dataset embeddings, embedding generated from bert model. I want to add the embeddings incrementally, it is working fine if I only add it with faiss.IndexFlatL2 , but ...