Questions tagged [bert-language-model]
BERT, or Bidirectional Encoder Representations from Transformers, is a method of pre-training language representations which obtains state-of-the-art results on a wide array of Natural Language Processing (NLP) tasks. BERT uses Transformers (an attention mechanism that learns contextual relations between words or sub words in a text) to generate a language model.
1,803
questions
6
votes
1
answer
4k
views
Fine-tuning BERT sentence transformer model
I am using a pre-trained BERT sentence transformer model, as described here https://www.sbert.net/docs/training/overview.html , to get embeddings for sentences.
I want to fine-tune these pre-trained ...
6
votes
0
answers
2k
views
How to slice string depending on length of tokens
When I use (with a long test_text and short question):
from transformers import BertTokenizer
import torch
from transformers import BertForQuestionAnswering
tokenizer = BertTokenizer.from_pretrained('...
5
votes
1
answer
12k
views
How to get cosine similarity of word embedding from BERT model
I was interesting in how to get the similarity of word embedding in different sentences from BERT model (actually, that means words have different meanings in different scenarios).
For example:
sent1 =...
5
votes
3
answers
32k
views
Unable to pip install -U sentence-transformers
I am unable to do: pip install -U sentence-transformers. I get this message on Anaconda Prompt:
ERROR: Could not find a version that satisfies the requirement torch>=1.0.1 (from sentence-transformers) ...
5
votes
3
answers
6k
views
AttributeError: 'str' object has no attribute 'dim' in pytorch
I got the following error output in the PyTorch when sent model predictions into the model. Does anyone know what's going on?
Following are the architecture model that I created, in the error output, ...
5
votes
1
answer
13k
views
TypeError: linear(): argument 'input' (position 1) must be Tensor, not str
so ive been trying to work on some example of bert that i found on github as its the first time im trying to use bert and see how it works. The respiratory im working with is the following: https://...
5
votes
2
answers
38k
views
No module named 'transformers.models' while trying to import BertTokenizer
I am trying to import BertTokenizer from the transformers library as follows:
import transformers
from transformers import BertTokenizer
from transformers.modeling_bert import BertModel, ...
5
votes
1
answer
11k
views
Overfitting when fine-tuning BERT sentiment analysis
I am newbie to Machine Learning in general. I am currently trying to follow a tutorial on sentiment analysis using BERT and Transformers https://curiousily.com/posts/sentiment-analysis-with-bert-and-...
5
votes
1
answer
3k
views
How does BertForSequenceClassification classify on the CLS vector?
Background:
Following along with this question when using bert to classify sequences the model uses the "[CLS]" token representing the classification task. According to the paper:
The first ...
5
votes
1
answer
5k
views
How to get the probability of a particular token(word) in a sentence given the context
I'm trying to calculate the probability or any type of score for words in a sentence using NLP. I've tried this approach with GPT2 model using Huggingface Transformers library, but, I couldn't get ...
5
votes
1
answer
818
views
How to save parameters just related to classifier layer of pretrained bert model due to the memory concerns?
I fine tuned the pretrained model here by freezing all layers except the classifier layers. And I saved weight file with using pytorch as .bin format.
Now instead of loading the 400mb pre-trained ...
5
votes
1
answer
1k
views
What is the difference between Transformer encoder vs Transformer decoder vs Transformer encoder-decoder?
I know that GPT uses Transformer decoder, BERT uses Transformer encoder, and T5 uses Transformer encoder-decoder. But can someone help me understand why GPT only uses the decoder, BERT only uses ...
5
votes
1
answer
3k
views
Getting embedding lookup result from BERT
Prior to passing my tokens through BERT, I would like to perform some processing on their embeddings, (the result of the embedding lookup layer). The HuggingFace BERT TensorFlow implementation allows ...
5
votes
3
answers
1k
views
BERT token vs. embedding
I understand that WordPiece is used to break text into tokens. And I understand that, somewhere in BERT, the model maps tokens into token embeddings that represent the meaning of the tokens. But ...
5
votes
1
answer
2k
views
Extend BERT or any transformer model using manual features
I have been doing a thesis in my citation classifications. I just implemented Bert model for the classification of citations. I have 4 output classes and I give an input sentence and my model returns ...
5
votes
2
answers
3k
views
Initialize HuggingFace Bert with random weights
How is it possible to initialize BERT with random weights? I want to compare the performance of multilingual vs monolingual vs randomly initialized BERT in a masked language modeling task. While in ...
5
votes
2
answers
2k
views
BERT-based NER model giving inconsistent prediction when deserialized
I am trying to train an NER model using the HuggingFace transformers library on Colab cloud GPUs, pickle it and load the model on my own CPU to make predictions.
Code
The model is the following:
from ...
5
votes
2
answers
3k
views
Can I use BERT as a feature extractor without any finetuning on my specific data set?
I'm trying to solve a multilabel classification task of 10 classes with a relatively balanced training set consists of ~25K samples and an evaluation set consists of ~5K samples.
I'm using the ...
5
votes
1
answer
2k
views
Does BertForSequenceClassification classify on the CLS vector?
I'm using the Huggingface Transformer package and BERT with PyTorch. I'm trying to do 4-way sentiment classification and am using BertForSequenceClassification to build a model that eventually leads ...
5
votes
2
answers
3k
views
Why are the matrices in BERT called Query, Key, and Value?
Within the transformer units of BERT, there are modules called Query, Key, and Value, or simply Q,K,V.
Based on the BERT paper and code (particularly in modeling.py), my pseudocode understanding of ...
5
votes
1
answer
8k
views
run python parameters in Google Colab
I am running a python file in Google Colab and getting an error. I am following a bert text classification example from this link;
https://appliedmachinelearning.blog/2019/03/04/state-of-the-art-text-...
5
votes
2
answers
19k
views
ERROR: file:///content does not appear to be a Python project: neither 'setup.py' nor 'pyproject.toml' found
https://colab.research.google.com/drive/11u6leEKvqE0CCbvDHHKmCxmW5GxyjlBm?usp=sharing
setup.py file is in transformers folder(root directory). But this error occurs when I run
!git clone https://...
5
votes
1
answer
612
views
Cast topic modeling outcome to dataframe
I have used BertTopic with KeyBERT to extract some topics from some docs
from bertopic import BERTopic
topic_model = BERTopic(nr_topics="auto", verbose=True, n_gram_range=(1, 4), ...
5
votes
2
answers
4k
views
BERT model : "enable_padding() got an unexpected keyword argument 'max_length'"
I am trying to implement the BERT model architecture using Hugging Face and KERAS. I am learning this from the Kaggle (link) and try to understand it. When I tokenized my data, I face some problems ...
5
votes
1
answer
4k
views
Use Bert to predict multiple tokens
I'm looking for suggestions on using Bert and Bert's masked language model to predict multiple tokens.
My data looks like:
context: some very long context paragraph
question: rainy days lead to @...
5
votes
1
answer
9k
views
Get the value of '[UNK]' in BERT
I have designed a model based on BERT to solve NER task. I am using transformers library with the "dccuchile/bert-base-spanish-wwm-cased" pre-trained model. The problem comes when my model detect an ...
5
votes
2
answers
5k
views
How to use trained BERT model checkpoints for prediction?
I trained the BERT with SQUAD 2.0 and got the model.ckpt.data, model.ckpt.meta, model.ckpt.index (F1 score : 81) in the output directory along with predictions.json, etc. using the BERT-master/...
5
votes
1
answer
2k
views
Fine-tune a BERT model for context specific embeddigns
I'm trying to find information on how to train a BERT model, possibly from the Huggingface Transformers library, so that the embedding it outputs are more closely related to the context o the text I'm ...
5
votes
1
answer
6k
views
How do I interpret my BERT output from Huggingface Transformers for Sequence Classification and tensorflow?
I am using bert for a sequence classification task with 3 labels. To do this, I am using huggingface transformers with tensorflow, more specifically the TFBertForSequenceClassification class with the ...
5
votes
2
answers
2k
views
Error with using BERT model from Tensorflow
I have tried to follow Tensorflow instructions to use BERT model: (https://www.tensorflow.org/tutorials/text/classify_text_with_bert)
However, when I run these lines:
text_test = ['this is such an ...
5
votes
1
answer
2k
views
Huggingface Trainer(): K-Fold Cross Validation
I am following this tutorial from TowardsDataScience for text classification using Huggingface Trainer.
To get a more robust model I want to do a K-Fold Cross Validation, but I am not sure how to do ...
5
votes
1
answer
2k
views
TypeError: argmax(): argument 'input' (position 1) must be Tensor, not str
My code was working fine and when I tried to run it today without changing anything I got the following error:
TypeError: argmax(): argument 'input' (position 1) must be Tensor, not str
Would ...
5
votes
2
answers
1k
views
Loss function for comparing two vectors for categorization
I am performing a NLP task where I analyze a document and classify it into one of six categories. However, I do this operation at three different time periods. So the final output is an array of three ...
5
votes
2
answers
258
views
Same sentences produces a different vector in XLNet
I have computed the vectors for two same sentences using XLNet embedding-as-service. But the model produces different vector embeddings for both the two same sentences hence the cosine similarity is ...
5
votes
0
answers
2k
views
Decode sentence representation derived from SentenceTransformer
Is it possible to decode a sentence representation derived from SentenceTransformer back to a sentence?
See example from the documentation
from sentence_transformers import SentenceTransformer
model = ...
5
votes
1
answer
6k
views
Unable to use custom dataset: AttributeError: 'list' object has no attribute 'keys'
I am trying to train a classification model with a custom dataset using Huggingface Transformers, but I keep getting errors. Last error seems solvable but I somehow I do not understand how.
What am I ...
5
votes
1
answer
2k
views
What does merge.txt file mean in BERT-based models in HuggingFace library?
I am trying to understand what merge.txt file infers in tokenizers for RoBERTa model in HuggingFace library. However, nothing is said about it on their website. Any help is appreciated.
5
votes
4
answers
5k
views
Convert a BERT Model to TFLite
I have this code for semantic search engine built using the pre-trained bert model. I want to convert this model into tflite for deploying it to google mlkit. I want to know how to convert it. I want ...
5
votes
0
answers
1k
views
Huggingface Bert TPU fine-tuning works on Colab but not in GCP
I'm trying to fine-tune a Huggingface transformers BERT model on TPU. It works in Colab but fails when I switch to a paid TPU on GCP. Jupyter notebook code is as follows:
[1] model = transformers....
4
votes
2
answers
4k
views
huggingface - save fine tuned model locally - and tokenizer too?
I just wonder if the tokenizer is somehow affected or changed if fine tune a BERT model and save it. Do I need to save the tokenizer locally too to reload it when using the saved BERT model later?
I ...
4
votes
2
answers
9k
views
Why we need the init_weight function in BERT pretrained model in Huggingface Transformers?
In the code by Hugginface transformers, there are many fine-tuning models have the function init_weight.
For example(here), there is a init_weight function at last.
class ...
4
votes
1
answer
7k
views
BERT for time series classification
I’d like to train a transformer encoder (e.g. BERT) on time-series data for a task that can be modeled as classification. Let met briefly describe the data I’m using before talking about the issue I’m ...
4
votes
2
answers
9k
views
How to use word embeddings (i.e., Word2vec, GloVe or BERT) to calculate the most word similarity in a set of N words?
I am trying to calculate the semantic similarity by inputting the word list and output a word, which is the most word similarity in the list.
E.g.
If I pass in a list of words
words = ['portugal', '...
4
votes
1
answer
11k
views
How can i use BERT fo machine Translation?
I got a big problem. For my bachelor thesis I have to make a machine tranlation model with BERT.
But I am not getting anywhere right now.
Do you know a documentation or something that can help me ...
4
votes
3
answers
27k
views
OSError for huggingface model
I am trying to use a huggingface model (CamelBERT), but I am getting an error when loading the tokenizer:
Code:
from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer....
4
votes
2
answers
6k
views
How to increase dimension-vector size of BERT sentence-transformers embedding
I am using sentence-transformers for semantic search but sometimes it does not understand the contextual meaning and returns wrong result
eg. BERT problem with context/semantic search in italian ...
4
votes
1
answer
7k
views
How to prepare text for BERT - getting error
I am trying to learn BERT for text classification. I am finding some problem in preparing data for using BERT.
From my Dataset, I am segregating the sentiments and reviews as:
X = df['sentiments']
y = ...
4
votes
1
answer
2k
views
Finetuning BERT on custom data
I want to train a 21 class text classification model using Bert. But I have very little training data, so a downloaded a similar dataset with 5 classes with 2 million samples.t
And finetuned ...
4
votes
2
answers
7k
views
Saving BERT Sentence Embedding
I'm currently working on an information retrieval task. I'm using SBERT to perform a semantic search. I already follows the documentation here
The model i use
model = SentenceTransformer('sentence-...
4
votes
2
answers
4k
views
UnparsedFlagAccessError: Trying to access flag --preserve_unused_tokens before flags were parsed. BERT
I want to use Bert language model for training a multi class text classification task.
Previously I trained using LSTM without any Error but Bert gives me this Error.
I get the following Error and I ...