All Questions
Tagged with bert-language-model python-3.x
86
questions
12
votes
8
answers
37k
views
SSLError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /dslim/bert-base-NER/resolve/main/tokenizer_config.json
I am facing below issue while loading the pretrained BERT model from HuggingFace due to SSL certificate error.
Error:
SSLError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries ...
9
votes
1
answer
9k
views
BERT document embedding
I am trying to do document embedding using BERT. The code I use is a combination of two sources. I use BERT Document Classification Tutorial with Code, and BERT Word Embeddings Tutorial. Below is the ...
8
votes
1
answer
14k
views
How to store Word vector Embeddings?
I am using BERT Word Embeddings for sentence classification task with 3 labels. I am using Google Colab for coding. My problem is, since I will have to execute the embedding part every time I restart ...
6
votes
1
answer
9k
views
Using BERT Embeddings in Keras Embedding layer
I want to use the BERT Word Vector Embeddings in the Embeddings layer of LSTM instead of the usual default embedding layer. Is there any way I can do it?
6
votes
0
answers
6k
views
How to add index to python FAISS incrementally
I am using Faiss to index my huge dataset embeddings, embedding generated from bert model. I want to add the embeddings incrementally, it is working fine if I only add it with faiss.IndexFlatL2 , but ...
6
votes
0
answers
2k
views
How to slice string depending on length of tokens
When I use (with a long test_text and short question):
from transformers import BertTokenizer
import torch
from transformers import BertForQuestionAnswering
tokenizer = BertTokenizer.from_pretrained('...
5
votes
3
answers
6k
views
AttributeError: 'str' object has no attribute 'dim' in pytorch
I got the following error output in the PyTorch when sent model predictions into the model. Does anyone know what's going on?
Following are the architecture model that I created, in the error output, ...
5
votes
1
answer
8k
views
run python parameters in Google Colab
I am running a python file in Google Colab and getting an error. I am following a bert text classification example from this link;
https://appliedmachinelearning.blog/2019/03/04/state-of-the-art-text-...
5
votes
1
answer
612
views
Cast topic modeling outcome to dataframe
I have used BertTopic with KeyBERT to extract some topics from some docs
from bertopic import BERTopic
topic_model = BERTopic(nr_topics="auto", verbose=True, n_gram_range=(1, 4), ...
5
votes
1
answer
9k
views
Get the value of '[UNK]' in BERT
I have designed a model based on BERT to solve NER task. I am using transformers library with the "dccuchile/bert-base-spanish-wwm-cased" pre-trained model. The problem comes when my model detect an ...
5
votes
1
answer
6k
views
Unable to use custom dataset: AttributeError: 'list' object has no attribute 'keys'
I am trying to train a classification model with a custom dataset using Huggingface Transformers, but I keep getting errors. Last error seems solvable but I somehow I do not understand how.
What am I ...
4
votes
1
answer
7k
views
How to prepare text for BERT - getting error
I am trying to learn BERT for text classification. I am finding some problem in preparing data for using BERT.
From my Dataset, I am segregating the sentiments and reviews as:
X = df['sentiments']
y = ...
3
votes
3
answers
5k
views
what is the difference between pooled output and sequence output in bert layer?
everyone! I was reading about Bert and wanted to do text classification with its word embeddings. I came across this line of code:
pooled_output, sequence_output = self.bert_layer([input_word_ids, ...
3
votes
1
answer
2k
views
Bert pre-trained model giving random output each time
I was trying to add an additional layer after huggingface bert transformer, so I used BertForSequenceClassification inside my nn.Module Network. But, I see the model is giving me random outputs when ...
3
votes
1
answer
3k
views
How to combine embeddins vectors of bert with other features?
I am working on a classification task with 3 labels (0,1,2 = neg, pos, neu). Data are sentences. So to produce vectors/embeddings of sentences, I use a Bert encoder to get embeddings for each sentence ...
3
votes
3
answers
2k
views
Transformers pipeline model directory
I'm using the Huggingface's Transformers pipeline function to download the model and the tokenizer, my Windows PC downloaded them but I don't know where they are stored on my PC. Can you please help ...
3
votes
0
answers
164
views
How to get the most similar match using BERT from a pandas column to an input string?
I am trying to find the most similar match in a column of a pandas dataframe to an input string that is not in English (Swedish). This is what I have tried. I have encoded both my input string and the ...
2
votes
1
answer
2k
views
zsh: no matches found: bertopic[visualization] [duplicate]
I am trying to install bertopic[visualization] in my macbook pro using
pip3 install bertopic[visualization]
but I am getting an error whenever I am running the above command. The error is as given ...
2
votes
2
answers
1k
views
Map BERTopic topic IDs back to the training dataframe
I have trained a BERTopic model on a dataframe of length of 400k. I want to map the topics of each document in a new column inside the dataframe. I could do that by running a for loop on all the ...
2
votes
2
answers
5k
views
"Input is not valid. Should be a string, a list/tuple of strings or a list/tuple of integers." ValueError: Input is not valid
I am using Bert tokenizer for french and I am getting this error but I do not seems to solutionated it. If you have a suggestion.
Traceback (most recent call last):
File "training_cross_data_2....
2
votes
1
answer
2k
views
RuntimeError: Given groups=3, weight of size 12 64 3 768, expected input[32, 12, 30, 768] to have 192 channels, but got 12 channels instead
I started working with Pytorch recently so my understanding of it isn't quite strong. I previously had a 1 layer CNN but wanted to extend it to 2 layers, but the input and output channels have been ...
2
votes
1
answer
3k
views
How to freeze some layers of BERT in fine tuning in tf2.keras
I am trying to fine-tune 'bert-based-uncased' on a dataset for a text classification task. Here is the way I am downloading the model:
import tensorflow as tf
from transformers import ...
2
votes
1
answer
3k
views
Calculate precision, recall, f1 score for custom dataset for multiclass classification Huggingface library
I am trying to do multiclass classification for the sentence pair task. I uploaded my custom dataset of train and test separately in the hugging face data set and trained my model and tested it and ...
2
votes
2
answers
2k
views
Fine-tune BERT for a specific domain on a different language?
I want to fine-tune on a pre-trained BERT model.
However, my task uses data within a specific domain (say biomedical data).
Additionally, my data is also in a language different from English (say ...
2
votes
1
answer
949
views
BERT: How to use bert-as-service with BioBERT?
bioBERT is throwing error mentioned down below :
But I can able to run other BERT versions uncased_L-12_H-768_A-12 and sciBERT of BERT using below statement:
bert-serving-start -model_dir C:\Users\...
2
votes
0
answers
496
views
How to resolve the mismatch of pre-trained model parameter and current parameter?
I'm using pre-trained BERT model for NER task(bert-base-NER) and I need more token categories than the model had(PER,LOC,ORG,MIS,O). Based on that I create my own dataset which include 7 categories, ...
2
votes
1
answer
388
views
BERTopic: pop from empty list IndexError while Inferencing
I have trained a BERTopic model on colab and I am now trying to use it locally I get the IndexError.
IndexError: Failed in nopython mode pipeline (step: analyzing bytecode)
pop from empty list
The ...
2
votes
0
answers
500
views
Using RoBERTa model with transformers-interpret library
I've been trying to use transformers-interpret library and have been successful in getting the results for facebook's BART model, but not for the RoBERTa.
My code goes as follows for the BART model :
...
2
votes
0
answers
558
views
TypeError: dropout(): argument 'input' (position 1) must be Tensor, not str Bert Model
Hi I encounter this error when I was training my Bert Model for sentiment analysis, where my classes have 3 outcomes and my input data is text.
So I got the above error when I am training the model. I ...
2
votes
2
answers
2k
views
Python RuntimeError: input sequence
I try to run NER in Indonesian Language
I've read some resources, they said that the BERT model has positional embeddings only for first 512 subtokens. So, the model can't work with longer sequences. ...
1
vote
2
answers
5k
views
Pytorch - Caught StopIteration in replica 1 on device 1 error while Training on GPU
I am trying to train a BertPunc model on the train2012 data used in the git link: https://github.com/nkrnrnk/BertPunc.
While running on the server, with 4 GPUs enabled, below is the error I get:
...
1
vote
1
answer
1k
views
BERTopic Embeddings ValueError when transform a new text
I have created embeddings using SentenceTransformer and trained a BERTopic model on those embeddings.
sentence_model = SentenceTransformer("all-MiniLM-L6-v2")
embeddings = sentence_model....
1
vote
2
answers
2k
views
How can I train an XGBoost with a generator?
I'm attempting to stack a BERT tensorflow model with and XGBoost model in python. To do this, I have trained the BERT model and and have a generator that takes the predicitons from BERT (which ...
1
vote
1
answer
1k
views
BERT binary Textclassification get different results every run
I do binary text classification with BERT from the Simpletransformer.
I work in Colab with GPU runtime type.
I have generated train and test set with the sklearn StratifiedKFold Method. I have two ...
1
vote
1
answer
298
views
get contrastive_logits_per_image with flava model using huggingface library
I have used a code of Flava model from this link:
https://huggingface.co/docs/transformers/model_doc/flava#transformers.FlavaModel.forward.example
But I am getting the following error:
'...
1
vote
1
answer
525
views
How to add simple custom pytorch-crf layer on top of TokenClassification model using pytorch and Trainer
I followed this link, but its implemented in Keras.
Cannot add CRF layer on top of BERT in keras for NER
Model description
Is it possible to add simple custom pytorch-crf layer on top of ...
1
vote
1
answer
551
views
AI Based Deduplication using Textual Similarity Measure in Python
Given I have a dataframe that contains rows like this
ID
Title
Abstract
Keywords
Author
Year
5875
Textual Similarity: A Review
Textual Similarity has been used for measuring ...
X, Y, Z
James Thomas
...
1
vote
0
answers
80
views
Python BERTopic 'numpy.float64' object cannot be interpreted as an integer
I am trying to replicate the Topic Modeling exercise from this article titled NLP Tutorial: Topic Modeling in Python with BerTopic. The article comes from the website HackerNoon if you'd prefer to ...
1
vote
0
answers
193
views
What if I have too many documents labelled in -1 cluster in bertopic?
I'm generating topics using bertopic on multilingual dataset (mainly Russian and English). I'm reducing the number of topics to 140. After generating topics, I'm analyzing its quality using the ...
1
vote
0
answers
146
views
Classification report in multi label
I try to use BERT for multi-label tasks. My data set has 1000 data. I first use train_test_split to use 80% of my data set as a training set and 20% as a verification set. It is reasonable to say that ...
1
vote
1
answer
499
views
ValueError: [E109] Component 'tagger' could not be run. Did you forget to call `initialize()`?
I use jupyter notebook for writing code, but our team wants me to write code using visual studio code so we can do version control and merges in Git. I set up my environment with new versions of ...
1
vote
1
answer
653
views
How to read BertForMaskedLM with BertModel?
I have fine-tuned BertForMaskedLM and now I want to read it with BertModel. But my saved model looks like this:
BertForMaskedLM(
(bert): BertModel(
(embeddings): BertEmbeddings(
(...
1
vote
0
answers
3k
views
How to pop elements from a tensor in Pytorch?
I want to drop/pop elements from a tensor in Pytorch, something similar to pop operation in python. In the following code , if the condition is met, it removes two elements from the array, current and ...
1
vote
0
answers
630
views
finBert Model - Config JSON File - Outputs Nothing
This is for running the ProsusAI finBert Model.
(https://github.com/ProsusAI/finBERT - GitHub)
(https://huggingface.co/ProsusAI/finbert - HuggingFace)
I downloaded the pytorch_model.bin file and used ...
1
vote
1
answer
931
views
Getting predict.proba from BERT classififer
I have a classifier on top of BERT, and I would like to see the predict probability for creating the ROC curve. How do I get the predict proba?. The predicted probas will be used to calculate the TPR ...
1
vote
1
answer
2k
views
Cannot import name 'network' from 'tensorflow.python.keras.engine'
When trying to load BERT QA I get the following ImportError:
"Cannot import name 'network' from 'tensorflow.python.keras.engine'"
The full error log follows below
Following this post,
...
1
vote
0
answers
633
views
Microsoft LayoutLM model error with huggingface
I was trying to utilize the https://github.com/microsoft/unilm/tree/master/layoutlm for document classification purpose, but was constantly getting "OSError: Unable to load weights from pytorch ...
1
vote
0
answers
61
views
How to Local Bert to Bert_module_hub
I just want to my Local Bert to here:
bert_module = hub.Module(
BERT_MODEL_HUB,
trainable=True)
How to add my local bert?
i have Tensorflow==1.15 and python==3.7
def create_model(is_predicting, ...
1
vote
1
answer
557
views
'list' object has no attribute 'shape
I am passing an embedding matrix to the embedding layer in Keras
model = Sequential()
model.add(Embedding(max_words, 30, input_length=max_len, weights=[all]))
model.add(BatchNormalization())
model.add(...
1
vote
0
answers
441
views
BERT - modify run_squad.py predictions file
I'm new to BERT and I'm trying to edit the output of run_squad.py for build up a Question Answering system and obtain an output file with the following structure:
{
"data": [
{
"...