All Questions
Tagged with bert-language-model neural-network
31
questions
8
votes
1
answer
18k
views
How is the number of parameters be calculated in BERT model?
The paper "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding" by Devlin & Co. calculated for the base model size 110M parameters (i.e. L=12, H=768, A=12) ...
7
votes
1
answer
2k
views
Fine-tune Bert for specific domain (unsupervised)
I want to fine-tune BERT on texts that are related to a specific domain (in my case related to engineering). The training should be unsupervised since I don't have any labels or anything. Is this ...
5
votes
2
answers
5k
views
How to use trained BERT model checkpoints for prediction?
I trained the BERT with SQUAD 2.0 and got the model.ckpt.data, model.ckpt.meta, model.ckpt.index (F1 score : 81) in the output directory along with predictions.json, etc. using the BERT-master/...
4
votes
0
answers
1k
views
Word embeddings with BERT and map tensors to words
I try to aggregate BERT embeddings on the token level. For each token in the corpus vocabulary, I would like to create a list of all their contextual embeddings and average them to get one ...
3
votes
3
answers
5k
views
what is the difference between pooled output and sequence output in bert layer?
everyone! I was reading about Bert and wanted to do text classification with its word embeddings. I came across this line of code:
pooled_output, sequence_output = self.bert_layer([input_word_ids, ...
3
votes
1
answer
3k
views
ValueError: Unknown layer: TFBertModel. Please ensure this object is passed to the `custom_objects` argument
Here I training the bert model. below code i used to train, when i load the saved model for predict, it's shows this error. can anyone please help me out?
import tensorflow as tf
import logging
from ...
3
votes
2
answers
3k
views
How can i get all outputs of the last transformer encoder in bert pretrained model and not just the cls token output?
I'm using pytorch and this is the model from huggingface transformers link:
from transformers import BertTokenizerFast, BertForSequenceClassification
bert = BertForSequenceClassification....
3
votes
0
answers
1k
views
How to update vocabulary of pre-trained bert model while doing my own training task?
I am now working on a task of predicting masked word using BERT model. Unlike others, the answer needs to be chosen from specific options.
For instance:
sentence: "In my daily [MASKED], ..."
options:...
2
votes
2
answers
7k
views
using nn.Cross entropy between outputs and target label
I use this code
function to train the model
def train():
model.train()
total_loss, total_accuracy = 0, 0
# empty list to save model predictions
total_preds=[]
# iterate over ...
2
votes
1
answer
537
views
Multi Head Attention: Correct implementation of Linear Transformations of Q, K, V
I am implementing the Multi-Head Self-Attention in Pytorch now. I looked at a couple of implementations and they seem a bit wrong, or at least I am not sure why it is done the way it is. They would ...
2
votes
2
answers
1k
views
Translation between different tokenizers
Sorry if this question is too basic to be asked here. I tried but I couldn't find solutions.
I'm now working on an NLP project that requires using two different models (BART for summarization and BERT ...
2
votes
1
answer
2k
views
BigBird, or Sparse self-attention: How to implement a sparse matrix?
This question is related to the new paper: Big Bird: Transformers for Longer Sequences. Mainly, about the implementation of the Sparse Attention (that is specified in the Supplemental material, part D)...
2
votes
1
answer
1k
views
cannot import name 'DISTILBERT_PRETRAINED_MODEL_ARCHIVE_MAP' from 'transformers.modeling_distilbert'
I am trying to train the distil BERT model for Question Answering purpose.
I have installed simple transformers and everything but when I try to run the following command:
model = ...
2
votes
0
answers
449
views
Updating model parameters of two models in one optimizer optimizes just one neural network
I'm trying to train two sequential neural networks in one optimizer. I read that this can be done by defining the optimizer as follows:
optimizer_domain = torch.optim.SGD(list(sentences_model....
2
votes
1
answer
2k
views
How to compute the Hessian of a large neural network in PyTorch?
How to compute the Hessian matrix of a large neural network or transformer model like BERT in PyTorch? I know torch.autograd.functional.hessian, but it seems like it only calculates the Hessian of a ...
1
vote
2
answers
574
views
Fine tuning BERT with my own entities/labels
i would like to fine tune A BERT model with my own labels, like [COLOR, MATERIAL] and not the normal "NAME", "ORG".
I'm following this Colab: https://colab.research.google.com/drive/...
1
vote
1
answer
680
views
Overfitting training data but still improving on test data
My machine learning model massively overfits the training data but still performs quite well on test data. When using a neural network approach every iteration increases the accuracy on the test set ...
1
vote
1
answer
2k
views
ValueError: Target size (torch.Size([32])) must be the same as input size (torch.Size([32, 3]))
I've looked at some explanation.
Here
But I understand what is going wrong I think, but my error occurs not at the loss. For example the snippet where the error is occurring is the line outputs = ...
1
vote
0
answers
184
views
Concatenating two pre-trained BERT
max_length = 50
tokenizer = RobertaTokenizer.from_pretrained('roberta-large', do_lower_case=True)
encodings = tokenizer.batch_encode_plus(comments,max_length=max_length,pad_to_max_length=True, ...
1
vote
0
answers
61
views
How to Local Bert to Bert_module_hub
I just want to my Local Bert to here:
bert_module = hub.Module(
BERT_MODEL_HUB,
trainable=True)
How to add my local bert?
i have Tensorflow==1.15 and python==3.7
def create_model(is_predicting, ...
1
vote
0
answers
2k
views
Predicting NER with BertForTokenClassification model
I have built my model using this tutorial on NER with bert:
https://www.depends-on-the-definition.com/named-entity-recognition-with-bert/#resources
However, I could not figure out how to parse in a ...
0
votes
1
answer
2k
views
Using KerasClassifier for training neural network
I created a simple neural network for binary spam/ham text classification using pretrained BERT transformer. The current pure-keras implementation works fine. I wanted however to plot certain metrics ...
0
votes
1
answer
1k
views
BERT model not giving loss or logits when training in an epoch
I'm trying to train the model. This is the epoch loop
seed_val = 17
random.seed(seed_val)
np.random.seed(seed_val)
torch.manual_seed(seed_val)
torch.cuda.manual_seed_all(seed_val)
device = torch....
0
votes
1
answer
1k
views
Pytorch Siamese NN with BERT for sentence matching
I'm trying to build a Siamese neural network using pytorch in which I feed BERT word embeddings and trying to find whether two sentences are similar or not (imagine duplicate posts matching, product ...
0
votes
1
answer
566
views
Sequence labeling with BERT for words position
If I have a set of sentences and in these sentences there are some dependencies between words.
I want to train BERT to predict which words have dependencies with others.
Example, If I have this ...
0
votes
0
answers
68
views
Neural network classifier always outputs the same class
I'm coding a neural network for recommendation system using pytorch. The item's metadata is a textual description and user's metadata are age and gender (binary values). I used Bert encoder (with ...
0
votes
1
answer
3k
views
How does BERT loss function works?
I'm confused about how cross-entropy works in bert LM. To calculate loss function we need the truth labels of masks. But we don't have the vector representation of the truth labels and the predictions ...
0
votes
1
answer
3k
views
RuntimeError: shape '[4, 512]' is invalid for input of size 1024 while while evaluating test data
I am trying XLnet over Jigsaw toxic dataset.
When I train my data with
input_ids = d["input_ids"].reshape(4,512).to(device) # batch size x seq length
it trains perfectly.
But when I try to ...
0
votes
0
answers
162
views
How to use BigBirdModel to create a neural network in Python?
I am trying to create a network with tenserflow and BigBird.
from transformers import BigBirdModel
import tensorflow as tf
classic_model = BigBirdModel.from_pretrained('google/bigbird-roberta-base')
...
0
votes
0
answers
186
views
problem in training RNN using bert embeddings
I have been working with bert embedding using a neural network model for the sentiment classification task. during model fit it's giving indices error, and I am still new to this so could not able to ...
-1
votes
1
answer
793
views
Should feature embeddings be taken before or after dropout layer in neural network?
I am training a binary text classification model using BERT as follows:
def create_model():
text_input = tf.keras.layers.Input(shape=(), dtype=tf.string, name='text')
preprocessed_text = ...