Highest scored 'bert-language-model+embedding' questions

9 votes

1 answer

9k views

BERT document embedding

I am trying to do document embedding using BERT. The code I use is a combination of two sources. I use BERT Document Classification Tutorial with Code, and BERT Word Embeddings Tutorial. Below is the ...

MRM

1,159

asked Aug 1, 2020 at 20:52

6 votes

1 answer

9k views

Using BERT Embeddings in Keras Embedding layer

I want to use the BERT Word Vector Embeddings in the Embeddings layer of LSTM instead of the usual default embedding layer. Is there any way I can do it?

PeakyBlinder

1,107

asked Jul 7, 2020 at 9:12

5 votes

3 answers

1k views

BERT token vs. embedding

I understand that WordPiece is used to break text into tokens. And I understand that, somewhere in BERT, the model maps tokens into token embeddings that represent the meaning of the tokens. But ...

i82much

61

asked Sep 27, 2023 at 18:03

3 votes

2 answers

2k views

Where can I get the pretrained word embeddinngs for BERT?

I know that BERT has total vocabulary size of 30522 which contains some words and subwords. I want to get the initial input embeddings of BERT. So, my requirement is to get the table of size [30522, ...

Ruchit

346

asked Jul 28, 2020 at 2:57

3 votes

1 answer

2k views

Using Sentence-Bert with other features in scikit-learn

I have a dataset, one feature is text and 4 more features. Sentence-Bert vectorizer transforms text data into tensors. I can use these sparse matrices directly with a machine learning classifier. Can ...

Narges Se

45

asked Oct 15, 2021 at 13:03

3 votes

1 answer

5k views

How to save sentence-Bert output vectors to a file?

I am using Bert to get similarity between multi term words.here is my code that I used for embedding : from sentence_transformers import SentenceTransformer model = SentenceTransformer('bert-large-...

Sahar Rezazadeh

314

asked Jul 11, 2021 at 9:16

3 votes

0 answers

3k views

NLP - Best document embedding library

Good day, fellow humans (?). I have a methodological question that is confused by a deep research in a tiny amount of time. The question arises from the following problem(s): I need to apply semi-...

cyberZamp

159

asked Nov 21, 2019 at 12:44

3 votes

0 answers

372 views

Does ktrain package combine input embedding with bert embedding when used for test classification?

I am running the code given in the link below. What embeddings does the ktrain package of python use for bert text classification. I believe the code is using a pre-trained model of Bert. In that is ...

POOJA BHATIA

31

asked Nov 19, 2019 at 15:10

2 votes

2 answers

1k views

keras LSTM get hidden-state (converting sentece-sequence to document context vectors)

Im trying to create document context vectors from sentence-vectors via LSTM using keras (so each document consist of a sequence of sentence vectors). My goal is to replicate the following blog post ...

Felix

323

asked Dec 27, 2019 at 9:11

2 votes

1 answer

2k views

cuda out of memory problem in gpu in google colab

I am trying to run code to get stacked embedding from flair and bert and I am getting the following error. one of the suggestion was to reduce the batch size, but how to pass the data in batches? here ...

shankar

39

asked Nov 12, 2020 at 18:26

2 votes

0 answers

195 views

Can we use bert on an image with embeddings created with the help of CNN architecture?

I am trying to use bert for the images, the following steps I'm considering to do this approach: Create an embedding of an image using VggNet (extracting avgpool layer from the network). Using PCA ...

lazytux

167

asked May 18, 2022 at 16:13

1 vote

1 answer

259 views

Understanding the results of Vespa BERT embeddings

I am copying parts of the Simple Semantic Search sample application at https://github.com/vespa-engine/sample-apps/tree/master/simple-semantic-search to get started with dense vector search. I have ...

Roope K

101

asked Oct 2, 2022 at 8:29

1 vote

1 answer

3k views

Max position embedding in BERT

I'm studying BERT right now. I thought BERT limits position embedding as 512 because of the memory problem. However, when I look up the BERT code in hugging face I found this parameter on config. ...

Q_Jay

11

asked Jun 6, 2022 at 9:40

1 vote

1 answer

931 views

Why can Bert's three embeddings be added?

I already know the meaning of Token Embedding, Segment Embedding, and Position Embedding. But why can these three vectors be added together? The Size and direction of vectors will change after the ...

Ray Tom

13

asked Mar 3, 2020 at 11:03

1 vote

2 answers

93 views

Calculating embedding overload problems with BERT

I'm trying to calculate the embedding of a sentence using BERT. After I input the sentence into BERT, I calculate the Mean-pooling, which is used as the embedding of the sentence. Problem My code can ...

edamame

31

asked Nov 28, 2022 at 2:39

1 vote

1 answer

514 views

Is it normal that model output slightly different on different platforms?

I am using Huggingface to generate bert embedding for text, but they are slightly different for the same text on my Mac and Linux platforms. For examples, one pair of results: Mac [0.9832047820091248, ...

marlon

6,847

asked Oct 26, 2021 at 19:57

1 vote

1 answer

292 views

How to store Bert embeddings in cassandra

I want to use Cassandra as feature store to store precomputed Bert embedding, Each row would consist of roughly 800 integers (ex. -0.18294132) Should I store all 800 in one large string column or 800 ...

casualprogrammer

371

asked Apr 6, 2021 at 0:52

1 vote

2 answers

2k views

Get feature vectors from BertForSequenceClassification

I have successfully build a sentiment analysis tool with BertForSequenceClassification from huggingface/transformers to classify $tsla tweets as positive or negative. However, I can't find out how I ...

Jonas De vos

19

asked Feb 4, 2020 at 20:21

1 vote

1 answer

623 views

Do BERT word embeddings change depending on context?

Before answering "yes, of course", let me clarify what I mean: After BERT has been trained, and I want to use the pretrained embeddings for some other NLP task, can I once-off extract all ...

Daniel von Eschwege

501

asked Jan 3, 2023 at 17:58

1 vote

1 answer

127 views

How to know which contextual embedding to use at test time

Models like BERT generate contextual embeddings for words with different contextual meanings, like 'bank', 'left'. I don't understand which contextual embedding the model chooses to use at test time? ...

Veydan

11

asked Dec 18, 2022 at 19:57

1 vote

1 answer

545 views

UnparsedFlagAccessError: Trying to access flag

I'm a beginner with BERT and i'm trying to run the code that the developers put as example. Unfortunately, when i'm in the cell that use the bert tokenizer i had this error ----------------------------...

Luca Guidotto

101

asked Aug 18, 2022 at 15:25

1 vote

0 answers

121 views

BadZipFile File is not a zip file

In the following code, i get bad zip file error. ctx = mx.gpu(0) bert = BertEmbedding(ctx=ctx) The output of the code is Downloading C:\Users\USER\.mxnet\models\...

rabia qayyum

91

asked Feb 15, 2022 at 5:18

1 vote

0 answers

759 views

how to resolve errors in bert embedding installation

import mxnet as mx Traceback (most recent call last): File "<stdin>", line 1, in <module> File "C:\Users\USER\anaconda3\envs\gpu3\lib\site-packages\mxnet\__init__.py", ...

rabia qayyum

91

asked Feb 3, 2022 at 8:47

1 vote

1 answer

624 views

Sentence embeddings BERT

I need an info. I used this: https://towardsdatascience.com/improving-sentence-embeddings-with-bert-and-representation-learning-dfba6b444f6b to extract features but I got word embeddings. If I want ...

Elia Fabbris

39

asked May 14, 2020 at 15:20

1 vote

0 answers

276 views

How to generate embeddings using Bert

I started using the following Kaggle kernel: https://www.kaggle.com/taindow/bert-a-fine-tuning-example After, I have used the following code which is getting close: bert_config = modeling....

madsthaks

2,171

asked Apr 18, 2019 at 19:26

0 votes

1 answer

3k views

How to extract Sentence Embedding Using BERT model from [CLS] token

I am following this link: BERT document embedding I want to extract sentence-embedding using BERT model using CLS token. Here is the code: import torch from keras.preprocessing.sequence import ...

MAC

1,455

asked Jul 31, 2021 at 15:28

0 votes

1 answer

4k views

Pytorch Loss Function for making embeddings similar

I am working on an embedding model, where there is a BERT model, which takes in text inputs and output a multidimensional vector. The goal of the model is to find similar embeddings (high cosine ...

krishanudb

23

asked Dec 31, 2020 at 13:59

0 votes

1 answer

2k views

About Bert embedding (input_ids, input_mask)

As far as I understand it, in Bert's operating logic, he changes 50% of his sentences that he takes as input. It doesn't touch the rest. 1-) Is the changed part the transaction made with tokenizer....

gezgine

47

asked Jan 26, 2021 at 12:34

0 votes

0 answers

27 views

How to evaluate the performance of sentence embedding models against benchmark dataset

I am relatively new to this field and would like guidance on how to effectively test an embedding model using a benchmark dataset. Specifically, I have acquired a few embedding models related to ...

Muhammad Daniyal

13

asked Apr 9 at 9:09

0 votes

1 answer

58 views

Can Kmeans Clustering using cosine distance in sklearn?

I want to clustering my document using BERT embedding from Sentence Transoformer especially bert-base-nli-mean tokens, and i want to cluster that embedding with kmeans clustering but i have a problem, ...

Rakha

1

asked Feb 20 at 10:42

0 votes

0 answers

84 views

Memory-efficient BERT Text Embedding for Large Dataset Preprocessing in TensorFlow

I'm working with a dataset containing approximately 920,614 rows and multiple columns, including "orig_item_title," "sub_item_title," "is_brand_same," and "...

krishna kaushik

51

asked Nov 29, 2023 at 6:41

0 votes

0 answers

112 views

KeyError: 'bert.embeddings.LayerNorm.weight'

Im running this code in Jetson Nano developer kit. Im getting following error. Error log: File "intentc.py", line 89, in model = BertClassifier(h_size_bert=768, h_size_classifier=50, ...

Fakhruddin Babar

11

asked Sep 7, 2023 at 17:24

0 votes

0 answers

23 views

Getting the Embedding of Text in a Dataframe

So I have a dataframe which contains 5 articles. Each article has been tokenized sentences using nltk. Here's an example of the article. Example Article I want to get the embedding of each sentence in ...

intodarkmoon

1

asked May 15, 2023 at 5:21

0 votes

0 answers

140 views

Using Bert for long sentence

I have a document which consist 200 rows(sentences). Each sentences has more than 512 tokens. I want to turn my whole document into a vector using Bert. But I am getting errors. I have tried this, ...

ok rot

1

asked May 13, 2023 at 5:55

0 votes

0 answers

90 views

Where did the Transformer embedding numbers come from?

I'm a student studying Transformer. I want to ask, when I will vectorize words with Transformer BERT and get 768 vector dimensions for each word, I'm confused about where these numbers come from, is ...

intodarkmoon

1

asked Apr 2, 2023 at 7:15

0 votes

1 answer

626 views

Can I feed categorical data in Keras embedding layer without encoding the data?

I am trying to feed multicolumn categorical data into Keras embedding layer. Can I feed categorical data in Keras embedding layer without encoding ? If not then which encoding method is preferable to ...

Abdullah Al Munem

11

asked Sep 10, 2022 at 11:47

0 votes

1 answer

621 views

Combining BERT and other types of embeddings

The flair model can give a representation of any word (it can handle the OOV problem), while the BERT model splits the unknown word into several sub-words. For example, the word "hjik" will ...

Ali Haider Ahmad

45

asked May 19, 2022 at 16:56

0 votes

1 answer

657 views

How to concatenate new vectors into existing Bert vector?

For a sentence,I may extract a few entities and each of the entities is embedded with 256 dimension vectors. Then I compute an average for these entities to be a single vector to represent these ...

marlon

6,847

asked Jun 24, 2021 at 7:13

0 votes

1 answer

344 views

How can I train a bert model for representational learning task that is domain specific?

I am trying to generate good sentence embeddings for some specific type od texts, using sentence transformer models while testing the the similarity and clustering using kmeans doesnt give good ...

adit94

1

asked Dec 8, 2020 at 14:09

0 votes

0 answers

186 views

problem in training RNN using bert embeddings

I have been working with bert embedding using a neural network model for the sentiment classification task. during model fit it's giving indices error, and I am still new to this so could not able to ...

shankar

39

asked Nov 5, 2020 at 14:30

-1 votes

1 answer

793 views

Should feature embeddings be taken before or after dropout layer in neural network?

I am training a binary text classification model using BERT as follows: def create_model(): text_input = tf.keras.layers.Input(shape=(), dtype=tf.string, name='text') preprocessed_text = ...

Jane Sully

3,247

asked Sep 13, 2021 at 22:03

-2 votes

1 answer

348 views

extract_features sentence embedding BERT

I'm using this code to get the embeddings of sentences that are in my dataset(I'm using my pretrained model). `python extract_features.py \ --input_file=/tmp/input.txt \ --output_file=/tmp/...

Elia Fabbris

39

asked May 18, 2020 at 8:26

Collectives™ on Stack Overflow

All Questions

Related Tags