Highest scored 'bert-language-model+sentence-transformers' questions

16 votes

2 answers

31k views

Download pre-trained sentence-transformers model locally

I am using the SentenceTransformers library (here: https://pypi.org/project/sentence-transformers/#pretrained-models) for creating embeddings of sentences using the pre-trained model bert-base-nli-...

neha tamore

311

asked Dec 23, 2020 at 5:34

7 votes

1 answer

8k views

max_seq_length for transformer (Sentence-BERT)

I'm using sentence-BERT from Huggingface in the following way: from sentence_transformers import SentenceTransformer model = SentenceTransformer('all-MiniLM-L6-v2') model.max_seq_length = 512 model....

BlackHawk

779

asked Mar 31, 2023 at 17:29

6 votes

1 answer

4k views

Fine-tuning BERT sentence transformer model

I am using a pre-trained BERT sentence transformer model, as described here https://www.sbert.net/docs/training/overview.html , to get embeddings for sentences. I want to fine-tune these pre-trained ...

Fiori

301

asked Oct 13, 2021 at 21:38

5 votes

2 answers

258 views

Same sentences produces a different vector in XLNet

I have computed the vectors for two same sentences using XLNet embedding-as-service. But the model produces different vector embeddings for both the two same sentences hence the cosine similarity is ...

Anoop kottappuram

224

asked Jan 8, 2021 at 15:53

5 votes

0 answers

2k views

Decode sentence representation derived from SentenceTransformer

Is it possible to decode a sentence representation derived from SentenceTransformer back to a sentence? See example from the documentation from sentence_transformers import SentenceTransformer model = ...

KoKo

379

asked Sep 9, 2021 at 23:19

4 votes

1 answer

763 views

Restrict Vocab for BERT Encoder-Decoder Text Generation

Is there any way to restrict the vocabulary of the decoder in a Huggingface BERT encoder-decoder model? I'd like to force the decoder to choose from a small vocabulary when generating text rather than ...

Joseph Harvey

83

asked Oct 6, 2021 at 14:07

4 votes

2 answers

4k views

How to convert model.safetensor to pytorch_model.bin?

I'm fine tuning a pre-trained bert model and i have a weird problem: When i'm fine tuning using the CPU, the code saves the model like this: With the "pytorch_model.bin". But when i use ...

Gabriel Henrique

53

asked Dec 23, 2023 at 20:43

3 votes

3 answers

1k views

String comparison with BERT seems to ignore "not" in sentence

I implemented a string comparison method using SentenceTransformers and BERT like following from sentence_transformers import SentenceTransformer from sklearn.metrics.pairwise import cosine_similarity ...

Tiago Bachiega de Almeida

121

asked Sep 7, 2021 at 16:18

3 votes

2 answers

2k views

Sentence-Transformer Training and Validation Loss

I am using the Sentence-Transformers model to Fine Tune(using PyTorch) it on a custom dataset which is the same as the Semantic Text Similarity (STS) Dataset. I am unable to get(or print) the training ...

Abhas kumar

37

asked Mar 6, 2023 at 21:15

3 votes

1 answer

5k views

How to save sentence-Bert output vectors to a file?

I am using Bert to get similarity between multi term words.here is my code that I used for embedding : from sentence_transformers import SentenceTransformer model = SentenceTransformer('bert-large-...

Sahar Rezazadeh

314

asked Jul 11, 2021 at 9:16

2 votes

1 answer

752 views

ReadError: file could not be opened successfully. But I am not sure where the tar file is stored to resolve this

I am using biobert-embeddings==0.1.2 and torch==1.2.0 versions to embed some documents. But, I get the following error when I try to load the model by from biobert_embedding.embedding import ...

satish cc

21

asked Jul 14, 2021 at 8:46

2 votes

3 answers

2k views

SimpleTransformers Error: VersionConflict: tokenizers==0.9.4? How do I fix this?

I'm trying to execute the simpletransformers example from their site on google colab. Example: from simpletransformers.classification import ClassificationModel, ClassificationArgs import pandas as pd ...

Reema Q Khan

878

asked Jan 27, 2021 at 17:16

2 votes

1 answer

966 views

How to know if a word belong to a Transformer model?

I use the python library sentence_transformers with the models RoBERTa and FlauBERT. I use cosine scores to compute similarity but for some words it doesn't work well. Those words seems to be the one ...

Nathan Redin

43

asked Mar 28, 2022 at 15:36

2 votes

0 answers

292 views

Sentence Transformers can not get a lot of images' embeddings

When I try to get embeddings from images I get error like this 'too many open files'. I have 50000 images, I do not want to split images into different folders and then concatenate embeddings (It is ...

Vadim

39

asked Nov 11, 2022 at 9:56

2 votes

0 answers

326 views

Performing MLM pretraining on BERT pretrained model to use model in Sentence Transformer for semantic similarity

I have a NLP use case to compute semantic similarity between sentences that are very specific to my use case. I want to use Sentence Transformers library to do this, which provides with state of the ...

Martin Becuwe

117

asked Sep 7, 2022 at 12:20

2 votes

0 answers

192 views

Error loading quantized BERT model from local repository

After quantizing the BERT model, it works without any issue. But if I save the quantized model and load, it does not work. It shows an error message: 'LinearPackedParams' object has no attribute '...

user3190883

17

asked Jul 2, 2021 at 13:31

1 vote

1 answer

1k views

BERTopic Embeddings ValueError when transform a new text

I have created embeddings using SentenceTransformer and trained a BERTopic model on those embeddings. sentence_model = SentenceTransformer("all-MiniLM-L6-v2") embeddings = sentence_model....

Vai

179

asked Nov 18, 2022 at 16:33

1 vote

1 answer

2k views

How to list all documents/words per topic in bert topic modelling?

I read the docs, but i can see the topics only show 3 or 4 documents per topic whereas the count is 2000+, is there a way i can see all the assigned documents, instead of three/four documents per ...

Noob Coder

213

asked Jun 13, 2022 at 10:38

1 vote

1 answer

281 views

Error while using bert-base-nli-mean-tokens bert model

I am using this code: model = SentenceTransformer('bert-base-nli-mean-tokens') body = list(data['preprocessedBody']) bodyEmbedding = model.encode(body, show_progress_bar = True) However, I am getting ...

python_pi

113

asked Oct 27, 2022 at 21:35

1 vote

1 answer

27 views

ber-base-uncase does not use newly added suffix token

I want to add custom tokens to the BertTokenizer. However, the model does not use the new token. from transformers import BertTokenizer tokenizer = BertTokenizer.from_pretrained("bert-base-...

Lulacca

13

asked Jul 14, 2023 at 13:53

1 vote

0 answers

137 views

Fine-tune SentenceTransformer/SBERT for Extractive Text Summarization

Newbie here on NLP. I want to build extractive text summarization, try to read this https://huggingface.co/blog/how-to-train-sentence-transformers, I think there is a way to fine-tune the model with ...

Python Beginner

53

asked May 14, 2023 at 7:10

1 vote

0 answers

164 views

Sequence to sequence classification (predicting sequence of labels) using Transformers

Im looking for a way to feed a transformer (HuggingFace trained model) a sequence of sentences(introducing context) in order to predict a sequence of labels. The goal is to predict each sentence by ...

Keren L

11

asked Apr 18, 2023 at 13:51

1 vote

0 answers

105 views

How can I fine tune sentence transfomer without any labels?

I only have product descriptions and nothing else. I need to match similar products using cosine similarity. I have achieved this by taking embeddings from the Sentence Transformer. However, I need to ...

Margam Rohith Kumar

11

asked Apr 7, 2023 at 13:24

1 vote

1 answer

605 views

FastBert TypeError :forward() got an unexpected keyword argument 'masked_lm_labels'

I am following this tutorial and I have an error in this step: lm_learner.fit(epochs=30, lr=1e-4, validate=True, schedule_type="warmup_cosine", ...

Catapultaa

170

asked Sep 27, 2022 at 15:22

1 vote

1 answer

654 views

save_pretrained function with fine tuned bert model with cnn

class MixModel(nn.Module): def __init__(self,pre_trained='bert-base-uncased'): super().__init__() config = BertConfig.from_pretrained('bert-base-uncased', output_hidden_states=...

Shorouk Adel

147

asked Sep 27, 2022 at 10:47

1 vote

0 answers

1k views

What is the maximum text length in tokens that can be given as input for summarisation task using a sentence transformer models

Most Bert models take a maximum input length of 512 tokens. When I used sentence transformer multi-qa-distilbert-cos-v1 model with bert-extractive-summarizer for summarisation task. A text with 792 ...

pheonix4821

11

asked Jul 6, 2022 at 4:11

1 vote

0 answers

346 views

Cant load pretrained model to generate embeddings

I am using this code to generate sentence embeddings with the hugging face transformer library, and I am getting this error. I can't seem to resolve this problem. Any pointers will help. Thanks. from ...

Maak

33

asked Feb 22, 2022 at 9:04

1 vote

0 answers

1k views

Improve the model prediction time in huggingface transformer models without GPU

I am using huggingface transformers models for quite a few tasks, it works good but the only problem is the response time. It takes around 6-7 seconds to generate result while some times it even takes ...

DevPy

467

asked Nov 23, 2021 at 7:40

1 vote

0 answers

669 views

Why are the three embedding results are so different from transformer models?

I want to get short text embedding from transformer models, so I had tested 3 ways to compute it. All 3 cases are using models from Huggingface Hub. inputs = tokenizer(text, padding=True, ...

marlon

6,847

asked Oct 29, 2021 at 5:31

0 votes

1 answer

1k views

sentence transformer using huggingface/transformers pre-trained model vs SentenceTransformer

This page has two scripts When should one use 1st method shown below vs 2nd? As nli-distilroberta-base-v2 trained specially for finding sentence embedding wont that always be better than the first ...

user2543622

6,258

asked Jan 10, 2022 at 22:54

0 votes

1 answer

803 views

bert sentence_transformers list index out of range

I'm trying to use sentence_transformers to get bert embeddings, but it can't process for example 300 documents, i keep getting error IndexError: list index out of range. How to fix that? from ...

Aska

141

asked Jun 27, 2022 at 18:38

0 votes

1 answer

4k views

Pytorch model object has no attribute 'predict' BERT

I had train a BertClassifier model using pytorch. After creating my best.pt I would like to make in production my model and using it to predict and classifier starting from a sample, so I resume them ...

Chiara

380

asked May 6, 2022 at 20:51

0 votes

2 answers

6k views

Can not find the pytorch model when loading BERT model in Python

I am following this article to find the text similarity. The code I have is this: from sentence_transformers import SentenceTransformer from tqdm import tqdm from sklearn.metrics.pairwise import ...

Feyzi Bagirov

1,342

asked Aug 4, 2021 at 15:12

0 votes

0 answers

16 views

Improving Similarity Measurement of Event Dates in Sentence Transformer Models

I'm developing a system to compute the similarity between textual descriptions of events using the sentence-transformers library. Despite trying various models, I am particularly struggling to capture ...

ashfak

85

asked Apr 15 at 5:19

0 votes

0 answers

27 views

How to evaluate the performance of sentence embedding models against benchmark dataset

I am relatively new to this field and would like guidance on how to effectively test an embedding model using a benchmark dataset. Specifically, I have acquired a few embedding models related to ...

Muhammad Daniyal

13

asked Apr 9 at 9:09

0 votes

0 answers

28 views

The using of golden dataset in Augmented SBERT Training

I use the training strategy of Augmented SBERT (Domain-Transfer). In the code example they use the golden-dataset (STSb) for the training evaluator. Here two code snippes of the example of sentence-...

Christian01

491

asked Dec 21, 2023 at 11:09

0 votes

1 answer

62 views

classification report for adapters with transformers

I used this code, but I want to calculate classification report especially f1 score but I donnot kow how todo that import numpy as np from transformers import TrainingArguments, AdapterTrainer, ...

Shorouk Adel

147

asked Aug 5, 2023 at 8:01

0 votes

0 answers

126 views

Bert Supervised model topics per class with only one class

I am trying to use BERT Supervised model for topic modeling. I dont have the liberty to use topic_model = BERTopic(verbose=True). I have to download the pre-trained model locally and use it. I have ...

Shekar Tippur

165

asked Jul 3, 2023 at 1:25

0 votes

0 answers

28 views

What model can we use for sentence classification using the CSAbstruct dataset?

Trying to train a model for sentence classification on the CSAbstruct dataset : https://github.com/allenai/sequential_sentence_classification/tree/master/data/CSAbstruct Tried with RoBERTa base model ...

snk_24

9

asked May 2, 2023 at 22:14

0 votes

1 answer

407 views

BERT sentence embeddings as input features for support vector regression

I used the bert-base-multilingual-cased Tokenizer and model to extract sentence embeddings from Instagram captions. from transformers import BertTokenizer, BertModel tokenizer = BertTokenizer....

Tara van Mierlo

1

asked Apr 21, 2023 at 14:38

0 votes

0 answers

90 views

Where did the Transformer embedding numbers come from?

I'm a student studying Transformer. I want to ask, when I will vectorize words with Transformer BERT and get 768 vector dimensions for each word, I'm confused about where these numbers come from, is ...

intodarkmoon

1

asked Apr 2, 2023 at 7:15

0 votes

1 answer

686 views

How to add weights in BERT loss function

I have unbalanced dataset size N with such classes: class 1 - size 0.554*N class 2 - size 0.271*N class 3 - size 0.185*N I’m trying to solve NER task by fine-tuning Bert “dslim / bert-large-NER”, ...

dkagramanyan

91

asked Feb 5, 2023 at 21:13

0 votes

1 answer

95 views

How to solve natural language inference using SentenceBERT?

How can I solve natural language inference using fine-tuned SentenceBERT models(ex. sentence-transformers/all-MiniLM-L6-v2 · Hugging Face) to obtain better sentence vectors? Many of these models have ...

tedmosby

1

asked Jan 30, 2023 at 14:03

0 votes

1 answer

232 views

Print out the text value of the points on a cluster when using UMAP and HDBScan and BERT sentence transformer

I have seen a number of questions similar to this but my cluster labels consist of sentence embeddings, thus a better question may be how do I get text values from the sentence embeddings? How can I ...

Tam

105

asked Jun 20, 2022 at 11:24

0 votes

0 answers

577 views

Sentence Transformers - IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

I am using DistilBERT from sentence_transformers library on kaggle, but when I import my model and try to encode a sentence with it : modelB = SentenceTransformer('../input/sentence-transformer-models/...

Khadija

72

asked Feb 5, 2022 at 8:57

0 votes

1 answer

3k views

How to load Bert pretrained model with SentenceTransformers from local path?

I am using the SentenceTransformer library to use Bert pre-trained model I download the file in google Colabs and saved it with these commands: from sentence_transformers import SentenceTransformer ...

Sahar Rezazadeh

314

asked Oct 18, 2021 at 19:06

0 votes

0 answers

357 views

How to download bert models and load in python?

How to download bert models and load in python? from sentence_transformers import SentenceTransformer model = SentenceTransformer('bert-base-nli-mean-tokens') How to save the pretrained model and ...

Nithin Reddy

650

asked Jun 12, 2021 at 15:51

0 votes

1 answer

344 views

How can I train a bert model for representational learning task that is domain specific?

I am trying to generate good sentence embeddings for some specific type od texts, using sentence transformer models while testing the the similarity and clustering using kmeans doesnt give good ...

adit94

1

asked Dec 8, 2020 at 14:09

-1 votes

2 answers

497 views

FileNotFound error downloading roberta-model sentence transformers

I've already downloaded the "roberta-large-nli-stsb-mean-tokens" model, but it starts downloading again and again. Note: This is not related to space, the machine has space. And this error ...

Arjit Yadav

1

asked May 6, 2021 at 14:37

-1 votes

1 answer

97 views

How does NLP model know the output length during translation tasks?

Translating English to French, we may have this: Input: "Please help me translate this sentence" 6 tokens Output: "Merci de m'aider à traduire cette phrase" 7 ...

Worldbuffer

35

asked Aug 3, 2022 at 3:36

Collectives™ on Stack Overflow

All Questions

Related Tags