All Questions

Filter by
Sorted by
Tagged with
6 votes
1 answer
8k views

Sliding window for long text in BERT for Question Answering

I've read post which explains how the sliding window works but I cannot find any information on how it is actually implemented. From what I understand if the input are too long, sliding window can be ...
Benj's user avatar
  • 63
5 votes
1 answer
2k views

TypeError: argmax(): argument 'input' (position 1) must be Tensor, not str

My code was working fine and when I tried to run it today without changing anything I got the following error: TypeError: argmax(): argument 'input' (position 1) must be Tensor, not str Would ...
veerendra bellapukonda's user avatar
3 votes
1 answer
4k views

How to map token indices from the SQuAD data to tokens from BERT tokenizer?

I am using the SQuaD dataset for answer span selection. After using the BertTokenizer to tokenize the passages, for some samples, the start and end indices of the answer don't match the real answer ...
KoalaJ's user avatar
  • 145
3 votes
1 answer
829 views

NLP : Get 5 best candidates from QuestionAnsweringPipeline

I am working on a French Question-Answering model using huggingface transformers library. I'm using a pre-trained CamemBERT model which is very similar to RoBERTa but is adapted to french. Currently, ...
Benno Uths's user avatar
3 votes
1 answer
505 views

How can I find the cosine similarity between two song lyrics represented as strings?

My friends and I are doing an NLP project on song recommendation. Context: We originally planned on giving the model a recommended song playlist that has the most similar lyrics based on the random ...
yyy818's user avatar
  • 33
2 votes
1 answer
644 views

Custom query with Deepstackai haystack

I am exploring deepset haystack and found it very interesting for multiple use cases like a chatbot, search engine, document search, etc But have not found any reference where I can create multiple ...
Varun's user avatar
  • 5,001
2 votes
1 answer
680 views

Input/output format for Fine Tuning Huggingface RobertaForQuestionAnswering

I'm trying to fine-tune "RobertaForQuestionAnswering" on my custom dataset and I'm confused about the input params it takes. Here's the sample code. >>> from transformers import ...
tarang ranpara's user avatar
2 votes
1 answer
1k views

cannot import name 'DISTILBERT_PRETRAINED_MODEL_ARCHIVE_MAP' from 'transformers.modeling_distilbert'

I am trying to train the distil BERT model for Question Answering purpose. I have installed simple transformers and everything but when I try to run the following command: model = ...
swapnil agashe's user avatar
2 votes
1 answer
1k views

What does BERT's special characters appearance in SQuAD's QA answers mean?

I'm running a fine-tuned model of BERT and ALBERT for Questing Answering. And, I'm evaluating the performance of these models on a subset of questions from SQuAD v2.0. I use SQuAD's official ...
Pedram's user avatar
  • 2,531
2 votes
0 answers
635 views

Transformation of tabular data into natural language for indexing for a search engine

How to transform tabular data that has various columns / rows as shown below into a more readable (natural language) so that it can be indexed for the downstream tasks of a search engine. I am aware ...
Sai_Vyas's user avatar
2 votes
0 answers
3k views

RuntimeError: The size of tensor a (546) must match the size of tensor b (512) at non-singleton dimension 1

I am using BertForQuestionAnswering from hungging face transformers. I am getting tensor size problem. I have tried setting congiguration using BertConfig. But It didn't solve the problem Here is my ...
divya reddy yeruva's user avatar
1 vote
1 answer
3k views

ImportError: cannot import name 'SAVE_STATE_WARNING' from 'torch.optim.lr_scheduler'

I am attempting to issue this statement in a jupyter Notebook. from transformers import BertForQuestionAnswering I get the error: ImportError: cannot import name 'SAVE_STATE_WARNING' from 'torch....
Scott Bing's user avatar
1 vote
1 answer
701 views

Understanding the Hugging face transformers

I am new to the Transformers concept and I am going through some tutorials and writing my own code to understand the Squad 2.0 dataset Question Answering using the transformer models. In the hugging ...
Vishnukk's user avatar
  • 564
1 vote
1 answer
1k views

How to use tapas table question answer model when table size is big like containing 50000 rows?

I am trying to build up a model in which I load the dataframe (an excel file from Kaggle) and I am using TAPAS-large-finetuned-wtq model to query this dataset. I tried to query 259 rows (the memory ...
akshit bhatia's user avatar
1 vote
1 answer
517 views

How can I build a custom context based Question answering model SQuAD using deeppavlov

I have the following queries Dataset format (is how to split train, test and valid data ) Where to place the dataset How to change the path for dataset reader How to save the model in my own ...
Sayak Ghanta's user avatar
1 vote
1 answer
947 views

How to add pooling layer to BERT QA for large text

I'm trying to implement a Question answering system that deal with large input text: so the idea is to split the large input text into subsequences of 510 tokens, after I will generate the ...
John Smith's user avatar
1 vote
1 answer
414 views

QnA model using Bert

I'm trying to build a bert model containing document as input. As bert's limitation is 512 tokens, it's unable to give accurate answer. Now, I'm trying to find NLP model/way/algorithm which should ...
Amrutha k's user avatar
1 vote
1 answer
49 views

Given a subject and an object, what methods can i use to inference a possible verb?

Given a subject A and an object B, for example, A is "Peter", B is "iPhone", Peter can be 'playing' or 'using' iPhone, the verb varies depending on the context, in this case, what ...
YXlukarov's user avatar
1 vote
0 answers
449 views

Unable to generate chunks (If length is greater than 512 in bert), we can use to split into chunks

I'm working Question & Answering hugging face pipeline, my sentence length is 3535, bert only takes 512 length, so i'm trying to divide into chunks and work on it. In the code, i'm working on ...
Mark's user avatar
  • 331
1 vote
0 answers
78 views

Natural Language Processing technique for rephrasing question-answer pairs as full sentence?

Is there an NLP technique for rephrasing question-answer pairs as a full and grammatically correct sentence? For example: Question: Where does Joe live? Answer: Joe lives in Los Angeles. I’ve looked ...
ABCD's user avatar
  • 43
1 vote
1 answer
1k views

Multiple answer spans in context, BERT question answering

I am writing a Question Answering system using pre-trained BERT with a linear layer and a softmax layer on top. When following the templates available on the net the labels of one example usually only ...
Billy_GT's user avatar
0 votes
1 answer
3k views

Token indices sequence length is longer than the specified maximum sequence length for this model (28627 > 512)

I am using BERT's Huggingface DistilBERT model as a backend for a question and answer application. The text I am using with which to train the model is one very large single text field. Even though ...
Scott Bing's user avatar
0 votes
1 answer
490 views

How to run 'run_squad.py' on google colab? It gives 'invalid syntax' error

I downloaded the file first using: !curl -L -O https://github.com/huggingface/transformers/blob/master/examples/legacy/question-answering/run_squad.py Then used following code: !python run_squad.py \...
JN17's user avatar
  • 21
0 votes
1 answer
344 views

I'm using bert pre-trained model for question and answering. It's returning correct result but with lot of spaces between the text

I'm using bert pre-trained model for question and answering. It's returning correct result but with lot of spaces between the text The code is below : def get_answer_using_bert(question, ...
Nithin Reddy's user avatar
0 votes
1 answer
101 views

My checkpoint albert files does not change when training

I train Albert model for question answering task. I have 200 thousand question-answer pairs and I use a saved checkpoint file with 2gb. I trained it on my GPU GeForce 2070 RTX with 1000 steps each ...
Việt Nguyễn's user avatar
0 votes
1 answer
27 views

What and how LLM is used for ranking organization job title?

Suppose there's a context like this context = Andy is a vice manager of finance department. \n Rio is a general manager finance deparment. \n Jason is a general manager finance deparment. question = ...
Jeremy Kenn's user avatar
0 votes
0 answers
51 views

Question answer generation using BERT embedding and LSTM layer?

How i use the BERT embedding as embedding layer with LSTM for generation quesiton answer ? I have tried bert embedding but i am not able stack the bert embedding with LSTM in question answer ...
Abhishek Pandey's user avatar
0 votes
0 answers
33 views

How to build a model in Python which will allow me to have discussion(question - answer) with my data which is in csv format?

I have the above dataset. I want to create a model which allows me to ask questions to the data and answers them correctly, my questions are not predefined and depend on the user at that particular ...
apurva bhujbal's user avatar
0 votes
0 answers
60 views

Bert return nan as lost

I am trying to make an model that is trained on custom data to eventually make a chatbot from it. The problem is training the model resolves in loss as nan. I am trying to to train the model and see ...
Vicklo's user avatar
  • 1
0 votes
0 answers
65 views

Using BERT Q&A model (SQUAD) to answer questions from a dataset

I am developing a custom BERT Q&A model (in the same format as SQUAD) with a view to pose questions to a dataset for an answer (the dataset is large collection of reports). Is it possible to use ...
Jon's user avatar
  • 91
0 votes
0 answers
254 views

How to use pre-trained BERT question-answering model for text extraction in Python?

So, let's say I have a following csv dataset. I have to use pre-trained BERT question-answering model to train , predict and finally evaluate. As, I am new to this it would be helpful to see similar ...
JN17's user avatar
  • 21
0 votes
1 answer
470 views

Bert using transformer's pipeline and encode_plus function

when I use: modelname = 'deepset/bert-base-cased-squad2' model = BertForQuestionAnswering.from_pretrained(modelname) tokenizer = AutoTokenizer.from_pretrained(modelname) nlp = pipeline('question-...
user avatar
0 votes
1 answer
780 views

Low accuracy when fine-tuning BERT for question answering

I'm trying to fine-tune CamemBERT (french version of Roberta) for question answering. At the first I'm using CamemBERT model to generate the input embedding of question and text and a output linear ...
John Smith's user avatar