Recently Active 'large-language-model' Questions

0 votes

0 answers

10 views

CUDA Out of Memory Error Despite Having Multiple GPUs

I'm encountering a CUDA out-of-memory error while trying to run a PyTorch model, even though my system has multiple NVIDIA GPUs. # Load the tokenizer and model tokenizer = AutoTokenizer....

Flying-Meta

1

asked 5 hours ago

0 votes

0 answers

16 views

After uploading LLM to Google Colab, how to use it in a code?

Recently, for a project, I have uploaded Meta Llama 3 8B model from huggingface to Google Colab, since the model's high VRAM requirements were not being met by my pc. Therefore i needed Colab's ...

Fedor

19.3k

modified 15 hours ago

1 vote

1 answer

65 views

How do I persist FAISS indexes?

In the langchain wiki of FAISS, https://python.langchain.com/v0.2/docs/integrations/vectorstores/faiss/, it only talks about saving indexes to files. db.save_local("faiss_index") new_db = ...

Benyamin Jafari

31.6k

modified 16 hours ago

1 vote

0 answers

300 views

mT5 Question/Answering fine tuning is generating empty sentences during inference

mT5-small Question Answering training is converging to high accuracy, high validation accuracy, near-zero low loss; however, when testing the model on trained questions, I am always receiving empty ...

hiba younis

1

modified 16 hours ago

0 votes

1 answer

36 views

SQLite query to chunk group_concat into groups that maximize a length constraint?

I have data in a SQLite table that I'd like to process in "chunks" that include concatenated fields of multiple rows up to an overall limit of 10,000 chars per chunk. I can run queries ...

Sarol

141

modified 21 hours ago

0 votes

0 answers

25 views

Need to Implement Function calling for Mistral 7b-instruct v.02 model in Sagemaker

I trying to add function calling in my chatbot code to actually fetch the tools if the user query is related to the tool. I was trying based on the internet format but i don't know where the error is. ...

vinoth kumar

111

modified yesterday

0 votes

0 answers

8 views

GGUF model in LM Studio returns broken answer

I try to run LLM GGUF model QuantFactory/T-lite-instruct-0.1-GGUF specifically its quantized version T-lite-instruct-0.1.Q2_K.gguf in LM Studio. Sometimes it works fine. But sometimes it returns "...

pav

99

asked yesterday

0 votes

0 answers

6 views

RuntimeError with DeBERTaV3 Sequence Classification: Tensor Size Mismatch

Iam trying to fine-tune the microsoft/deberta-v3-base model for sequence classification with three labels. I have set up my tokenizer and data preprocessing, but I encounter a RuntimeError during ...

suri

21

asked yesterday

0 votes

1 answer

36 views

AzureChatOpenAI only uses one tool at a time

LangChain with AzureChatOpenAI is only ever calling one tool at a time. When prompting the model to multiply and add two sets of numbers, I expect two tool calls, however only one tool is called, ...

jowid

1

answered yesterday

0 votes

0 answers

16 views

Load Phi 3 small on Nvidia Tesla V100 - Flash Attention

I would like to inquire about the possibility of uploading and fine tuning a Phi 3 8k small. When I load the model, I get an error about missing Flash attention. If I want to install the given package,...

talonmies

71.8k

modified yesterday

0 votes

0 answers

27 views

Unable to solve dtype issue using UnslothAI fine tuning for Llama 3.1 8B model

I am new to fine tuning LLMs and I have been trying to run the notebooks provided by UnSlothAI. For this question, I am running the code for fine-tuning LLaMa 3.1 8B model as posted here This colab ...

adhok

411

asked yesterday

0 votes

1 answer

91 views

Could not find a version that satisfies the requirement llama-index-finetuning-cross-encoders

I'm trying to run this Llama Index How to Finetune a cross-encoder using LLamaIndex. And, I cannot install llama-index-finetuning-cross-encoders package. I tried this code %pip install llama-index-...

Dmitry Grebenyuk

491

answered 2 days ago

2 votes

3 answers

2k views

Why is my vector database retrieving irrelevant results?

I'm trying to create a vector database in python using LangChain for retrieval augmentation with a large language model. Currently, I'm using NCBI Statpearls (a corpus of medical data) and for testing ...

Pablo Mendes

401

answered 2 days ago

1 vote

1 answer

391 views

Local LLM as argument to initialize_agent function in langchain.agents

How to correctly load a local model LLM and use it in the initialize_agent function of the langchain library? I have a LLM google/flan-t5-large (downloaded from HuggingFaces) stored in my computer ...

ListenSoftware Louise Ai Agent

4,066

answered 2 days ago

1 vote

1 answer

471 views

How to prompt engineer LLM using LangChain to give "unable to answer question" when asked a question

I am currently using LangChain and OpenAI to build a Natural Language to SQL model. The issue I am having is that I want the model to return "I don't know" or "Please provide more ...

ListenSoftware Louise Ai Agent

4,066

answered 2 days ago

Collectives™ on Stack Overflow

Questions tagged [large-language-model]

CUDA Out of Memory Error Despite Having Multiple GPUs

After uploading LLM to Google Colab, how to use it in a code?

How do I persist FAISS indexes?

mT5 Question/Answering fine tuning is generating empty sentences during inference

SQLite query to chunk group_concat into groups that maximize a length constraint?

Need to Implement Function calling for Mistral 7b-instruct v.02 model in Sagemaker

GGUF model in LM Studio returns broken answer

RuntimeError with DeBERTaV3 Sequence Classification: Tensor Size Mismatch

AzureChatOpenAI only uses one tool at a time

Load Phi 3 small on Nvidia Tesla V100 - Flash Attention

Unable to solve dtype issue using UnslothAI fine tuning for Llama 3.1 8B model

Could not find a version that satisfies the requirement llama-index-finetuning-cross-encoders

Why is my vector database retrieving irrelevant results?

Local LLM as argument to initialize_agent function in langchain.agents

How to prompt engineer LLM using LangChain to give "unable to answer question" when asked a question

Hot Network Questions

Collectives™ on Stack Overflow

Questions tagged [large-language-model]

Related Tags