Skip to main content
The 2024 Developer Survey results are live! See the results

All Questions

0 votes
0 answers
25 views

Need to Implement Function calling for Mistral 7b-instruct v.02 model in Sagemaker

I trying to add function calling in my chatbot code to actually fetch the tools if the user query is related to the tool. I was trying based on the internet format but i don't know where the error is. ...
vinoth kumar's user avatar
0 votes
0 answers
38 views

Script for streaming Mistral-7B LLM output only streams on server side. Client gets full output

I designed a remote server - client pipeline, which is supposed to load the model on the server and stream the output of the model. At the moment, the output is correctly streamed, but only inside the ...
Phys's user avatar
  • 518
1 vote
1 answer
49 views

Mistral7b response starts with an extra leading space when streamed with Ollama

When I stream the response of mistral7b LLM with Ollama, it has an extra space to the left on the very first streamed chunk. Below is my code: import ollama stream = ollama.chat( model='mistral', ...
noocoder777's user avatar
0 votes
0 answers
70 views

My LLM application in Streamlit (using python) takes longer time to generate the response

I am creating a LLM application using Ollama, Langchain, RAG and streamlit. I am using Mistral as my LLM model from Ollama. However, after uploading the PDF file in the streamlit, it take so much time ...
Urvesh's user avatar
  • 358
0 votes
0 answers
100 views

Inference with LLava v1.6 Mistral model on Amazon SageMaker

I've deployed the following model llava-hf/llava-v1.6-mistral-7b-hf in Amazon SageMaker simply pasting deployment code from model card (https://huggingface.co/llava-hf/llava-v1.6-mistral-7b-hf). ...
Aleksandar Cvjetic's user avatar
0 votes
1 answer
104 views

Mistral7B Instruct input size limited

recently i finetuned a Mistral 7B Instruct v0.3 model and deployed it on an AWS Sagemaker endpoint. But got errors like this: " Received client error (422) from primary with message "{"...
MaxS.'s user avatar
  • 25
0 votes
1 answer
42 views

TGI does not reference model weights

My server's proxy does not allow me to go to Hugging Face. So, I downloaded Mistral 7B weights from GitHub to another computer, sftpd it over to the server, and untarred the contents, $ tar -tvf ...
juanchito's user avatar
  • 482
1 vote
1 answer
216 views

Performing Function Calling with Mistral AI through Hugging Face Endpoint

I am trying to perform function calling using Mistral AI through the Hugging Face endpoint. Mistral AI requires input in a specific string format (assistant: ... \n user: ...). However, the input ...
Neo_clown's user avatar
0 votes
0 answers
89 views

How to fix RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

I am trying to use a custom csv dataset to finetune a model: TheBloke/Mistral-7B-Instruct-v0.1-GPTQ.I performed data preprocessing and I split the dataset into train, validation and test set and then ...
Mayor's user avatar
  • 29
0 votes
1 answer
190 views

Using Llama_index with Mistral Model

I'm new to the field of large language models (LLMs), so I apologize if my explanation isn't clear. I have a Mistral model running in a private cloud, and I have both the URL and the model name. URL = ...
khaoula's user avatar
  • 67
0 votes
0 answers
179 views

Open Source LLM Repeating Tokens Until Max Tokens Reached - How to Fix?

I'm working with an open-source language model (LLM) for generating text in Portuguese, and I'm encountering an issue where the model keeps repeating tokens until the maximum number of tokens is ...
Miguel Casagrande's user avatar
0 votes
0 answers
300 views

Error while loading MISTRAL LLM for fine-tune. Qlora doesn't work but full works

if I try to load the model in this way : bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_use_double_quant=True) model = ...
Antonio's user avatar
0 votes
1 answer
706 views

Open LLM model fine tuning in local machine

I would like to use an open source or free LLM model which is helpful in text processing and summarization, which is also capable of engaging in chats for extracting meaningful content. Are there ...
AJITH KUNDUKULANGARA JOSE's user avatar
0 votes
1 answer
145 views

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint, Received client error (400) from primary with message"{

I have trained mistral 7B model on aws sagemaker, the model weights are stored in S3 location. I have deployed the end point, when I am trying to invoke the endpoint , I am getting the below error ...
Tecena's user avatar
  • 21
1 vote
1 answer
179 views

How to save the LLM2Vec model as a HuggingFace PreTrainedModel object?

Typically, we should be able to save a merged base + PEFT model, like this: import torch from transformers import AutoTokenizer, AutoModel, AutoConfig from peft import PeftModel # Loading base MNTP ...
alvas's user avatar
  • 120k

15 30 50 per page