Skip to main content
The 2024 Developer Survey results are live! See the results

Questions tagged [large-language-model]

A general tag for large language model (LLM)-related subjects. Please ALWAYS use the more specific tags if available (GPT variants, PaLM , LLaMa, BLOOM, Claude etc..)

large-language-model
-1 votes
0 answers
30 views

Embedding and Vector Search with Milvus

I am trying to make a RAG-based chatbot application that lets the user prompt in natural language and receive relevant information that can be retrieved a collection of from multiple tables, all of ...
Calvin Nguyen's user avatar
-2 votes
0 answers
32 views

Want to run a Local LLM on Nvidia Jetson AGX Orin over GPU

I am looking to run a local LLM (Large Language Model) on an Nvidia Jetson AGX Orin over the GPU CUDA Cores . Could anyone provide guidance or share resources on how to achieve this? Thank you in ...
Mausam Jain's user avatar
0 votes
0 answers
27 views

How to increase maximum limit for completion_tokens in AWS Sagemaker invoke endpoint

I have deployed the meta-llama/Meta-Llama-3-8B-Instruct model using HuggingFaceModel. The model responds with the full output when I make a call using HuggingFaceModel's predictor method. Here is the ...
keerti4p's user avatar
0 votes
0 answers
46 views

How to improve response time of Phi-3-medium-128k serverless API?

I have deployed the Phi-3-medium-128k model using Azure AI Studio (serverless deployment). I am using the v1/chat/completions API to get chat completions and I am streaming the response. The time to ...
Rithika Chowta's user avatar
0 votes
1 answer
39 views

How to get multimodal embeddings from CLIP model?

I'm hoping to use CLIP to get a single embedding for rows of multimodal (image and text) data. Say I have the following model: from PIL import Image import torch from transformers import CLIPProcessor,...
T_d's user avatar
  • 13
-1 votes
0 answers
5 views

Is ChatGPT and LLM killing stackoverflow [migrated]

Last few months I have been using ChatGPT LLM for coding, debugging, troubleshooting. Earlier I used to google / post my question on stackoverflow. But now I have instant solution to most of my coding ...
Naresh Chaurasia's user avatar
2 votes
1 answer
139 views

Saving Fine-tune Falcon HuggingFace LLM Model

I'm trying to save my model so it won't need to re-download the base model every time I want to use it but nothing seems to work for me, I would love your help with it. The following parameters are ...
Lidor Eliyahu Shelef's user avatar
0 votes
1 answer
53 views

GPT-2 model from hugging face always generate same result

Why were all the results I got from the GPT-2 model the same no matter what I fed into it? The following are my operating details. First I download the needed files from the official website. These ...
zhangtianpu's user avatar
0 votes
0 answers
20 views

OOM Error using PPO Trainer to LoRa-tune 4-bit Llama-3-8B Model (TRL Hugging Face Library)

As per the standard for PPO Training (which is to do supervised-fine tuning before running the PPO Algorithm) I did a QLoRa fine-tuning of the Llama-3-8B instruct model using my own custom data and ...
Aryaman Jaggi's user avatar
0 votes
0 answers
90 views

Meta Llama-3 prompt sample

I am trying to ask Llama-3 model to read a document and then answer my questions, but my code seems does not generate any output. Can someone tell me what’s wrong with the code? I appreciate it. Code: ...
Joey1205's user avatar
0 votes
0 answers
17 views

Chunking a Tokenized dataset

I am trying to experiment with the databricks-dolly-15k dataset to make it suitable for fine tuning a Llama2 model according to this article by Phil Schmid. The initial part of building the dataset is ...
Arindam's user avatar
  • 312
0 votes
0 answers
24 views

Export a teknium/OpenHermes-2.5-Mistral-7B model to ONNX

I am trying to export teknium/OpenHermes-2.5-Mistral-7B to ONNX, This is my code: import torch from transformers import AutoModelForCausalLM, AutoTokenizer import onnx model_name = "teknium/...
mohammed yazid Berrached's user avatar
0 votes
0 answers
30 views

Google Canary, Docker and Gemini Nano

Can I run the latest Google Canary with the Gemini Nano model in a Docker container in headless mode and interact with the model via Selenium (execute_script)? If so, how do I do it?
aasdf1xa's user avatar
-4 votes
0 answers
25 views

Want to retrain my LLM based user questions and answers on OpenAI

We have created a solution where users can upload their PDFs and ask questions. We have used NodeJS, Langchain, and OpenAI. Currently, the app flow is we save all the content of PDF in our vector ...
Muhammad Mudassir's user avatar
0 votes
0 answers
20 views

apply different learning rate for introduced tokens in the transformers library

Say I want to introduce a few new tokens into the vocabulary of an existing model, and I want these tokens to have a different learning rate compared to the rest of the model's parameters during ...
Bipolo's user avatar
  • 73

15 30 50 per page