I am working on a project to analyze text messages to extract specific data from them. Regular expressions in Python don't work well because the text formats are constantly changing and there is no consistency. Therefore, I decided to use a language model to process these texts and return the result if the text contains what I am interested in.
While developing the program, I ran the model with the following command:
docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
After this, I sent a request in my code as follows:
url = 'http://localhost:11434/api/generate'
data = {
"model": "llama3",
"prompt": f"{input_text}",
}
This worked (although the response took a bit long to generate).
Now, when I try to configure my docker-compose.yml
file to start the language model container, it looks like this:
ollama:
container_name: ollama
image: ollama/ollama
volumes:
- ollama:/root/.ollama
ports:
- "11434:11434"
volumes:
ollama:
But I just get a 404 error, meaning the endpoint is not found. I don't understand what I am doing wrong. Can someone help with this?
Additionally, does anyone know if the language model supports multithreading? My script sends text very quickly, and I am not sure whether to limit the rate of sending requests to the language model or if it can handle multithreading and asynchronous requests.