LiteLLM
LiteLLM is a proxy server that unifies various LLM APIs (OpenAI, Anthropic, Ollama, etc.) under a single OpenAI-compatible API. Ideal when you want to manage multiple AI models centrally and provide them through one interface.
For Advanced Users
LiteLLM is ideal when you want to use multiple AI providers or local models through a unified API. For most users, llama-swap or Ollama is sufficient for local models.
Installation
Add the following template to your docker-compose.yml and then run ei23 dc.
Configuration file required
Create the file ei23-docker/volumes/litellm/config.yaml before starting.
Template
litellm:
image: ghcr.io/berriai/litellm:main-latest
container_name: litellm
restart: always
ports:
- "4000:4000"
volumes:
- ./volumes/litellm/config.yaml:/app/config.yaml
environment:
DATABASE_URL: "postgresql://llmproxy:dbpassword9090@litellm-db:5432/litellm"
STORE_MODEL_IN_DB: "True"
command: --config /app/config.yaml --detailed_debug
depends_on:
- litellm-db
litellm-db:
image: postgres:16
container_name: litellm-db
restart: always
environment:
POSTGRES_DB: litellm
POSTGRES_USER: llmproxy
POSTGRES_PASSWORD: dbpassword9090
volumes:
- ./volumes/litellm/postgres_data:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -d litellm -U llmproxy"]
interval: 1s
timeout: 5s
retries: 10
Configuration
Create /home/[user]/ei23-docker/volumes/litellm/config.yaml:
model_list:
# Local model via llama-swap/Ollama
- model_name: local-llama
litellm_params:
model: openai/llama3
api_base: http://llama-swap:8080/v1
api_key: none
# OpenAI (if API key available)
- model_name: gpt-4
litellm_params:
model: openai/gpt-4
api_key: os.environ/OPENAI_API_KEY
# Anthropic Claude (if API key available)
- model_name: claude-sonnet
litellm_params:
model: anthropic/claude-3-sonnet-20240229
api_key: os.environ/ANTHROPIC_API_KEY
Usage
The API is compatible with the OpenAI API:
# Test request
curl http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "local-llama",
"messages": [{"role": "user", "content": "Hello!"}]
}'
Use Cases
- Unified API - One endpoint for all models
- Fallback - Automatically switch to next available model
- Load Balancing - Distribute requests across multiple instances
- Cost Tracking - Monitor API usage and costs
Notes
- API accessible at
http://[IP]:4000 - Configuration in
./volumes/litellm/config.yaml - Database data in
./volumes/litellm/postgres_data/ - Compatible with OpenAI API clients