Model Providers
Spice supports various model providers for traditional machine learning (ML) models and large language models (LLMs).
Name | Description | Status | ML Format(s) | LLM Format(s)* |
---|---|---|---|---|
openai | OpenAI (or compatible) LLM endpoint | Release Candidate | - | OpenAI-compatible HTTP endpoint |
file | Local filesystem | Beta | ONNX | GGUF, GGML, SafeTensor |
huggingface | Models hosted on HuggingFace | Beta | ONNX | GGUF, GGML, SafeTensor |
spice.ai | Models hosted on the Spice.ai Cloud Platform | Alpha | ONNX | OpenAI-compatible HTTP endpoint |
azure | Azure OpenAI | Alpha | - | OpenAI-compatible HTTP endpoint |
anthropic | Models hosted on Anthropic | Alpha | - | OpenAI-compatible HTTP endpoint |
xai | Models hosted on xAI | Alpha | - | OpenAI-compatible HTTP endpoint |
- LLM Format(s) may require additional files (e.g.
tokenizer_config.json
).
The model type is inferred based on the model source and files. For more detail, refer to the model
reference specification.
Features
Spice supports a variety of features for large language models (LLMs):
- Custom Tools: Provide models with tools to interact with the Spice runtime. See Tools.
- System Prompts: Customize system prompts and override defaults for
v1/chat/completion
. See Parameter Overrides. - Memory: Provide LLMs with memory persistence tools to store and retrieve information across conversations. See Memory.
- Vector Search: Perform advanced vector-based searches using embeddings. See Vector Search.
- Evals: Evaluate, track, compare, and improve language model performance for specific tasks. See Evals.
- Local Models: Load and serve models locally from various sources, including local filesystems and Hugging Face. See Local Models.
For more details, refer to the Large Language Models documentation.
Model Examples
The following examples demonstrate how to configure and use various models or model features with Spice. Each example provides a specific use case to help you understand the configuration options available.
Example: Configuring an OpenAI Model
To use a language model hosted on OpenAI (or compatible), specify the openai
path and model ID in from
. For more details, see OpenAI Model Provider.
Example spicepod.yml
:
models:
- from: openai:gpt-4o-mini
name: openai
params:
openai_api_key: ${ secrets:SPICE_OPENAI_API_KEY }
- from: openai:llama3-groq-70b-8192-tool-use-preview
name: groq-llama
params:
endpoint: https://api.groq.com/openai/v1
openai_api_key: ${ secrets:SPICE_GROQ_API_KEY }
Example: Using an OpenAI Model with Tools
To specify tools for an OpenAI model, include them in the params.tools
field. For more details, see the Tools documentation.
models:
- name: sql-model
from: openai:gpt-4o
params:
tools: list_datasets, sql, table_schema
Example: Adding Memory to a Model
To enable memory tools for a model, define a store
memory dataset and specify memory
in the model's tools
parameter. For more details, see the Memory documentation.
datasets:
- from: memory:store
name: llm_memory
mode: read_write
models:
- name: memory-enabled-model
from: openai:gpt-4o
params:
tools: memory, sql
Example: Setting Default Parameter Overrides
To set default overrides for parameters, use the openai_
prefix followed by the parameter name. For more details, see the Parameter Overrides documentation.
models:
- name: pirate-haikus
from: openai:gpt-4o
params:
openai_temperature: 0.1
openai_response_format: { 'type': 'json_object' }
Example: Configuring a System Prompt
To configure an additional system prompt, use the system_prompt
parameter. For more details, see the Parameter Overrides documentation.
models:
- name: pirate-haikus
from: openai:gpt-4o
params:
system_prompt: |
Write everything in Haiku like a pirate
Example: Serving a Local Model
To serve a model from the local filesystem, specify the from
path as file
and provide the local path. For more details, see Filesystem Model Provider.
models:
- from: file://absolute/path/to/my/model.onnx
name: local_fs_model
Example: Analyzing GitHub Issues with a Chat Model
This example demonstrates how to pull GitHub issue data from the last 14 days, accelerate the data, create a chat model with memory and tools to access the accelerated data, and use Spice to ask the chat model about the general themes of new issues.
Step 1: Pull GitHub Issue Data
First, configure a dataset to pull GitHub issue data from the last 14 days.
datasets:
- from: github:github.com/<owner>/<repo>/issues
name: github_issues
params:
github_token: ${secrets:GITHUB_TOKEN}
acceleration:
enabled: true
refresh_mode: append
refresh_check_interval: 24h
refresh_data_window: 14d
Step 2: Create a Chat Model with Memory and Tools
Next, create a chat model that includes memory and tools to access the accelerated GitHub issue data.
datasets:
- from: memory:store
name: llm_memory
mode: read_write
models:
- name: github-issues-analyzer
from: openai:gpt-4o
params:
tools: memory, sql
Step 3: Query the Chat Model
At this step, the spicepod.yaml
should look like:
datasets:
- from: github:github.com/<owner>/<repo>/issues
name: github_issues
params:
github_token: ${secrets:GITHUB_TOKEN}
acceleration:
enabled: true
refresh_mode: append
refresh_check_interval: 24h
refresh_data_window: 14d
- from: memory:store
name: llm_memory
mode: read_write
models:
- name: github-issues-analyzer
from: openai:gpt-4o
params:
openai_api_key: ${ secrets:SPICE_OPENAI_API_KEY }
tools: memory, sql
Finally, use Spice to ask the chat model about the general themes of new issues in the last 14 days. The following curl
command demonstrates how to make this request using the OpenAI-compatible API.
curl -X POST http://localhost:8090/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "github-issues-analyzer",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What are the general themes of new issues in the last 14 days?"}
]
}'
Refer to the Create Chat Completion API documentation for more details on making chat completion requests.
📄️ OpenAI
Instructions for using language models hosted on OpenAI or compatible services with Spice.
📄️ Azure OpenAI
Instructions for using Azure OpenAI models
📄️ Anthropic
Instructions for using language models hosted on Anthropic with Spice.
📄️ HuggingFace
Instructions for using machine learning models hosted on HuggingFace with Spice.
📄️ Filesystem
Instructions for using models hosted on a filesystem with Spice.
📄️ Spice Cloud Platform
Instructions for using models hosted on the Spice Cloud Platform with Spice.
📄️ xAI
Instructions for using xAI models