chat
Start an interactive or one-shot chat with a model registered in the Spice runtime.
Requirements
- Spice runtime must be running
- At least one model defined in
spicepod.yaml
and the model is ready
Usage
Interative Chat: Invoke the command without arguments to open a REPL
spice chat [flags]
One-shot Chat: Pass a single message as the argument to send a one-shot chat request and print the response
spice chat [flags] [<message>]
Flags
--cloud
Send requests to a Spice Cloud instance instead of the local instance. Default:false
.--http-endpoint <string>
Runtime HTTP endpoint. Default:http://localhost:8090
.--model <string>
Target model for the chat request. When omitted, the CLI uses the single ready model or prompts for a choice if several models are ready.--temperature <float32>
Model temperature used for chat request. Default:1
.--user-agent <string>
CustomUser-Agent
header sent with every request.--responses
Direct all chats to the/v1/responses
endpoint, which exposes configured models that support OpenAI's Responses API and enables access to OpenAI-hosted tools. To learn more about Spice's support for OpenAI's Responses API, view the OpenAI model provider documentation or the Azure OpenAI model provider documentation.
Examples
When exactly one model is ready, spice chat
opens a REPL that uses that model automatically:
> spice chat
Using model: openai
chat> hello
Hello! How can I assist you today?
Time: 0.57s (first token 0.53s). Tokens: 18. Prompt: 8. Completion: 10 (325.04/s).
When multiple models are ready, the command prompts for a selection before starting the REPL:
> spice chat
Use the arrow keys to navigate: ↓ ↑ → ←
? Select model:
▸ openai
llama
Using model: openai
chat> hello
Hello! How can I assist you today?
Time: 0.55s (first token 0.43s). Tokens: 18. Prompt: 8. Completion: 10 (80.09/s).
Passing --model
skips the prompt and directs the request to the specified model. The flag works both in REPL mode and in one‑shot mode:
# REPL
spice chat --model openai
chat> hello
Hello! How can I assist you today?
Time: 0.61s (first token 0.58s). Tokens: 18. Prompt: 8. Completion: 10 (285.90/s).
Single prompt:
# One‑shot
spice chat --model openai "hello"
Hello! How can I assist you today?
Time: 1.10s (first token 0.80s). Tokens: 18. Prompt: 8. Completion: 10 (33.74/s).