Skip to main content
Version: Next

nsql

Text-to-SQL REPL — translate natural language queries into SQL using a model loaded by the Spice runtime. The REPL sends the input to the runtime's /v1/nsql endpoint and prints the generated SQL along with the executed results.

Requirements

  • Spice runtime must be running
  • At least one model defined in spicepod.yaml and ready

Usage

spice nsql [flags]
spice nsql [flags] [command]

Flags

  • --model, -m <model> Target model for Text-to-SQL generation. When omitted, the CLI uses the single ready model or prompts for a choice if several models are ready.
  • -h, --help Print usage information.

Subcommands

  • analyze Analyze Text-to-SQL performance by comparing the generated SQL against an expected SQL query.

Examples

Start an interactive Text-to-SQL session:

$ spice nsql
Welcome to the Spice.ai NSQL REPL!

Using model:
openai

Enter a query in natural language.
nsql> show the top 5 longest taxi trips

Pass --model to select a specific model when more than one is ready:

spice nsql --model openai

Type exit, quit, .exit, or .quit — or press Ctrl+D — to leave the REPL. Inputs are saved to nsql_history.txt for recall with the up-arrow key.

analyze

The spice nsql analyze subcommand evaluates Text-to-SQL quality by comparing a generated SQL query against an expected SQL query and reporting accuracy and performance metrics.

Usage

spice nsql analyze --query <natural-language-query> --expected <expected-sql> --model <model>

Flags

  • --query <query> Natural language query to analyze. Required.
  • --expected <sql> Expected SQL query to compare the generated SQL against. Required.
  • --model, -m <model> Model to use for Text-to-SQL. Required.

Metrics

Functional metrics (generated vs. expected SQL):

  • exact_match1.0 if the generated SQL exactly matches the expected SQL, 0.0 otherwise.
  • correct_tables — Intersection-over-Union (IoU) of tables referenced.
  • correct_projections — IoU of projected columns/expressions.
  • correct_schema — IoU of output schema fields.

Performance metrics (read from runtime.task_history via the request's W3C trace ID):

  • input_tokens — Total prompt tokens used by LLM calls.
  • output_tokens — Total completion tokens generated by the LLM.
  • latency_ms — End-to-end latency of the nsql request.
  • sql_duration_ms — Total time spent executing SQL queries.
  • llm_duration_ms — Total time spent in LLM inference.
  • sql_query_count — Number of SQL queries executed.
  • llm_count — Number of LLM completion calls made.

Example

spice nsql analyze \
--model openai \
--query "how many taxi trips are there?" \
--expected "SELECT COUNT(*) FROM taxi_trips"