Version: v1.10

Sharded

The Spice Runtime instances can be sharded based on specific criteria, such as by customer, state, or other logical partitions. Each shard operates independently, with a 1:N Application to Spice instances ratio.

Benefits

Helps distribute load across multiple instances, improving performance and scalability.
Isolates failures to specific shards, enhancing resiliency.
Allows tailored configurations and optimizations for different shards.

Considerations

More complex deployment and management due to multiple instances.
Requires effective sharding strategy to balance load and avoid hotspots.
Potentially higher cost due to multiple instances.

Use This Approach When

Distributing load across multiple instances for better performance is needed.
Isolating failures to specific shards to improve resiliency is desired.
The application can benefit from tailored configurations for different logical partitions.
The complexity of managing multiple instances can be handled.

Example Use Case A multi-tenant application where each customer has a dedicated Spice Runtime instance. This helps ensure that heavy usage by one customer does not impact others, and allows for customer-specific optimizations.

Sharding vs. partitioning

Sharding splits load across multiple Spice instances, each backing a logical slice of the system (a customer, a region, a workload). Each shard runs an independent runtime with its own datasets, accelerations, and resources.

Within a single Spice instance, acceleration partitioning splits a single dataset into multiple physical units (files, tables, or in-memory tables) so that filtered queries only read the relevant subset. The two are complementary: a shard can also use partitioning internally to keep individual datasets pruneable.

Concern	Sharded deployment	Acceleration partitioning
Splits across…	Multiple runtimes/processes	One acceleration on one runtime
Routing	Application picks which Spice instance to query	Spice prunes partitions automatically based on filter pushdown
Failure isolation	Per-shard	None — single runtime
Use when	Tenants/regions have very different load or data volumes	A single dataset is too large to scan whole on every query