Sharded
The Spice Runtime instances can be sharded based on specific criteria, such as by customer, state, or other logical partitions. Each shard operates independently, with a 1:N Application to Spice instances ratio.
Benefits
- Helps distribute load across multiple instances, improving performance and scalability.
- Isolates failures to specific shards, enhancing resiliency.
- Allows tailored configurations and optimizations for different shards.
Considerations
- More complex deployment and management due to multiple instances.
- Requires effective sharding strategy to balance load and avoid hotspots.
- Potentially higher cost due to multiple instances.
Use This Approach When
- Distributing load across multiple instances for better performance is needed.
- Isolating failures to specific shards to improve resiliency is desired.
- The application can benefit from tailored configurations for different logical partitions.
- The complexity of managing multiple instances can be handled.
Example Use Case A multi-tenant application where each customer has a dedicated Spice Runtime instance. This helps ensure that heavy usage by one customer does not impact others, and allows for customer-specific optimizations.
Sharding vs. partitioning
Sharding splits load across multiple Spice instances, each backing a logical slice of the system (a customer, a region, a workload). Each shard runs an independent runtime with its own datasets, accelerations, and resources.
Within a single Spice instance, acceleration partitioning splits a single dataset into multiple physical units (files, tables, or in-memory tables) so that filtered queries only read the relevant subset. The two are complementary: a shard can also use partitioning internally to keep individual datasets pruneable.
| Concern | Sharded deployment | Acceleration partitioning |
|---|---|---|
| Splits across… | Multiple runtimes/processes | One acceleration on one runtime |
| Routing | Application picks which Spice instance to query | Spice prunes partitions automatically based on filter pushdown |
| Failure isolation | Per-shard | None — single runtime |
| Use when | Tenants/regions have very different load or data volumes | A single dataset is too large to scan whole on every query |
