HuggingFace Text Embedding Models

To use an embedding model from HuggingFace with Spice, specify the huggingface path in the from field of your configuration. The model and its related files will be automatically downloaded, loaded, and served locally by Spice.

The following parameters are specific to HuggingFace models:

Parameter	Description	Default
`hf_token`	The Huggingface access token.	-
`pooling`	The pooling method for embedding models. Supported values are `cls`, `mean`, `splade`, `last_token`	-

Here is an example configuration in spicepod.yaml:

embeddings:
  - from: huggingface:huggingface.co/sentence-transformers/all-MiniLM-L6-v2
    name: all_minilm_l6_v2

Supported models include:

All models tagged as text-embeddings-inference on Huggingface
Any Huggingface repository with the correct files to be loaded as a local embedding model.

With the same semantics as language models, spice can run private HuggingFace embedding models:

embeddings:
  - from: huggingface:huggingface.co/secret-company/awesome-embedding-model
    name: top_secret
    params:
      hf_token: ${ secrets:HF_TOKEN }