Skip to content

Embeddings

Embeddings providers convert text into vectors for semantic similarity search. When you extract a memory or run a query, the embeddings provider generates vectors that capture the meaning of the text.

ProviderPackageDescription
OpenAI@youcraft/recall-embeddings-openaitext-embedding-3-small/large
Cohere@youcraft/recall-embeddings-cohereembed-english/multilingual-v3.0
Voyage@youcraft/recall-embeddings-voyagevoyage-3, code, finance, law models

Embeddings transform text into numerical vectors that capture semantic meaning:

"User loves TypeScript" → [0.023, -0.041, 0.018, ...] (1536 dimensions)

Similar meanings produce similar vectors, enabling semantic search:

// Query: "What programming languages?"
// Finds: "User loves TypeScript" (high similarity)
// Even though "programming languages" ≠ "TypeScript"

When you extract or query memories:

  1. Extraction: Each memory’s content is converted to a vector and stored
  2. Query: Your query text is converted to a vector
  3. Search: The database finds memories with the most similar vectors

This enables semantic search—finding memories by meaning, not just keywords.

FactorOpenAICohereVoyage
Modelstext-embedding-3-small/largeembed-v3.0 (english/multilingual)voyage-3, code, finance, law
Dimensions1536 / 3072384 / 1024512 / 1024
MultilingualLimitedNative supportDedicated model
Batch size204896128
SpecializationGeneralMultilingualCode, finance, legal

Choose based on:

  • Existing API access: Use what you already have
  • Language support: Cohere for multilingual applications
  • Domain: Voyage for code, finance, or legal applications
  • Dimensions: Smaller = faster search, larger = better quality
  • Cost: Compare pricing for your volume

Implement the EmbeddingsProvider interface:

import type { EmbeddingsProvider } from '@youcraft/recall'
export function customEmbeddings(config: YourConfig): EmbeddingsProvider {
const dimensions = 1536 // Your model's output dimensions
return {
dimensions,
async embed(text: string): Promise<number[]> {
// Call your embedding API/model
// Return a single vector
const response = await yourAPI.embed(text)
return response.embedding
},
async embedBatch(texts: string[]): Promise<number[][]> {
// Embed multiple texts efficiently
// Return array of vectors in same order as input
const response = await yourAPI.embedBatch(texts)
return response.embeddings
},
}
}
  1. Dimensions: Must match your model’s output size. Used by database adapters for schema creation.

  2. Batch processing: Implement embedBatch for efficiency. If your API doesn’t support batching, fall back to sequential calls:

async embedBatch(texts: string[]): Promise<number[][]> {
return Promise.all(texts.map(text => this.embed(text)))
}
  1. Rate limiting: Handle API rate limits in your implementation. Consider adding retry logic with exponential backoff.

  2. Consistency: Always use the same model for a given database. Mixing embedding models will produce poor search results.