All docs

Embeddings pipeline

Embeddings pipeline

Module: packages/knowledge/src/embeddings/ Queue: embeddings (BullMQ). Default model: text-embedding-3-small (1536d, OpenAI). Privacy guard: OPENAI_ZDR_CONFIRMED=true required in production (runbook).

Lifecycle

[ MCP store_knowledge ]──INSERT─▶[ knowledge_entries (embedding=NULL) ]
            │
            └─enqueueEmbedding({ knowledgeId, orgId })──▶[ BullMQ embeddings queue ]
                                                                     │
                                                                     ▼
                                       [ worker: processEmbeddingJob ]
                                          1. SELECT row by id
                                          2. buildEmbeddingInput(title, tags, body)
                                          3. provider.embed([input]) — OpenAI / mock / Ollama
                                          4. UPDATE embedding + embedding_model
                                          5. Redis counters (best-effort)

Provider interface

provider.ts defines:

interface EmbeddingProvider {
  name: string; // persisted on the row as embedding_model
  dims: number; // must match knowledge_entries.embedding column type
  embed(texts: string[]): Promise<number[][]>;
}

Three implementations ship today:

Each row stores the provider's name in embedding_model, so we can grep for stale embeddings after a model migration (embedding_model='text-embedding-3-small' vs 'v2').

Queue defaults

Cost / volume metrics

After every successful job the worker increments:

Errors during the metric write are warning-level only — they never fail the job.

Input shape + truncation

buildEmbeddingInput({ title, body, tags }) returns:

${title}\n\n${tags.join(' ')}\n\n${body}

Trimmed to MAX_INPUT_CHARS = 24 000 (well inside the 8 192-token OpenAI cap). Char-based cap is fine for English/Czech; if we ever store CJK-heavy content we'll swap to a real tokeniser.

Boot guard

assertOpenAIZdrConfirmed() throws in NODE_ENV=production unless OPENAI_ZDR_CONFIRMED=true. The MCP entry calls it before constructing the OpenAI provider. logOpenAIBootStatus() prints a one-line confirmation/rejection so ops can grep the boot log.

When OPENAI_EUDR_ENABLED=true, the OpenAI client points at https://eu.api.openai.com/v1 — available only for Enterprise OpenAI tier. Not enabled in Phase 0.

What this module does NOT do

  • Search — task 20 layers pgvector cosine + tsvector hybrid on top.
  • Re-embedding on model change — there's a runbook for it (Phase 9). Today, manual

job: bump EMBEDDING_MODEL, run a sweep script that re-enqueues every row.

  • DLQ wiring — BullMQ already retains failures for 7 days; an explicit DLQ queue lands

with the verification engine (task 32) where retry semantics are richer.