Skip to content

Embedding#

Encodes chunks into vectors for semantic search & hybrid ranking.

Embedder#

  • Default: lightweight sentence transformer (config: EMBED_MODEL).
  • Batch size & concurrency tuned to avoid memory spikes.

Failure & Retries#

  • If the model fails, the chunk is marked pending; the scheduler retries later.
  • We never block ingestion of the catalog on embedding failures.

Stored Fields#

  • entity_uid
  • chunk_id
  • vector (list of floats / DB-specific binary)
  • dim, model_id, created_at

Replacement Policy#

  • Re-embedding occurs only when:
  • Chunk text or weight changed, or
  • Embedder model_id changed, or
  • Admin forced re-embed.