Vector Indexing#
Matrix Hub supports a pluggable vector layer:
- Day-1:
pgvector(PostgreSQL extension) - Later: Milvus/FAISS/Weaviate (adapter interface is already isolated)
pgvector (recommended to start)#
- Table:
embedding_chunk entity_uid,chunk_id,vector,weight,model_id, timestamps- Index type: IVF/Flat/HNSW (choose based on your pgvector version and dataset size)
- Tunables:
- IVF lists / probes for recall vs. latency
- HNSW
m,ef_searchif available - Filter strategy: apply filters on
entityfirst (type, caps), then join candidates to ANN results.
Example (conceptual)#
-- Create extension (once)
CREATE EXTENSION IF NOT EXISTS vector;
-- Example index (IVF, dimensionality N)
-- CREATE INDEX ON embedding_chunk USING ivfflat (vector vector_l2_ops) WITH (lists=100);
Alternatives (future)#
Milvus: stand-alone ANN with collections per model_id.
API contracts for vector search remain the same: search(query_vec, filters, k) → hits.