Skip to content

Matrix Hub

Chunking

agent-matrix/matrix-hub

Chunking#

Prepares text for semantic search & RAG.

Inputs#

name, summary, description
README/excerpts referenced by manifest or repository
Examples / usage snippets (if provided)

Strategy#

Hierarchical splitting:
Headings (#, ##) → paragraphs → sentences
Target chunk size ~ 300–600 tokens (configurable)
Metadata per chunk:
entity_uid, section, position, weight, source_uri, checksum

Weights#

Title/name: higher prior (e.g., 1.3×)
Summary: 1.2×
README/body: 1.0×
Examples/code: 1.1×

Output#

A list of (chunk_id, entity_uid, text, weight, meta) ready for embedding.