M0 organizes agent knowledge into two distinct layers: a high-level "Experience" summary outlining strategy and cautions, and a detailed "Skill" layer with structured operational steps. This allows an agent to load the compact strategy first and only retrieve operational details when necessary, keeping the active prompt lean and efficient.
A cost-effective AI strategy involves using a powerful, expensive model once to solve a complex task, then using a system like M0 to distill that solution into reusable "experience" and "skill" records. Cheaper models can then leverage this pre-packaged knowledge to execute the same task with higher success rates and significantly lower token costs.
M0's retrieval system runs four parallel signals: vector and full-text search across both the title and description of knowledge records. This hybrid approach captures semantic similarity for paraphrased queries (vector search) and exact matches for specific terms like API names (full-text), resulting in highly relevant, compact results.
M0 employs a two-phase process for agent memory. It first extracts atomic facts solely from human-computer dialogue, ignoring verbose tool outputs. A separate LLM call then compares these new facts to existing memories to decide whether to add, update, or ignore them, preventing redundant or contradictory storage and minimizing token usage.
