What gets approved
Every query template, workflow, and tool definition requires explicit human approval before it enters the system’s repertoire. Nothing is auto-promoted.
In high-stakes environments, a wrong answer is worse than no answer. Squad is built around this principle: every response passes through multiple layers of classification, disambiguation, and confidence evaluation before it reaches the user. The system is designed to ask rather than guess, and to decline rather than hallucinate.
When a query arrives, AIM classifies it across several dimensions before deciding how to respond. This classification determines the entire downstream route.
| Classification | What It Detects | Route |
|---|---|---|
| Entity ambiguity | Multiple entities match the query (“Smith” could be several people) | Disambiguation |
| Relation ambiguity | An entity has multiple facets (“tell me about Valve X”: its spec, its history, its failures?) | Clarification |
| Quantity ambiguity | Vague quantifiers (“show me some results” vs “show me all results”) | Clarification |
| Underspecified | Query lacks sufficient constraints to produce a reliable answer | Clarification |
| Recommendation | Open-ended request that requires understanding preferences | Preference elicitation |
| Procedural | Task that maps to a stored workflow or requires multi-step execution | Planner |
| Clear | Unambiguous factual query with a confident match | Direct retrieval |
This classification is not binary: the system assesses confidence across all categories simultaneously and routes to the highest-priority concern first.
When a query is ambiguous, Squad doesn’t pick the most likely interpretation and hope for the best. Instead, it asks the question that will maximally reduce uncertainty: a technique called Expected Information Gain (EIG).
The process works in three steps:
Candidate identification: Squad identifies all plausible interpretations of the query from the knowledge graph. If the user asks about “transformer failures”, the system finds all entities matching that description: specific components, failure categories, document references.
Question selection: The system evaluates candidate clarifying questions and selects the one whose answer would partition the candidate set most evenly. A question that splits 10 candidates into two groups of 5 is more informative than one that splits them 9-to-1.
Iterative narrowing: The user’s answer eliminates candidates, and the process repeats if necessary. In practice, most ambiguities resolve within one or two rounds.
Disambiguation can span multiple exchanges. At each turn, Squad:
Squad tracks a confidence metric throughout the reasoning process. A response is only delivered when the system’s confidence that it has correctly understood and answered the query exceeds a threshold.
| Confidence State | System Action |
|---|---|
| High confidence | Answer directly, no clarification needed |
| Medium confidence | Ask a clarifying question to increase certainty |
| Low confidence | Acknowledge uncertainty, provide what is known, suggest next steps |
This means Squad never silently delivers a low-confidence answer. If it isn’t sure, the user knows.
When AIM generates a novel query to answer a question, that query enters a pending state. It can work for the current session, but it should not be reused for future requests until a human approves it.
Once approved:
This is how Squad progressively shifts from expensive, deliberate reasoning (System 2) to fast, automatic execution (System 1). But the transition only happens through human approval: the system never promotes its own outputs without oversight.
Squad’s learning is powerful precisely because it’s governed. At every level, humans control what the system learns and how it applies that knowledge:
What gets approved
Every query template, workflow, and tool definition requires explicit human approval before it enters the system’s repertoire. Nothing is auto-promoted.
Who can approve
Role-based access control determines who can approve queries, manage workflows, and administer the system.
What can be revoked
Approved workflows and query templates can be revoked at any time by an administrator, immediately removing them from the system’s active repertoire.
What’s visible
The full audit trail, approval history, and reuse statistics are available for inspection.
For queries flagged as high-risk, Squad applies stricter controls. A high-risk query without a strong match to a previously approved template is declined automatically: it never reaches the execution stage.
| Risk Level | Similarity Match | Action |
|---|---|---|
| Low / Medium | Any | Proceed normally |
| High | Strong match to approved template | Execute using proven approach |
| High | No strong match | Decline with explanation |
This ensures that in safety-critical contexts, the system only acts when it has a proven, human-approved approach. Novel high-risk queries surface as review items rather than being attempted speculatively.
When Squad cannot find relevant information in the knowledge graph, it does not attempt to fill the gap with general knowledge or hallucinated content. Instead, it:
Accuracy is not a single feature but a property that emerges from multiple systems working together:
For related governance and review workflows, see Human-in-the-Loop and Query Curation.