Agent OS

WorkMan

A lightweight operating system for AI agent work. Multiple agents collaborate through structured review cycles with honest state, governed prompts, and evidence-based fitness scoring. Local-first today, hosted tomorrow, self-improving by design.

Runtime roles
  • Director — multi-bench coordination and path selection
  • Manager — workspace state aggregation and reporting
  • Operator — runtime commands, agent supervision, stage decisions
  • Seats — pluggable agent slots for work, review, and handoff

The system

Three layers

Orchestration

Governed agent work

Agents don't just run — they work in structured cycles with real accountability. Semi-adversarial review catches what the worker missed. Staged handoffs keep every decision traceable. Explicit runtime truth means nothing drifts silently.

Intelligence

Cortex

A context engineering layer that assembles prompts from retrieval, live runtime state, and policy constraints. Every agent gets guidance built from what's actually happening — not a static template pulled from a library.

Fitness

Self-improving

A qualification model scores how well each agent handles each type of task. Validation benches assess the system itself. Evidence from real work flows back in, so the runtime keeps getting better at matching agents to work.

Platform

Capabilities

  • Isolated workbenches — each task gets its own branch, runtime state, and staged guidance
  • Semi-adversarial review — agents review each other's work with structured approve/reject cycles and remedy rounds
  • Governed guidance — every prompt is assembled from grounding, guardrails, and live execution state (Triple-G)
  • Pluggable agent seats — assign any AI engine to any role; different models bring different strengths
  • Context-engineered prompts — Cortex assembles guidance from retrieval, policy, and execution context
  • Explicit runtime truth — one canonical state per bench; no hidden state, no second authority plane
  • Parallel execution — multiple benches running simultaneously with fully isolated state
  • Operator views — browser-based live bench state, system topology, and runtime monitoring
  • Hosted projection — the same runtime projects to a protected remote surface behind identity gating

In progress

What's being built

  • N-seat topology — configurable multi-seat benches with delegation chains beyond paired review
  • Qualification model — evidence-based fitness scoring to match agents to tasks by measured capability
  • Policy engine — per-action authority and delegation gates evaluated at runtime
  • Resolved guidance packets — explicit instruction objects assembled from truth, grounding, and guardrails
  • Response matrix — deterministic next-action derivation from runtime state plus attested signals
  • Self-assessment — validation benches that measure WorkMan's own fitness and feed back improvements
  • Cortex retrieval — embedding-based similar-example augmentation and outcome-aware prompt selection
  • Template output mapping — theoretical software component designs driving structured code generation

Explore

Links

Architecture

Local-first, hostable, self-improving

WorkMan runs on your machine and manages agent sessions through tmux. Cortex assembles context-aware prompts from retrieval and live state. The qualification system measures what works and feeds evidence back. When you're ready, the whole thing projects to a hosted operator surface behind identity gating.