Architecture

Optio is a monorepo with two applications (API server and web dashboard), four shared packages, and a Helm chart for Kubernetes deployment. All services run in Kubernetes, including the API and web app.

System Overview

The system has three layers: the web UI for user interaction, the API server for orchestration logic, and Kubernetes for agent execution.

  • Web UI (Next.js) — Dashboard with live log streaming, task management, repo configuration, cost analytics, and cluster monitoring. Communicates with the API via REST and WebSocket.
  • API Server (Fastify) — Orchestration brain. Manages task queue (BullMQ), PR watching, health monitoring, ticket sync, and pod lifecycle. Stores state in PostgreSQL, uses Redis for job queue and pub/sub.
  • Kubernetes — Execution environment. Each repository gets its own long-lived pod. Tasks run in isolated git worktrees within those pods.

Pod-per-Repo with Worktrees

This is the central design decision. Instead of one pod per task (slow and wasteful), Optio runs one long-lived pod per repository:

  • The pod clones the repo once on creation, then runs sleep infinity
  • When a task arrives, Optio execs into the pod: git worktree add → run agent → cleanup worktree
  • Multiple tasks can run concurrently in the same pod (one per worktree), controlled by per-repo maxConcurrentTasks
  • Pods use persistent volumes so installed tools survive restarts
  • Idle pods are cleaned up after 10 minutes (configurable via OPTIO_REPO_POD_IDLE_MS)

Multi-Pod Scaling

Repos can scale beyond a single pod for higher throughput. Two per-repo settings control this:

SettingDefaultDescription
maxPodInstances1Max pod replicas per repo (1–20)
maxAgentsPerPod2Max concurrent agents per pod (1–50)

Total capacity = maxPodInstances × maxAgentsPerPod. Pods scale up dynamically when all existing pods are at capacity, and scale down LIFO when idle.

Workers

The API server runs several BullMQ workers:

  • Task Worker — Processes the job queue. Handles concurrency limits, pod provisioning, agent execution, and log streaming.
  • PR Watcher — Polls open PRs every 30 seconds for CI status, review state, and merge readiness. Triggers auto-resume and auto-merge.
  • Health Monitor — Runs every 60 seconds. Detects crashed pods, cleans up orphaned worktrees, removes idle pods.
  • Ticket Sync — Syncs tasks from GitHub Issues and Linear tickets.
  • Webhook Worker — Delivers outgoing webhook events.
  • Schedule Worker — Executes cron-based scheduled tasks.

Tech Stack

LayerTechnology
MonorepoTurborepo + pnpm 10
APIFastify 5, Drizzle ORM, BullMQ, Zod
WebNext.js 15, Tailwind CSS 4, Zustand, Recharts
DatabasePostgreSQL 16
QueueRedis 7 + BullMQ
RuntimeKubernetes + Docker
DeployHelm 3
AuthOAuth (GitHub, Google, GitLab)
CIGitHub Actions

Packages

  • @optio/shared — Types, state machine, prompt template renderer, error classifier, constants
  • @optio/container-runtime — Abstract runtime interface with Kubernetes implementation
  • @optio/agent-adapters — Claude Code and Codex adapters (auth, environment, config)
  • @optio/ticket-providers — GitHub Issues and Linear ticket sync