Swazee mark Product (type slug) Tool (chevron-wrench) Experiment (4-point star) Active (filled diamond) Shipped (hollow diamond) Shelved (diamond + slash) External link (↗) Search (angular magnifier) Filter (funnel) Close / Esc (chunky X) Move down (j) Move up (k) Return / Enter
SWAZEENET
VOL. I · NO. 01 · EST. MCMLXXXVII
BROADSIDE 12 active · 2026·05·14
№ 30 · tool · Python 3.12+ · 2024—

Master Agent Engine,
23 routers, one Python process.

A Windows Server's control plane on :8420.

windowsadmin fastapi automation

Master Agent Engine is the local infrastructure-management platform for this Windows Server — a single uvicorn worker fronting 23 FastAPI routers covering process control, service lifecycle, AD, IIS, Defender, certificates, three GPU worker proxies, and a disk-usage subsystem with bottom-up scanning. Every mutation is audit-logged. Every PowerShell call funnels through one sanitization chokepoint. Every daemon thread (CPU sampler, child reaper, disk scheduler) starts at lifespan and stops cleanly at shutdown.

IArchitecture

The platform is a single uvicorn worker fronting 23 FastAPI routers organized by Windows subsystem — host process control, service lifecycle, users + groups, registry, firewall, certificates, AD, IIS, Defender, scheduled tasks, DNS, plus three GPU worker proxies for FLUX image-gen and Wan 2.2 video, plus a disk-usage subsystem with bottom-up scanning and three-stage dupe detection. All endpoints are async; every PowerShell invocation runs on a dedicated ThreadPoolExecutor sized to min(64, cpu_count * 4) for I/O saturation without starving the loop.

The workhorse is run_ps_json(cmd): it executes PowerShell, validates exit code, parses JSON output, and normalizes results to list[dict]. Almost every router uses it. The injection-prevention boundary is sanitize_ps_param(value), which doubles single quotes and rejects null bytes; sanitized values must always be embedded in PS single-quoted strings (-Name '{val}'). Path inputs go through validate_path() separately with an explicit allowlist.

Three daemon threads spawn at lifespan startup and stop cleanly at shutdown. A CPU sampler ticks every second so /metrics reads the latest value instantly instead of blocking 500 ms per call. A child-process reaper polls the spawned-PID registry and evicts completed entries after a 300-second TTL. A disk-usage scheduler fires its first scan at T+5 minutes, then every twelve hours, against a configurable scan root and an EXCLUDE_DIRS list for transient/cache directories.

Twenty-three routers, one Python process, no locks.

§ I · Architecture

IIOperating posture

The day-2 surface is small on purpose. Deploy-MasterAgent.ps1 handles the full bring-up: virtualenv, ACLs, NSSM service registration, Windows firewall rules, and a scaffolded .env. After that, the only operator surfaces are the dark-themed dashboard at / (auto-connecting on localStorage credentials), the e2e smoke test (a 50-second live exercise of every router plus the WebSocket handshake), and direct REST/JSON for scripted use.

Authentication accepts either an X-API-Key header (SHA-256 + HMAC verified against the configured key set) or an Authorization: Bearer <jwt> token issued by /auth/token. Per-IP rate limiting uses a deque-based sliding 60-second window at 300 RPM (configurable). Every state-changing endpoint funnels through one audit(caller, action, detail) call, written to audit.log when AGENT_AUDIT_LOG_ENABLED=true; the same audit posture covers FLUX/video proxy submissions and disk-usage destructive actions.

The optional Windows feature checks (AD module, IIS module) cache module-availability with a 10-minute TTL — those routers return 503 Service Unavailable when the module is missing instead of 500-ing on every request. The FLUX and video routers proxy to local GPU workers running on different loopback ports; each holds a module-level httpx.AsyncClient reused across requests, and aclose_client() is called from lifespan shutdown to drain in-flight calls.

Fig. I.
01CORS 02rate-limit 03req-mw 04router 05require_auth 06handler 07audit
7 stages · every endpoint async · as of 2026-05-09
Fig. II.
host 6 26% network 4 17% identity 3 13% GPU workers · 2 metrics · 2 registry · 1 tasks · 1 defender · 1 diskusage · 1 execution · 1 codex · 1
single uvicorn worker · ~11 model files · as of 2026-05-09

IIISurface

Callers see one HTTP service on localhost:8420. Twenty-three routers cover host process control, service lifecycle, user/group management, registry, firewall, certificates, Windows Update, DNS, system metrics, AD users + groups, IIS sites, Defender status, scheduled tasks, FLUX image generation, Wan 2.2 video generation, and a disk-usage subsystem. The dashboard at / bundles the most-used surfaces into a single dark-themed page; the OpenAPI doc at /docs ships only when AGENT_DEBUG=true. There is no client-side framework — the dashboard is hand-written HTML talking to the same JSON API everything else uses.

IVConstraints

The constraints are deliberate. Single-worker uvicorn means in-memory state — child registry, rate-limit buckets, CPU sampler, disk-usage scheduler — needs no locking. The cost is no horizontal scaling on the same host, which is fine because the agent is the host's control plane; one box, one worker. sanitize_ps_param is the only safe path to embed user input in PowerShell, so any router that bypasses it would let a caller inject arbitrary PS — the reviewer expectation is explicit and the test fixture covers known injection vectors. NSSM running as LocalSystem trades sandboxing for the ability to manage anything on the box; the audit log is the post-hoc check on that tradeoff. Single-worker also means a slow PowerShell call can occupy the executor while other requests queue; the executor pool sizing (min(64, cpu_count * 4)) is tuned for that contention and the e2e smoke catches regressions.

:/ ESC