Skip to main content
8 min

Building AI Agent Infrastructure as a Solo Developer

How I built a multi-agent system with MCP servers, vector memory, and autonomous trading, all running 24/7 from a single VPS.

AgentsMCPInfrastructure

When people hear "multi-agent system," they picture a team of engineers, months of planning, and enterprise infrastructure. I built one by myself, and it runs on a single $15/month VPS.

This post covers the architecture decisions, the tools that made it possible, and the parts that surprised me.

The Stack

The system has five layers:

Claude Code sits at the top as the primary agent runtime. Skills, hooks, and memory give it persistent context across sessions.

MCP Servers provide the tool layer. Instead of hardcoding capabilities, each tool is a standalone server that any agent can call. Search the knowledge vault? That's an MCP tool. Dispatch a task to another agent? MCP tool. Check VPS health? MCP tool.

ChromaDB handles vector memory. Every document, conversation summary, and learned pattern gets embedded and stored. When an agent needs context, it queries by semantic similarity rather than keyword matching.

Obsidian is the knowledge vault: 7,000+ markdown files organized by topic. It's the human-readable layer that agents can also query through MCP.

Hetzner VPS runs the always-on processes: the trading bot, the Telegram gateway, the cron jobs, everything that needs to persist beyond a terminal session.

Why MCP Changes Everything

Before MCP, giving an AI agent access to tools meant writing custom integrations for each model provider. MCP standardizes the protocol: define your tool once, and any MCP-compatible client can use it.

I have servers for ChromaDB search, Obsidian vault queries, backlog management, and inter-agent messaging. Adding a new capability means writing one server, not modifying every agent.

The Effect-TS implementation makes the servers composable and type-safe. Error handling is built into the type system rather than scattered across try-catch blocks.

Memory That Actually Works

The biggest challenge with AI agents isn't reasoning; it's memory. A conversation ends, and everything learned evaporates.

I open-sourced the basic version as the Claude Memory Kit and built a Pro version with stack-specific libraries and advanced patterns.

The solution is a three-layer memory system:

  1. ChromaDB for semantic search across all stored knowledge
  2. File-based memory for structured facts (user preferences, project context, feedback)
  3. Obsidian vault for human-curated knowledge that agents can also access

Each layer serves a different retrieval pattern. ChromaDB handles "find me something similar to X." File memory handles "what did the user tell me about Y." The vault handles "what's the canonical documentation for Z."

The Trading Bot

Pamela, the autonomous trading agent, was the forcing function for getting the infrastructure right. A trading bot that loses money because it forgot its strategy is worse than no bot at all.

She runs 24/7 on the VPS, monitored by PM2. Her architecture:

  • Market scanning: Polymarket API for contract discovery
  • Analysis: ML-driven probability estimation
  • Position sizing: Kelly criterion with configurable risk limits
  • Execution: Automated order placement and management
  • Reporting: Daily P&L summaries via Telegram

The key insight: the bot doesn't need to be smart about everything. It needs to be smart about a few things and disciplined about the rest.

Lessons Learned

Start with one agent, not three. Multi-agent orchestration sounds impressive but adds complexity. Get one agent working end-to-end before adding coordination.

MCP servers are the right abstraction. Tools as services, not libraries. This makes testing, deployment, and access control straightforward.

Memory is infrastructure, not a feature. Treat it like a database, with schemas, retention policies, and access patterns.

VPS beats serverless for always-on agents. When your agent needs to maintain state, respond to events, and run cron jobs, a $15 VPS is simpler than a constellation of Lambda functions.

The tools exist. Claude Code, MCP, ChromaDB, PM2: the building blocks for agent infrastructure are production-ready today. The bottleneck isn't technology, it's architecture.

What's Next

The system keeps growing. Current priorities: improving inter-agent communication (an "agent bus" for real-time messaging), better memory consolidation (merging redundant knowledge), and more sophisticated trading strategies.

The goal isn't to build the most complex system. It's to build the most useful one, with the least moving parts.