I Replaced $150/Month of SaaS With a $24 VPS and a Weekend — Building Your Private AI Infrastructure [1/5]

I am a Certified Public Tax Accountant (Zeirishi) and financial planner in Japan, specializing in international taxation and transfer pricing. I am also a member of IFA (International Fiscal Association). I have a parallel background in IT — from early microcomputer programming through enterprise ERP implementations.

Last year I sat down and added up what my small practice was paying for SaaS: cloud storage, document collaboration, AI assistants, calendar, email, remote desktop, monitoring. The number was $163 per user per month. I was paying for convenience — but I was also paying for dependence. I could not verify the security architecture. I could not audit the data flow. And every year, the invoices went up while the control went down.

I decided to see whether I could build a self-hosted, zero-trust replacement that I actually understood and controlled — and that any solo practitioner or small firm with 3 to 10 employees could deploy by following a guide.

This is what I ended up with. It runs in production on real client work every day.

The Stack

VPS: Vultr, $24/month, Ubuntu 24.04 LTS
Zero-trust access: Cloudflare Zero Trust (free tier) — 2 open ports only (80/443), no VPN, no exposed SSH, no third-party tunnels
Private cloud + real-time editing: Nextcloud + Collabora Online
Four AI secretaries: A unified proxy routing to ChatGPT, Claude, Gemini, and Perplexity — each selected for a distinct strength. Claude for contracts and editorial precision. Perplexity for source-cited research. ChatGPT for general reasoning and coding. Gemini for structured data and integration. One authenticated portal, four specialized capabilities. Additional providers can be added by extending a single configuration file.
An AI butler: OpenClaw — an agentic automation layer that does not merely answer questions but executes multi-step tasks on instruction. Morning briefings, email-to-task conversion, weekly summaries, file organization. It operates under strict standing rules: all email actions produce drafts only, filesystem access is compartmentalized, and no action is taken without human confirmation.
Remote desktop: Apache Guacamole — browser-based RDP through 5 authentication layers (WARP encryption → Cloudflare Access OTP → TLS tunnel → Guacamole auth → Windows login)
Monitoring + alerting: Prometheus + Grafana + Alertmanager — the system watches itself and notifies you before problems become incidents
Triple-redundant backups: Nightly DB to Supabase (PostgreSQL-to-PostgreSQL, zero format conversion) + weekly AES-256 encrypted full config archive + 30-day retention with documented 2-hour restore procedure

8 security layers: WARP encryption → Cloudflare Access (OTP) → TLS tunnel → UFW (80/443 only) → fail2ban → sysctl hardening → localhost-only service binding → application-level authentication

Who This Is For

This stack is designed for solo practitioners and small firms — accountants, lawyers, consultants, advisors — with 3 to 10 employees. It scales within that range without architectural changes. If you are comfortable following step-by-step instructions in a terminal, you can build this. No DevOps background is required.

The Migration: SaaS → Zero Trust Self-Hosted

What you gain:

Cost control. No per-user pricing that compounds as you grow. The VPS cost is fixed. AI costs are usage-based and capped at your discretion.
Data sovereignty. Client data resides on infrastructure you control. It does not pass through third-party SaaS pipelines you cannot audit.
Architectural transparency. Every configuration file, every security layer, every network rule — you can read it, verify it, and change it.
Independence. No vendor can alter your pricing, discontinue your plan, or change terms of service beneath you.

What you accept:

Operational responsibility. There is no vendor to call at 2 AM. You maintain the system. The monthly checklist (13 items, ~30 minutes) and the emergency runbook (7 scenarios) exist precisely for this reason.
Initial time investment. The full build takes approximately 16–24 hours spread across two weekends. This is a one-time cost. After that, monthly maintenance is under one hour.
A learning curve. You will work in a terminal. The guide explains every command and every expected result, but you must be willing to follow it carefully.

The Cost Comparison

Initial investment:

VPS setup: $0 (hourly billing, cancel anytime)
Cloudflare Zero Trust: $0 (free tier)
All software: $0 (open source)
Domain name: ~$12/year
Your time: 16–24 hours (one-time)

Monthly running cost (3–8 person team):

Component	Cost
VPS (Vultr)	$12 (starter) / $24 (recommended) / $48 (growth)
Cloudflare	$0
Supabase backup	$0 (free tier)
All software	$0
AI API usage (moderate, 3 users)	$15–35
Total	$35–50/month

Equivalent SaaS for 3 users:

Component	Cost
Cloud storage + collaboration (Google Workspace)	$36/month
AI subscriptions (4 providers)	$240+/month
Remote desktop (TeamViewer)	$45/month
VPN / zero-trust access	$30+/month
Monitoring (Datadog/UptimeRobot)	$45+/month
Total	$400+/month

5-year savings estimate: $36,900–$48,900.

OpenClaw: The Butler — Used Safely

OpenClaw deserves specific discussion because it is both the most powerful and the most carefully constrained component in this stack.

CVE-2026-25253 (CVSS 8.8, High) and the ClawJacked attack class are real. Over 42,000 public instances exist, and approximately 36% (15,200) remain vulnerable. This stack specifies OpenClaw ≥2026.1.29 (patched) and adds three architectural defenses:

Localhost-only binding. OpenClaw listens on 127.0.0.1 only. It is never reachable from the internet.
Cloudflare Tunnel authentication. Even reaching localhost requires passing through Cloudflare Access OTP — an attacker would need to compromise your email account first.
UFW port restriction. Only ports 80 and 443 are open. There is no path to OpenClaw from the outside.

The standing rules enforce behavioral constraints: all email actions produce drafts only (never autonomous sending), filesystem access is restricted to designated working directories, and every action requires human confirmation before execution.

The question is not whether the tool has risk. Every tool with real capability has risk. The question is whether the architecture contains that risk. This one does.

Four Secretaries, One Portal

The AI proxy is approximately 100 lines of Node.js. It routes requests to four providers through a single authenticated endpoint. API keys live in a .env file on the server and never reach the browser.

Each provider was selected for a distinct role:

Claude — contracts, editorial review, nuanced prose
Perplexity — source-cited real-time research
ChatGPT — general reasoning, coding assistance, analysis
Gemini — structured data, spreadsheet logic, integration tasks

This is not a limitation. It is a deliberate design. Four specialists outperform one generalist. And if a fifth provider emerges that serves your needs, adding it requires extending a single route in the proxy — fewer than 20 lines of code.

The spending rule: set a hard cap per provider before your first API request. $20/month each. Total maximum exposure: $80/month. Realistic spend for a 3-person team: $15–35/month.

The Guide: DIY from Start to Finish

I wrote a free five-part series that covers the entire build. Every command. Every configuration file. Every decision point. Every place where I made a mistake, so you do not have to.

If you follow Parts 1 through 5 and the operational appendices in sequence, you will finish with a complete, production-grade system — without needing to consult external documentation or fill in gaps from other sources.

Part	What You Build
Part 1	Architecture overview, cost analysis, security model, threat assessment
Part 2	VPS provisioning, Cloudflare Zero Trust, UFW, fail2ban, sysctl hardening
Part 3	Docker, Nextcloud, Collabora, AI proxy, OpenClaw, CalDAV, email, backups
Part 4	Guacamole, accounting API integration, Prometheus, Grafana, Alertmanager, AES-256 encrypted backups
Part 5	Full operations manual: LLM proxy code, OpenClaw workflow templates, monthly/annual checklists, emergency runbook (7 scenarios), AI spending audit

Build time: approximately 16–24 hours across two weekends.

All five parts are published and free. No paywall. No signup. No follow-up sequence.

A Few Things I Learned

Cloudflare Tunnel eliminated the need for a VPN entirely. Two ports open, everything else invisible. This was the single biggest simplification.
The hardest integration was not the AI proxy — it was getting Collabora’s aliasgroup configuration to work correctly with Cloudflare’s TLS termination.
OpenClaw’s CVE is a serious concern, but the architectural defense — localhost-only binding plus tunnel authentication — neutralizes it structurally. Do not deploy it without understanding the risk.
The most underrated component is Supabase as a backup target. PostgreSQL-to-PostgreSQL with zero format conversion.
The real transformation was not technical. It was organizational. Four AI secretaries with defined roles and one butler with strict standing rules changed how I work every day. The system stopped being infrastructure and became a team.

I would be grateful for any feedback from this community. If you see something I could improve, or a better approach to any part of this stack, I would genuinely like to hear it.

hendrik@palaver.p3x.de English

6·

4 hours ago

Cost? Just do away with your bills and do it on a $24 Vulture VPS 🥹😂

greyscale@lemmy.grey.ooo
fedilink
English
arrow-up
3·
3 hours ago
Ha. Eventually, the bottom will drop out the market as low-cost NPUs pick up the model running. A good enough open model will emerge and there wont be a market for a paid model.

We’re already kinda seeing it on the hardware side. Eventually it’ll all dissolve into the hardware like how MPEG2 decode hardware for DVDs was once upon a time an expensive addon accellerator card, but is now fractions of a square mm of gates laid out as part of a larger assembly within the silicon of your GPU.