Files
alexa-amundson-resume/roles/01-senior-devops-engineer.md
Alexa Amundson ec7b1445b5 kpi: auto-update metrics 2026-03-13
RoadChain-SHA2048: c645c1292ab1555e
RoadChain-Identity: alexa@sovereign
RoadChain-Full: c645c1292ab1555ebe6982915536d1c94701ff6bb16c20ed6ef4144eb50c9f984b4bfe5b9902109e8defd958d6be43ced8ec11cf95d6241536cd4da0b75f8fb48cbeb1b9f450c8f665b73d39e837d23e73e2ba4201af4dc40c02a34283efb04b39c612083465536f194f16adfadb1b56f714a65b918f40750f54eebf7724236861de173ec31963ff3b1b988d712be7e5acc3fe391eb804d3fdcfb9ccf77afc732660d23fff801f894318327eabf775eb4f4e67f7f22d07f23b0e17f6594cfe95b83b275fb7baaa97115e86562604fc5b47cc8024574b61396924e0ee2b7e454b0a1480c3076c7ad72408ceb4a75360d2d49c7d805c37ac5315af00e4a8ca2262
2026-03-13 23:16:12 -05:00

53 lines
2.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Alexa Amundson
**Senior DevOps Engineer**
amundsonalexa@gmail.com | [github.com/blackboxprogramming](https://github.com/blackboxprogramming)
---
## Summary
Needed production infrastructure without a team or budget. Built a self-healing 7-node fleet from Raspberry Pis, automated 52 operational tasks, and deployed 99 cloud services — solo, from scratch.
---
## Experience
### BlackRoad OS | Founder & Senior DevOps Engineer | 2025Present
**The Problem: Zero Infrastructure, Zero Team**
- No existing infrastructure, no ops team, no vendor contracts — needed production-grade systems running 48+ domains on day one
- Solved by designing a hybrid fleet: 5 Pi nodes + 2 cloud VMs + Cloudflare edge, all connected via WireGuard mesh VPN — total cost under $700 hardware
- Result: 256 systemd services running across fleet, 48 Nginx reverse proxy sites, 14 Docker containers — all managed by one person
**The Bet: Self-Healing Over Manual Ops**
- Fleet nodes crash, services fail, temperatures spike — manual monitoring doesn't scale for a solo operator running 256 services
- Built autonomy scripts: heartbeat every 60 seconds, heal cycle every 5 minutes, automatic service restarts on failure
- Detected a node cooking at 73.8°C from a runaway Ollama loop — auto-isolated the process, dropped temp to 57.9°C without downtime
**The Multiplier: 212 CLI Tools**
- Every repeated task became a tool. 212 CLI tools (121 MB) in ~/bin — deploy, probe, audit, sync, report
- GitHub-to-Gitea relay syncs 207 repos every 30 minutes. Daily KPI collection tracks 60+ metrics across 10 data sources
- 99 Cloudflare Pages, 23 D1 databases, 47 KV namespaces, 11 R2 buckets — all deployed and maintained through CLI automation
---
## Technical Skills
Linux/Debian, Docker Swarm, systemd, Nginx, WireGuard, Cloudflare, GitHub Actions, Bash, Python
---
## Metrics
| Metric | Value | Source |
|--------|-------|--------|
| Systemd Services | *live* | services.sh — systemctl list-units via SSH |
| Docker Containers | *live* | services.sh — docker ps via SSH |
| Fleet Nodes | *live* | fleet.sh — SSH probe to all nodes |
| CF Pages | *live* | cloudflare.sh — wrangler pages list |
| CLI Tools | *live* | local.sh — ls ~/bin | wc -l |
| Total Repos | *live* | github-all-orgs.sh — gh api repos (17 owners) |
| Nginx Sites | *live* | services.sh — /etc/nginx/sites-enabled via SSH |