kpi: auto-update metrics 2026-03-13

RoadChain-SHA2048: c645c1292ab1555e
RoadChain-Identity: alexa@sovereign
RoadChain-Full: c645c1292ab1555ebe6982915536d1c94701ff6bb16c20ed6ef4144eb50c9f984b4bfe5b9902109e8defd958d6be43ced8ec11cf95d6241536cd4da0b75f8fb48cbeb1b9f450c8f665b73d39e837d23e73e2ba4201af4dc40c02a34283efb04b39c612083465536f194f16adfadb1b56f714a65b918f40750f54eebf7724236861de173ec31963ff3b1b988d712be7e5acc3fe391eb804d3fdcfb9ccf77afc732660d23fff801f894318327eabf775eb4f4e67f7f22d07f23b0e17f6594cfe95b83b275fb7baaa97115e86562604fc5b47cc8024574b61396924e0ee2b7e454b0a1480c3076c7ad72408ceb4a75360d2d49c7d805c37ac5315af00e4a8ca2262
This commit is contained in:
2026-03-13 23:16:12 -05:00
parent 0c714c106c
commit ec7b1445b5
25 changed files with 815 additions and 1112 deletions

View File

@@ -8,64 +8,43 @@ amundsonalexa@gmail.com | [github.com/blackboxprogramming](https://github.com/bl
## Summary
Security engineer who identified and remediated malware, credential leaks, and misconfigurations across a 7-node distributed fleet. Implements zero-trust networking via Cloudflare tunnels, WireGuard encryption, firewall policies, and credential management across 256 managed services.
Found a crypto miner, a cron dropper, and a leaked PAT in my own infrastructure. Cleaned all of it, rotated credentials fleet-wide, and rebuilt security from zero-trust architecture up — because the hardest incidents are the ones inside your own network.
---
## Experience
### BlackRoad OS | Founder & Security Lead | 2025Present
### BlackRoad OS | Founder & Security Engineer | 2025Present
**Incident Response**
- Discovered and removed obfuscated cron dropper executing from /tmp/op.py (Cecilia)
- Identified leaked GitHub PAT (gho_Gfu...) in Lucidia service file, initiated rotation
- Found and investigated xmrig crypto miner service configuration on Lucidia
- Migrated credentials from plaintext crontabs to secured env files (chmod 600) fleet-wide
**The Incidents: What I Found and How I Fixed It**
- Obfuscated cron dropper on Cecilia — exec'ing from /tmp/op.py every 5 minutes. Traced it, removed the cron entry, cleaned /tmp, audited all nodes
- xmrig crypto miner service configured on Lucidia — unit file referencing mining pool. Service removed, system audited for persistence mechanisms
- Leaked GitHub PAT (gho_Gfu...) embedded in a systemd service file on Lucidia — removed from config, token revoked on GitHub, all secrets migrated to chmod 600 env files
- 50+ SSH authorized keys on some nodes — audited every key, identified which ones are active, locked down access paths
**Network Security**
- Zero-trust architecture: all external access through 4 Cloudflare tunnels (no exposed ports)
- WireGuard encryption for all inter-node communication (10.8.0.x mesh)
- UFW firewall with INPUT DROP policy on edge nodes
- Tailscale ACLs for management access (9 peers)
**The Architecture: Trust Nothing by Default**
- Zero open ports — all external access through Cloudflare tunnels. No port forwarding, no exposed SSH, no public APIs
- WireGuard encryption for all inter-node traffic. UFW with INPUT DROP policy on edge nodes. Credential rotation enforced fleet-wide
- GitHub security scanning workflows check for AWS keys, tokens, passwords on every push — catches secrets before they ship
**Access Management**
- SSH key audit: identified 50+ keys on Alice and Octavia requiring cleanup
- NOPASSWD sudo policies documented across all nodes
- Identified 3 Tailscale ghost nodes (offline 15+ days) for decommissioning
- Per-user cron job audit across all fleet nodes
**Infrastructure Hardening**
- Disabled 16 unused skeleton microservices (freed 800 MB RAM, reduced attack surface)
- Masked crash-looping services (rpi-connect-wayvnc) to prevent service abuse
- Removed overclock settings causing instability
- Secured GitHub relay credentials in ~/.github-relay.env (chmod 600)
**Monitoring & Detection**
- Self-healing autonomy scripts detecting and restarting failed services
- 12 failed systemd units tracked and investigated daily
- Fleet-wide power monitoring detecting anomalous CPU usage
- Daily KPI collection tracking security-relevant metrics
**The Lesson**
- Security isn't a feature you add — it's what you find when you actually look. Every fleet needs an adversarial audit, not just a firewall
---
## Technical Skills
**Security:** Incident response, credential management, malware removal, hardening
**Networking:** WireGuard, Cloudflare Tunnels (zero-trust), UFW, nftables, Tailscale
**Linux:** systemd, SSH, file permissions, audit, service isolation
**Monitoring:** Custom KPI system, anomaly detection, SSH probes
**Tools:** Bash (212 CLI tools), Python, GitHub CLI
incident response, malware analysis, credential rotation, WireGuard, Cloudflare tunnels, UFW, SSH, Linux hardening
---
## Metrics
| Metric | Value |
|--------|-------|
| Incidents remediated | 5+ |
| Services managed | 256 |
| Firewall policies | UFW + nftables |
| VPN tunnels | 4 CF + 7 WG |
| Services disabled | 16+ |
| Credentials rotated | 4+ |
| Fleet nodes secured | 7 |
| Metric | Value | Source |
|--------|-------|--------|
| Failed Units | *live* | services.sh — systemctl --failed via SSH |
| Fleet Nodes | *live* | fleet.sh — SSH probe to all nodes |
| Systemd Services | *live* | services.sh — systemctl list-units via SSH |
| Tailscale Peers | *live* | services.sh — tailscale status via SSH |
| Nginx Sites | *live* | services.sh — /etc/nginx/sites-enabled via SSH |
| Nodes Online | *live* | fleet.sh — SSH probe to all nodes |