feat: Add domain architecture and extract core services from Prism Console

## Domain Architecture
- Complete domain-to-service mapping for 16 verified domains
- Subdomain architecture for blackroad.systems and blackroad.io
- GitHub organization mapping (BlackRoad-OS repos)
- Railway service-to-domain configuration
- DNS configuration templates for Cloudflare

## Extracted Services

### AIops Service (services/aiops/)
- Canary analysis for deployment validation
- Config drift detection
- Event correlation engine
- Auto-remediation with runbook mapping
- SLO budget management

### Analytics Service (services/analytics/)
- Rule-based anomaly detection with safe expression evaluation
- Cohort analysis with multi-metric aggregation
- Decision engine with credit budget constraints
- Narrative report generation

### Codex Governance (services/codex/)
- 82+ governance principles (entries)
- Codex Pantheon with 48+ agent archetypes
- Manifesto defining ethical framework

## Integration Points
- AIops → infra.blackroad.systems (blackroad-os-infra)
- Analytics → core.blackroad.systems (blackroad-os-core)
- Codex → operator.blackroad.systems (blackroad-os-operator)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
Alexa Louise
2025-11-29 13:39:08 -06:00
parent ff692f9a37
commit 9644737ba7
109 changed files with 4891 additions and 0 deletions

View File

@@ -0,0 +1,27 @@
# Codex 18 — Byzantine Agreement — Keep Consensus Under Attack
**Fingerprint:** `23064887b1469b19fa562e8afdee5e9046bedf99aa9cd7142c35e38f91e6fef2`
## Aim
Reach agreement among distributed replicas even when some behave maliciously.
## Core
- With \(3f + 1\) replicas, tolerate up to \(f\) Byzantine faults without violating safety.
- Guarantee safety so no two honest replicas commit different values and maintain liveness under partial synchrony.
- Target commit latency of roughly two to three rounds, as in PBFT or HotStuff families.
## Runbook
1. Size quorums at \(N = 3f + 1\), randomize leaders, and rotate keys to limit targeted attacks.
2. Gossip using authenticated channels, checkpoint state, and maintain view-change proofs.
3. Rate-limit clients and detect equivocation via signed message logs.
## Telemetry
- Commit rate and consensus latency.
- Frequency of view changes and estimated fault counts.
- Network health metrics affecting synchrony assumptions.
## Failsafes
- If liveness degrades, shrink to a smaller honest core or enter read-only mode.
- Escalate to manual intervention when equivocation exceeds thresholds or key compromise is suspected.
**Tagline:** Agreement even when the room lies.