Files
blackroad-operating-system/DEPLOYMENT_ARCHITECTURE.md
Claude 9a728f655a Prevent BlackRoad-Operating-System monorepo from being added to Railway services
CRITICAL CHANGES:
- Add comprehensive deployment architecture documentation
- Prevent misconfiguration where monorepo is deployed instead of satellites
- Clarify monorepo-to-satellite sync model across all docs

CHANGES:
1. railway.toml
   - Add critical warning banner at top of file
   - Mark config as local development/testing only
   - Explain correct deployment model (satellites, not monorepo)

2. DEPLOYMENT_ARCHITECTURE.md (NEW)
   - Complete 500+ line deployment guide
   - Monorepo vs satellite model explained in detail
   - Critical rules: NEVER add monorepo to Railway
   - Service-to-repository mapping
   - Environment configuration guide
   - Cloudflare DNS configuration
   - Common mistakes and troubleshooting

3. README.md
   - Add prominent deployment warning box
   - Clarify monorepo is source of truth, not deployable
   - List satellite repos that should be deployed
   - Reference DEPLOYMENT_ARCHITECTURE.md

4. CLAUDE.md
   - Add critical deployment model section
   - Clarify Railway deployment is satellite-only
   - Update deployment workflow explanation
   - Add key rules for deployment

5. backend/.env.example
   - Fix ALLOWED_ORIGINS to reference satellites
   - Remove monorepo Railway URL reference
   - Add correct satellite service URLs

6. ops/domains.yaml
   - Fix os.blackroad.systems DNS target
   - Point to blackroad-os-core-production (satellite)
   - Remove incorrect monorepo Railway URL

7. scripts/validate_deployment_config.py (NEW)
   - Automated validation script
   - Checks for monorepo references in configs
   - Validates railway.toml, env files, DNS configs
   - Ensures DEPLOYMENT_ARCHITECTURE.md exists
   - Exit code 0 = pass, 1 = fail

WHY THIS MATTERS:
- Adding monorepo to Railway creates circular deploy loops
- Environment variables break (wrong service URLs)
- Cloudflare routing fails
- Service dependencies misconfigured
- Prevents production outages from misconfiguration

CORRECT MODEL:
- Monorepo = source of truth (orchestration only)
- Satellites = deployable services (Railway deployment)
- Code flows: monorepo → sync → satellite → Railway

See: DEPLOYMENT_ARCHITECTURE.md for complete details
2025-11-19 22:31:22 +00:00

17 KiB

BlackRoad OS Deployment Architecture

Last Updated: 2025-11-19 Status: Canonical deployment model for all BlackRoad OS services


Table of Contents

  1. Overview
  2. The Monorepo vs Satellite Model
  3. Critical Rules
  4. Deployment Topology
  5. Service-to-Repository Mapping
  6. Environment Configuration
  7. Cloudflare DNS Configuration
  8. Common Mistakes to Avoid
  9. Troubleshooting

Overview

BlackRoad OS uses a monorepo-to-satellite sync architecture where:

  • BlackRoad-Operating-System (this repo) = Source of truth, orchestration, sync logic
  • Satellite repos (blackroad-os-core, blackroad-os-api, etc.) = Deployable services

This document establishes the canonical deployment model to prevent misconfiguration.


The Monorepo vs Satellite Model

BlackRoad-Operating-System (Monorepo)

Purpose: Control plane and source of truth

Role:

  • Houses all service code in services/, apps/, docs/
  • Syncs code to satellite repos via GitHub Actions
  • Stores orchestration logic, prompts, and infrastructure configs
  • Serves as the "brain" - NOT the compute

Deployment Status: NEVER DEPLOYED TO PRODUCTION

Why NOT deployable:

  • No single entry point (contains multiple services)
  • Would create circular deployment dependencies
  • Not designed for runtime execution
  • Would break service discovery and routing

Satellite Repositories

Purpose: Deployable, runtime services

Satellites:

Repository Purpose Railway Service Cloudflare Domain
blackroad-os-core Core API & business logic blackroad-os-core-production core.blackroad.systems
blackroad-os-api Public API gateway blackroad-os-api-production api.blackroad.systems
blackroad-os-operator Agent runtime & orchestrator blackroad-os-operator-production operator.blackroad.systems
blackroad-os-prism-console Status console frontend blackroad-os-prism-console-production prism.blackroad.systems
blackroad-os-docs Documentation site blackroad-os-docs-production docs.blackroad.systems
blackroad-os-web Public website blackroad-os-web-production blackroad.systems

Deployment Status: EACH DEPLOYED INDEPENDENTLY TO RAILWAY

Sync Process:

  1. Developer edits code in monorepo (e.g., services/core-api/)
  2. GitHub Action syncs changes to satellite (BlackRoad-OS/blackroad-os-core)
  3. Satellite triggers Railway deployment
  4. Railway deploys to production

See: docs/os/monorepo-sync.md for sync details


Critical Rules

NEVER DO THIS

  1. Never add BlackRoad-Operating-System to Railway as a service

    • Not in production environments
    • Not in staging environments
    • Not in development environments (unless explicitly testing locally)
  2. Never reference the monorepo in service configurations

    • Don't add it to env vars (e.g., MONOREPO_URL)
    • Don't add it as a dependency in other services
    • Don't point Cloudflare to the monorepo
  3. Never deploy the monorepo to production

    • It's not designed to run as a service
    • It will break everything

ALWAYS DO THIS

  1. Deploy ONLY satellite repos to Railway

    • Each satellite = one Railway service
    • Each service = one environment (dev, staging, production)
  2. Edit code ONLY in the monorepo

    • Satellites are read-only mirrors
    • All changes flow: monorepo → sync → satellite → deploy
  3. Point Cloudflare ONLY to satellite Railway URLs

    • core.blackroad.systemsblackroad-os-core-production.up.railway.app
    • api.blackroad.systemsblackroad-os-api-production.up.railway.app
    • etc.
  4. Create production environments for ALL satellites

    • Each satellite needs: dev, staging, production
    • Example: blackroad-os-core-dev, blackroad-os-core-staging, blackroad-os-core-production

Deployment Topology

┌─────────────────────────────────────────────────────────────┐
│  BlackRoad-Operating-System (Monorepo)                      │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐      │
│  │ services/    │  │ apps/        │  │ docs/        │      │
│  │ core-api/    │  │ prism-console│  │ site/        │      │
│  │ public-api/  │  │ web/         │  │              │      │
│  │ operator/    │  │              │  │              │      │
│  └──────────────┘  └──────────────┘  └──────────────┘      │
│         │                  │                  │             │
│         └──────────────────┴──────────────────┘             │
│                     GitHub Actions Sync                     │
└─────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────┐
│  Satellite Repositories (Deployable)                         │
│  ┌────────────────┐  ┌────────────────┐  ┌────────────────┐│
│  │ blackroad-os-  │  │ blackroad-os-  │  │ blackroad-os-  ││
│  │ core           │  │ api            │  │ operator       ││
│  └────────────────┘  └────────────────┘  └────────────────┘│
│  ┌────────────────┐  ┌────────────────┐  ┌────────────────┐│
│  │ blackroad-os-  │  │ blackroad-os-  │  │ blackroad-os-  ││
│  │ prism-console  │  │ web            │  │ docs           ││
│  └────────────────┘  └────────────────┘  └────────────────┘│
└─────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────┐
│  Railway (Deployment Platform)                               │
│  ┌────────────────┐  ┌────────────────┐  ┌────────────────┐│
│  │ core-production│  │ api-production │  │ operator-prod  ││
│  │ .up.railway.app│  │ .up.railway.app│  │ .up.railway.app││
│  └────────────────┘  └────────────────┘  └────────────────┘│
└─────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────┐
│  Cloudflare (DNS & Routing)                                  │
│  core.blackroad.systems  → blackroad-os-core-production      │
│  api.blackroad.systems   → blackroad-os-api-production       │
│  operator.blackroad.systems → blackroad-os-operator-production│
└─────────────────────────────────────────────────────────────┘

Service-to-Repository Mapping

Canonical Mapping

Defined in infra/github/sync-config.yml:

services:
  core-api:
    monorepo_path: "services/core-api"
    target_repo: "BlackRoad-OS/blackroad-os-core"
    target_branch: "main"
  public-api:
    monorepo_path: "services/public-api"
    target_repo: "BlackRoad-OS/blackroad-os-api"
    target_branch: "main"
  operator:
    monorepo_path: "services/operator"
    target_repo: "BlackRoad-OS/blackroad-os-operator"
    target_branch: "main"

apps:
  prism-console:
    monorepo_path: "apps/prism-console"
    target_repo: "BlackRoad-OS/blackroad-os-prism-console"
    target_branch: "main"
  web:
    monorepo_path: "apps/web"
    target_repo: "BlackRoad-OS/blackroad-os-web"
    target_branch: "main"

docs:
  site:
    monorepo_path: "docs/site"
    target_repo: "BlackRoad-OS/blackroad-os-docs"
    target_branch: "main"

Development Workflow

  1. Edit in monorepo: services/core-api/app/main.py
  2. Commit to monorepo: git commit -m "Add health endpoint"
  3. Push to monorepo: git push origin main
  4. GitHub Action syncs: Copies services/core-api/BlackRoad-OS/blackroad-os-core
  5. Railway deploys: Deploys blackroad-os-core to production

Environment Configuration

Required Railway Environments

Each satellite repository needs THREE environments:

  1. Development (-dev)

    • Connected to: dev or develop branch
    • Example: blackroad-os-core-dev.up.railway.app
  2. Staging (-staging)

    • Connected to: staging branch
    • Example: blackroad-os-core-staging.up.railway.app
  3. Production (-production)

    • Connected to: main branch
    • Example: blackroad-os-core-production.up.railway.app

Environment Variables by Service

Core API (blackroad-os-core)

# Database
DATABASE_URL=postgresql+asyncpg://...
REDIS_URL=redis://...

# Auth
SECRET_KEY=...
ACCESS_TOKEN_EXPIRE_MINUTES=30

# Service URLs
API_URL=https://api.blackroad.systems
OPERATOR_URL=https://operator.blackroad.systems
PRISM_URL=https://prism.blackroad.systems

# Environment
ENVIRONMENT=production
DEBUG=False

Public API (blackroad-os-api)

# Core API reference
CORE_API_URL=https://core.blackroad.systems

# Auth
JWT_SECRET=...

# Environment
ENVIRONMENT=production

Operator (blackroad-os-operator)

# Core API reference
CORE_API_URL=https://core.blackroad.systems
API_URL=https://api.blackroad.systems

# GitHub
GITHUB_TOKEN=...
GITHUB_APP_ID=...

# Environment
ENVIRONMENT=production

Prism Console (blackroad-os-prism-console)

# Frontend env vars (Next.js)
NEXT_PUBLIC_CORE_API_URL=https://core.blackroad.systems
NEXT_PUBLIC_API_URL=https://api.blackroad.systems
NEXT_PUBLIC_OPERATOR_URL=https://operator.blackroad.systems

NEVER Add These Vars

# ❌ DO NOT ADD THESE - THEY WILL BREAK EVERYTHING
MONOREPO_URL=https://github.com/blackboxprogramming/BlackRoad-Operating-System
BLACKROAD_OS_REPO=BlackRoad-Operating-System
SOURCE_REPO=BlackRoad-Operating-System

Why? These vars reference the non-deployable monorepo, creating confusion and breaking service discovery.


Cloudflare DNS Configuration

Canonical DNS Records

Subdomain Type Target Proxied
blackroad.systems CNAME blackroad-os-web-production.up.railway.app Yes
core.blackroad.systems CNAME blackroad-os-core-production.up.railway.app Yes
api.blackroad.systems CNAME blackroad-os-api-production.up.railway.app Yes
operator.blackroad.systems CNAME blackroad-os-operator-production.up.railway.app Yes
prism.blackroad.systems CNAME blackroad-os-prism-console-production.up.railway.app Yes
docs.blackroad.systems CNAME blackroad-os-docs-production.up.railway.app Yes
os.blackroad.systems CNAME prism.blackroad.systems Yes

NEVER Point Cloudflare To

# ❌ WRONG - Monorepo is not deployed
core.blackroad.systems → blackroad-operating-system.up.railway.app

# ❌ WRONG - Monorepo doesn't exist on Railway
api.blackroad.systems → blackroad-os-monorepo-production.up.railway.app

See: CLOUDFLARE_DNS_BLUEPRINT.md for complete DNS configuration


Common Mistakes to Avoid

Mistake #1: Adding Monorepo to Railway

Problem:

# ❌ Creating Railway service for monorepo
railway link blackboxprogramming/BlackRoad-Operating-System
railway up

Why it's wrong:

  • Monorepo contains multiple services (not a single deployable)
  • No single entry point or health check
  • Will create circular dependencies

Solution:

# ✅ Deploy satellite repos instead
railway link BlackRoad-OS/blackroad-os-core
railway up

Mistake #2: Adding Monorepo to Service Env Vars

Problem:

# ❌ In blackroad-os-core environment
MONOREPO_URL=https://github.com/blackboxprogramming/BlackRoad-Operating-System

Why it's wrong:

  • Services should reference other services, not the source repo
  • Creates confusion between source code and runtime services

Solution:

# ✅ Reference other services instead
API_URL=https://api.blackroad.systems
OPERATOR_URL=https://operator.blackroad.systems

Mistake #3: Pointing Cloudflare to Monorepo

Problem:

# ❌ Cloudflare DNS
core.blackroad.systems → blackroad-operating-system.up.railway.app

Why it's wrong:

  • Monorepo is not deployed to Railway
  • URL doesn't exist
  • Will result in DNS failures

Solution:

# ✅ Point to satellite Railway URL
core.blackroad.systems → blackroad-os-core-production.up.railway.app

Mistake #4: Editing Satellite Repos Directly

Problem:

# ❌ Editing blackroad-os-core directly
cd blackroad-os-core
git commit -m "Fix bug"
git push

Why it's wrong:

  • Satellites are read-only mirrors
  • Changes will be overwritten by next sync
  • Creates drift between monorepo and satellite

Solution:

# ✅ Edit in monorepo
cd BlackRoad-Operating-System/services/core-api
# Make changes
git commit -m "Fix bug"
git push
# Sync workflow automatically updates satellite

Troubleshooting

Issue: "Service not deploying after code change"

Diagnosis:

  1. Check if code was edited in monorepo (not satellite)
  2. Verify sync workflow ran successfully
  3. Check Railway deployment logs

Solution:

# 1. Verify sync workflow
gh workflow view sync-core-api
gh run list --workflow=sync-core-api

# 2. Manually trigger sync if needed
gh workflow run sync-core-api

# 3. Check Railway deployment
railway logs -s blackroad-os-core-production

Issue: "Environment variables not resolving correctly"

Diagnosis:

  1. Check if vars reference monorepo (wrong)
  2. Verify service-to-service URLs are correct
  3. Confirm Railway environment is set up properly

Solution:

# 1. List current env vars
railway variables

# 2. Remove monorepo references
railway variables delete MONOREPO_URL

# 3. Add correct service URLs
railway variables set CORE_API_URL=https://core.blackroad.systems

Issue: "Cloudflare returning 522 errors"

Diagnosis:

  1. Check if DNS points to correct Railway URL
  2. Verify Railway service is running
  3. Confirm health check is passing

Solution:

# 1. Check Railway service status
railway status

# 2. Verify health endpoint
curl https://blackroad-os-core-production.up.railway.app/health

# 3. Update Cloudflare DNS if needed
# (via Cloudflare dashboard: DNS → Edit CNAME)

Issue: "Circular deployment loops"

Diagnosis:

  • Monorepo may be configured as a dependency
  • Service may be triggering its own deployment

Solution:

  1. Remove monorepo from Railway services
  2. Ensure satellites deploy independently
  3. Check GitHub Actions triggers

Summary

The Golden Rules

  1. Monorepo = Source of Truth (not deployed)
  2. Satellites = Deployable Services (deployed to Railway)
  3. Edit in monorepo → Sync to satellites → Deploy automatically
  4. Never add monorepo to Railway or service configurations
  5. Cloudflare points to satellites, not monorepo

Quick Reference

When you want to...

  • Deploy a service: Use satellite repo (blackroad-os-core), not monorepo
  • Edit code: Edit in monorepo (services/core-api), not satellite
  • Configure DNS: Point to satellite Railway URL, not monorepo
  • Set env vars: Reference other services, not monorepo
  • Create new service: Add to monorepo, create sync workflow, deploy satellite

For questions or issues, see:

  • docs/os/monorepo-sync.md - Sync process details
  • CLOUDFLARE_DNS_BLUEPRINT.md - DNS configuration
  • infra/railway/ENVIRONMENT_GUIDE.md - Railway setup
  • CLAUDE.md - Development guide

Last updated: 2025-11-19 This document is canonical for all BlackRoad OS deployments.