This commit introduces a comprehensive infrastructure overhaul that transforms
BlackRoad OS into a true distributed operating system with unified kernel,
DNS-aware service discovery, and standardized syscall APIs.
## New Infrastructure Components
### 1. Kernel Module (kernel/typescript/)
- Complete TypeScript kernel implementation for all services
- Service registry with production and dev DNS mappings
- RPC client for inter-service communication
- Event bus, job queue, state management
- Structured logging with log levels
- Full type safety with TypeScript
Modules:
- types.ts: Complete type definitions
- serviceRegistry.ts: DNS-aware service discovery
- identity.ts: Service identity and metadata
- config.ts: Environment-aware configuration
- logger.ts: Structured logging
- rpc.ts: Inter-service RPC client
- events.ts: Event bus (pub/sub)
- jobs.ts: Background job queue
- state.ts: Key-value state management
- index.ts: Main exports
### 2. DNS Infrastructure Documentation (infra/DNS.md)
- Complete Cloudflare DNS mapping
- Railway production and dev endpoints
- Email configuration (MX, SPF, DKIM, DMARC)
- SSL/TLS, security, and monitoring settings
- Service-to-domain mapping
- Health check configuration
Production Services:
- operator.blackroad.systems
- core.blackroad.systems
- api.blackroad.systems
- console.blackroad.systems
- docs.blackroad.systems
- web.blackroad.systems
- os.blackroad.systems
- app.blackroad.systems
### 3. Service Registry & Architecture (INFRASTRUCTURE.md)
- Canonical service registry with all endpoints
- Monorepo-to-satellite deployment model
- Service-as-process architecture
- DNS-as-filesystem model
- Inter-service communication patterns
- Service lifecycle management
- Complete environment variable documentation
### 4. Syscall API Specification (SYSCALL_API.md)
- Standard kernel API for all services
- Required syscalls: health, version, identity, RPC
- Optional syscalls: logging, metrics, events, jobs, state
- Complete API documentation with examples
- Express.js implementation guide
Core Endpoints:
- GET /health
- GET /version
- GET /v1/sys/identity
- GET /v1/sys/health
- POST /v1/sys/rpc
- POST /v1/sys/event
- POST /v1/sys/job
- GET/PUT /v1/sys/state
### 5. Railway Deployment Guide (docs/RAILWAY_DEPLOYMENT.md)
- Step-by-step deployment instructions
- Environment variable configuration
- Monitoring and health checks
- Troubleshooting guide
- Best practices for Railway deployment
### 6. Atlas Kernel Scaffold Prompt (prompts/atlas/ATLAS_KERNEL_SCAFFOLD.md)
- Complete prompt for generating new services
- Auto-generates full kernel implementation
- Includes all DNS and Railway mappings
- Production-ready output with zero TODOs
### 7. GitHub Workflow Templates (templates/github-workflows/)
- deploy.yml: Railway auto-deployment
- test.yml: Test suite with coverage
- validate-kernel.yml: Kernel validation
- README.md: Template documentation
## Updated Files
### CLAUDE.md
- Added "Kernel Architecture & DNS Infrastructure" section
- Updated Table of Contents
- Added service architecture diagram
- Documented all new infrastructure files
- Updated repository structure with new directories
- Added kernel and infrastructure to critical path files
## Architecture Impact
This update establishes BlackRoad OS as a distributed operating system where:
- Each Railway service = OS process
- Each Cloudflare domain = mount point
- All services communicate via syscalls
- Unified kernel ensures interoperability
- DNS-aware service discovery
- Production and development environments
## Service Discovery
Services can now discover and call each other:
```typescript
import { rpc } from './kernel';
const user = await rpc.call('core', 'getUserById', { id: 123 });
```
## DNS Mappings
Production:
- operator.blackroad.systems → blackroad-os-operator-production-3983.up.railway.app
- core.blackroad.systems → 9gw4d0h2.up.railway.app
- api.blackroad.systems → ac7bx15h.up.railway.app
Internal (Railway):
- blackroad-os-operator.railway.internal:8001
- blackroad-os-core.railway.internal:8000
- blackroad-os-api.railway.internal:8000
## Next Steps
1. Sync kernel to satellite repos
2. Implement syscall endpoints in all services
3. Update services to use RPC for inter-service calls
4. Configure Cloudflare health checks
5. Deploy updated services to Railway
---
Files Added:
- INFRASTRUCTURE.md
- SYSCALL_API.md
- infra/DNS.md
- docs/RAILWAY_DEPLOYMENT.md
- kernel/typescript/* (9 modules + README)
- prompts/atlas/ATLAS_KERNEL_SCAFFOLD.md
- templates/github-workflows/* (4 files)
Files Modified:
- CLAUDE.md
Total: 22 new files, 1 updated file
GitHub Workflows Templates
This directory contains reusable GitHub Actions workflow templates for BlackRoad OS satellite repositories.
Available Templates
1. deploy.yml - Railway Deployment
Purpose: Automatically deploy to Railway on push to main
Features:
- Runs tests before deployment
- Deploys to Railway
- Health check after deployment
- Notifications on success/failure
Setup:
# 1. Copy to satellite repo
cp templates/github-workflows/deploy.yml .github/workflows/
# 2. Replace placeholders
# - {SERVICE_NAME}: e.g., "core", "api", "operator"
# - {DOMAIN}: e.g., "core", "api", "operator"
# 3. Add Railway token to GitHub secrets
gh secret set RAILWAY_TOKEN --body "your-railway-token"
# 4. Commit and push
git add .github/workflows/deploy.yml
git commit -m "Add Railway deployment workflow"
git push
2. test.yml - Test Suite
Purpose: Run tests on every push and pull request
Features:
- Runs linting
- Type checking
- Unit tests with coverage
- Build validation
- Environment template validation
Setup:
# Copy to satellite repo
cp templates/github-workflows/test.yml .github/workflows/
# No customization needed
git add .github/workflows/test.yml
git commit -m "Add test workflow"
git push
3. validate-kernel.yml - Kernel Validation
Purpose: Ensure service correctly implements BlackRoad OS kernel
Features:
- Validates kernel directory structure
- Checks for required kernel modules
- Verifies syscall endpoints
- Validates railway.json
- Checks documentation
Setup:
# Copy to satellite repo
cp templates/github-workflows/validate-kernel.yml .github/workflows/
# No customization needed
git add .github/workflows/validate-kernel.yml
git commit -m "Add kernel validation workflow"
git push
Quick Start (New Satellite Repo)
To set up all workflows for a new satellite repo:
# 1. Clone satellite repo
git clone https://github.com/BlackRoad-OS/blackroad-os-{service}
cd blackroad-os-{service}
# 2. Copy all workflows
mkdir -p .github/workflows
cp /path/to/monorepo/templates/github-workflows/*.yml .github/workflows/
# 3. Customize deploy.yml
sed -i 's/{SERVICE_NAME}/core/g' .github/workflows/deploy.yml
sed -i 's/{DOMAIN}/core/g' .github/workflows/deploy.yml
# 4. Add Railway token secret
gh secret set RAILWAY_TOKEN --body "$(railway token)"
# 5. Commit and push
git add .github/workflows/
git commit -m "Add GitHub Actions workflows"
git push
Required GitHub Secrets
For deploy.yml
| Secret | Description | How to Get |
|---|---|---|
RAILWAY_TOKEN |
Railway API token | Run railway token in CLI |
Optional Secrets
| Secret | Description | When Needed |
|---|---|---|
CODECOV_TOKEN |
Codecov API token | For code coverage reporting |
SLACK_WEBHOOK_URL |
Slack webhook for notifications | For Slack alerts |
Workflow Triggers
deploy.yml
- Trigger: Push to
mainbranch - Manual: Via workflow_dispatch
test.yml
- Trigger: Push to any branch
- Trigger: Pull request to
mainordevelop
validate-kernel.yml
- Trigger: Push to
mainordevelop - Trigger: Pull request to
main
Customization Guide
Adding Custom Test Steps
Edit test.yml:
- name: Run custom tests
run: npm run test:custom
- name: Integration tests
run: npm run test:integration
env:
DATABASE_URL: postgresql://test:test@localhost:5432/test
Adding Deployment Environments
Edit deploy.yml:
jobs:
deploy-staging:
if: github.ref == 'refs/heads/develop'
steps:
- name: Deploy to staging
run: railway up --environment staging
deploy-production:
if: github.ref == 'refs/heads/main'
steps:
- name: Deploy to production
run: railway up --environment production
Adding Slack Notifications
Add to deploy.yml:
- name: Notify Slack
if: always()
uses: 8398a7/action-slack@v3
with:
status: ${{ job.status }}
text: 'Deployment to production: ${{ job.status }}'
webhook_url: ${{ secrets.SLACK_WEBHOOK_URL }}
Troubleshooting
Workflow Not Running
Cause: Workflow file not in correct location
Solution:
# Ensure files are in .github/workflows/
ls -la .github/workflows/
Deployment Fails with 401 Error
Cause: Invalid or missing RAILWAY_TOKEN
Solution:
# Regenerate Railway token
railway token
# Update GitHub secret
gh secret set RAILWAY_TOKEN --body "new-token"
Health Check Always Fails
Cause: Service not exposing health endpoint
Solution:
// Ensure /health endpoint exists
app.get('/health', (req, res) => {
res.json({ status: 'healthy' });
});
Tests Pass Locally but Fail in CI
Cause: Environment differences
Solution:
# Add test environment to .env.example
cp .env.example .env.test
# Update test.yml to use test env
- name: Run tests
run: npm test
env:
NODE_ENV: test
Best Practices
-
Always run tests before deploying
- Use
continue-on-error: falsefor critical tests
- Use
-
Use health checks
- Verify deployment succeeded before marking as complete
-
Cache dependencies
- Speeds up workflow runs significantly
-
Fail fast
- Stop workflow on first failure to save CI minutes
-
Notify on failures
- Set up Slack/email notifications for production deploys
-
Use matrix builds
- Test against multiple Node versions if needed
-
Separate concerns
- Keep test, deploy, and validation workflows separate
References
- GitHub Actions Docs: https://docs.github.com/en/actions
- Railway CLI Docs: https://docs.railway.app/develop/cli
- BlackRoad OS Deployment:
../docs/RAILWAY_DEPLOYMENT.md
Last Updated: 2025-11-20 Author: Atlas (Infrastructure Architect)