mirror of
https://github.com/blackboxprogramming/BlackRoad-Operating-System.git
synced 2026-03-17 07:57:19 -05:00
This commit introduces a comprehensive infrastructure overhaul that transforms
BlackRoad OS into a true distributed operating system with unified kernel,
DNS-aware service discovery, and standardized syscall APIs.
## New Infrastructure Components
### 1. Kernel Module (kernel/typescript/)
- Complete TypeScript kernel implementation for all services
- Service registry with production and dev DNS mappings
- RPC client for inter-service communication
- Event bus, job queue, state management
- Structured logging with log levels
- Full type safety with TypeScript
Modules:
- types.ts: Complete type definitions
- serviceRegistry.ts: DNS-aware service discovery
- identity.ts: Service identity and metadata
- config.ts: Environment-aware configuration
- logger.ts: Structured logging
- rpc.ts: Inter-service RPC client
- events.ts: Event bus (pub/sub)
- jobs.ts: Background job queue
- state.ts: Key-value state management
- index.ts: Main exports
### 2. DNS Infrastructure Documentation (infra/DNS.md)
- Complete Cloudflare DNS mapping
- Railway production and dev endpoints
- Email configuration (MX, SPF, DKIM, DMARC)
- SSL/TLS, security, and monitoring settings
- Service-to-domain mapping
- Health check configuration
Production Services:
- operator.blackroad.systems
- core.blackroad.systems
- api.blackroad.systems
- console.blackroad.systems
- docs.blackroad.systems
- web.blackroad.systems
- os.blackroad.systems
- app.blackroad.systems
### 3. Service Registry & Architecture (INFRASTRUCTURE.md)
- Canonical service registry with all endpoints
- Monorepo-to-satellite deployment model
- Service-as-process architecture
- DNS-as-filesystem model
- Inter-service communication patterns
- Service lifecycle management
- Complete environment variable documentation
### 4. Syscall API Specification (SYSCALL_API.md)
- Standard kernel API for all services
- Required syscalls: health, version, identity, RPC
- Optional syscalls: logging, metrics, events, jobs, state
- Complete API documentation with examples
- Express.js implementation guide
Core Endpoints:
- GET /health
- GET /version
- GET /v1/sys/identity
- GET /v1/sys/health
- POST /v1/sys/rpc
- POST /v1/sys/event
- POST /v1/sys/job
- GET/PUT /v1/sys/state
### 5. Railway Deployment Guide (docs/RAILWAY_DEPLOYMENT.md)
- Step-by-step deployment instructions
- Environment variable configuration
- Monitoring and health checks
- Troubleshooting guide
- Best practices for Railway deployment
### 6. Atlas Kernel Scaffold Prompt (prompts/atlas/ATLAS_KERNEL_SCAFFOLD.md)
- Complete prompt for generating new services
- Auto-generates full kernel implementation
- Includes all DNS and Railway mappings
- Production-ready output with zero TODOs
### 7. GitHub Workflow Templates (templates/github-workflows/)
- deploy.yml: Railway auto-deployment
- test.yml: Test suite with coverage
- validate-kernel.yml: Kernel validation
- README.md: Template documentation
## Updated Files
### CLAUDE.md
- Added "Kernel Architecture & DNS Infrastructure" section
- Updated Table of Contents
- Added service architecture diagram
- Documented all new infrastructure files
- Updated repository structure with new directories
- Added kernel and infrastructure to critical path files
## Architecture Impact
This update establishes BlackRoad OS as a distributed operating system where:
- Each Railway service = OS process
- Each Cloudflare domain = mount point
- All services communicate via syscalls
- Unified kernel ensures interoperability
- DNS-aware service discovery
- Production and development environments
## Service Discovery
Services can now discover and call each other:
```typescript
import { rpc } from './kernel';
const user = await rpc.call('core', 'getUserById', { id: 123 });
```
## DNS Mappings
Production:
- operator.blackroad.systems → blackroad-os-operator-production-3983.up.railway.app
- core.blackroad.systems → 9gw4d0h2.up.railway.app
- api.blackroad.systems → ac7bx15h.up.railway.app
Internal (Railway):
- blackroad-os-operator.railway.internal:8001
- blackroad-os-core.railway.internal:8000
- blackroad-os-api.railway.internal:8000
## Next Steps
1. Sync kernel to satellite repos
2. Implement syscall endpoints in all services
3. Update services to use RPC for inter-service calls
4. Configure Cloudflare health checks
5. Deploy updated services to Railway
---
Files Added:
- INFRASTRUCTURE.md
- SYSCALL_API.md
- infra/DNS.md
- docs/RAILWAY_DEPLOYMENT.md
- kernel/typescript/* (9 modules + README)
- prompts/atlas/ATLAS_KERNEL_SCAFFOLD.md
- templates/github-workflows/* (4 files)
Files Modified:
- CLAUDE.md
Total: 22 new files, 1 updated file
BlackRoad OS - Kernel Module (TypeScript)
Version: 2.0 Author: Atlas (Infrastructure Architect) Last Updated: 2025-11-20
Overview
The BlackRoad OS Kernel is a TypeScript module that provides a unified interface for all BlackRoad OS services. It enables:
- Service Discovery: Automatic DNS and Railway endpoint resolution
- Inter-Service Communication: RPC calls between services
- Event Broadcasting: Pub/sub event bus
- Background Jobs: Asynchronous task execution
- State Management: Shared key-value store
- Structured Logging: Centralized logging with levels
- Health Monitoring: Service health checks and status reporting
Installation
Option 1: Copy to Your Service
# Copy the entire kernel directory to your service
cp -r kernel/typescript/* your-service/src/kernel/
Option 2: Symlink (Monorepo)
# Create a symlink to the kernel
cd your-service/src
ln -s ../../kernel/typescript kernel
Option 3: NPM Package (Future)
# Not yet published
npm install @blackroad-os/kernel
Usage
Basic Setup
// src/index.ts
import { kernelConfig, logger, getKernelIdentity } from './kernel';
async function main() {
// Get service identity
const identity = getKernelIdentity();
logger.info('Service starting', { identity });
// Log configuration
logger.info('Configuration loaded', { config: kernelConfig });
// Your service logic here...
}
main().catch((error) => {
logger.fatal('Fatal error', { error: error.message });
process.exit(1);
});
Service Discovery
import { getServiceUrl, getInternalUrl, SERVICE_REGISTRY } from './kernel';
// Get public URL (Cloudflare)
const coreUrl = getServiceUrl('core', 'production', 'cloudflare');
// => "https://core.blackroad.systems"
// Get internal URL (Railway)
const coreInternal = getInternalUrl('core', 'production');
// => "http://blackroad-os-core.railway.internal:8000"
// Get all services
const allServices = getAllServices();
// => ["operator", "core", "api", "console", ...]
Inter-Service RPC
import { rpc } from './kernel';
// Call a method on another service
const user = await rpc.call('core', 'getUserById', { id: 123 });
// Check service health
const health = await rpc.getHealth('operator');
// Get service identity
const identity = await rpc.getIdentity('api');
// Ping a service
const isAlive = await rpc.ping('docs');
Event Bus
import { events } from './kernel';
// Subscribe to an event
events.on('user:created', (event) => {
console.log('User created:', event.data);
});
// Subscribe once
events.once('app:ready', (event) => {
console.log('App ready!');
});
// Emit an event
await events.emit('user:created', { userId: 123, email: 'test@example.com' });
// Unsubscribe
const unsubscribe = events.on('data:updated', handler);
unsubscribe(); // Call to unsubscribe
Background Jobs
import { jobQueue } from './kernel';
// Register a job handler
jobQueue.registerHandler('send-email', async (params) => {
const { to, subject, body } = params;
// Send email logic...
return { sent: true, messageId: '12345' };
});
// Create and execute a job
const job = await jobQueue.createJob('send-email', {
to: 'user@example.com',
subject: 'Welcome',
body: 'Hello!',
});
// Get job status
const jobStatus = jobQueue.getJob(job.id);
console.log('Job status:', jobStatus?.status);
// Cancel a job
await jobQueue.cancelJob(job.id);
State Management
import { state } from './kernel';
// Set state
state.set('user:count', 0);
// Get state
const entry = state.get('user:count');
console.log('User count:', entry?.value);
// Update state
state.update('user:count', (current) => current + 1);
// Increment/decrement
state.increment('user:count');
state.decrement('user:count', 5);
// Optimistic locking
try {
state.set('user:count', 10, 5); // expectedVersion = 5
} catch (error) {
console.error('Version conflict!');
}
Logging
import { logger, createLogger } from './kernel';
// Use default logger
logger.debug('Debug message');
logger.info('Info message');
logger.warn('Warning message');
logger.error('Error message', { error: 'details' });
logger.fatal('Fatal error');
// Create contextual logger
const dbLogger = createLogger('Database');
dbLogger.info('Connected to database');
API Reference
Types
All TypeScript types are defined in types.ts:
Environment: "production" | "development" | "staging" | "test"ServiceRole: "core" | "api" | "operator" | "web" | "console" | "docs" | "shell" | "root"HealthStatus: "healthy" | "degraded" | "unhealthy"LogLevel: "debug" | "info" | "warn" | "error" | "fatal"JobStatus: "pending" | "queued" | "running" | "completed" | "failed" | "cancelled"
See types.ts for complete interface definitions.
Service Registry
Functions:
getServiceUrl(serviceName, environment, urlType): Get service URLgetAllServices(): Get all service namesgetServiceByRole(role): Get service by rolehasService(serviceName): Check if service existsgetInternalUrl(serviceName, environment): Get internal Railway URLgetPublicUrl(serviceName, environment): Get public Cloudflare URL
Identity
Functions:
getKernelIdentity(): Get service identitysetHealthStatus(status): Update health statusgetUptime(): Get service uptime (seconds)
Config
Variables:
kernelConfig: Global configuration object
Functions:
loadKernelConfig(): Load config from env varsvalidateConfig(config): Validate configuration
Logger
Class: Logger
debug(message, meta?): Log debug messageinfo(message, meta?): Log info messagewarn(message, meta?): Log warningerror(message, meta?): Log errorfatal(message, meta?): Log fatal error
Functions:
createLogger(context?): Create logger with contextgetLogs(level?, limit?, offset?): Get buffered logsclearLogs(): Clear log buffer
RPC Client
Class: RPCClient
call<T>(service, method, params?, timeout?): Call remote proceduregetHealth(service): Get service healthgetIdentity(service): Get service identityping(service): Ping service
Event Bus
Class: EventBus
on(eventName, handler): Subscribe to eventonce(eventName, handler): Subscribe onceoff(eventName, handler): Unsubscribeemit(eventName, data?): Emit eventgetEventNames(): Get all event namesgetSubscriberCount(eventName): Get subscriber countclearEvent(eventName): Clear event handlersclearAll(): Clear all handlers
Job Queue
Class: JobQueue
registerHandler(name, handler): Register job handlercreateJob(name, params?, schedule?): Create jobgetJob(jobId): Get job statusgetAllJobs(): Get all jobsgetJobsByStatus(status): Get jobs by statuscancelJob(jobId): Cancel jobclearCompleted(): Clear completed jobs
State Manager
Class: StateManager
get(key): Get state valuegetAll(): Get all state entriesset(key, value, expectedVersion?): Set state valuedelete(key): Delete state entryhas(key): Check if key existsclear(): Clear all statesize(): Get state sizekeys(): Get all keysupdate(key, updater): Update state valueincrement(key, delta?): Increment numeric valuedecrement(key, delta?): Decrement numeric value
Environment Variables
Required for all services:
# Service Identity
SERVICE_NAME=blackroad-os-{service}
SERVICE_ROLE=core|api|operator|web|console|docs|shell|root
ENVIRONMENT=production|development|staging|test
PORT=8000
# Railway (auto-provided in production)
RAILWAY_STATIC_URL=
RAILWAY_ENVIRONMENT=
# Service URLs (public)
OPERATOR_URL=https://operator.blackroad.systems
CORE_API_URL=https://core.blackroad.systems
PUBLIC_API_URL=https://api.blackroad.systems
CONSOLE_URL=https://console.blackroad.systems
DOCS_URL=https://docs.blackroad.systems
WEB_URL=https://web.blackroad.systems
OS_URL=https://os.blackroad.systems
# Internal URLs (Railway private network)
OPERATOR_INTERNAL_URL=http://blackroad-os-operator.railway.internal:8001
CORE_API_INTERNAL_URL=http://blackroad-os-core.railway.internal:8000
PUBLIC_API_INTERNAL_URL=http://blackroad-os-api.railway.internal:8000
CONSOLE_INTERNAL_URL=http://blackroad-os-prism-console.railway.internal:8000
DOCS_INTERNAL_URL=http://blackroad-os-docs.railway.internal:8000
WEB_INTERNAL_URL=http://blackroad-os-web.railway.internal:8000
Testing
Example test using the kernel:
import { getKernelIdentity, rpc, events, state } from './kernel';
describe('Kernel', () => {
beforeEach(() => {
// Set up test environment
process.env.SERVICE_NAME = 'test-service';
process.env.SERVICE_ROLE = 'api';
process.env.ENVIRONMENT = 'test';
});
it('should return kernel identity', () => {
const identity = getKernelIdentity();
expect(identity.service).toBe('test-service');
expect(identity.role).toBe('api');
});
it('should emit and receive events', async () => {
const handler = jest.fn();
events.on('test:event', handler);
await events.emit('test:event', { foo: 'bar' });
expect(handler).toHaveBeenCalledWith(
expect.objectContaining({
event: 'test:event',
data: { foo: 'bar' },
})
);
});
it('should manage state', () => {
state.set('test:key', 'test:value');
const entry = state.get('test:key');
expect(entry?.value).toBe('test:value');
expect(entry?.version).toBe(1);
});
});
Architecture
The kernel follows a modular architecture:
kernel/typescript/
├── types.ts # Type definitions
├── serviceRegistry.ts # Service discovery
├── identity.ts # Service identity
├── config.ts # Configuration
├── logger.ts # Logging
├── rpc.ts # Inter-service RPC
├── events.ts # Event bus
├── jobs.ts # Job queue
├── state.ts # State management
├── index.ts # Main exports
└── README.md # This file
Each module is:
- Self-contained: No external dependencies (except Node.js built-ins)
- Typed: Full TypeScript support
- Testable: Easy to mock and test
- Documented: JSDoc comments for all public APIs
Best Practices
- Use internal URLs for RPC: Always prefer
getInternalUrl()for inter-service communication - Handle RPC errors: Wrap RPC calls in try/catch blocks
- Version state carefully: Use optimistic locking for concurrent updates
- Clean up event listeners: Always unsubscribe when done
- Use contextual loggers: Create loggers with context for better debugging
- Register job handlers early: Register all handlers before creating jobs
Roadmap
Future enhancements:
- Distributed event bus (cross-service events)
- Persistent job queue (Redis/Postgres)
- Distributed state (Redis/Consul)
- Circuit breaker for RPC calls
- Request tracing (OpenTelemetry)
- Metrics export (Prometheus)
- Service mesh integration
- NPM package publication
Contributing
This kernel is part of the BlackRoad OS monorepo. To contribute:
- Edit files in
kernel/typescript/ - Run tests:
npm test(when available) - Update this README if adding features
- Sync changes to satellite repos
License
Part of BlackRoad Operating System © 2025 Alexa Louise (Cadillac)
Version: 2.0 Last Updated: 2025-11-20 Status: ✅ Production Ready