This commit introduces a comprehensive infrastructure overhaul that transforms
BlackRoad OS into a true distributed operating system with unified kernel,
DNS-aware service discovery, and standardized syscall APIs.
## New Infrastructure Components
### 1. Kernel Module (kernel/typescript/)
- Complete TypeScript kernel implementation for all services
- Service registry with production and dev DNS mappings
- RPC client for inter-service communication
- Event bus, job queue, state management
- Structured logging with log levels
- Full type safety with TypeScript
Modules:
- types.ts: Complete type definitions
- serviceRegistry.ts: DNS-aware service discovery
- identity.ts: Service identity and metadata
- config.ts: Environment-aware configuration
- logger.ts: Structured logging
- rpc.ts: Inter-service RPC client
- events.ts: Event bus (pub/sub)
- jobs.ts: Background job queue
- state.ts: Key-value state management
- index.ts: Main exports
### 2. DNS Infrastructure Documentation (infra/DNS.md)
- Complete Cloudflare DNS mapping
- Railway production and dev endpoints
- Email configuration (MX, SPF, DKIM, DMARC)
- SSL/TLS, security, and monitoring settings
- Service-to-domain mapping
- Health check configuration
Production Services:
- operator.blackroad.systems
- core.blackroad.systems
- api.blackroad.systems
- console.blackroad.systems
- docs.blackroad.systems
- web.blackroad.systems
- os.blackroad.systems
- app.blackroad.systems
### 3. Service Registry & Architecture (INFRASTRUCTURE.md)
- Canonical service registry with all endpoints
- Monorepo-to-satellite deployment model
- Service-as-process architecture
- DNS-as-filesystem model
- Inter-service communication patterns
- Service lifecycle management
- Complete environment variable documentation
### 4. Syscall API Specification (SYSCALL_API.md)
- Standard kernel API for all services
- Required syscalls: health, version, identity, RPC
- Optional syscalls: logging, metrics, events, jobs, state
- Complete API documentation with examples
- Express.js implementation guide
Core Endpoints:
- GET /health
- GET /version
- GET /v1/sys/identity
- GET /v1/sys/health
- POST /v1/sys/rpc
- POST /v1/sys/event
- POST /v1/sys/job
- GET/PUT /v1/sys/state
### 5. Railway Deployment Guide (docs/RAILWAY_DEPLOYMENT.md)
- Step-by-step deployment instructions
- Environment variable configuration
- Monitoring and health checks
- Troubleshooting guide
- Best practices for Railway deployment
### 6. Atlas Kernel Scaffold Prompt (prompts/atlas/ATLAS_KERNEL_SCAFFOLD.md)
- Complete prompt for generating new services
- Auto-generates full kernel implementation
- Includes all DNS and Railway mappings
- Production-ready output with zero TODOs
### 7. GitHub Workflow Templates (templates/github-workflows/)
- deploy.yml: Railway auto-deployment
- test.yml: Test suite with coverage
- validate-kernel.yml: Kernel validation
- README.md: Template documentation
## Updated Files
### CLAUDE.md
- Added "Kernel Architecture & DNS Infrastructure" section
- Updated Table of Contents
- Added service architecture diagram
- Documented all new infrastructure files
- Updated repository structure with new directories
- Added kernel and infrastructure to critical path files
## Architecture Impact
This update establishes BlackRoad OS as a distributed operating system where:
- Each Railway service = OS process
- Each Cloudflare domain = mount point
- All services communicate via syscalls
- Unified kernel ensures interoperability
- DNS-aware service discovery
- Production and development environments
## Service Discovery
Services can now discover and call each other:
```typescript
import { rpc } from './kernel';
const user = await rpc.call('core', 'getUserById', { id: 123 });
```
## DNS Mappings
Production:
- operator.blackroad.systems → blackroad-os-operator-production-3983.up.railway.app
- core.blackroad.systems → 9gw4d0h2.up.railway.app
- api.blackroad.systems → ac7bx15h.up.railway.app
Internal (Railway):
- blackroad-os-operator.railway.internal:8001
- blackroad-os-core.railway.internal:8000
- blackroad-os-api.railway.internal:8000
## Next Steps
1. Sync kernel to satellite repos
2. Implement syscall endpoints in all services
3. Update services to use RPC for inter-service calls
4. Configure Cloudflare health checks
5. Deploy updated services to Railway
---
Files Added:
- INFRASTRUCTURE.md
- SYSCALL_API.md
- infra/DNS.md
- docs/RAILWAY_DEPLOYMENT.md
- kernel/typescript/* (9 modules + README)
- prompts/atlas/ATLAS_KERNEL_SCAFFOLD.md
- templates/github-workflows/* (4 files)
Files Modified:
- CLAUDE.md
Total: 22 new files, 1 updated file
17 KiB
BlackRoad OS - Syscall API Specification
Version: 2.0 Last Updated: 2025-11-20 Author: Atlas (Infrastructure Architect) Status: Production Standard
Overview
The BlackRoad OS Syscall API defines the standard kernel interface that ALL services MUST implement. This ensures uniform interoperability across the distributed operating system.
Think of syscalls as the "system calls" of a traditional OS, but exposed as HTTP REST endpoints. Every service is a process, and syscalls are how the OS and other processes interact with it.
Table of Contents
- Core Syscalls
- Logging Syscalls
- Metrics Syscalls
- RPC Syscalls
- Event Syscalls
- Job Syscalls
- State Syscalls
- Implementation Guide
- Examples
Core Syscalls
GET /health
Purpose: Basic health check
Request: None
Response:
{
"status": "healthy"
}
Status Codes:
200 OK: Service is healthy503 Service Unavailable: Service is unhealthy
Example:
curl https://core.blackroad.systems/health
GET /version
Purpose: Get service version
Request: None
Response:
{
"version": "1.0.0",
"service": "blackroad-os-core"
}
Status Codes:
200 OK: Success
Example:
curl https://core.blackroad.systems/version
GET /v1/sys/identity
Purpose: Get complete service identity
Request: None
Response: KernelIdentity object
{
"service": "blackroad-os-core",
"role": "core",
"version": "1.0.0",
"environment": "production",
"dns": {
"cloudflare": "https://core.blackroad.systems",
"railway": "https://9gw4d0h2.up.railway.app",
"internal": "http://blackroad-os-core.railway.internal:8000"
},
"runtime": {
"railwayHost": "9gw4d0h2.up.railway.app",
"internalHost": "http://blackroad-os-core.railway.internal:8000",
"port": 8000,
"pid": 1234,
"uptime": 3600
},
"health": {
"status": "healthy",
"uptime": 3600,
"lastCheck": "2025-11-20T12:00:00Z"
},
"capabilities": ["rpc", "events", "jobs", "state"]
}
Status Codes:
200 OK: Success
Example:
curl https://core.blackroad.systems/v1/sys/identity
GET /v1/sys/health
Purpose: Extended health check with detailed metrics
Request: None
Response: HealthCheck object
{
"status": "healthy",
"timestamp": "2025-11-20T12:00:00Z",
"uptime": 3600,
"memory": {
"rss": 52428800,
"heapTotal": 20971520,
"heapUsed": 15728640,
"external": 1048576
},
"checks": {
"database": {
"status": "ok",
"latency": 5
},
"redis": {
"status": "ok",
"latency": 2
},
"dependencies": {
"status": "ok",
"message": "All dependencies healthy"
}
}
}
Status Codes:
200 OK: Service is healthy503 Service Unavailable: Service is unhealthy or degraded
Example:
curl https://core.blackroad.systems/v1/sys/health
GET /v1/sys/version
Purpose: Extended version information
Request: None
Response: VersionInfo object
{
"version": "1.0.0",
"service": "blackroad-os-core",
"commit": "a1b2c3d4e5f6",
"buildTime": "2025-11-20T10:00:00Z",
"nodeVersion": "20.10.0",
"environment": "production"
}
Status Codes:
200 OK: Success
Example:
curl https://core.blackroad.systems/v1/sys/version
GET /v1/sys/config
Purpose: Get non-sensitive service configuration
Request: None
Response: Partial KernelConfig (excluding secrets)
{
"service": {
"name": "blackroad-os-core",
"role": "core",
"version": "1.0.0",
"environment": "production",
"port": 8000
},
"features": {
"rpc": true,
"events": true,
"jobs": true,
"state": true
}
}
Status Codes:
200 OK: Success
Example:
curl https://core.blackroad.systems/v1/sys/config
Logging Syscalls
POST /v1/sys/log
Purpose: Log a message (remote logging)
Request: LogEntry (partial)
{
"level": "info",
"message": "User logged in",
"meta": {
"userId": 123,
"ip": "192.168.1.1"
}
}
Response:
{
"id": "1732104000000-abc123",
"timestamp": "2025-11-20T12:00:00Z"
}
Status Codes:
201 Created: Log entry created400 Bad Request: Invalid log entry
Example:
curl -X POST https://core.blackroad.systems/v1/sys/log \
-H "Content-Type: application/json" \
-d '{"level":"info","message":"Test log"}'
GET /v1/sys/logs
Purpose: Get buffered logs
Query Parameters:
level(optional): Filter by log levellimit(optional): Number of logs to return (default: 100)offset(optional): Offset for pagination (default: 0)
Request: None
Response:
{
"logs": [
{
"id": "1732104000000-abc123",
"timestamp": "2025-11-20T12:00:00Z",
"level": "info",
"message": "User logged in",
"service": "blackroad-os-core",
"meta": { "userId": 123 }
}
],
"total": 500,
"limit": 100,
"offset": 0
}
Status Codes:
200 OK: Success
Example:
curl "https://core.blackroad.systems/v1/sys/logs?level=error&limit=50"
Metrics Syscalls
POST /v1/sys/metric
Purpose: Record a metric
Request: MetricEntry (partial)
{
"name": "http.request.duration",
"value": 125,
"unit": "ms",
"tags": {
"method": "GET",
"path": "/api/users",
"status": "200"
}
}
Response:
{
"id": "1732104000000-xyz789",
"timestamp": "2025-11-20T12:00:00Z"
}
Status Codes:
201 Created: Metric recorded400 Bad Request: Invalid metric
Example:
curl -X POST https://core.blackroad.systems/v1/sys/metric \
-H "Content-Type: application/json" \
-d '{"name":"cpu.usage","value":75,"unit":"percent"}'
GET /v1/sys/metrics
Purpose: Get recorded metrics
Query Parameters:
name(optional): Filter by metric namefrom(optional): Start timestamp (ISO 8601)to(optional): End timestamp (ISO 8601)limit(optional): Number of metrics to return (default: 100)
Request: None
Response:
{
"metrics": [
{
"id": "1732104000000-xyz789",
"timestamp": "2025-11-20T12:00:00Z",
"name": "http.request.duration",
"value": 125,
"unit": "ms",
"tags": { "method": "GET" }
}
],
"total": 1000,
"limit": 100
}
Status Codes:
200 OK: Success
Example:
curl "https://core.blackroad.systems/v1/sys/metrics?name=cpu.usage&limit=50"
RPC Syscalls
POST /v1/sys/rpc
Purpose: Call a remote procedure (method) on this service
Request: RPCRequest
{
"method": "getUserById",
"params": {
"id": 123
},
"timeout": 5000
}
Response: RPCResponse
{
"result": {
"id": 123,
"email": "user@example.com",
"name": "John Doe"
}
}
Error Response:
{
"error": {
"code": "USER_NOT_FOUND",
"message": "User with ID 123 not found",
"details": { "id": 123 }
}
}
Status Codes:
200 OK: RPC call succeeded400 Bad Request: Invalid RPC request404 Not Found: Method not found500 Internal Server Error: RPC call failed
Headers:
X-Service-Name: Calling service nameX-Service-Role: Calling service role
Example:
curl -X POST https://core.blackroad.systems/v1/sys/rpc \
-H "Content-Type: application/json" \
-H "X-Service-Name: blackroad-os-operator" \
-H "X-Service-Role: operator" \
-d '{"method":"getUserById","params":{"id":123}}'
Event Syscalls
POST /v1/sys/event
Purpose: Emit an event (publish to subscribers)
Request: Event (partial)
{
"event": "user:created",
"data": {
"userId": 123,
"email": "user@example.com"
}
}
Response:
{
"id": "1732104000000-evt123",
"timestamp": "2025-11-20T12:00:00Z"
}
Status Codes:
201 Created: Event emitted400 Bad Request: Invalid event
Example:
curl -X POST https://core.blackroad.systems/v1/sys/event \
-H "Content-Type: application/json" \
-d '{"event":"user:created","data":{"userId":123}}'
GET /v1/sys/events
Purpose: Subscribe to events (Server-Sent Events stream)
Query Parameters:
event(optional): Filter by event namesince(optional): Only events after this timestamp (ISO 8601)
Request: None
Response: SSE stream
data: {"id":"1732104000000-evt123","event":"user:created","timestamp":"2025-11-20T12:00:00Z","source":"blackroad-os-core","data":{"userId":123}}
data: {"id":"1732104000001-evt124","event":"user:updated","timestamp":"2025-11-20T12:00:01Z","source":"blackroad-os-core","data":{"userId":123}}
Status Codes:
200 OK: SSE stream started
Example:
curl -N "https://core.blackroad.systems/v1/sys/events?event=user:created"
Job Syscalls
POST /v1/sys/job
Purpose: Create a background job
Request: Job (partial)
{
"name": "send-email",
"params": {
"to": "user@example.com",
"subject": "Welcome",
"body": "Hello!"
},
"schedule": "0 0 * * *"
}
Response: Job
{
"id": "job-1732104000000-abc123",
"name": "send-email",
"params": { "to": "user@example.com" },
"schedule": "0 0 * * *",
"status": "queued",
"createdAt": "2025-11-20T12:00:00Z"
}
Status Codes:
201 Created: Job created400 Bad Request: Invalid job404 Not Found: Job handler not found
Example:
curl -X POST https://core.blackroad.systems/v1/sys/job \
-H "Content-Type: application/json" \
-d '{"name":"send-email","params":{"to":"user@example.com","subject":"Test"}}'
GET /v1/sys/job/:id
Purpose: Get job status
Request: None
Response: Job
{
"id": "job-1732104000000-abc123",
"name": "send-email",
"params": { "to": "user@example.com" },
"status": "completed",
"createdAt": "2025-11-20T12:00:00Z",
"startedAt": "2025-11-20T12:00:01Z",
"completedAt": "2025-11-20T12:00:05Z",
"result": { "sent": true, "messageId": "12345" }
}
Error Response (if job failed):
{
"id": "job-1732104000000-abc123",
"status": "failed",
"error": {
"message": "SMTP connection failed",
"stack": "..."
}
}
Status Codes:
200 OK: Success404 Not Found: Job not found
Example:
curl https://core.blackroad.systems/v1/sys/job/job-1732104000000-abc123
POST /v1/sys/job/:id/cancel
Purpose: Cancel a running or queued job
Request: None
Response: Job
{
"id": "job-1732104000000-abc123",
"status": "cancelled",
"completedAt": "2025-11-20T12:00:10Z"
}
Status Codes:
200 OK: Job cancelled400 Bad Request: Job cannot be cancelled (already completed/failed)404 Not Found: Job not found
Example:
curl -X POST https://core.blackroad.systems/v1/sys/job/job-1732104000000-abc123/cancel
State Syscalls
GET /v1/sys/state
Purpose: Get state value(s)
Query Parameters:
key(optional): Get specific key (if omitted, returns all state)
Request: None
Response (single key):
{
"key": "user:count",
"value": 42,
"version": 5,
"updatedAt": "2025-11-20T12:00:00Z"
}
Response (all keys):
{
"state": [
{
"key": "user:count",
"value": 42,
"version": 5,
"updatedAt": "2025-11-20T12:00:00Z"
},
{
"key": "session:count",
"value": 10,
"version": 2,
"updatedAt": "2025-11-20T11:00:00Z"
}
]
}
Status Codes:
200 OK: Success404 Not Found: Key not found (if specific key requested)
Example:
# Get specific key
curl "https://core.blackroad.systems/v1/sys/state?key=user:count"
# Get all state
curl "https://core.blackroad.systems/v1/sys/state"
PUT /v1/sys/state
Purpose: Set state value
Request: State update
{
"key": "user:count",
"value": 43,
"expectedVersion": 5
}
Response: StateEntry
{
"key": "user:count",
"value": 43,
"version": 6,
"updatedAt": "2025-11-20T12:00:10Z"
}
Error Response (version conflict):
{
"error": {
"code": "VERSION_CONFLICT",
"message": "Version conflict for key 'user:count': expected 5, got 6"
}
}
Status Codes:
200 OK: State updated409 Conflict: Version conflict (optimistic locking)400 Bad Request: Invalid request
Example:
curl -X PUT https://core.blackroad.systems/v1/sys/state \
-H "Content-Type: application/json" \
-d '{"key":"user:count","value":43,"expectedVersion":5}'
Implementation Guide
Express.js Example
import express from 'express';
import { getKernelIdentity, logger, rpc, events, jobQueue, state } from './kernel';
const app = express();
app.use(express.json());
// Core syscalls
app.get('/health', (req, res) => {
res.json({ status: 'healthy' });
});
app.get('/version', (req, res) => {
const identity = getKernelIdentity();
res.json({ version: identity.version, service: identity.service });
});
app.get('/v1/sys/identity', (req, res) => {
res.json(getKernelIdentity());
});
// RPC syscall
app.post('/v1/sys/rpc', async (req, res) => {
const { method, params } = req.body;
try {
// Call your RPC handler
const result = await handleRPC(method, params);
res.json({ result });
} catch (error) {
res.status(500).json({
error: {
code: 'RPC_ERROR',
message: error.message,
},
});
}
});
// Event syscall
app.post('/v1/sys/event', async (req, res) => {
const { event, data } = req.body;
await events.emit(event, data);
res.status(201).json({
id: generateId(),
timestamp: new Date().toISOString(),
});
});
// Job syscall
app.post('/v1/sys/job', async (req, res) => {
const { name, params, schedule } = req.body;
const job = await jobQueue.createJob(name, params, schedule);
res.status(201).json(job);
});
// State syscalls
app.get('/v1/sys/state', (req, res) => {
const { key } = req.query;
if (key) {
const entry = state.get(key as string);
if (!entry) {
return res.status(404).json({ error: 'Key not found' });
}
res.json(entry);
} else {
res.json({ state: state.getAll() });
}
});
app.put('/v1/sys/state', (req, res) => {
const { key, value, expectedVersion } = req.body;
try {
const entry = state.set(key, value, expectedVersion);
res.json(entry);
} catch (error) {
res.status(409).json({
error: {
code: 'VERSION_CONFLICT',
message: error.message,
},
});
}
});
app.listen(8000, () => {
logger.info('Service started on port 8000');
});
Examples
Health Check Flow
# Check if service is alive
curl https://operator.blackroad.systems/health
# => {"status":"healthy"}
# Get detailed health
curl https://operator.blackroad.systems/v1/sys/health
# => {"status":"healthy","uptime":3600,"memory":{...},"checks":{...}}
RPC Call Flow
// Operator calling Core API to get user
import { rpc } from './kernel';
const user = await rpc.call('core', 'getUserById', { id: 123 });
console.log('User:', user);
Under the hood:
# 1. RPC client resolves internal URL
# http://blackroad-os-core.railway.internal:8000
# 2. POST to /v1/sys/rpc
curl -X POST http://blackroad-os-core.railway.internal:8000/v1/sys/rpc \
-H "Content-Type: application/json" \
-H "X-Service-Name: blackroad-os-operator" \
-d '{"method":"getUserById","params":{"id":123}}'
# 3. Core API responds
# {"result":{"id":123,"email":"user@example.com"}}
Event Flow
// Service A emits event
await events.emit('user:created', { userId: 123 });
// Service B subscribes (local)
events.on('user:created', (event) => {
console.log('User created:', event.data.userId);
});
// Remote subscription (SSE)
curl -N "https://core.blackroad.systems/v1/sys/events?event=user:created"
Compliance
All BlackRoad OS services MUST implement:
✅ Required Syscalls:
/health/version/v1/sys/identity/v1/sys/health/v1/sys/rpc
⚠️ Optional Syscalls (recommended):
/v1/sys/log/v1/sys/metric/v1/sys/event/v1/sys/job/v1/sys/state
📋 Testing:
- All syscalls must have tests
- Health checks must complete in < 100ms
- RPC calls must support timeouts
- All responses must be valid JSON
References
- Kernel Implementation:
kernel/typescript/ - Service Registry:
INFRASTRUCTURE.md - DNS Mapping:
infra/DNS.md - Deployment:
docs/RAILWAY_DEPLOYMENT.md
Version: 2.0 Last Updated: 2025-11-20 Author: Atlas (Infrastructure Architect) Status: ✅ Production Standard