Files
blackroad-infra/hardware/accelerators/ai-compute.md
Alexa Amundson 7465fb8660 Update fleet maps: Lucidia back online with 1TB NVMe + 2nd Hailo-8
- Lucidia confirmed UP at 192.168.4.81 with 916G NVMe (868G free) and hailort running
- Fleet AI compute upgraded from 26 to 52 TOPS (2x Hailo-8: Cecilia + Lucidia)
- Documented /etc/hostname mismatch on Lucidia (says "octavia")
- Cleaned: xmrig-build on 3 nodes, /opt/xmrig on Codex-Infinity, 3.6G RISC-V toolchains
- Rotated 3.8G journal logs on Codex-Infinity (33% → 27% usage)
- Octavia rebooted clean (load 9.47 → 0.86)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 02:12:12 -06:00

233 lines
7.6 KiB
Markdown

# AI Compute Accelerators — Live Verified
**Verified via SSH probes on 2026-02-21.**
> **UPDATE 2026-02-21:** Lucidia came back online and has a **second Hailo-8** confirmed!
> `hailort.service` running on both Cecilia and Lucidia. Octavia and Aria have no Hailo.
> 1 of 3 purchased modules remains unaccounted for.
**Confirmed fleet AI compute: ~67.8 TOPS active** (2x Hailo-8 + M1 Neural Engine)
---
## Accelerator Inventory
| # | Accelerator | Node | TOPS | Interface | Status | Verification |
|---|-------------|------|------|-----------|--------|-------------|
| 1 | Hailo-8 M.2 | Cecilia | 26 | M.2 PCIe | **Active** | `hailort.service` running, `/dev/hailo0` present |
| 2 | Hailo-8 M.2 | Lucidia | 26 | M.2 PCIe | **Active** | `hailort.service` running (confirmed 2026-02-21) |
| 3 | Hailo-8 M.2 | Octavia | 26 | M.2 PCIe | **NOT DETECTED** | No `/dev/hailo*`, no `hailort.service` |
| — | Hailo-8 M.2 | Aria | — | M.2 PCIe | **NOT DETECTED** | No `/dev/hailo*`, no `hailort.service` |
| 4 | Jetson Orin Nano GPU | Jetson-Agent | 40 | Onboard | **Pending** | Dev kit not deployed |
| 5 | Apple M1 Neural Engine | Alexandria | 15.8 | Onboard | **Active** | Mac in use daily |
| 6 | Himax Ethos-U55 NPU | SenseCAP W1-A | ~1 | Onboard | **Returned** | Returned Aug 2025 |
### Compute Budget — Corrected
| Category | TOPS | Status | Notes |
|----------|------|--------|-------|
| Hailo-8 (2x confirmed) | 52 | **Active** | Cecilia + Lucidia |
| Hailo-8 (1x unverified) | 26 | **Unknown** | 3rd module purchased, not detected on Octavia or Aria |
| NVIDIA Jetson Orin Nano | 40 | **Pending** | Dev kit not deployed |
| Apple M1 Neural Engine | 15.8 | **Active** | Alexandria Mac |
| Arm Ethos-U55 | ~1 | **Returned** | SenseCAP Watcher |
| **Confirmed Active** | **67.8** | | 2x Hailo-8 + M1 |
| **Potential (if all working)** | **~135** | | +1 Hailo-8 + Jetson |
---
## Missing Hailo-8 Investigation
3 Hailo-8 M.2 modules were purchased (serial numbers documented: HLLWM2B233704667, HLLWM2B233704606, third unknown). 2 are confirmed active: Cecilia and Lucidia. 1 remains unaccounted for.
### Possible Explanations
1. **Not physically installed** — M.2 modules may still be in packaging or stored separately
2. **Installed but no drivers** — HailoRT runtime not installed on Octavia/Aria
3. **Hardware fault** — M.2 slot or module not functioning
4. **Wrong slot** — Pironman case M.2 slot may be configured for NVMe, not AI accelerator
### Verification Steps
```bash
# On Octavia (ssh octavia):
ls /dev/hailo* # Check for Hailo device nodes
systemctl status hailort # Check for Hailo runtime service
lspci | grep -i hailo # Check PCIe bus for Hailo device
dpkg -l | grep hailo # Check if HailoRT packages installed
# On Aria (ssh aria):
ls /dev/hailo*
systemctl status hailort
lspci | grep -i hailo
dpkg -l | grep hailo
# Physical inspection required:
# 1. Open Pironman cases on Octavia and Aria
# 2. Check M.2 Key M slot — is a Hailo-8 card present?
# 3. If present, install HailoRT: sudo apt install hailort
```
---
## Hailo-8 M.2 Module
### Specifications
| Spec | Value |
|------|-------|
| Architecture | Hailo-8 |
| Compute | 26 TOPS (INT8) |
| Interface | M.2 Key M (PCIe Gen 3.0 x1) |
| Power | ~2.5W typical |
| Price | $214.99 each (3x = $644.97 total) |
| Compatible Hosts | Raspberry Pi 5 (via HAT), Pironman case |
### Software Stack
- **HailoRT:** Runtime library for model execution
- **Hailo Model Zoo:** Pre-compiled HEF files
- **Hailo TAPPAS:** Application examples and pipelines
- **Hailo Dataflow Compiler:** Convert ONNX/TF models to HEF format
### Detection & Management
```bash
# Detect Hailo devices
hailortcli scan
# Identify firmware version
hailortcli fw-control identify
# Run inference benchmark
hailortcli benchmark --hef /usr/share/hailo-models/yolov5m_wo_spp_60p.hef
# List installed models
ls /usr/share/hailo-models/*.hef
# Check installed packages
dpkg -l | grep hailo
# Management script
~/hailo.sh
```
### Benchmark Results (Cecilia only)
Hailo-8 vs NVIDIA Jetson benchmarks (from BlackRoad testing):
- **Power Efficiency:** 15-30x more efficient than NVIDIA Jetson (TOPS/Watt)
- **YOLOv5m:** Real-time 30+ FPS at 2.5W power draw
- **Latency:** Sub-10ms inference for object detection
### Model Compatibility
| Model | Format | Use Case | Status |
|-------|--------|----------|--------|
| YOLOv5m | HEF | Object detection | Compiled |
| YOLOv8n/s/m | HEF | Object detection | Compiled |
| ResNet-50 | HEF | Image classification | Compiled |
| MobileNet v2 | HEF | Classification (lightweight) | Compiled |
| SSD MobileNet | HEF | Detection (lightweight) | Compiled |
| Custom models | ONNX → HEF | Via Dataflow Compiler | Supported |
---
## Ollama Deployment (4 nodes)
Ollama runs on 4 of 6 reachable nodes, providing LLM inference across the fleet:
| Node | Binding | Security | Status |
|------|---------|----------|--------|
| Cecilia | 127.0.0.1:11434 | Localhost only | **Secure** |
| Octavia | 127.0.0.1:11434 | Localhost only | **Secure** |
| Shellfish | 100.64.0.1:11434 | Tailscale interface | **Secure** |
| Codex-Infinity | **0.0.0.0:11434** | **ALL INTERFACES** | **INSECURE** |
> **ACTION:** Fix Codex-Infinity Ollama binding immediately. Public IP 159.65.43.12:11434 is
> accessible to anyone on the internet.
---
## NVIDIA Jetson Orin Nano
### Specifications
| Spec | Value |
|------|-------|
| GPU | NVIDIA Ampere (1024 CUDA cores) |
| AI Compute | 40 TOPS (INT8) |
| CPU | 6-core Arm Cortex-A78AE |
| RAM | 8GB LPDDR5 |
| Storage | microSD + NVMe M.2 |
| Power | 7-15W configurable TDP |
| Price | $114.29 (base dev kit) |
| Display | HDMI + DisplayPort |
| Status | **Pending initial setup** |
### Software Stack
- **JetPack SDK:** Ubuntu-based OS with CUDA, cuDNN, TensorRT
- **TensorRT:** Optimized inference engine
- **DeepStream:** Video analytics SDK
- **Ollama:** LLM inference via CUDA
### Capabilities
| Task | Framework | Notes |
|------|-----------|-------|
| LLM inference | Ollama (CUDA) | Llama 2 7B, Mistral 7B |
| Object detection | TensorRT | YOLOv8 real-time |
| Speech-to-text | Whisper (CUDA) | Real-time transcription |
| Image generation | Stable Diffusion | Small models only (8GB RAM) |
| Video analytics | DeepStream | Multi-stream pipeline |
---
## Apple M1 Neural Engine
| Spec | Value |
|------|-------|
| Architecture | Apple Neural Engine (16-core) |
| AI Compute | 15.8 TOPS |
| Host | MacBook Pro M1 (Alexandria) |
| Framework | CoreML, MLX |
| Status | **Active** (daily use) |
---
## Arm Ethos-U55 NPU (SenseCAP Watcher — Returned)
| Spec | Value |
|------|-------|
| Architecture | Arm Ethos-U55 microNPU |
| Host Processor | Arm Cortex-M55 (Himax HX6538) |
| AI Compute | ~1 TOPS (INT8) |
| Device | SenseCAP Watcher W1-A |
| Status | **Returned** (August 2025) |
---
## Power Efficiency Comparison
| Accelerator | TOPS | Power (W) | TOPS/W | Status |
|-------------|------|-----------|--------|--------|
| Hailo-8 | 26 | 2.5 | **10.4** | 1 active, 2 unverified |
| Jetson Orin Nano | 40 | 15 | 2.7 | Pending setup |
| M1 Neural Engine | 15.8 | ~5 | 3.2 | Active |
| Ethos-U55 | ~1 | 0.05 | 20.0 | Returned |
---
## Model Compatibility Matrix
| Model | Hailo-8 (HEF) | Jetson (TRT) | M1 (CoreML) | Ethos-U55 (TFLite) |
|-------|---------------|--------------|-------------|---------------------|
| YOLOv5m | Yes | Yes | Yes | — |
| YOLOv8n | Yes | Yes | Yes | — |
| ResNet-50 | Yes | Yes | Yes | — |
| MobileNet v2 | Yes | Yes | Yes | Yes |
| Llama 2 7B | — | Yes (CUDA) | Yes (Metal) | — |
| Whisper | — | Yes (CUDA) | Yes (Metal) | — |
| Stable Diffusion | — | Yes (limited) | Yes (MLX) | — |
| Person Detection | Yes | Yes | Yes | Yes |