Files
blackroad-infra/hardware/accelerators/ai-compute.md
Alexa Amundson 7465fb8660 Update fleet maps: Lucidia back online with 1TB NVMe + 2nd Hailo-8
- Lucidia confirmed UP at 192.168.4.81 with 916G NVMe (868G free) and hailort running
- Fleet AI compute upgraded from 26 to 52 TOPS (2x Hailo-8: Cecilia + Lucidia)
- Documented /etc/hostname mismatch on Lucidia (says "octavia")
- Cleaned: xmrig-build on 3 nodes, /opt/xmrig on Codex-Infinity, 3.6G RISC-V toolchains
- Rotated 3.8G journal logs on Codex-Infinity (33% → 27% usage)
- Octavia rebooted clean (load 9.47 → 0.86)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 02:12:12 -06:00

7.6 KiB

AI Compute Accelerators — Live Verified

Verified via SSH probes on 2026-02-21.

UPDATE 2026-02-21: Lucidia came back online and has a second Hailo-8 confirmed! hailort.service running on both Cecilia and Lucidia. Octavia and Aria have no Hailo. 1 of 3 purchased modules remains unaccounted for.

Confirmed fleet AI compute: ~67.8 TOPS active (2x Hailo-8 + M1 Neural Engine)


Accelerator Inventory

# Accelerator Node TOPS Interface Status Verification
1 Hailo-8 M.2 Cecilia 26 M.2 PCIe Active hailort.service running, /dev/hailo0 present
2 Hailo-8 M.2 Lucidia 26 M.2 PCIe Active hailort.service running (confirmed 2026-02-21)
3 Hailo-8 M.2 Octavia 26 M.2 PCIe NOT DETECTED No /dev/hailo*, no hailort.service
Hailo-8 M.2 Aria M.2 PCIe NOT DETECTED No /dev/hailo*, no hailort.service
4 Jetson Orin Nano GPU Jetson-Agent 40 Onboard Pending Dev kit not deployed
5 Apple M1 Neural Engine Alexandria 15.8 Onboard Active Mac in use daily
6 Himax Ethos-U55 NPU SenseCAP W1-A ~1 Onboard Returned Returned Aug 2025

Compute Budget — Corrected

Category TOPS Status Notes
Hailo-8 (2x confirmed) 52 Active Cecilia + Lucidia
Hailo-8 (1x unverified) 26 Unknown 3rd module purchased, not detected on Octavia or Aria
NVIDIA Jetson Orin Nano 40 Pending Dev kit not deployed
Apple M1 Neural Engine 15.8 Active Alexandria Mac
Arm Ethos-U55 ~1 Returned SenseCAP Watcher
Confirmed Active 67.8 2x Hailo-8 + M1
Potential (if all working) ~135 +1 Hailo-8 + Jetson

Missing Hailo-8 Investigation

3 Hailo-8 M.2 modules were purchased (serial numbers documented: HLLWM2B233704667, HLLWM2B233704606, third unknown). 2 are confirmed active: Cecilia and Lucidia. 1 remains unaccounted for.

Possible Explanations

  1. Not physically installed — M.2 modules may still be in packaging or stored separately
  2. Installed but no drivers — HailoRT runtime not installed on Octavia/Aria
  3. Hardware fault — M.2 slot or module not functioning
  4. Wrong slot — Pironman case M.2 slot may be configured for NVMe, not AI accelerator

Verification Steps

# On Octavia (ssh octavia):
ls /dev/hailo*                    # Check for Hailo device nodes
systemctl status hailort          # Check for Hailo runtime service
lspci | grep -i hailo             # Check PCIe bus for Hailo device
dpkg -l | grep hailo              # Check if HailoRT packages installed

# On Aria (ssh aria):
ls /dev/hailo*
systemctl status hailort
lspci | grep -i hailo
dpkg -l | grep hailo

# Physical inspection required:
# 1. Open Pironman cases on Octavia and Aria
# 2. Check M.2 Key M slot — is a Hailo-8 card present?
# 3. If present, install HailoRT: sudo apt install hailort

Hailo-8 M.2 Module

Specifications

Spec Value
Architecture Hailo-8
Compute 26 TOPS (INT8)
Interface M.2 Key M (PCIe Gen 3.0 x1)
Power ~2.5W typical
Price $214.99 each (3x = $644.97 total)
Compatible Hosts Raspberry Pi 5 (via HAT), Pironman case

Software Stack

  • HailoRT: Runtime library for model execution
  • Hailo Model Zoo: Pre-compiled HEF files
  • Hailo TAPPAS: Application examples and pipelines
  • Hailo Dataflow Compiler: Convert ONNX/TF models to HEF format

Detection & Management

# Detect Hailo devices
hailortcli scan

# Identify firmware version
hailortcli fw-control identify

# Run inference benchmark
hailortcli benchmark --hef /usr/share/hailo-models/yolov5m_wo_spp_60p.hef

# List installed models
ls /usr/share/hailo-models/*.hef

# Check installed packages
dpkg -l | grep hailo

# Management script
~/hailo.sh

Benchmark Results (Cecilia only)

Hailo-8 vs NVIDIA Jetson benchmarks (from BlackRoad testing):

  • Power Efficiency: 15-30x more efficient than NVIDIA Jetson (TOPS/Watt)
  • YOLOv5m: Real-time 30+ FPS at 2.5W power draw
  • Latency: Sub-10ms inference for object detection

Model Compatibility

Model Format Use Case Status
YOLOv5m HEF Object detection Compiled
YOLOv8n/s/m HEF Object detection Compiled
ResNet-50 HEF Image classification Compiled
MobileNet v2 HEF Classification (lightweight) Compiled
SSD MobileNet HEF Detection (lightweight) Compiled
Custom models ONNX → HEF Via Dataflow Compiler Supported

Ollama Deployment (4 nodes)

Ollama runs on 4 of 6 reachable nodes, providing LLM inference across the fleet:

Node Binding Security Status
Cecilia 127.0.0.1:11434 Localhost only Secure
Octavia 127.0.0.1:11434 Localhost only Secure
Shellfish 100.64.0.1:11434 Tailscale interface Secure
Codex-Infinity 0.0.0.0:11434 ALL INTERFACES INSECURE

ACTION: Fix Codex-Infinity Ollama binding immediately. Public IP 159.65.43.12:11434 is accessible to anyone on the internet.


NVIDIA Jetson Orin Nano

Specifications

Spec Value
GPU NVIDIA Ampere (1024 CUDA cores)
AI Compute 40 TOPS (INT8)
CPU 6-core Arm Cortex-A78AE
RAM 8GB LPDDR5
Storage microSD + NVMe M.2
Power 7-15W configurable TDP
Price $114.29 (base dev kit)
Display HDMI + DisplayPort
Status Pending initial setup

Software Stack

  • JetPack SDK: Ubuntu-based OS with CUDA, cuDNN, TensorRT
  • TensorRT: Optimized inference engine
  • DeepStream: Video analytics SDK
  • Ollama: LLM inference via CUDA

Capabilities

Task Framework Notes
LLM inference Ollama (CUDA) Llama 2 7B, Mistral 7B
Object detection TensorRT YOLOv8 real-time
Speech-to-text Whisper (CUDA) Real-time transcription
Image generation Stable Diffusion Small models only (8GB RAM)
Video analytics DeepStream Multi-stream pipeline

Apple M1 Neural Engine

Spec Value
Architecture Apple Neural Engine (16-core)
AI Compute 15.8 TOPS
Host MacBook Pro M1 (Alexandria)
Framework CoreML, MLX
Status Active (daily use)

Arm Ethos-U55 NPU (SenseCAP Watcher — Returned)

Spec Value
Architecture Arm Ethos-U55 microNPU
Host Processor Arm Cortex-M55 (Himax HX6538)
AI Compute ~1 TOPS (INT8)
Device SenseCAP Watcher W1-A
Status Returned (August 2025)

Power Efficiency Comparison

Accelerator TOPS Power (W) TOPS/W Status
Hailo-8 26 2.5 10.4 1 active, 2 unverified
Jetson Orin Nano 40 15 2.7 Pending setup
M1 Neural Engine 15.8 ~5 3.2 Active
Ethos-U55 ~1 0.05 20.0 Returned

Model Compatibility Matrix

Model Hailo-8 (HEF) Jetson (TRT) M1 (CoreML) Ethos-U55 (TFLite)
YOLOv5m Yes Yes Yes
YOLOv8n Yes Yes Yes
ResNet-50 Yes Yes Yes
MobileNet v2 Yes Yes Yes Yes
Llama 2 7B Yes (CUDA) Yes (Metal)
Whisper Yes (CUDA) Yes (Metal)
Stable Diffusion Yes (limited) Yes (MLX)
Person Detection Yes Yes Yes Yes