- Lucidia confirmed UP at 192.168.4.81 with 916G NVMe (868G free) and hailort running - Fleet AI compute upgraded from 26 to 52 TOPS (2x Hailo-8: Cecilia + Lucidia) - Documented /etc/hostname mismatch on Lucidia (says "octavia") - Cleaned: xmrig-build on 3 nodes, /opt/xmrig on Codex-Infinity, 3.6G RISC-V toolchains - Rotated 3.8G journal logs on Codex-Infinity (33% → 27% usage) - Octavia rebooted clean (load 9.47 → 0.86) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
233 lines
7.6 KiB
Markdown
233 lines
7.6 KiB
Markdown
# AI Compute Accelerators — Live Verified
|
|
|
|
**Verified via SSH probes on 2026-02-21.**
|
|
|
|
> **UPDATE 2026-02-21:** Lucidia came back online and has a **second Hailo-8** confirmed!
|
|
> `hailort.service` running on both Cecilia and Lucidia. Octavia and Aria have no Hailo.
|
|
> 1 of 3 purchased modules remains unaccounted for.
|
|
|
|
**Confirmed fleet AI compute: ~67.8 TOPS active** (2x Hailo-8 + M1 Neural Engine)
|
|
|
|
---
|
|
|
|
## Accelerator Inventory
|
|
|
|
| # | Accelerator | Node | TOPS | Interface | Status | Verification |
|
|
|---|-------------|------|------|-----------|--------|-------------|
|
|
| 1 | Hailo-8 M.2 | Cecilia | 26 | M.2 PCIe | **Active** | `hailort.service` running, `/dev/hailo0` present |
|
|
| 2 | Hailo-8 M.2 | Lucidia | 26 | M.2 PCIe | **Active** | `hailort.service` running (confirmed 2026-02-21) |
|
|
| 3 | Hailo-8 M.2 | Octavia | 26 | M.2 PCIe | **NOT DETECTED** | No `/dev/hailo*`, no `hailort.service` |
|
|
| — | Hailo-8 M.2 | Aria | — | M.2 PCIe | **NOT DETECTED** | No `/dev/hailo*`, no `hailort.service` |
|
|
| 4 | Jetson Orin Nano GPU | Jetson-Agent | 40 | Onboard | **Pending** | Dev kit not deployed |
|
|
| 5 | Apple M1 Neural Engine | Alexandria | 15.8 | Onboard | **Active** | Mac in use daily |
|
|
| 6 | Himax Ethos-U55 NPU | SenseCAP W1-A | ~1 | Onboard | **Returned** | Returned Aug 2025 |
|
|
|
|
### Compute Budget — Corrected
|
|
|
|
| Category | TOPS | Status | Notes |
|
|
|----------|------|--------|-------|
|
|
| Hailo-8 (2x confirmed) | 52 | **Active** | Cecilia + Lucidia |
|
|
| Hailo-8 (1x unverified) | 26 | **Unknown** | 3rd module purchased, not detected on Octavia or Aria |
|
|
| NVIDIA Jetson Orin Nano | 40 | **Pending** | Dev kit not deployed |
|
|
| Apple M1 Neural Engine | 15.8 | **Active** | Alexandria Mac |
|
|
| Arm Ethos-U55 | ~1 | **Returned** | SenseCAP Watcher |
|
|
| **Confirmed Active** | **67.8** | | 2x Hailo-8 + M1 |
|
|
| **Potential (if all working)** | **~135** | | +1 Hailo-8 + Jetson |
|
|
|
|
---
|
|
|
|
## Missing Hailo-8 Investigation
|
|
|
|
3 Hailo-8 M.2 modules were purchased (serial numbers documented: HLLWM2B233704667, HLLWM2B233704606, third unknown). 2 are confirmed active: Cecilia and Lucidia. 1 remains unaccounted for.
|
|
|
|
### Possible Explanations
|
|
|
|
1. **Not physically installed** — M.2 modules may still be in packaging or stored separately
|
|
2. **Installed but no drivers** — HailoRT runtime not installed on Octavia/Aria
|
|
3. **Hardware fault** — M.2 slot or module not functioning
|
|
4. **Wrong slot** — Pironman case M.2 slot may be configured for NVMe, not AI accelerator
|
|
|
|
### Verification Steps
|
|
|
|
```bash
|
|
# On Octavia (ssh octavia):
|
|
ls /dev/hailo* # Check for Hailo device nodes
|
|
systemctl status hailort # Check for Hailo runtime service
|
|
lspci | grep -i hailo # Check PCIe bus for Hailo device
|
|
dpkg -l | grep hailo # Check if HailoRT packages installed
|
|
|
|
# On Aria (ssh aria):
|
|
ls /dev/hailo*
|
|
systemctl status hailort
|
|
lspci | grep -i hailo
|
|
dpkg -l | grep hailo
|
|
|
|
# Physical inspection required:
|
|
# 1. Open Pironman cases on Octavia and Aria
|
|
# 2. Check M.2 Key M slot — is a Hailo-8 card present?
|
|
# 3. If present, install HailoRT: sudo apt install hailort
|
|
```
|
|
|
|
---
|
|
|
|
## Hailo-8 M.2 Module
|
|
|
|
### Specifications
|
|
|
|
| Spec | Value |
|
|
|------|-------|
|
|
| Architecture | Hailo-8 |
|
|
| Compute | 26 TOPS (INT8) |
|
|
| Interface | M.2 Key M (PCIe Gen 3.0 x1) |
|
|
| Power | ~2.5W typical |
|
|
| Price | $214.99 each (3x = $644.97 total) |
|
|
| Compatible Hosts | Raspberry Pi 5 (via HAT), Pironman case |
|
|
|
|
### Software Stack
|
|
|
|
- **HailoRT:** Runtime library for model execution
|
|
- **Hailo Model Zoo:** Pre-compiled HEF files
|
|
- **Hailo TAPPAS:** Application examples and pipelines
|
|
- **Hailo Dataflow Compiler:** Convert ONNX/TF models to HEF format
|
|
|
|
### Detection & Management
|
|
|
|
```bash
|
|
# Detect Hailo devices
|
|
hailortcli scan
|
|
|
|
# Identify firmware version
|
|
hailortcli fw-control identify
|
|
|
|
# Run inference benchmark
|
|
hailortcli benchmark --hef /usr/share/hailo-models/yolov5m_wo_spp_60p.hef
|
|
|
|
# List installed models
|
|
ls /usr/share/hailo-models/*.hef
|
|
|
|
# Check installed packages
|
|
dpkg -l | grep hailo
|
|
|
|
# Management script
|
|
~/hailo.sh
|
|
```
|
|
|
|
### Benchmark Results (Cecilia only)
|
|
|
|
Hailo-8 vs NVIDIA Jetson benchmarks (from BlackRoad testing):
|
|
- **Power Efficiency:** 15-30x more efficient than NVIDIA Jetson (TOPS/Watt)
|
|
- **YOLOv5m:** Real-time 30+ FPS at 2.5W power draw
|
|
- **Latency:** Sub-10ms inference for object detection
|
|
|
|
### Model Compatibility
|
|
|
|
| Model | Format | Use Case | Status |
|
|
|-------|--------|----------|--------|
|
|
| YOLOv5m | HEF | Object detection | Compiled |
|
|
| YOLOv8n/s/m | HEF | Object detection | Compiled |
|
|
| ResNet-50 | HEF | Image classification | Compiled |
|
|
| MobileNet v2 | HEF | Classification (lightweight) | Compiled |
|
|
| SSD MobileNet | HEF | Detection (lightweight) | Compiled |
|
|
| Custom models | ONNX → HEF | Via Dataflow Compiler | Supported |
|
|
|
|
---
|
|
|
|
## Ollama Deployment (4 nodes)
|
|
|
|
Ollama runs on 4 of 6 reachable nodes, providing LLM inference across the fleet:
|
|
|
|
| Node | Binding | Security | Status |
|
|
|------|---------|----------|--------|
|
|
| Cecilia | 127.0.0.1:11434 | Localhost only | **Secure** |
|
|
| Octavia | 127.0.0.1:11434 | Localhost only | **Secure** |
|
|
| Shellfish | 100.64.0.1:11434 | Tailscale interface | **Secure** |
|
|
| Codex-Infinity | **0.0.0.0:11434** | **ALL INTERFACES** | **INSECURE** |
|
|
|
|
> **ACTION:** Fix Codex-Infinity Ollama binding immediately. Public IP 159.65.43.12:11434 is
|
|
> accessible to anyone on the internet.
|
|
|
|
---
|
|
|
|
## NVIDIA Jetson Orin Nano
|
|
|
|
### Specifications
|
|
|
|
| Spec | Value |
|
|
|------|-------|
|
|
| GPU | NVIDIA Ampere (1024 CUDA cores) |
|
|
| AI Compute | 40 TOPS (INT8) |
|
|
| CPU | 6-core Arm Cortex-A78AE |
|
|
| RAM | 8GB LPDDR5 |
|
|
| Storage | microSD + NVMe M.2 |
|
|
| Power | 7-15W configurable TDP |
|
|
| Price | $114.29 (base dev kit) |
|
|
| Display | HDMI + DisplayPort |
|
|
| Status | **Pending initial setup** |
|
|
|
|
### Software Stack
|
|
|
|
- **JetPack SDK:** Ubuntu-based OS with CUDA, cuDNN, TensorRT
|
|
- **TensorRT:** Optimized inference engine
|
|
- **DeepStream:** Video analytics SDK
|
|
- **Ollama:** LLM inference via CUDA
|
|
|
|
### Capabilities
|
|
|
|
| Task | Framework | Notes |
|
|
|------|-----------|-------|
|
|
| LLM inference | Ollama (CUDA) | Llama 2 7B, Mistral 7B |
|
|
| Object detection | TensorRT | YOLOv8 real-time |
|
|
| Speech-to-text | Whisper (CUDA) | Real-time transcription |
|
|
| Image generation | Stable Diffusion | Small models only (8GB RAM) |
|
|
| Video analytics | DeepStream | Multi-stream pipeline |
|
|
|
|
---
|
|
|
|
## Apple M1 Neural Engine
|
|
|
|
| Spec | Value |
|
|
|------|-------|
|
|
| Architecture | Apple Neural Engine (16-core) |
|
|
| AI Compute | 15.8 TOPS |
|
|
| Host | MacBook Pro M1 (Alexandria) |
|
|
| Framework | CoreML, MLX |
|
|
| Status | **Active** (daily use) |
|
|
|
|
---
|
|
|
|
## Arm Ethos-U55 NPU (SenseCAP Watcher — Returned)
|
|
|
|
| Spec | Value |
|
|
|------|-------|
|
|
| Architecture | Arm Ethos-U55 microNPU |
|
|
| Host Processor | Arm Cortex-M55 (Himax HX6538) |
|
|
| AI Compute | ~1 TOPS (INT8) |
|
|
| Device | SenseCAP Watcher W1-A |
|
|
| Status | **Returned** (August 2025) |
|
|
|
|
---
|
|
|
|
## Power Efficiency Comparison
|
|
|
|
| Accelerator | TOPS | Power (W) | TOPS/W | Status |
|
|
|-------------|------|-----------|--------|--------|
|
|
| Hailo-8 | 26 | 2.5 | **10.4** | 1 active, 2 unverified |
|
|
| Jetson Orin Nano | 40 | 15 | 2.7 | Pending setup |
|
|
| M1 Neural Engine | 15.8 | ~5 | 3.2 | Active |
|
|
| Ethos-U55 | ~1 | 0.05 | 20.0 | Returned |
|
|
|
|
---
|
|
|
|
## Model Compatibility Matrix
|
|
|
|
| Model | Hailo-8 (HEF) | Jetson (TRT) | M1 (CoreML) | Ethos-U55 (TFLite) |
|
|
|-------|---------------|--------------|-------------|---------------------|
|
|
| YOLOv5m | Yes | Yes | Yes | — |
|
|
| YOLOv8n | Yes | Yes | Yes | — |
|
|
| ResNet-50 | Yes | Yes | Yes | — |
|
|
| MobileNet v2 | Yes | Yes | Yes | Yes |
|
|
| Llama 2 7B | — | Yes (CUDA) | Yes (Metal) | — |
|
|
| Whisper | — | Yes (CUDA) | Yes (Metal) | — |
|
|
| Stable Diffusion | — | Yes (limited) | Yes (MLX) | — |
|
|
| Person Detection | Yes | Yes | Yes | Yes |
|