doradus-research

Doradus Research is the operator perspective of an on-prem AI cluster — consumer + workstation-class GPUs, multi-vendor inference, multi-model serving. We publish what we learn the hard way: dead-ends, configuration that didn't work, and the recipes that ended up in production.

Compute: 3 GPU compute nodes carrying 10× NVIDIA RTX PRO 6000 Blackwell (95 GiB each, SM12.0a) + 4× RTX 5090 — partitioned across rotation pools and dedicated services. 2× NVIDIA DGX Spark (GB10, 128 GiB UMA each) for medium-tier always-on serving + content workflows. 2× Apple Mac Studio M3 Ultra (256 GiB UMA each) for Apple-Silicon-native inference. ~1.3 TB system RAM total. All on-prem. Built to accommodate ~40 daily users across internal tooling, research pipelines, and ad-hoc inference.

Storage: ~75 TB across four tiers — hot NVMe per node, an erasure-coded warm cluster, a shared NFS model cache, and SMB cold archive. Promotion / demotion between tiers is automatic.

Network: 100GbE fiberoptic backbone with managed switching across all nodes, fronted by a dedicated perimeter firewall appliance.

Orchestration: modern scheduler + service mesh with mTLS on every internal hop, including localhost. Inter-host trust over a private overlay.

The interesting bugs live at the seams of a multi-vendor inference stack. That's where we publish from.

Code at github.com/DoradusResearch.

Reach out via GitHub, or on X at @DoradusAI. We also read r/LocalLLaMA and a few community Discords.