About
Doradus Research is the operator perspective of an on-prem AI cluster — consumer + workstation-class GPUs, multi-vendor inference, multi-model serving. We publish what we learn the hard way: dead-ends, configuration that didn't work, and the recipes that ended up in production.
Compute: 3 GPU compute nodes carrying 10× NVIDIA RTX PRO 6000 Blackwell (95 GiB each, SM12.0a) + 4× RTX 5090 — partitioned across rotation pools and dedicated services. 2× NVIDIA DGX Spark (GB10, 128 GiB UMA each) for medium-tier always-on serving + content workflows. 2× Apple Mac Studio M3 Ultra (256 GiB UMA each) for Apple-Silicon-native inference. ~1.3 TB system RAM total. All on-prem. Built to accommodate ~40 daily users across internal tooling, research pipelines, and ad-hoc inference.
Storage: ~75 TB across four tiers — hot NVMe per node, an erasure-coded warm cluster, a shared NFS model cache, and SMB cold archive. Promotion / demotion between tiers is automatic.
Network: 100GbE fiberoptic backbone with managed switching across all nodes, fronted by a dedicated perimeter firewall appliance.
Orchestration: modern scheduler + service mesh with mTLS on every internal hop, including localhost. Inter-host trust over a private overlay.
The interesting bugs live at the seams of a multi-vendor inference stack. That's where we publish from.
Code at github.com/DoradusResearch.
Reach out via GitHub, or on X at @DoradusAI. We also read r/LocalLLaMA and a few community Discords.