The goal: build a scalable RTX3090 machine with Threadripper 5955WX, 512GB RAM, and capability to expand from 2 to 6 GPUs. This build log covers component selection, compatibility verification, and important lessons learned along the way. Starting with 2 RTX3090 GPUs and 1300W EVGA Supernova PSU, and planning for expansion to 4 or 6 GPUs.
Bill of Materials
Core Components
| Component | Part | Est. Cost | Notes |
|---|---|---|---|
| Motherboard | Supermicro M12SWA-TF-O (E-ATX) | $700 | sWRX8 socket, 6x PCIe 4.0 x16, 8-ch DDR4 |
| CPU | Threadripper Pro 5955WX | $660 | 16C/32T, 4.0/4.5GHz, 128 PCIe 4.0 lanes |
| RAM | OWC 512GB (8x64GB) DDR4-3200 ECC RDIMM | $3,423 | 2Rx4 288-pin 1.2V ECC Registered |
| GPUs | 2x NVIDIA RTX 3090 | $1,600 | 48GB VRAM total (24GB each) |
| PSU | Corsair HX1500i 80+ Platinum | $400 | 1500W single PSU (not dual) |
| PSU Adapter | ADD2PSU-S Dual PSU Adapter | $17 | For future dual PSU setup |
| Frame | Bomeiqee Mining Rig Frame | $65 | 12GPU frame with fans |
| Risers | 6x LINKUP PCIE 5.0 Riser Cable 40cm | $714 | Black v2, true x16 bandwidth |
| CPU Cooler | Noctua NH-U14S TR4-SP3 | $110 | Threadripper compatible cooler |
| Case Fans | 4x Noctua NF-A12x25 PWM | $152 | 120mm high-performance fans |
Total: ~$7,875 (estimated prices based on actual purchases)
Actual Spent to date (with tax): ~$5,618 (haven't bought the 2x more 3090 GPUs, or addition PSU yet)
Critical Component Selection Criteria
Motherboard: Supermicro M12SWA-TF
Chosen specifically for its 6x PCIe 4.0 x16 slots at full bandwidth. No bifurcation or sharing - each GPU gets the full 16 lanes. This board supports both 3000WX and 5000WX Threadripper Pro CPUs with 8-channel memory configuration.
CPU: Threadripper Pro 5955WX
The 5955WX provides 16 cores/32 threads with 4.0/4.5GHz boost speeds. For GPU-bound inference workloads, this is more than adequate - the key advantage is all Threadripper Pro CPUs provide 128 PCIe 4.0 lanes regardless of core count. The 5955WX delivers excellent value compared to higher core count models for LLM inference workloads.
Power Requirements
| Component | Power Draw |
|---|---|
| 2x RTX 3090 @ load | ~700W |
| Threadripper Pro 5955WX | ~280W |
| RAM, storage, fans | ~50W |
| Total Peak Load | ~1,030W |
Assembly Process & Lessons Learned
PCIe Riser Cable Reality Check
Cheap mining risers won't work. Those USB-style risers are x1-to-x16 adapters with only x1 bandwidth (~4 GB/s). For LLM inference, you need true x16-to-x16 extension cables. The LINKUP Ultra cables deliver full 32 GB/s bandwidth with no measurable performance loss.
CPU Lock Warning
Some Threadripper Pro CPUs from Lenovo P620 workstations are firmware-locked. Always verify sellers confirm "unlocked" or "retail/OEM tray" before purchase. A locked CPU won't post in aftermarket boards.
RAM Configuration
ECC RDIMM is required for 512GB configuration. 64GB DIMMs are only available as registered ( RDIMM). Standard UDIMM maxes out at 256GB (8x32GB). The 8-channel configuration provides ~200 GB/s memory bandwidth.
Thermal Management
Open-air frame with directional airflow across GPUs works well. RTX 3090s run hot (~80-85C under load). Consider undervolting if thermals become problematic. Noctua fans provide excellent airflow without excessive noise.
Capability Overview
Current Configuration (2x RTX3090)
- 48GB VRAM + 512GB system RAM
- ~1.9 TB/s aggregate VRAM bandwidth
- Runs decent smaller models: Qwen3-32B-Q5_K_M, deepseek-r1:32b, Nemotron-3-Nano-30B-A3B-Q6_K
- Context window: < 65k at usable speeds
- Power draw: ~1,030W (2 GPUs + CPU + RAM)
Future Expansion Path
The M12SWA-TF supports 6 GPUs total. Adding 2 more RTX3090s (~$1,600) with second PSU will provide:
- 96GB VRAM + 512GB system RAM
- ~3.7 TB/s aggregate VRAM bandwidth
- Capable of running larger MoE models: GLM-4.5 Air, GLM-4.6/4.7 (MoE offload), MiniMax M2.1 Q2 (maybe)
- Total power draw: ~1,730W (requires dual PSU setup)
6-GPU configuration (~$2,400 for 4 additional GPUs + second PSU) will deliver:
- 144GB VRAM + 512GB system RAM
- ~5.6 TB/s aggregate VRAM bandwidth
- Capable of running premier MoE models: DeepSeek V3 Q4 (MoE offload), MiniMax M2.1 Q4, GLM-4.6/4.7
- Total power draw: ~2,494W
Key Takeaways
Design Decisions
- Supermicro over consumer boards: Guaranteed PCIe bandwidth, workstation reliability
- Dual PSU setup: Can add modularly w/ add2psu when we add additional GPUs
- Quality riser cables: 16x needed, shitty USB crypto mining ones are 1x
- ECC RDIMM memory: Good for stability, plus RDIMM is required above 256GB as far as I know
Cost vs. Reality
So far, this build delivers ~48GB VRAM initially for ~$5,600 actual spend which is not very economical. However, I think the design choices regarding PCIe lanes, flexibility afforded from significant RAM will pay off when we look to cram as much model as we can into our system. I've not seen used A100 40GB cards for less than $3000, so I think we are doing better than buying server-grade gear. And while a Mac Studio has decent performance and supports larger models with greater context length, the 256GB one will cost you at least $7,500, and the 512GB model > $12,000. If we can find a way to avoid running into memory bandwidth bottlenecks (~200GB/sec vs Mac's ~800GB/sec), the GPUs should offer faster inference speeds.
The motherboard, CPU, RAM, open-air case, and accessories have all been bought. As soon as I've transplanted my 2x RTX3090s from my old rig, I'll be able to start benchmarking and experimenting with various models and frameworks. Once it's stable, I will start adding more GPU. Stay tuned.