Building the Beast: 4-RTX3090 Threadripper Pro Machine

Complete build log from component selection to assembly
January 2026
← Back to LLM Garage

The goal: build a scalable RTX3090 machine with Threadripper 5955WX, 512GB RAM, and capability to expand from 2 to 6 GPUs. This build log covers component selection, compatibility verification, and important lessons learned along the way. Starting with 2 RTX3090 GPUs and 1300W EVGA Supernova PSU, and planning for expansion to 4 or 6 GPUs.

Bill of Materials

Core Components

Component Part Est. Cost Notes
Motherboard Supermicro M12SWA-TF-O (E-ATX) $700 sWRX8 socket, 6x PCIe 4.0 x16, 8-ch DDR4
CPU Threadripper Pro 5955WX $660 16C/32T, 4.0/4.5GHz, 128 PCIe 4.0 lanes
RAM OWC 512GB (8x64GB) DDR4-3200 ECC RDIMM $3,423 2Rx4 288-pin 1.2V ECC Registered
GPUs 2x NVIDIA RTX 3090 $1,600 48GB VRAM total (24GB each)
PSU Corsair HX1500i 80+ Platinum $400 1500W single PSU (not dual)
PSU Adapter ADD2PSU-S Dual PSU Adapter $17 For future dual PSU setup
Frame Bomeiqee Mining Rig Frame $65 12GPU frame with fans
Risers 6x LINKUP PCIE 5.0 Riser Cable 40cm $714 Black v2, true x16 bandwidth
CPU Cooler Noctua NH-U14S TR4-SP3 $110 Threadripper compatible cooler
Case Fans 4x Noctua NF-A12x25 PWM $152 120mm high-performance fans

Total: ~$7,875 (estimated prices based on actual purchases)

Actual Spent to date (with tax): ~$5,618 (haven't bought the 2x more 3090 GPUs, or addition PSU yet)

Critical Component Selection Criteria

Motherboard: Supermicro M12SWA-TF

Chosen specifically for its 6x PCIe 4.0 x16 slots at full bandwidth. No bifurcation or sharing - each GPU gets the full 16 lanes. This board supports both 3000WX and 5000WX Threadripper Pro CPUs with 8-channel memory configuration.

CPU: Threadripper Pro 5955WX

The 5955WX provides 16 cores/32 threads with 4.0/4.5GHz boost speeds. For GPU-bound inference workloads, this is more than adequate - the key advantage is all Threadripper Pro CPUs provide 128 PCIe 4.0 lanes regardless of core count. The 5955WX delivers excellent value compared to higher core count models for LLM inference workloads.

Power Requirements

Component Power Draw
2x RTX 3090 @ load ~700W
Threadripper Pro 5955WX ~280W
RAM, storage, fans ~50W
Total Peak Load ~1,030W
Important: Current single 1500W PSU provides ample headroom for 2-GPU setup. Future expansion to 4-6 GPUs will require the ADD2PSU adapter and second PSU. At full 6-GPU expansion (~2,494W), you'll need a 30A/240V circuit or two separate 20A/120V circuits.

Assembly Process & Lessons Learned

PCIe Riser Cable Reality Check

Cheap mining risers won't work. Those USB-style risers are x1-to-x16 adapters with only x1 bandwidth (~4 GB/s). For LLM inference, you need true x16-to-x16 extension cables. The LINKUP Ultra cables deliver full 32 GB/s bandwidth with no measurable performance loss.

CPU Lock Warning

Some Threadripper Pro CPUs from Lenovo P620 workstations are firmware-locked. Always verify sellers confirm "unlocked" or "retail/OEM tray" before purchase. A locked CPU won't post in aftermarket boards.

RAM Configuration

ECC RDIMM is required for 512GB configuration. 64GB DIMMs are only available as registered ( RDIMM). Standard UDIMM maxes out at 256GB (8x32GB). The 8-channel configuration provides ~200 GB/s memory bandwidth.

Thermal Management

Open-air frame with directional airflow across GPUs works well. RTX 3090s run hot (~80-85C under load). Consider undervolting if thermals become problematic. Noctua fans provide excellent airflow without excessive noise.

Capability Overview

Current Configuration (2x RTX3090)

Future Expansion Path

The M12SWA-TF supports 6 GPUs total. Adding 2 more RTX3090s (~$1,600) with second PSU will provide:

6-GPU configuration (~$2,400 for 4 additional GPUs + second PSU) will deliver:

Key Takeaways

Design Decisions

Cost vs. Reality

So far, this build delivers ~48GB VRAM initially for ~$5,600 actual spend which is not very economical. However, I think the design choices regarding PCIe lanes, flexibility afforded from significant RAM will pay off when we look to cram as much model as we can into our system. I've not seen used A100 40GB cards for less than $3000, so I think we are doing better than buying server-grade gear. And while a Mac Studio has decent performance and supports larger models with greater context length, the 256GB one will cost you at least $7,500, and the 512GB model > $12,000. If we can find a way to avoid running into memory bandwidth bottlenecks (~200GB/sec vs Mac's ~800GB/sec), the GPUs should offer faster inference speeds.

The motherboard, CPU, RAM, open-air case, and accessories have all been bought. As soon as I've transplanted my 2x RTX3090s from my old rig, I'll be able to start benchmarking and experimenting with various models and frameworks. Once it's stable, I will start adding more GPU. Stay tuned.