Patching NVIDIA's driver and vLLM to enable P2P on consumer GPUs

NVIDIA artificially restricts peer-to-peer (P2P) GPU communication to their enterprise cards. Turns out this is a software limitation, not a hardware one. I patched my drivers to remove it, hacked vLLM to take advantage of it, and got a 15-50% throughput improvement running Qwen 3.5 35b on dual RTX 3090s.

February 25, 2026 ยท 10 min ยท 2084 words ยท Sam McLeod

Fixing AMD CPU Scaling on Fedora

Recently, after replacing my home server I noticed that the CPU (Ryzen 7600) was only scaling between 3000MHz and 3800MHz, which is the base and the first level boost clock of the CPU. I was expecting it to scale down to as low as 400Mhz when idle, and up to 5.17Ghz on boost. ...

July 9, 2023 ยท 4 min ยท 710 words ยท Sam McLeod