Patching NVIDIA's driver and vLLM to enable P2P on consumer GPUs

NVIDIA artificially restricts peer-to-peer (P2P) GPU communication to their enterprise cards. Turns out this is a software limitation, not a hardware one. I patched my drivers to remove it, hacked vLLM to take advantage of it, and got a 15-50% throughput improvement running Qwen 3.5 35b on dual RTX 3090s.

February 25, 2026 ยท 10 min ยท 2084 words ยท Sam McLeod
NVApi integrated with Home Assistant

NVApi - Nvidia GPU Monitoring API

NVApi is a small application Iโ€™ve written for monitoring and presenting utilisation metrics from Nvidia GPUs. This can be used to monitor GPU memory, temperature, power usage, and utilisation of GPUs in a system and can easily be integrated into tools such as HomeAssistant or Prometheus. The package uses the Nvidia Management Library (NVML) and provides a simple API for monitoring Nvidia GPUs along with a basic GUI client. NVApi on Github ...

May 18, 2024 ยท 3 min ยท 458 words ยท Sam McLeod