[Studio] Fix GPU detection for AMD/Intel — add Vulkan VRAM fallback#4874
[Studio] Fix GPU detection for AMD/Intel — add Vulkan VRAM fallback#4874HellBoxyz wants to merge 1 commit intounslothai:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a fallback mechanism for detecting free GPU memory using vulkaninfo, enabling support for AMD, Intel, and other Vulkan-compatible hardware when nvidia-smi is unavailable. The review feedback identifies a logic error in the parsing of vulkaninfo output, where multiple memory heaps are incorrectly treated as distinct GPUs, and provides a more robust implementation that groups heaps by physical device.
| # Split output into per-heap blocks at each " memoryHeaps[N]:" | ||
| # marker, then check each block for DEVICE_LOCAL flag and budget. | ||
| heap_sections = re.split(r"(?=\tmemoryHeaps\[\d+\]:)", output) | ||
| budget_re = re.compile(r"budget\s*=\s*(\d+)") | ||
|
|
||
| gpus: list[tuple[int, int]] = [] | ||
| gpu_idx = 0 | ||
| for section in heap_sections: | ||
| if not section.strip().startswith("memoryHeaps["): | ||
| continue | ||
| if "MEMORY_HEAP_DEVICE_LOCAL_BIT" not in section: | ||
| continue | ||
| budget_m = budget_re.search(section) | ||
| if not budget_m: | ||
| continue | ||
| budget_bytes = int(budget_m.group(1)) | ||
| free_mib = budget_bytes // (1024 * 1024) | ||
| if free_mib > 0: | ||
| gpus.append((gpu_idx, free_mib)) | ||
| gpu_idx += 1 |
There was a problem hiding this comment.
The current parsing logic for vulkaninfo output is not robust for all systems. It treats every device-local memory heap as a separate GPU, which is incorrect for multi-GPU systems or single GPUs that expose multiple device-local heaps. This can lead to misreporting the number of GPUs and their available memory, causing issues with GPU selection and model offloading.
A more robust approach is to group memory heaps by physical device and report the largest available memory budget for each. This ensures that each physical GPU is represented as a single entry with its correct available VRAM.
# Split output by physical device. vulkaninfo typically separates devices
# with headers like "GPU0", "GPU1", etc. on their own lines.
# The lookahead (?=...) keeps the delimiter.
device_sections = re.split(r"(?=^GPU\\d+\\n)", output, flags=re.MULTILINE)
if len(device_sections) > 1:
# Filter out any non-GPU sections (like the header before GPU0)
device_sections = [s for s in device_sections if s.strip().startswith("GPU")]
# If no GPUn headers, device_sections contains the whole output as one element.
budget_re = re.compile(r"budget\\s*=\\s*(\\d+)")
gpus: list[tuple[int, int]] = []
for gpu_idx, device_section in enumerate(device_sections):
# For each physical device, find the largest device-local memory heap budget.
# A single GPU can have multiple device-local heaps.
max_free_mib = 0
heap_sections = re.split(r"(?=\\tmemoryHeaps\\[\\d+\\]:)", device_section)
for section in heap_sections:
if "MEMORY_HEAP_DEVICE_LOCAL_BIT" in section:
budget_m = budget_re.search(section)
if budget_m:
budget_bytes = int(budget_m.group(1))
free_mib = budget_bytes // (1024 * 1024)
if free_mib > max_free_mib:
max_free_mib = free_mib
if max_free_mib > 0:
gpus.append((gpu_idx, max_free_mib))_get_gpu_free_memory() relied exclusively on nvidia-smi, returning an empty list on non-NVIDIA systems. This caused the VRAM-aware context auto-reduction logic to be skipped entirely: models launched with full native context (e.g. 128K+), KV caches spilled into system RAM, and inference performance degraded significantly. Add a vulkaninfo fallback that parses VK_EXT_memory_budget heap data to detect DEVICE_LOCAL VRAM budget on AMD, Intel, and any Vulkan-capable GPU. Handles multi-GPU systems (split by GPU device headers) and GPUs with multiple DEVICE_LOCAL heaps (takes largest budget per device). nvidia-smi retains priority — zero impact on NVIDIA setups. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
23d2dc0 to
a73223d
Compare
|
This is duplicate of #4720 |
Problem
Unsloth Studio doesn't detect GPU on AMD/Intel systems. The VRAM detection (
_get_gpu_free_memory()) uses onlynvidia-smi, so on non-NVIDIA hardware it returns an empty list. This means:Fix
Add a
vulkaninfofallback that kicks in whennvidia-smiis not available:VK_EXT_memory_budget)Before / After
Before (AMD GPU):
After (AMD GPU):
Tested on
-ngl -1