Device and Queue
Selecting the Physical Device
Now that the vk::Instance is created we need to actually choose the physical device, as discussed in the high level overview in the introduction. We can just call instance.enumerate_physical_devices() to get a vector of the possible devices. Then, we can use instance.get_physical_device_properties(physical_device) to get a vk::PhysicalDeviceProperties for each one that contains info about what type of device it is, what Vulkan version it supports, how much memory it has, etc.
For this guide, we’re going to use a very simple algorithm:
This will prioritize discrete, over integrated, over any other type of device and just choose the first one if there’s a tie. Or, if the vector is empty that means that no device supports Vulkan, so we can just return an error. This won’t cover all cases, for example it’s possible a user has two discrete GPUs, and we select one that has much lower features and thus fails logical device creation while the other one wouldn’t. But, this is good enough for 99% of cases.
Feel free to add more criteria or a scoring system or whatever else you wish if you want more complex physical device selection.
Queue Family Selection
On top of the commands we put in the command buffer mentioned before, there’s also a way to explicitly program the GPU much the same way you do the CPU. You use something called shaders that you “bind” then commands afterward refer to them.
Shaders are split into two main types:
- Graphics shaders - Which uses hardware accelerated rasterizer, any
cmd_draw*command uses the bound graphics shader. These have quite a few parameters, and we’ll get into what a “rasterizer” is later, for now we’ll focus on compute shaders. - Compute shaders - The GPU is designed to do heavily parallelized work, and so a compute shader is essentially a function that is run by an amount of “threads” you specify. It’s similar to spawning a bunch of threads to run the same function on the CPU (with the only argument difference being the thread index), except while your CPU only has maybe 4-32 threads a GPU has thousands to millions. These correlate to a
cmd_dispatch*command that helps to specify how many threads to run for the bound compute shader.
One thing that we left out of the high level overview is that queues are organized together into “queue families” based off of capabilities. Those capabilities are indicated by VkQueueFlagBits. There’s a couple of capabilities there but the one’s we are interested in is GRAPHICS, COMPUTE, and TRANSFER.
GRAPHICS- enables usage of graphics shaders and the related commands.COMPUTE- enables usage of compute shaders and the related commands.TRANSFER- enables usage of commands to copy buffers or images to each other.
Queue families can have multiple of these capabilities together, and indeed both GRAPHICS and COMPUTE imply TRANSFER capability since you can copy buffers and images in shaders. Notably, for a long time even though they implied TRANSFER it was optional for the implementation to advertise it being enabled on that queue family until Vulkan 1.4 added the following requirement:
Combine that with the requirement that implementors had since Vulkan 1.0:
And that means we have a guaranteed queue family that supports all three capabilities, generally this is queue family 0 but there’s no guarantee for that so we’ll find it manually.
Note that generally the fewer capabilities a queue family has, the more specialized the hardware probably is for it. For example, a queue family with TRANSFER capability and none of the other two usually indicates that queue family uses DMA (direct memory access) hardware. But, for the duration of this guide we’ll keep things simple and stick to using one queue from the queue family with all three capabilities.
Finding this queue family is pretty similar to physical device selection, just using instance.get_physical_device_queue_family_properties(physical_device) to get a list of queue family properties, checking for the first queue family that supports all three capabilities, then returning the index. There is no object for queue families, they’re just referred to by the index in that list.
Since it’s an index Rust will return the position as a usize, but Vulkan refers to it with a u32 so we’ll need to cast it.
Logical Device Creation
Just like with the instance, we’re going to use instance.create_device() to make the logical device, and it takes a vk::DeviceCreateInfo for the info needed to create it. Let’s look at the C version of the struct to see the arguments:
We’re not using any special flags, so let’s focus on pQueueCreateInfos first. We need to specify what queues we’re going to use when creating the logical device. The VkDeviceQueueCreateInfo struct specifies a queue family, how many queues we’re going to use in that family, and a list of priority values (from 0.0 - 1.0). Those priority values hint to the implementation that some queues have higher priority than others and thus should be allotted more processing time. Since we’re only using one queue the priority doesn’t matter.
Next is layers, as noted in the comments these are deprecated. It used to be that instance level layers were able to intercept the “global” functions like queue family enumeration and device properties. While device layers were able to intercept device level functions like command buffer creation and submission. But, every layer just ended up allowing you to specify for both because it’s more useful to control everything that way (like again with validation layers). So, instance layers were expanded to be able to intercept all functions and device layers deprecated.
Extensions as mentioned before actually add new functionality (structs and functions) instead of just intercepting. Device level extensions have to be supported by the graphics driver, the Vulkan SDK has a nice tool called the Vulkan Hardware Capability Viewer that lets you view the queue families and possible extensions/features your device supports. For now, we’re not going to use any extensions so we’ll leave this blank.
Finally, pEnabledFeatures. Some features are optional for the implementation to support, some are exclusive with each other, and some introduce some performance overhead just by being enabled. So, Vulkan requires you to explicitly enable a couple of features. The issue though is that VkPhysicalDeviceFeatures only covers features in Vulkan 1.0. So, it’s since been deprecated, and instead you attach a VkPhysicalDeviceFeatures2 struct to pNext, and every extension and Vulkan version after 1.0 has its own feature struct you add to the pNext chain. For example, Vulkan 1.1 has its own VkPhysicalDeviceVulkan11Features features struct. Again, we’re not interested in enabling any features for now so we don’t attach anything to pNext.
Now we can actually create it, only thing we’re filling out is the queue create info, everything else we’re leaving zeroed.
Now we can get the Queue object we’ve been working for, which anticlimactically is a one-liner:
Storing in the Struct and Cleanup
Let’s modify our Engine struct real quick to store all these objects we created:
Now if you run this with the Vulkan Configurator open in the background, you’ll get a validation error about not destroying the device. As mentioned before, you have to manually clean-up things in Vulkan and in this case we need to destroy the logical device before the instance.