The usage of the LLVM libraries in the Mesa 3D graphics library - GASERI


The Mesa 3D graphics library represents the de facto standard open-source implementation of OpenGL, OpenGL ES, OpenCL, Vulkan, and other open standards. Mesa offers several hardware drivers, including the drivers for several generations of ...



Onion Details



Page Clicks: 0

First Seen: 03/15/2024

Last Indexed: 09/18/2024

Domain Index Total: 397



Onion Content



The usage of the LLVM libraries in the Mesa 3D graphics library - The Mesa 3D graphics library represents the de facto standard open-source implementation of OpenGL , OpenGL ES , OpenCL , Vulkan , and other open standards . Mesa offers several hardware drivers , including the drivers for several generations of AMD Radeon GPUs. In the following we will focus on RadeonSI , Mesa's OpenGL, OpenGL ES, and OpenCL driver for Graphics Core Next (GCN) and Radeon DNA (RDNA) GPUs. For what it's worth, Mesa's GCN and RDNA Vulkan driver is called RADV ; a more detailed overview of the driver structure can be found in the State of open source AMD GPU drivers presentation ( recording ). Overview of the Graphics Core Next (GCN) and Radeon DNA (RDNA) architecture generations - Generations of the Graphics Core Next (GCN) architecture are: Generations of the Radeon DNA (RDNA) architecture are: Running the OpenCL programs using Clover - Gallium is Mesa's driver API that enables drivers for different hardware to share device-agnostic parts of the code. RadeonSI is one of the drivers using Gallium API; Nouveau is another, offering support for NVIDIA GPUs . Both drivers can use Mesa's Gallium frontends (also known as state trackers), which implement various standards for 3D graphics, compute, and video decoding acceleration. We are specifically interested in Clover, which is the Gallium frontend for OpenCL. Note - While Clover on RadeonSI can run many OpenCL programs, it is not a complete implementation of the OpenCL standard; the detailed list of the supported extensions can be found on the Mesa drivers matrix . There is an ongoing community effort to improve the Clover frontend and the RadeonSI driver as well. For a 2016/2017 overview of the work required to make Clover and RadeonSI usable for the scientific computing applications, see the presentations LLVM AMDGPU for High Performance Computing: are we competitive yet? ( slides , recording ) and Towards fully open source GPU accelerated molecular dynamics simulation ( slides , recording ) by the author of these exercises. We will start by running clinfo , a simple OpenCL program that prints out all known properties of all OpenCL platforms and devices in the system. Using the --version parameter we'll make sure that the clinfo command is working properly and that a recent version is being used: When --list parameter is specified, clinfo will print the list of OpenCL platforms and devices on each of the platforms: We can see that we have only one platform (Clover) and only one device (Radeon RX 6800, 2nd generation RDNA GPU codenamed Sienna Cichild ). Running clinfo command without parameters will make it print the platform and device properties: We can see LLVM 13.0.1 mentioned in several places. Clang and LLVM are used by Clover for compiling the OpenCL C code to assembly code for the gfx1030 processor, which is a part of the Radeon RX 6800 GPU. The resulting assembly code is then linked with libclc that contains the implementations of the fundamental OpenCL C data types and functions. Finally, the resulting code after linking is executed by RadeonSI on the Radeon 6800 GPU. Assignment - Compare the output of clinfo on your machine to the output shown above. (If you do not posses a GPU with an OpenCL driver, use Portable Computing Language ( GitHub ) to run OpenCL on the CPU.) - Using the environment variables - Mesa supports many environment variables which can be used for debugging purposes as well as learning how the compilation. Clover frontend environment variables are: RadeonSI driver environment variable AMD_DEBUG has several interesting options, including preoptir , which prints the LLVM intermediate representation before initial optimizations, and gisel , which enables the LLVM global instruction selector . Compiling the OpenCL programs with Clang - For convenience, we will be compiling the OpenCL programs with standalone Clang. An example OpenCL kernel that performs vector addition is as follows: Save this kernel in a file named vecadd.cl . In order to compile it, we will use the following parameters we have not used before: The parameter -emit-llvm can be used in addition to -S to make Clang write the LLVM intermediate representation instead of the assembly, just like we have used it previously. The resulting file is named vecadd.s . Let's take a look at its contents: Common GCN and RDNA assembly instructions can be divided into two groups, scalar (names starting with s_ ) and vector (names starting with v_ ). Scalar instructions use scalar general-purpose registers (SGPRs, named s1 , s2 etc.), while vector instructions use (VGPRs, named v1 , v2 etc.). The assembly code above contains, among other, the assembly code produced by from the OpenCL C code lines int i = get_global_id(0); and c[i] = a[i] + b[i]; . Figure out which lines in the assembly code correspond to each of the two lines and classify them as scalar or vector instructions. Modify the OpenCL C kernel to compute the sum of the three vectors instead of two and compile it into the assembly code. Compare the two resulting assembly codes in terms of code size and register usage. Tip - For details on how to compile with Clang for different GPUs, a good starting point is the article titled Compile OpenCL Kernel into LLVM-IR or Nvidia PTX written by Adam Basfop Cavendish and posted on his blog named Modest Destiny . llc - llc compiles the LLVM intermediate representation into the assembly language of any of the processor architectures supported by LLVM. The list of supported architectures can be obtained using the --version parameter: Note the amdgcn entry in the list of the LLVM's registered target architectures, i.e. AMD's graphics processors based on the Graphics Core Next architecture (GCN) . As the instructions of the Radeon DNA (RDNA) architecture are very similar to the instructions of the GCN architecture, the same LLVM backend is also used for RDNA, despite the name perhaps suggesting otherwise. The situation is similar with older graphics processors: the r600 backend supports the R600 architecture (marketing names Radeon HD 2000 and Radeon HD 3000) as well as R700 (Radeon HD 4000 series), Evergreen (Radeon HD 5000 series), and Northern Islands (Radeon HD 6000 series) architectures. Each architecture generation has a number of processors. The official documentation of the LLVM AMDGPU backend contains the list of processors and features . In addition, the list of the supported processors and features of the target can be obtained using the llc command with -march and -mattr parameters: A good starting point for further study of AMDGPU backend is Tom Stellard 's A Detailed Look at the R600 Backend ( slides , recording ), presented at 2013 LLVM Developers' Meeting . While focused on R600 and not GCN, many of the points made in the talk still hold. Use Clang to compile the OpenCL C code from the previous example to LLVM intermediate representation and then use llc to compile it for gfx700 , fiji , and gfx1030 processors. Compare the resulting assembly codes in terms of the code size, types of instructions used as well as register pressure. Author: Vedran Miletić