Skip to main content

OpenCL with Mali GPU


Ref: https://developer.arm.com/solutions/graphics-and-gaming/resources/presentations/opencl-tutorials

caution

Due to the difference in hardware structure, Desktop GPU and Mali GPU are different. Therefore, when using the Mali GPU, there are restrictions on the use of some functions.

Read the ARM® Mali™ GPU OpenCL Developer Guide, which can have a huge impact on performance.

Mali memory model

Mali GPU uses global memory with cache, instead of using local or private memory. If local or private memory is allocated, it is difficult to expect performance improvement, because it is allocated to global memory. In addition, unnecessary data movement may occur, which can degrade performance.

CL_MEM_ALLOC_HOST_PTR

It is recommended to use clCreateBuffer(CL_MEM_ALLOC_HOST_PTR) whenever possible.

CL_MEM_USE_HOST_PTR

It is better not to create a buffer using clCreateBuffer(CL_MEM_USE_HOST_PTR) whenever possible.

Creating a buffer with HOST_PTR creates a buffer accessed by the host program and a buffer accessed by the GPU in global memory. The result is 'unnecessary copying'.

malloc

There is a global memory area that the host program can access, not the Mali GPU.