At the invitation of Deputy Director of RAL Prof. Zhang Dianhua and Prof. Xie Zhi of the School of Information Science and Engineering, Dr. Cheng Runwei of the BGP R&D Center in Houston, USA visited our office on April 25-26. In the afternoon of April 25th, Dr. Chen delivered an academic report titled "How to write a good GPU kernel" at the academic report hall RAL 411. Prof. Zhang Dianhua hosted this report meeting.
Dr. Cheng Runwei believes that, in the context of the current wave of the fourth industrial revolution, GPU high-performance computing technology has promoted the second rise of machine learning, and has demonstrated a fascinating future for today's AI computing era. High-performance parallel computing on CPU+GPU Heterogeneous Architecture has become one of the two core technologies in today's AI computing era. Dr. Cheng expounded its significance to industrial production based on his engineering experience in the field of parallel computing.
Based on his long-term research experience, Dr. Cheng described the general principles and steps of GPU kernel function optimization, and used the finite difference method to solve a partial differential equation in the sound wave transmission process as an example to explain how to optimize kernel parameters from the perspective of GPU executing the model and memory model. At the same time, Dr. Cheng, based on his years of work experience, described his own mental path from a C programmer to a GPU programmer and the difficulties encountered in the process, especially the difficulty of thinking, and made a detailed explanation through a specific CUDA program.
Finally, Dr. Cheng answered the questions of the students and encouraged the students to face problems and face difficulties, and affirmed that after overcoming the initial difficulties, they would usher in a bright future.
High-performance parallel computing on the CPU+GPU heterogeneous architecture will provide technical support for big data and intelligent manufacturing in the steel industry. It will also bring new ideas for the process control system R&D and finite element analysis that RAL has long been engaged in.


