NVIDIA GPU默认是自动调频,如果需要profile程序,通过ncu获得的结果可能会不准确,尤其是对于一些比较小的程序来说。因此,测试需要固定gpu的时钟频率。需要执行的命令如下:

sudo nvidia-smi -pm 1
nvidia-smi -q -d CLOCK
sudo nvidia-smi -lgc 2100,2100
nvidia-smi -q -d CLOCK
-pm,  --persistence-mode=   Set persistence mode: 0/DISABLED, 1/ENABLED
-q,   --query               Display GPU or Unit info.
        -d,   --display=            Display only selected information: MEMORY,
                                    UTILIZATION, ECC, TEMPERATURE, POWER, CLOCK,
                                    COMPUTE, PIDS, PERFORMANCE, SUPPORTED_CLOCKS,
                                    PAGE_RETIREMENT, ACCOUNTING, ENCODER_STATS,
                                    SUPPORTED_GPU_TARGET_TEMP, VOLTAGE
                                    FBC_STATS, ROW_REMAPPER
                                Flags can be combined with comma e.g. ECC,POWER.
                                Sampling data with max/min/avg is also returned 
                                for POWER, UTILIZATION and CLOCK display types.
                                Doesn't work with -u or -x flags.
-lgc  --lock-gpu-clocks=    Specifies <minGpuClock,maxGpuClock> clocks as a
                                    pair (e.g. 1500,1500) that defines the range 
                                    of desired locked GPU clock speed in MHz.
                                    Setting this will supercede application clocks
                                    and take effect regardless if an app is running.
                                    Input can also be a singular desired clock value
                                    (e.g. <GpuClockValue>).

首先设置persistence mode为enabled,然后通过-q -d CLOCK查询当前gpu的最大SM时钟频率为多少,再通过-lgc设置上下限,可以直接固定为最高频率。最后再次查询下当前的SM时钟频率。注意,最后查询获得的实际频率可能比设置的目标频率低,比如我使用的A3090查询获得最大时钟频率为2100MHz,但以此频率设置时,实际为1980MHz。

在使用ncu profile程序时,要加上--clock-control=none来组织ncu控制gpu频率。ncu的 clock-control参数为

Control the behavior of the GPU clocks during profiling. Allowed values:

  • base: GPC and memory clocks are locked to their respective base frequency during profiling. This has no impact on thermal throttling. Note that actual clocks might still vary, depending on the level of driver support for this feature. As an alternative, use nvidia-smi to lock the clocks externally and set this option to none.
  • none: No GPC or memory frequencies are changed during profiling.

默认为base。

Reference

Can I fix my GPU clock rate to ensure consistent profiling results?

How to set gpu clock using nvidia-smi?


文章版权归 FindHao 所有丨本站默认采用CC-BY-NC-SA 4.0协议进行授权|
转载必须包含本声明,并以超链接形式注明作者 FindHao 和本文原始地址:
https://www.findhao.net/easycoding/2587.html

Comments

-->