Message ID | 20220518063032.2377351-1-tarumizu.kohei@fujitsu.com (mailing list archive) |
---|---|
Headers | show |
Series | Add hardware prefetch control driver for A64FX and x86 | expand |
On 18/05/2022 15.30, Kohei Tarumizu wrote: > This patch series add sysfs interface to control CPU's hardware > prefetch behavior for performance tuning from userspace for the > processor A64FX and x86 (on supported CPU). > [snip] > In pattern A, a change of dist at L1 has a larger effect. On the other > hand, in pattern B, the change of dist at L2 has a larger effect. > As described above, the optimal dist combination depends on the > characteristics of the application. Therefore, such a sysfs interface > is useful for performance tuning. If this is something to be tuned for specific applications, shouldn't it be a prctl or similar and part of process context, so different applications can use different settings (or even a single application depending on what it's doing)? Especially if writing those sysregs/MSRs is cheap. In particular, configuring things separately for different cores feels strange. You'd then have to pin applications to specific cores to get the benefits, and wouldn't be able to optimize for multiple applications running simultaneously that need different prefetch behavior if they share cores.
Thanks for the comment. > If this is something to be tuned for specific applications, shouldn't it be a prctl or > similar and part of process context, so different applications can use different > settings (or even a single application depending on what it's doing)? Especially if > writing those sysregs/MSRs is cheap. > In particular, configuring things separately for different cores feels strange. You'd > then have to pin applications to specific cores to get the benefits, and wouldn't be > able to optimize for multiple applications running simultaneously that need > different prefetch behavior if they share cores. As you say, this is used for tuning specific applications. I assume that users using this feature bind an application to a specific core and use it exclusively. This is not only for pfctl, but also to prevent performance from being affected by context switches, etc. I agree that it is also useful to be able to control in the process context. However, in this case, I think that it is sufficient if it can be provided as a userspace interface which expresses the hardware prefetch register directly, assuming the above usage.