Message ID | 20190613161656.20765-1-julien.grall@arm.com (mailing list archive) |
---|---|
Headers | show |
Series | arm64/sve: First steps towards optimizing syscalls | expand |
On Thu, Jun 13, 2019 at 05:16:48PM +0100, Julien Grall wrote: > Hi all, > > This is a first attempt to optimize the syscall path when the user > application uses SVE. The patch series is based on v5.2-rc4. > > Per the syscall ABI, SVE registers will be unknown after a syscall. In > practice, the kernel will disable SVE and the registers will be zeroed > (except the first 128-bits of each vector) on the next SVE instruction. > In a workload mixing SVE and syscalls, this will result to 2 entry/exit > to the kernel per syscall. > > This series aims to avoid the second entry/exit by zeroing the SVE > registers on syscall return with a twist when the task will get > rescheduled. > > This implementation will have an impact on application using SVE > only once. SVE will now be turned on until the application terminates > (unless disabling it via ptrace). Cleverer strategies for choosing > between SVE and FPSIMD context switching are possible (see [1]), but > it is difficult to assess the benefit right now. We could improve the > behaviour in the future as a selection of mature hardware platform > emerges that we can benchmark. I'm wondering whether we ought to do something about this such as turning SVE back off after the nth syscall. Given the complexity of this code though, let's stabilise the series as-is first. I probably ask dumb questions in some places, since I'm trying to refresh my memory of the subtleties of this code as I go... > It is also possible to optimize the case when the SVE vector-length > is 128-bit (i.e the same size as the FPSIMD vectors). This could be > explored in the future respin of the series. > > While developing the series, I have added a series of tracepoint in > the SVE code. They may not be suitable for upstreaming and hence not > included in the series. I can provide them if anyone is interested. > > Note that the last patch for the series is is not here to optimize syscall > but SVE trap access by directly converting in hardware the FPSIMD state > to SVE state. If there are an interest to have this optimization earlier, > I can reshuffle the patches in the series. I think this could make sense. Maybe see what Will and Catalin think about it. [...] Cheers ---Dave