Message ID | 1589301419-24459-14-git-send-email-Dave.Martin@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | prctl.2 man page updates for Linux 5.6 | expand |
Hi Dave, On Tue, May 12, 2020 at 05:36:58PM +0100, Dave Martin wrote: > diff --git a/man2/prctl.2 b/man2/prctl.2 > index 7f511d2..dd16227 100644 > --- a/man2/prctl.2 > +++ b/man2/prctl.2 > @@ -1291,6 +1291,104 @@ call failing with the error > .BR ENXIO . > For further details, see the kernel source file > .IR Documentation/admin-guide/kernel-parameters.txt . > +.\" prctl PR_SVE_SET_VL > +.\" commit 2d2123bc7c7f843aa9db87720de159a049839862 > +.\" linux-5.6/Documentation/arm64/sve.rst > +.TP > +.BR PR_SVE_SET_VL " (since Linux 4.15, only on arm64)" > +Configure the thread's SVE vector length, > +as specified by > +.IR arg2 . > +Arguments > +.IR arg3 ", " arg4 " and " arg5 > +are ignored. Bugger, did we forget to force these to zero? I guess we should write the man-page first next time :( > +.IP > +The bits of > +.I arg2 > +corresponding to > +.B SVE_VL_LEN_MASK PR_SVE_LEN_MASK > +must be set to the desired vector length in bytes. > +In addition, > +.I arg2 > +may include zero or more of the following flags: > +.RS > +.TP > +.B PR_SVE_VL_INHERIT > +Inherit the configured vector length across > +.BR execve (2). > +.TP > +.B PR_SVE_SET_VL_ONEXEC > +Defer the change until the next > +.BR execve (2) > +in this thread. (aside, it's weird that we didn't allocate (1<<16) for one of these flags) > +If > +.B PR_SVE_VL_INHERIT > +is also included in > +.IR arg2 , > +it takes effect > +.I after > +this deferred change. I find this a bit hard to follow, since it's not clear to me whether the INHERIT flag is effectively set before or after the next execve(). In other words, if both PR_SVE_SET_VL_ONEXEC and PR_SVE_VL_INHERIT are specified, is the vector length preserved or reset on the next execve()? > +.RE > +.IP > +On success, the vector length and flags are set as requested, > +and any deferred change that was pending immediately before the > +.B PR_SVE_SET_VL > +call is canceled. Huh, turns out 'canceled' is a valid US spelling. Fair enough, but it looks wrong to me ;) > +If > +.B PR_SVE_SET_VL_ONEXEC > +was included in > +.IR arg2 , > +the returned value describes the configuration > +scheduled to take effect at the next > +.BR execve (2). "describes the configuration" how? > +Otherwise, the effect is immediate and > +the returned value describes the new configuration. > +The returned value is encoded in the same way as the return value of > +.BR PR_SVE_GET_VL . Aha. Maybe move this bit up slightly? > +.IP > +If neither of the above flags is included in are included > +.IR arg2 , > +a subsequent > +.BR execve (2) > +resets the vector length to the default value configured in > +.IR /proc/sys/abi/sve_default_vector_length . > +.IP > +The actual vector length configured by this operation > +is the greatest vector length supported by the platform > +that does not exceed > +.I arg2 > +& > +.BR PR_SVE_VL_LEN_MASK . > +.IP > +The configuration (including any pending deferred change) > +is inherited across > +.BR fork (2) > +and > +.BR clone (2). > +.\" prctl PR_SVE_GET_VL > +.TP > +.BR PR_SVE_GET_VL " (since Linux 4.15, only on arm64)" > +Get the thread's current SVE vector length configuration, > +as configured by > +.BR PR_SVE_SET_VL . It doesn't *have* to be configured by PR_SVE_SET_VL though, right? > +.IP > +If successful, the return value describes the > +.I current > +configuration. (aside: prctl() returns int, so we can't ever allocate past bit 30 in arg2. Might be worth a note somewhere in the kernel). > +The bits corresponding to > +.B PR_SVE_VL_LEN_MASK > +contain the currently configured vector length in bytes. > +The bit corresponding to > +.B PR_SVE_VL_INHERIT > +indicates whether the vector length will be inherited > +across > +.BR execve (2). > +.IP > +Note that there is no way determine whether there is to determine > +a pending vector length change that has not yet taken effect. > +.IP > +Providing that the kernel and platform support SVE, > +this operation always succeeds. > .\" > .\" prctl PR_TASK_PERF_EVENTS_DISABLE > .TP > @@ -1534,6 +1632,8 @@ On success, > .BR PR_GET_NO_NEW_PRIVS , > .BR PR_GET_SECUREBITS , > .BR PR_GET_SPECULATION_CTRL , > +.BR PR_SVE_GET_VL , > +.BR PR_SVE_SET_VL , > .BR PR_GET_THP_DISABLE , > .BR PR_GET_TIMING , > .BR PR_GET_TIMERSLACK , > @@ -1817,6 +1917,18 @@ and unused arguments to > .BR prctl () > are not 0. > .TP > +.B EINVAL > +.I option > +was > +.B PR_SVE_SET_VL > +and > +.I arg2 > +contains invalid flags, or > +.I arg2 > +& > +.B SVE_VL_LEN_MASK > +is not a valid vector length. > +.TP PR_SVE_GET_VL can return -EINVAL if SVE is not supported. Will
On Wed, May 13, 2020 at 09:43:52AM +0100, Will Deacon wrote: > Hi Dave, > > On Tue, May 12, 2020 at 05:36:58PM +0100, Dave Martin wrote: > > diff --git a/man2/prctl.2 b/man2/prctl.2 > > index 7f511d2..dd16227 100644 > > --- a/man2/prctl.2 > > +++ b/man2/prctl.2 > > @@ -1291,6 +1291,104 @@ call failing with the error > > .BR ENXIO . > > For further details, see the kernel source file > > .IR Documentation/admin-guide/kernel-parameters.txt . > > +.\" prctl PR_SVE_SET_VL > > +.\" commit 2d2123bc7c7f843aa9db87720de159a049839862 > > +.\" linux-5.6/Documentation/arm64/sve.rst > > +.TP > > +.BR PR_SVE_SET_VL " (since Linux 4.15, only on arm64)" > > +Configure the thread's SVE vector length, > > +as specified by > > +.IR arg2 . > > +Arguments > > +.IR arg3 ", " arg4 " and " arg5 > > +are ignored. > > Bugger, did we forget to force these to zero? I guess we should write the > man-page first next time :( Not an accident, but there does seem to be some inconsistency in policy among the various prctls here. glibc explicitly has extern int prctl (int __option, ...); (and nobody has to write _exit(0, 0, 0, 0, 0, 0) after all.) Is there some agreed rationale for requiring redundant arguments to be supplied explicitly as zero? For now there are likely to be few users of this, so we _might_ get away with changing the behaviour here if it's considered important enough. There is no forwards compatibility problem with this prctl though, because there are spare bits in arg2 which can "turn on" additional args if needed. Also, it's implausible that PR_SVE_GET_VL will ever want an argument. There are still 2 billion unallocated prctl numbers, so new prctls can always be added if there's ever a need to work around these limitations, but it seems extremely unlikely. > > > +.IP > > +The bits of > > +.I arg2 > > +corresponding to > > +.B SVE_VL_LEN_MASK > > PR_SVE_LEN_MASK Hmm, not sure how that happened. Good spot! I'll recheck that all the names are real when reposting. > > +must be set to the desired vector length in bytes. > > +In addition, > > +.I arg2 > > +may include zero or more of the following flags: > > +.RS > > +.TP > > +.B PR_SVE_VL_INHERIT > > +Inherit the configured vector length across > > +.BR execve (2). > > +.TP > > +.B PR_SVE_SET_VL_ONEXEC > > +Defer the change until the next > > +.BR execve (2) > > +in this thread. > > (aside, it's weird that we didn't allocate (1<<16) for one of these flags) The flag definitions are shared with ptrace: ptrace is the SVE_PT_REGS_SVE format selection bit, which isn't relevant to the prctl. Maybe it would have made more sense to keep the definitions completely separate, but it's there now. > > +If > > +.B PR_SVE_VL_INHERIT > > +is also included in > > +.IR arg2 , > > +it takes effect > > +.I after > > +this deferred change. > > I find this a bit hard to follow, since it's not clear to me whether the > INHERIT flag is effectively set before or after the next execve(). In other > words, if both PR_SVE_SET_VL_ONEXEC and PR_SVE_VL_INHERIT are specified, > is the vector length preserved or reset on the next execve()? It makes no difference, because the ONEXEC handling takes priority over the INHERIT handling. But either way INHERIT is never cleared by execve() and will apply at subsequent execs(). Explaining all this properly seems out of scope here. Maybe this should be trimmed down rather than elaborated? Or perhaps just explain it in terms of what the kernel does instead of futile attempts to make it intuitive? Ultimately I'll probably write a separate page or pages for SVE and other arm64 specifics. > > +.RE > > +.IP > > +On success, the vector length and flags are set as requested, > > +and any deferred change that was pending immediately before the > > +.B PR_SVE_SET_VL > > +call is canceled. > > Huh, turns out 'canceled' is a valid US spelling. Fair enough, but it looks > wrong to me ;) Yeah, I know, but the man pages do have a documented policy on this... prctl.2 has a lot of authors, so having mixed spellings could get particularly messy. > > > +If > > +.B PR_SVE_SET_VL_ONEXEC > > +was included in > > +.IR arg2 , > > +the returned value describes the configuration > > +scheduled to take effect at the next > > +.BR execve (2). > > "describes the configuration" how? > > > +Otherwise, the effect is immediate and > > +the returned value describes the new configuration. > > +The returned value is encoded in the same way as the return value of > > +.BR PR_SVE_GET_VL . > > Aha. Maybe move this bit up slightly? Yes, I'll reorder that. > > > +.IP > > +If neither of the above flags is included in > > are included Debatable. The subject of the verb here is not "flags" (plural), but "neither of the above flags" (which is more nuanced, though it can be interpreted as singular). Usage varies, and I don't consider this wrong. > > +.IR arg2 , > > +a subsequent > > +.BR execve (2) > > +resets the vector length to the default value configured in > > +.IR /proc/sys/abi/sve_default_vector_length . > > +.IP > > +The actual vector length configured by this operation > > +is the greatest vector length supported by the platform > > +that does not exceed > > +.I arg2 > > +& > > +.BR PR_SVE_VL_LEN_MASK . > > +.IP > > +The configuration (including any pending deferred change) > > +is inherited across > > +.BR fork (2) > > +and > > +.BR clone (2). > > +.\" prctl PR_SVE_GET_VL > > +.TP > > +.BR PR_SVE_GET_VL " (since Linux 4.15, only on arm64)" > > +Get the thread's current SVE vector length configuration, > > +as configured by > > +.BR PR_SVE_SET_VL . > > It doesn't *have* to be configured by PR_SVE_SET_VL though, right? No, "as" here is trying to mean that PR_SVE_SET_VL illustrates how the vl can be set, not that it is the only way. Maybe just delete that clause? I'm not sure it adds much. > > +.IP > > +If successful, the return value describes the > > +.I current > > +configuration. > > (aside: prctl() returns int, so we can't ever allocate past bit 30 in arg2. > Might be worth a note somewhere in the kernel). Agreed. The chance of adding more bits seemed negligible, but dropping a comment in would probably be a good idea. Maybe we could redocument PR_SVE_SET_VL's arg2 as an int. Passing an int must survive the variadic argument marshaling of the glibc prctl() wrapper, otherwise passing the existing #defines without an explicit case to long would already break. I'll probably just add the comment for now though. > > +The bits corresponding to > > +.B PR_SVE_VL_LEN_MASK > > +contain the currently configured vector length in bytes. > > +The bit corresponding to > > +.B PR_SVE_VL_INHERIT > > +indicates whether the vector length will be inherited > > +across > > +.BR execve (2). > > +.IP > > +Note that there is no way determine whether there is > > to determine Ack > > +a pending vector length change that has not yet taken effect. > > +.IP > > +Providing that the kernel and platform support SVE, > > +this operation always succeeds. > > .\" > > .\" prctl PR_TASK_PERF_EVENTS_DISABLE > > .TP > > @@ -1534,6 +1632,8 @@ On success, > > .BR PR_GET_NO_NEW_PRIVS , > > .BR PR_GET_SECUREBITS , > > .BR PR_GET_SPECULATION_CTRL , > > +.BR PR_SVE_GET_VL , > > +.BR PR_SVE_SET_VL , > > .BR PR_GET_THP_DISABLE , > > .BR PR_GET_TIMING , > > .BR PR_GET_TIMERSLACK , > > @@ -1817,6 +1917,18 @@ and unused arguments to > > .BR prctl () > > are not 0. > > .TP > > +.B EINVAL > > +.I option > > +was > > +.B PR_SVE_SET_VL > > +and > > +.I arg2 > > +contains invalid flags, or > > +.I arg2 > > +& > > +.B SVE_VL_LEN_MASK > > +is not a valid vector length. > > +.TP > > PR_SVE_GET_VL can return -EINVAL if SVE is not supported. See the earlier patch about the "unsupported hardware case of EINVAL". This affects many prctls, is "obvious" and we'd just have to document the same thing over and over again... Thanks for the review. Perhaps I was slightly oo hasty about having got it half right! ---Dave
Hi, On 5/13/20 12:46 PM, Dave Martin wrote: > On Wed, May 13, 2020 at 09:43:52AM +0100, Will Deacon wrote: >> Hi Dave, >> >> On Tue, May 12, 2020 at 05:36:58PM +0100, Dave Martin wrote: >>> diff --git a/man2/prctl.2 b/man2/prctl.2 >>> index 7f511d2..dd16227 100644 >>> --- a/man2/prctl.2 >>> +++ b/man2/prctl.2 >>> @@ -1291,6 +1291,104 @@ call failing with the error >>> .BR ENXIO . >>> For further details, see the kernel source file >>> .IR Documentation/admin-guide/kernel-parameters.txt . >>> +.\" prctl PR_SVE_SET_VL >>> +.\" commit 2d2123bc7c7f843aa9db87720de159a049839862 >>> +.\" linux-5.6/Documentation/arm64/sve.rst >>> +.TP >>> +.BR PR_SVE_SET_VL " (since Linux 4.15, only on arm64)" >>> +Configure the thread's SVE vector length, >>> +as specified by >>> +.IR arg2 . >>> +Arguments >>> +.IR arg3 ", " arg4 " and " arg5 >>> +are ignored. >> >> Bugger, did we forget to force these to zero? I guess we should write the >> man-page first next time :( Quite... > Not an accident, but there does seem to be some inconsistency in policy > among the various prctls here. The whole 5-args-for-prctl thing was a bit of a misdesign. The general preference is that, for new prctls, unused arguments should be required to be zero. Historically, there was much inconsistency. > glibc explicitly has > > extern int prctl (int __option, ...); > > (and nobody has to write _exit(0, 0, 0, 0, 0, 0) after all.) > > Is there some agreed rationale for requiring redundant arguments to be > supplied explicitly as zero? For now there are likely to be few users > of this, so we _might_ get away with changing the behaviour here if it's > considered important enough. See above. > There is no forwards compatibility problem with this prctl though, > because there are spare bits in arg2 which can "turn on" additional > args if needed. > > Also, it's implausible that PR_SVE_GET_VL will ever want an argument. > > There are still 2 billion unallocated prctl numbers, so new prctls can > always be added if there's ever a need to work around these limitations, > but it seems extremely unlikely. > >> >>> +.IP >>> +The bits of >>> +.I arg2 >>> +corresponding to >>> +.B SVE_VL_LEN_MASK >> >> PR_SVE_LEN_MASK > > Hmm, not sure how that happened. Good spot! > > I'll recheck that all the names are real when reposting. > >>> +must be set to the desired vector length in bytes. >>> +In addition, >>> +.I arg2 >>> +may include zero or more of the following flags: >>> +.RS >>> +.TP >>> +.B PR_SVE_VL_INHERIT >>> +Inherit the configured vector length across >>> +.BR execve (2). >>> +.TP >>> +.B PR_SVE_SET_VL_ONEXEC >>> +Defer the change until the next >>> +.BR execve (2) >>> +in this thread. >> >> (aside, it's weird that we didn't allocate (1<<16) for one of these flags) > > The flag definitions are shared with ptrace: ptrace is the > SVE_PT_REGS_SVE format selection bit, which isn't relevant to the prctl. > > Maybe it would have made more sense to keep the definitions completely > separate, but it's there now. > >>> +If >>> +.B PR_SVE_VL_INHERIT >>> +is also included in >>> +.IR arg2 , >>> +it takes effect >>> +.I after >>> +this deferred change. >> >> I find this a bit hard to follow, since it's not clear to me whether the >> INHERIT flag is effectively set before or after the next execve(). In other >> words, if both PR_SVE_SET_VL_ONEXEC and PR_SVE_VL_INHERIT are specified, >> is the vector length preserved or reset on the next execve()? > > It makes no difference, because the ONEXEC handling takes priority over > the INHERIT handling. But either way INHERIT is never cleared by execve() > and will apply at subsequent execs(). > > Explaining all this properly seems out of scope here. Maybe this should > be trimmed down rather than elaborated? Or perhaps just explain it in > terms of what the kernel does instead of futile attempts to make it > intuitive? > > Ultimately I'll probably write a separate page or pages for SVE and other > arm64 specifics. (okay.) >>> +.RE >>> +.IP >>> +On success, the vector length and flags are set as requested, >>> +and any deferred change that was pending immediately before the >>> +.B PR_SVE_SET_VL >>> +call is canceled. >> >> Huh, turns out 'canceled' is a valid US spelling. Fair enough, but it looks >> wrong to me ;) > > Yeah, I know, but the man pages do have a documented policy on this... > > prctl.2 has a lot of authors, so having mixed spellings could get > particularly messy. Quite. Indeed, that was how things were when I took over as maintainer: a hodge-podge of British and American spellings, occasionally even in the same page. I decided we needed consistency, and though American is not my native spelling, it seemed the most appropriate convention. >> >>> +If >>> +.B PR_SVE_SET_VL_ONEXEC >>> +was included in >>> +.IR arg2 , >>> +the returned value describes the configuration >>> +scheduled to take effect at the next >>> +.BR execve (2). >> >> "describes the configuration" how? >> >>> +Otherwise, the effect is immediate and >>> +the returned value describes the new configuration. >>> +The returned value is encoded in the same way as the return value of >>> +.BR PR_SVE_GET_VL . >> >> Aha. Maybe move this bit up slightly? > > Yes, I'll reorder that. > >> >>> +.IP >>> +If neither of the above flags is included in >> >> are included > > Debatable. > > The subject of the verb here is not "flags" (plural), but "neither of > the above flags" (which is more nuanced, though it can be interpreted > as singular). Usage varies, and I don't consider this wrong. As far as I know, the grammarians are with you on this one, Dave, and if I was writing carefully, I'd do the same as you. But, the plural here is also frequent (and so common that I would hesitate to call it "wrong"). [...] Cheers, Michael
On Wed, May 13, 2020 at 01:01:12PM +0200, Michael Kerrisk (man-pages) wrote: > Hi, > > On 5/13/20 12:46 PM, Dave Martin wrote: > > On Wed, May 13, 2020 at 09:43:52AM +0100, Will Deacon wrote: > >> Hi Dave, > >> > >> On Tue, May 12, 2020 at 05:36:58PM +0100, Dave Martin wrote: > >>> diff --git a/man2/prctl.2 b/man2/prctl.2 > >>> index 7f511d2..dd16227 100644 > >>> --- a/man2/prctl.2 > >>> +++ b/man2/prctl.2 > >>> @@ -1291,6 +1291,104 @@ call failing with the error > >>> .BR ENXIO . > >>> For further details, see the kernel source file > >>> .IR Documentation/admin-guide/kernel-parameters.txt . > >>> +.\" prctl PR_SVE_SET_VL > >>> +.\" commit 2d2123bc7c7f843aa9db87720de159a049839862 > >>> +.\" linux-5.6/Documentation/arm64/sve.rst > >>> +.TP > >>> +.BR PR_SVE_SET_VL " (since Linux 4.15, only on arm64)" > >>> +Configure the thread's SVE vector length, > >>> +as specified by > >>> +.IR arg2 . > >>> +Arguments > >>> +.IR arg3 ", " arg4 " and " arg5 > >>> +are ignored. > >> > >> Bugger, did we forget to force these to zero? I guess we should write the > >> man-page first next time :( > > Quite... > > > Not an accident, but there does seem to be some inconsistency in policy > > among the various prctls here. > > The whole 5-args-for-prctl thing was a bit of a misdesign. > > The general preference is that, for new prctls, unused arguments > should be required to be zero. Historically, there was much > inconsistency. > > > glibc explicitly has > > > > extern int prctl (int __option, ...); > > > > (and nobody has to write _exit(0, 0, 0, 0, 0, 0) after all.) > > > > Is there some agreed rationale for requiring redundant arguments to be > > supplied explicitly as zero? For now there are likely to be few users > > of this, so we _might_ get away with changing the behaviour here if it's > > considered important enough. > > See above. So there is no bulletproof rationale for either approach, but the main concern is inconsistency? Have I understood that right? I'll propose to get that written down in the kernel source somewhere if so. (From my end, the pros and cons of the two approaches seem superficial but the inconsistency is indeed annoying. For PR_SVE_SET_VL, I think the first example I looked at didn't zero the trailing arguments, so I didn't either... but it's been upstream for several releases, so most likely we're stuck with it.) > > > There is no forwards compatibility problem with this prctl though, > > because there are spare bits in arg2 which can "turn on" additional > > args if needed. > > > > Also, it's implausible that PR_SVE_GET_VL will ever want an argument. > > > > There are still 2 billion unallocated prctl numbers, so new prctls can > > always be added if there's ever a need to work around these limitations, > > but it seems extremely unlikely. > > > >> > >>> +.IP > >>> +The bits of > >>> +.I arg2 > >>> +corresponding to > >>> +.B SVE_VL_LEN_MASK > >> > >> PR_SVE_LEN_MASK > > > > Hmm, not sure how that happened. Good spot! > > > > I'll recheck that all the names are real when reposting. > > > >>> +must be set to the desired vector length in bytes. > >>> +In addition, > >>> +.I arg2 > >>> +may include zero or more of the following flags: > >>> +.RS > >>> +.TP > >>> +.B PR_SVE_VL_INHERIT > >>> +Inherit the configured vector length across > >>> +.BR execve (2). > >>> +.TP > >>> +.B PR_SVE_SET_VL_ONEXEC > >>> +Defer the change until the next > >>> +.BR execve (2) > >>> +in this thread. > >> > >> (aside, it's weird that we didn't allocate (1<<16) for one of these flags) > > > > The flag definitions are shared with ptrace: ptrace is the > > SVE_PT_REGS_SVE format selection bit, which isn't relevant to the prctl. > > > > Maybe it would have made more sense to keep the definitions completely > > separate, but it's there now. > > > >>> +If > >>> +.B PR_SVE_VL_INHERIT > >>> +is also included in > >>> +.IR arg2 , > >>> +it takes effect > >>> +.I after > >>> +this deferred change. > >> > >> I find this a bit hard to follow, since it's not clear to me whether the > >> INHERIT flag is effectively set before or after the next execve(). In other > >> words, if both PR_SVE_SET_VL_ONEXEC and PR_SVE_VL_INHERIT are specified, > >> is the vector length preserved or reset on the next execve()? > > > > It makes no difference, because the ONEXEC handling takes priority over > > the INHERIT handling. But either way INHERIT is never cleared by execve() > > and will apply at subsequent execs(). > > > > Explaining all this properly seems out of scope here. Maybe this should > > be trimmed down rather than elaborated? Or perhaps just explain it in > > terms of what the kernel does instead of futile attempts to make it > > intuitive? > > > > Ultimately I'll probably write a separate page or pages for SVE and other > > arm64 specifics. > > (okay.) > > >>> +.RE > >>> +.IP > >>> +On success, the vector length and flags are set as requested, > >>> +and any deferred change that was pending immediately before the > >>> +.B PR_SVE_SET_VL > >>> +call is canceled. > >> > >> Huh, turns out 'canceled' is a valid US spelling. Fair enough, but it looks > >> wrong to me ;) > > > > Yeah, I know, but the man pages do have a documented policy on this... > > > > prctl.2 has a lot of authors, so having mixed spellings could get > > particularly messy. > > Quite. Indeed, that was how things were when I took over as > maintainer: a hodge-podge of British and American spellings, > occasionally even in the same page. I decided we needed > consistency, and though American is not my native spelling, > it seemed the most appropriate convention. > > >> > >>> +If > >>> +.B PR_SVE_SET_VL_ONEXEC > >>> +was included in > >>> +.IR arg2 , > >>> +the returned value describes the configuration > >>> +scheduled to take effect at the next > >>> +.BR execve (2). > >> > >> "describes the configuration" how? > >> > >>> +Otherwise, the effect is immediate and > >>> +the returned value describes the new configuration. > >>> +The returned value is encoded in the same way as the return value of > >>> +.BR PR_SVE_GET_VL . > >> > >> Aha. Maybe move this bit up slightly? > > > > Yes, I'll reorder that. > > > >> > >>> +.IP > >>> +If neither of the above flags is included in > >> > >> are included > > > > Debatable. > > > > The subject of the verb here is not "flags" (plural), but "neither of > > the above flags" (which is more nuanced, though it can be interpreted > > as singular). Usage varies, and I don't consider this wrong. > > As far as I know, the grammarians are with you on this one, > Dave, and if I was writing carefully, I'd do the same as you. Good, because I just made that up to justify myself! > But, the plural here is also frequent (and so common that I would > hesitate to call it "wrong"). Sure, I don't think either is wrong as such. My preference is only a preference, and partly depends on the context anyway. Cheers ---Dave
On Wed, May 13, 2020 at 03:02:00PM +0100, Dave Martin wrote: > On Wed, May 13, 2020 at 01:01:12PM +0200, Michael Kerrisk (man-pages) wrote: > > On 5/13/20 12:46 PM, Dave Martin wrote: > > > On Wed, May 13, 2020 at 09:43:52AM +0100, Will Deacon wrote: > > >> On Tue, May 12, 2020 at 05:36:58PM +0100, Dave Martin wrote: > > > glibc explicitly has > > > > > > extern int prctl (int __option, ...); > > > > > > (and nobody has to write _exit(0, 0, 0, 0, 0, 0) after all.) > > > > > > Is there some agreed rationale for requiring redundant arguments to be > > > supplied explicitly as zero? For now there are likely to be few users > > > of this, so we _might_ get away with changing the behaviour here if it's > > > considered important enough. > > > > See above. > > So there is no bulletproof rationale for either approach, but the main > concern is inconsistency? Have I understood that right? I think it's all just an extension of "make sure unused parameters are 0" idiom which allows those bits to be safely repurposed for flags and things later on without the worry of existing applications getting away with passing junk. > I'll propose to get that written down in the kernel source somewhere > if so. That would be a really good idea, actually! > (From my end, the pros and cons of the two approaches seem superficial > but the inconsistency is indeed annoying. For PR_SVE_SET_VL, I think > the first example I looked at didn't zero the trailing arguments, so I > didn't either... but it's been upstream for several releases, so most > likely we're stuck with it.) FWIW, I wasn't blaming you for this. Just that these oversights aren't always apparent when reviewing patches, but become more clear when reviewing the documentation. > > > There is no forwards compatibility problem with this prctl though, > > > because there are spare bits in arg2 which can "turn on" additional > > > args if needed. > > > > > > Also, it's implausible that PR_SVE_GET_VL will ever want an argument. > > > > > > There are still 2 billion unallocated prctl numbers, so new prctls can > > > always be added if there's ever a need to work around these limitations, > > > but it seems extremely unlikely. Oh, there are ways out, but had I noticed this during code review it would've been very easy just to enforce zero for the other args and be done with it. > > >>> +If > > >>> +.B PR_SVE_VL_INHERIT > > >>> +is also included in > > >>> +.IR arg2 , > > >>> +it takes effect > > >>> +.I after > > >>> +this deferred change. > > >> > > >> I find this a bit hard to follow, since it's not clear to me whether the > > >> INHERIT flag is effectively set before or after the next execve(). In other > > >> words, if both PR_SVE_SET_VL_ONEXEC and PR_SVE_VL_INHERIT are specified, > > >> is the vector length preserved or reset on the next execve()? > > > > > > It makes no difference, because the ONEXEC handling takes priority over > > > the INHERIT handling. But either way INHERIT is never cleared by execve() > > > and will apply at subsequent execs(). > > > > > > Explaining all this properly seems out of scope here. Maybe this should > > > be trimmed down rather than elaborated? Or perhaps just explain it in > > > terms of what the kernel does instead of futile attempts to make it > > > intuitive? Hmm, if we don't explain it in the man page then we should at least point people to somewhere where they can get the gory details, because I think they're necessary in order to use the prctl() request correctly. I'm still not confident that I understand the semantics of setting both PR_SVE_SET_VL_ONEXEC and PR_SVE_VL_INHERIT without reading the code, which may change. (I concede on all the spelling/grammar discussions ;) Will
On Wed, May 13, 2020 at 10:11:54PM +0100, Will Deacon wrote: > On Wed, May 13, 2020 at 03:02:00PM +0100, Dave Martin wrote: > > On Wed, May 13, 2020 at 01:01:12PM +0200, Michael Kerrisk (man-pages) wrote: > > > On 5/13/20 12:46 PM, Dave Martin wrote: > > > > On Wed, May 13, 2020 at 09:43:52AM +0100, Will Deacon wrote: > > > >> On Tue, May 12, 2020 at 05:36:58PM +0100, Dave Martin wrote: > > > > glibc explicitly has > > > > > > > > extern int prctl (int __option, ...); > > > > > > > > (and nobody has to write _exit(0, 0, 0, 0, 0, 0) after all.) > > > > > > > > Is there some agreed rationale for requiring redundant arguments to be > > > > supplied explicitly as zero? For now there are likely to be few users > > > > of this, so we _might_ get away with changing the behaviour here if it's > > > > considered important enough. > > > > > > See above. > > > > So there is no bulletproof rationale for either approach, but the main > > concern is inconsistency? Have I understood that right? > > I think it's all just an extension of "make sure unused parameters are 0" > idiom which allows those bits to be safely repurposed for flags and things > later on without the worry of existing applications getting away with > passing junk. I'd say that the explicit zeroing may give a false sense of safety, but I sympathise with the intent. At least, I think the explicit zeroing means that any resulting bugs are more likely to be fixable in userspace. > > I'll propose to get that written down in the kernel source somewhere > > if so. > > That would be a really good idea, actually! > > > (From my end, the pros and cons of the two approaches seem superficial > > but the inconsistency is indeed annoying. For PR_SVE_SET_VL, I think > > the first example I looked at didn't zero the trailing arguments, so I > > didn't either... but it's been upstream for several releases, so most > > likely we're stuck with it.) > > FWIW, I wasn't blaming you for this. Just that these oversights aren't > always apparent when reviewing patches, but become more clear when > reviewing the documentation. I'll have a think, so long as nobody implies that the SVE prctls are "wrong" ;) Adding comments in the code about how the implementation of those prctls can and can't safely be extended would be sensible though. I'll try to address that at some point. > > > > There is no forwards compatibility problem with this prctl though, > > > > because there are spare bits in arg2 which can "turn on" additional > > > > args if needed. > > > > > > > > Also, it's implausible that PR_SVE_GET_VL will ever want an argument. > > > > > > > > There are still 2 billion unallocated prctl numbers, so new prctls can > > > > always be added if there's ever a need to work around these limitations, > > > > but it seems extremely unlikely. > > Oh, there are ways out, but had I noticed this during code review it > would've been very easy just to enforce zero for the other args and be done > with it. Ack > > > >>> +If > > > >>> +.B PR_SVE_VL_INHERIT > > > >>> +is also included in > > > >>> +.IR arg2 , > > > >>> +it takes effect > > > >>> +.I after > > > >>> +this deferred change. > > > >> > > > >> I find this a bit hard to follow, since it's not clear to me whether the > > > >> INHERIT flag is effectively set before or after the next execve(). In other > > > >> words, if both PR_SVE_SET_VL_ONEXEC and PR_SVE_VL_INHERIT are specified, > > > >> is the vector length preserved or reset on the next execve()? > > > > > > > > It makes no difference, because the ONEXEC handling takes priority over > > > > the INHERIT handling. But either way INHERIT is never cleared by execve() > > > > and will apply at subsequent execs(). > > > > > > > > Explaining all this properly seems out of scope here. Maybe this should > > > > be trimmed down rather than elaborated? Or perhaps just explain it in > > > > terms of what the kernel does instead of futile attempts to make it > > > > intuitive? > > Hmm, if we don't explain it in the man page then we should at least point > people to somewhere where they can get the gory details, because I think > they're necessary in order to use the prctl() request correctly. I'm still > not confident that I understand the semantics of setting both > PR_SVE_SET_VL_ONEXEC and PR_SVE_VL_INHERIT without reading the code, which > may change. > > (I concede on all the spelling/grammar discussions ;) Ultimately I aim to add another page, but for now would it be sufficient to refer to Documentation/? Cheers ---Dave
On Wed, May 13, 2020 at 10:11:54PM +0100, Will Deacon wrote: > On Wed, May 13, 2020 at 03:02:00PM +0100, Dave Martin wrote: > > On Wed, May 13, 2020 at 01:01:12PM +0200, Michael Kerrisk (man-pages) wrote: > > > On 5/13/20 12:46 PM, Dave Martin wrote: > > > > On Wed, May 13, 2020 at 09:43:52AM +0100, Will Deacon wrote: > > > >> On Tue, May 12, 2020 at 05:36:58PM +0100, Dave Martin wrote: [...] > > > >>> +If > > > >>> +.B PR_SVE_VL_INHERIT > > > >>> +is also included in > > > >>> +.IR arg2 , > > > >>> +it takes effect > > > >>> +.I after > > > >>> +this deferred change. > > > >> > > > >> I find this a bit hard to follow, since it's not clear to me whether the > > > >> INHERIT flag is effectively set before or after the next execve(). In other > > > >> words, if both PR_SVE_SET_VL_ONEXEC and PR_SVE_VL_INHERIT are specified, > > > >> is the vector length preserved or reset on the next execve()? > > > > > > > > It makes no difference, because the ONEXEC handling takes priority over > > > > the INHERIT handling. But either way INHERIT is never cleared by execve() > > > > and will apply at subsequent execs(). > > > > > > > > Explaining all this properly seems out of scope here. Maybe this should > > > > be trimmed down rather than elaborated? Or perhaps just explain it in > > > > terms of what the kernel does instead of futile attempts to make it > > > > intuitive? > > Hmm, if we don't explain it in the man page then we should at least point > people to somewhere where they can get the gory details, because I think > they're necessary in order to use the prctl() request correctly. I'm still > not confident that I understand the semantics of setting both > PR_SVE_SET_VL_ONEXEC and PR_SVE_VL_INHERIT without reading the code, which > may change. On this point, can you review the following wording? I simply enumerate the possible flag combinations now, rather than tying myself in knots trying to describe the two flags independently. Cheers ---Dave --8<-- PR_SVE_SET_VL (since Linux 4.15, only on arm64) Configure the thread's SVE vector length, as specified by (int) arg2. Arguments arg3, arg4 and arg5 are ignored. The bits of arg2 corresponding to PR_SVE_VL_LEN_MASK must be set to the desired vector length in bytes. This is interpreted as an upper bound: the kernel will select the greatest available vector length that does not exceed the value specified. In par- ticular, specifying SVE_VL_MAX (defined in <asm/sigcontext.h>) for the PR_SVE_VL_LEN_MASK bits requests the maximum supported vector length. In addition, arg2 may include the following combinations of flags: 0 Perform the change immediately. At the next execve(2) in the thread, the vector length will be reset to the value configured in /proc/sys/abi/sve_default_vector_length. PR_SVE_VL_INHERIT Perform the change immediately. Subsequent execve(2) calls will preserve the new vector length. PR_SVE_SET_VL_ONEXEC Defer the change, so that it is performed at the next execve(2) in the thread. Further execve(2) calls will reset the vector length to the value configured in /proc/sys/abi/sve_default_vector_length. PR_SVE_SET_VL_ONEXEC | PR_SVE_VL_INHERIT Defer the change, so that it is performed at the next execve(2) in the thread. Further execve(2) calls will preserve the new vector length. In all cases, any previously pending deferred change is can- celed. The call fails with error EINVAL if SVE is not supported on the platform, if arg2 is unrecognized or invalid, or the value in the bits of arg2 corresponding to PR_SVE_VL_LEN_MASK is outside the range SVE_VL_MIN..SVE_VL_MAX, or is not a multiple of 16. On success, a nonnegative value is returned that describes the selected configuration, which may differ from the current con- figuration if PR_SVE_SET_VL_ONEXEC was specified. The value is encoded in the same way as the return value of PR_SVE_GET_VL. The configuration (including any pending deferred change) is inherited across fork(2) and clone(2). For more information, see the kernel source file Documenta- tion/arm64/sve.rst (or Documentation/arm64/sve.txt before Linux 5.3). Warning: Because the compiler or run-time environment may be using SVE, using this call without the PR_SVE_SET_VL_ONEXEC flag can lead to unpredicable behaviour in the calling process. The conditions for using it safely are complex and system-dependent. Don't use it unless you really know what you are doing. -->8--
diff --git a/man2/prctl.2 b/man2/prctl.2 index 7f511d2..dd16227 100644 --- a/man2/prctl.2 +++ b/man2/prctl.2 @@ -1291,6 +1291,104 @@ call failing with the error .BR ENXIO . For further details, see the kernel source file .IR Documentation/admin-guide/kernel-parameters.txt . +.\" prctl PR_SVE_SET_VL +.\" commit 2d2123bc7c7f843aa9db87720de159a049839862 +.\" linux-5.6/Documentation/arm64/sve.rst +.TP +.BR PR_SVE_SET_VL " (since Linux 4.15, only on arm64)" +Configure the thread's SVE vector length, +as specified by +.IR arg2 . +Arguments +.IR arg3 ", " arg4 " and " arg5 +are ignored. +.IP +The bits of +.I arg2 +corresponding to +.B SVE_VL_LEN_MASK +must be set to the desired vector length in bytes. +In addition, +.I arg2 +may include zero or more of the following flags: +.RS +.TP +.B PR_SVE_VL_INHERIT +Inherit the configured vector length across +.BR execve (2). +.TP +.B PR_SVE_SET_VL_ONEXEC +Defer the change until the next +.BR execve (2) +in this thread. +If +.B PR_SVE_VL_INHERIT +is also included in +.IR arg2 , +it takes effect +.I after +this deferred change. +.RE +.IP +On success, the vector length and flags are set as requested, +and any deferred change that was pending immediately before the +.B PR_SVE_SET_VL +call is canceled. +If +.B PR_SVE_SET_VL_ONEXEC +was included in +.IR arg2 , +the returned value describes the configuration +scheduled to take effect at the next +.BR execve (2). +Otherwise, the effect is immediate and +the returned value describes the new configuration. +The returned value is encoded in the same way as the return value of +.BR PR_SVE_GET_VL . +.IP +If neither of the above flags is included in +.IR arg2 , +a subsequent +.BR execve (2) +resets the vector length to the default value configured in +.IR /proc/sys/abi/sve_default_vector_length . +.IP +The actual vector length configured by this operation +is the greatest vector length supported by the platform +that does not exceed +.I arg2 +& +.BR PR_SVE_VL_LEN_MASK . +.IP +The configuration (including any pending deferred change) +is inherited across +.BR fork (2) +and +.BR clone (2). +.\" prctl PR_SVE_GET_VL +.TP +.BR PR_SVE_GET_VL " (since Linux 4.15, only on arm64)" +Get the thread's current SVE vector length configuration, +as configured by +.BR PR_SVE_SET_VL . +.IP +If successful, the return value describes the +.I current +configuration. +The bits corresponding to +.B PR_SVE_VL_LEN_MASK +contain the currently configured vector length in bytes. +The bit corresponding to +.B PR_SVE_VL_INHERIT +indicates whether the vector length will be inherited +across +.BR execve (2). +.IP +Note that there is no way determine whether there is +a pending vector length change that has not yet taken effect. +.IP +Providing that the kernel and platform support SVE, +this operation always succeeds. .\" .\" prctl PR_TASK_PERF_EVENTS_DISABLE .TP @@ -1534,6 +1632,8 @@ On success, .BR PR_GET_NO_NEW_PRIVS , .BR PR_GET_SECUREBITS , .BR PR_GET_SPECULATION_CTRL , +.BR PR_SVE_GET_VL , +.BR PR_SVE_SET_VL , .BR PR_GET_THP_DISABLE , .BR PR_GET_TIMING , .BR PR_GET_TIMERSLACK , @@ -1817,6 +1917,18 @@ and unused arguments to .BR prctl () are not 0. .TP +.B EINVAL +.I option +was +.B PR_SVE_SET_VL +and +.I arg2 +contains invalid flags, or +.I arg2 +& +.B SVE_VL_LEN_MASK +is not a valid vector length. +.TP .B ENODEV .I option was
Add documentation for the the PR_SVE_SET_VL and PR_SVE_GET_VL prctls added in Linux 4.15 for arm64. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> --- I wrote the SVE support originally, so I probably have this one halfway right. The explantion added is not exhaustive, but I didn't want it to be too verbose. I may trim it a bit and move the detail to a dedicated page later on, but this is better than nothing in the meantime. --- man2/prctl.2 | 112 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 112 insertions(+)