diff mbox series

[13/14] prctl.2: Add SVE prctls (arm64)

Message ID 1589301419-24459-14-git-send-email-Dave.Martin@arm.com (mailing list archive)
State New, archived
Headers show
Series prctl.2 man page updates for Linux 5.6 | expand

Commit Message

Dave Martin May 12, 2020, 4:36 p.m. UTC
Add documentation for the the PR_SVE_SET_VL and PR_SVE_GET_VL
prctls added in Linux 4.15 for arm64.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>

---

I wrote the SVE support originally, so I probably have this one
halfway right.

The explantion added is not exhaustive, but I didn't want it to be too
verbose.  I may trim it a bit and move the detail to a dedicated page
later on, but this is better than nothing in the meantime.
---
 man2/prctl.2 | 112 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 112 insertions(+)

Comments

Will Deacon May 13, 2020, 8:43 a.m. UTC | #1
Hi Dave,

On Tue, May 12, 2020 at 05:36:58PM +0100, Dave Martin wrote:
> diff --git a/man2/prctl.2 b/man2/prctl.2
> index 7f511d2..dd16227 100644
> --- a/man2/prctl.2
> +++ b/man2/prctl.2
> @@ -1291,6 +1291,104 @@ call failing with the error
>  .BR ENXIO .
>  For further details, see the kernel source file
>  .IR Documentation/admin-guide/kernel-parameters.txt .
> +.\" prctl PR_SVE_SET_VL
> +.\" commit 2d2123bc7c7f843aa9db87720de159a049839862
> +.\" linux-5.6/Documentation/arm64/sve.rst
> +.TP
> +.BR PR_SVE_SET_VL " (since Linux 4.15, only on arm64)"
> +Configure the thread's SVE vector length,
> +as specified by
> +.IR arg2 .
> +Arguments
> +.IR arg3 ", " arg4 " and " arg5
> +are ignored.

Bugger, did we forget to force these to zero? I guess we should write the
man-page first next time :(

> +.IP
> +The bits of
> +.I arg2
> +corresponding to
> +.B SVE_VL_LEN_MASK

PR_SVE_LEN_MASK

> +must be set to the desired vector length in bytes.
> +In addition,
> +.I arg2
> +may include zero or more of the following flags:
> +.RS
> +.TP
> +.B PR_SVE_VL_INHERIT
> +Inherit the configured vector length across
> +.BR execve (2).
> +.TP
> +.B PR_SVE_SET_VL_ONEXEC
> +Defer the change until the next
> +.BR execve (2)
> +in this thread.

(aside, it's weird that we didn't allocate (1<<16) for one of these flags)

> +If
> +.B PR_SVE_VL_INHERIT
> +is also included in
> +.IR arg2 ,
> +it takes effect
> +.I after
> +this deferred change.

I find this a bit hard to follow, since it's not clear to me whether the
INHERIT flag is effectively set before or after the next execve(). In other
words, if both PR_SVE_SET_VL_ONEXEC and PR_SVE_VL_INHERIT are specified,
is the vector length preserved or reset on the next execve()?

> +.RE
> +.IP
> +On success, the vector length and flags are set as requested,
> +and any deferred change that was pending immediately before the
> +.B PR_SVE_SET_VL
> +call is canceled.

Huh, turns out 'canceled' is a valid US spelling. Fair enough, but it looks
wrong to me ;)

> +If
> +.B PR_SVE_SET_VL_ONEXEC
> +was included in
> +.IR arg2 ,
> +the returned value describes the configuration
> +scheduled to take effect at the next
> +.BR execve (2).

"describes the configuration" how?

> +Otherwise, the effect is immediate and
> +the returned value describes the new configuration.
> +The returned value is encoded in the same way as the return value of
> +.BR PR_SVE_GET_VL .

Aha. Maybe move this bit up slightly?

> +.IP
> +If neither of the above flags is included in

are included

> +.IR arg2 ,
> +a subsequent
> +.BR execve (2)
> +resets the vector length to the default value configured in
> +.IR /proc/sys/abi/sve_default_vector_length .
> +.IP
> +The actual vector length configured by this operation
> +is the greatest vector length supported by the platform
> +that does not exceed
> +.I arg2
> +&
> +.BR PR_SVE_VL_LEN_MASK .
> +.IP
> +The configuration (including any pending deferred change)
> +is inherited across
> +.BR fork (2)
> +and
> +.BR clone (2).
> +.\" prctl PR_SVE_GET_VL
> +.TP
> +.BR PR_SVE_GET_VL " (since Linux 4.15, only on arm64)"
> +Get the thread's current SVE vector length configuration,
> +as configured by
> +.BR PR_SVE_SET_VL .

It doesn't *have* to be configured by PR_SVE_SET_VL though, right?

> +.IP
> +If successful, the return value describes the
> +.I current
> +configuration.

(aside: prctl() returns int, so we can't ever allocate past bit 30 in arg2.
Might be worth a note somewhere in the kernel).

> +The bits corresponding to
> +.B PR_SVE_VL_LEN_MASK
> +contain the currently configured vector length in bytes.
> +The bit corresponding to
> +.B PR_SVE_VL_INHERIT
> +indicates whether the vector length will be inherited
> +across
> +.BR execve (2).
> +.IP
> +Note that there is no way determine whether there is

to determine

> +a pending vector length change that has not yet taken effect.
> +.IP
> +Providing that the kernel and platform support SVE,
> +this operation always succeeds.
>  .\"
>  .\" prctl PR_TASK_PERF_EVENTS_DISABLE
>  .TP
> @@ -1534,6 +1632,8 @@ On success,
>  .BR PR_GET_NO_NEW_PRIVS ,
>  .BR PR_GET_SECUREBITS ,
>  .BR PR_GET_SPECULATION_CTRL ,
> +.BR PR_SVE_GET_VL ,
> +.BR PR_SVE_SET_VL ,
>  .BR PR_GET_THP_DISABLE ,
>  .BR PR_GET_TIMING ,
>  .BR PR_GET_TIMERSLACK ,
> @@ -1817,6 +1917,18 @@ and unused arguments to
>  .BR prctl ()
>  are not 0.
>  .TP
> +.B EINVAL
> +.I option
> +was
> +.B PR_SVE_SET_VL
> +and
> +.I arg2
> +contains invalid flags, or
> +.I arg2
> +&
> +.B SVE_VL_LEN_MASK
> +is not a valid vector length.
> +.TP

PR_SVE_GET_VL can return -EINVAL if SVE is not supported.

Will
Dave Martin May 13, 2020, 10:46 a.m. UTC | #2
On Wed, May 13, 2020 at 09:43:52AM +0100, Will Deacon wrote:
> Hi Dave,
> 
> On Tue, May 12, 2020 at 05:36:58PM +0100, Dave Martin wrote:
> > diff --git a/man2/prctl.2 b/man2/prctl.2
> > index 7f511d2..dd16227 100644
> > --- a/man2/prctl.2
> > +++ b/man2/prctl.2
> > @@ -1291,6 +1291,104 @@ call failing with the error
> >  .BR ENXIO .
> >  For further details, see the kernel source file
> >  .IR Documentation/admin-guide/kernel-parameters.txt .
> > +.\" prctl PR_SVE_SET_VL
> > +.\" commit 2d2123bc7c7f843aa9db87720de159a049839862
> > +.\" linux-5.6/Documentation/arm64/sve.rst
> > +.TP
> > +.BR PR_SVE_SET_VL " (since Linux 4.15, only on arm64)"
> > +Configure the thread's SVE vector length,
> > +as specified by
> > +.IR arg2 .
> > +Arguments
> > +.IR arg3 ", " arg4 " and " arg5
> > +are ignored.
> 
> Bugger, did we forget to force these to zero? I guess we should write the
> man-page first next time :(

Not an accident, but there does seem to be some inconsistency in policy
among the various prctls here.

glibc explicitly has

	extern int prctl (int __option, ...);

(and nobody has to write _exit(0, 0, 0, 0, 0, 0) after all.)

Is there some agreed rationale for requiring redundant arguments to be
supplied explicitly as zero?  For now there are likely to be few users
of this, so we _might_ get away with changing the behaviour here if it's
considered important enough.

There is no forwards compatibility problem with this prctl though,
because there are spare bits in arg2 which can "turn on" additional
args if needed.

Also, it's implausible that PR_SVE_GET_VL will ever want an argument.

There are still 2 billion unallocated prctl numbers, so new prctls can
always be added if there's ever a need to work around these limitations,
but it seems extremely unlikely.

> 
> > +.IP
> > +The bits of
> > +.I arg2
> > +corresponding to
> > +.B SVE_VL_LEN_MASK
> 
> PR_SVE_LEN_MASK

Hmm, not sure how that happened.  Good spot!

I'll recheck that all the names are real when reposting.

> > +must be set to the desired vector length in bytes.
> > +In addition,
> > +.I arg2
> > +may include zero or more of the following flags:
> > +.RS
> > +.TP
> > +.B PR_SVE_VL_INHERIT
> > +Inherit the configured vector length across
> > +.BR execve (2).
> > +.TP
> > +.B PR_SVE_SET_VL_ONEXEC
> > +Defer the change until the next
> > +.BR execve (2)
> > +in this thread.
> 
> (aside, it's weird that we didn't allocate (1<<16) for one of these flags)

The flag definitions are shared with ptrace: ptrace is the
SVE_PT_REGS_SVE format selection bit, which isn't relevant to the prctl.

Maybe it would have made more sense to keep the definitions completely
separate, but it's there now.

> > +If
> > +.B PR_SVE_VL_INHERIT
> > +is also included in
> > +.IR arg2 ,
> > +it takes effect
> > +.I after
> > +this deferred change.
> 
> I find this a bit hard to follow, since it's not clear to me whether the
> INHERIT flag is effectively set before or after the next execve(). In other
> words, if both PR_SVE_SET_VL_ONEXEC and PR_SVE_VL_INHERIT are specified,
> is the vector length preserved or reset on the next execve()?

It makes no difference, because the ONEXEC handling takes priority over
the INHERIT handling. But either way INHERIT is never cleared by execve()
and will apply at subsequent execs().

Explaining all this properly seems out of scope here.  Maybe this should
be trimmed down rather than elaborated?  Or perhaps just explain it in
terms of what the kernel does instead of futile attempts to make it
intuitive?

Ultimately I'll probably write a separate page or pages for SVE and other
arm64 specifics.

> > +.RE
> > +.IP
> > +On success, the vector length and flags are set as requested,
> > +and any deferred change that was pending immediately before the
> > +.B PR_SVE_SET_VL
> > +call is canceled.
> 
> Huh, turns out 'canceled' is a valid US spelling. Fair enough, but it looks
> wrong to me ;)

Yeah, I know, but the man pages do have a documented policy on this...

prctl.2 has a lot of authors, so having mixed spellings could get
particularly messy.

> 
> > +If
> > +.B PR_SVE_SET_VL_ONEXEC
> > +was included in
> > +.IR arg2 ,
> > +the returned value describes the configuration
> > +scheduled to take effect at the next
> > +.BR execve (2).
> 
> "describes the configuration" how?
> 
> > +Otherwise, the effect is immediate and
> > +the returned value describes the new configuration.
> > +The returned value is encoded in the same way as the return value of
> > +.BR PR_SVE_GET_VL .
> 
> Aha. Maybe move this bit up slightly?

Yes, I'll reorder that.

> 
> > +.IP
> > +If neither of the above flags is included in
> 
> are included

Debatable.

The subject of the verb here is not "flags" (plural), but "neither of
the above flags" (which is more nuanced, though it can be interpreted
as singular).  Usage varies, and I don't consider this wrong.

> > +.IR arg2 ,
> > +a subsequent
> > +.BR execve (2)
> > +resets the vector length to the default value configured in
> > +.IR /proc/sys/abi/sve_default_vector_length .
> > +.IP
> > +The actual vector length configured by this operation
> > +is the greatest vector length supported by the platform
> > +that does not exceed
> > +.I arg2
> > +&
> > +.BR PR_SVE_VL_LEN_MASK .
> > +.IP
> > +The configuration (including any pending deferred change)
> > +is inherited across
> > +.BR fork (2)
> > +and
> > +.BR clone (2).
> > +.\" prctl PR_SVE_GET_VL
> > +.TP
> > +.BR PR_SVE_GET_VL " (since Linux 4.15, only on arm64)"
> > +Get the thread's current SVE vector length configuration,
> > +as configured by
> > +.BR PR_SVE_SET_VL .
> 
> It doesn't *have* to be configured by PR_SVE_SET_VL though, right?

No, "as" here is trying to mean that PR_SVE_SET_VL illustrates how the
vl can be set, not that it is the only way.

Maybe just delete that clause?  I'm not sure it adds much.

> > +.IP
> > +If successful, the return value describes the
> > +.I current
> > +configuration.
> 
> (aside: prctl() returns int, so we can't ever allocate past bit 30 in arg2.
> Might be worth a note somewhere in the kernel).

Agreed.  The chance of adding more bits seemed negligible, but dropping
a comment in would probably be a good idea.

Maybe we could redocument PR_SVE_SET_VL's arg2 as an int.  Passing an
int must survive the variadic argument marshaling of the glibc prctl()
wrapper, otherwise passing the existing #defines without an explicit
case to long would already break.

I'll probably just add the comment for now though.

> > +The bits corresponding to
> > +.B PR_SVE_VL_LEN_MASK
> > +contain the currently configured vector length in bytes.
> > +The bit corresponding to
> > +.B PR_SVE_VL_INHERIT
> > +indicates whether the vector length will be inherited
> > +across
> > +.BR execve (2).
> > +.IP
> > +Note that there is no way determine whether there is
> 
> to determine

Ack

> > +a pending vector length change that has not yet taken effect.
> > +.IP
> > +Providing that the kernel and platform support SVE,
> > +this operation always succeeds.
> >  .\"
> >  .\" prctl PR_TASK_PERF_EVENTS_DISABLE
> >  .TP
> > @@ -1534,6 +1632,8 @@ On success,
> >  .BR PR_GET_NO_NEW_PRIVS ,
> >  .BR PR_GET_SECUREBITS ,
> >  .BR PR_GET_SPECULATION_CTRL ,
> > +.BR PR_SVE_GET_VL ,
> > +.BR PR_SVE_SET_VL ,
> >  .BR PR_GET_THP_DISABLE ,
> >  .BR PR_GET_TIMING ,
> >  .BR PR_GET_TIMERSLACK ,
> > @@ -1817,6 +1917,18 @@ and unused arguments to
> >  .BR prctl ()
> >  are not 0.
> >  .TP
> > +.B EINVAL
> > +.I option
> > +was
> > +.B PR_SVE_SET_VL
> > +and
> > +.I arg2
> > +contains invalid flags, or
> > +.I arg2
> > +&
> > +.B SVE_VL_LEN_MASK
> > +is not a valid vector length.
> > +.TP
> 
> PR_SVE_GET_VL can return -EINVAL if SVE is not supported.

See the earlier patch about the "unsupported hardware case of EINVAL".

This affects many prctls, is "obvious" and we'd just have to document
the same thing over and over again...


Thanks for the review.  Perhaps I was slightly oo hasty about having got
it half right!

---Dave
Michael Kerrisk (man-pages) May 13, 2020, 11:01 a.m. UTC | #3
Hi,

On 5/13/20 12:46 PM, Dave Martin wrote:
> On Wed, May 13, 2020 at 09:43:52AM +0100, Will Deacon wrote:
>> Hi Dave,
>>
>> On Tue, May 12, 2020 at 05:36:58PM +0100, Dave Martin wrote:
>>> diff --git a/man2/prctl.2 b/man2/prctl.2
>>> index 7f511d2..dd16227 100644
>>> --- a/man2/prctl.2
>>> +++ b/man2/prctl.2
>>> @@ -1291,6 +1291,104 @@ call failing with the error
>>>  .BR ENXIO .
>>>  For further details, see the kernel source file
>>>  .IR Documentation/admin-guide/kernel-parameters.txt .
>>> +.\" prctl PR_SVE_SET_VL
>>> +.\" commit 2d2123bc7c7f843aa9db87720de159a049839862
>>> +.\" linux-5.6/Documentation/arm64/sve.rst
>>> +.TP
>>> +.BR PR_SVE_SET_VL " (since Linux 4.15, only on arm64)"
>>> +Configure the thread's SVE vector length,
>>> +as specified by
>>> +.IR arg2 .
>>> +Arguments
>>> +.IR arg3 ", " arg4 " and " arg5
>>> +are ignored.
>>
>> Bugger, did we forget to force these to zero? I guess we should write the
>> man-page first next time :(

Quite...

> Not an accident, but there does seem to be some inconsistency in policy
> among the various prctls here.

The whole 5-args-for-prctl thing was a bit of a misdesign.

The general preference is that, for new prctls, unused arguments 
should be required to be zero. Historically, there was much
inconsistency.

> glibc explicitly has
> 
> 	extern int prctl (int __option, ...);
> 
> (and nobody has to write _exit(0, 0, 0, 0, 0, 0) after all.)
> 
> Is there some agreed rationale for requiring redundant arguments to be
> supplied explicitly as zero?  For now there are likely to be few users
> of this, so we _might_ get away with changing the behaviour here if it's
> considered important enough.

See above.

> There is no forwards compatibility problem with this prctl though,
> because there are spare bits in arg2 which can "turn on" additional
> args if needed.
> 
> Also, it's implausible that PR_SVE_GET_VL will ever want an argument.
> 
> There are still 2 billion unallocated prctl numbers, so new prctls can
> always be added if there's ever a need to work around these limitations,
> but it seems extremely unlikely.
> 
>>
>>> +.IP
>>> +The bits of
>>> +.I arg2
>>> +corresponding to
>>> +.B SVE_VL_LEN_MASK
>>
>> PR_SVE_LEN_MASK
> 
> Hmm, not sure how that happened.  Good spot!
> 
> I'll recheck that all the names are real when reposting.
> 
>>> +must be set to the desired vector length in bytes.
>>> +In addition,
>>> +.I arg2
>>> +may include zero or more of the following flags:
>>> +.RS
>>> +.TP
>>> +.B PR_SVE_VL_INHERIT
>>> +Inherit the configured vector length across
>>> +.BR execve (2).
>>> +.TP
>>> +.B PR_SVE_SET_VL_ONEXEC
>>> +Defer the change until the next
>>> +.BR execve (2)
>>> +in this thread.
>>
>> (aside, it's weird that we didn't allocate (1<<16) for one of these flags)
> 
> The flag definitions are shared with ptrace: ptrace is the
> SVE_PT_REGS_SVE format selection bit, which isn't relevant to the prctl.
> 
> Maybe it would have made more sense to keep the definitions completely
> separate, but it's there now.
> 
>>> +If
>>> +.B PR_SVE_VL_INHERIT
>>> +is also included in
>>> +.IR arg2 ,
>>> +it takes effect
>>> +.I after
>>> +this deferred change.
>>
>> I find this a bit hard to follow, since it's not clear to me whether the
>> INHERIT flag is effectively set before or after the next execve(). In other
>> words, if both PR_SVE_SET_VL_ONEXEC and PR_SVE_VL_INHERIT are specified,
>> is the vector length preserved or reset on the next execve()?
> 
> It makes no difference, because the ONEXEC handling takes priority over
> the INHERIT handling. But either way INHERIT is never cleared by execve()
> and will apply at subsequent execs().
> 
> Explaining all this properly seems out of scope here.  Maybe this should
> be trimmed down rather than elaborated?  Or perhaps just explain it in
> terms of what the kernel does instead of futile attempts to make it
> intuitive?
> 
> Ultimately I'll probably write a separate page or pages for SVE and other
> arm64 specifics.

(okay.)

>>> +.RE
>>> +.IP
>>> +On success, the vector length and flags are set as requested,
>>> +and any deferred change that was pending immediately before the
>>> +.B PR_SVE_SET_VL
>>> +call is canceled.
>>
>> Huh, turns out 'canceled' is a valid US spelling. Fair enough, but it looks
>> wrong to me ;)
> 
> Yeah, I know, but the man pages do have a documented policy on this...
> 
> prctl.2 has a lot of authors, so having mixed spellings could get
> particularly messy.

Quite. Indeed, that was how things were when I took over as
maintainer: a hodge-podge of British and American spellings,
occasionally even in the same page. I decided we needed
consistency, and though American is not my native spelling,
it seemed the most appropriate convention.

>>
>>> +If
>>> +.B PR_SVE_SET_VL_ONEXEC
>>> +was included in
>>> +.IR arg2 ,
>>> +the returned value describes the configuration
>>> +scheduled to take effect at the next
>>> +.BR execve (2).
>>
>> "describes the configuration" how?
>>
>>> +Otherwise, the effect is immediate and
>>> +the returned value describes the new configuration.
>>> +The returned value is encoded in the same way as the return value of
>>> +.BR PR_SVE_GET_VL .
>>
>> Aha. Maybe move this bit up slightly?
> 
> Yes, I'll reorder that.
> 
>>
>>> +.IP
>>> +If neither of the above flags is included in
>>
>> are included
> 
> Debatable.
> 
> The subject of the verb here is not "flags" (plural), but "neither of
> the above flags" (which is more nuanced, though it can be interpreted
> as singular).  Usage varies, and I don't consider this wrong.

As far as I know, the grammarians are with you on this one,
Dave, and if I was writing carefully, I'd do the same as you.
But, the plural here is also frequent (and so common that I would
hesitate to call it "wrong").

[...]


Cheers,

Michael
Dave Martin May 13, 2020, 2:02 p.m. UTC | #4
On Wed, May 13, 2020 at 01:01:12PM +0200, Michael Kerrisk (man-pages) wrote:
> Hi,
> 
> On 5/13/20 12:46 PM, Dave Martin wrote:
> > On Wed, May 13, 2020 at 09:43:52AM +0100, Will Deacon wrote:
> >> Hi Dave,
> >>
> >> On Tue, May 12, 2020 at 05:36:58PM +0100, Dave Martin wrote:
> >>> diff --git a/man2/prctl.2 b/man2/prctl.2
> >>> index 7f511d2..dd16227 100644
> >>> --- a/man2/prctl.2
> >>> +++ b/man2/prctl.2
> >>> @@ -1291,6 +1291,104 @@ call failing with the error
> >>>  .BR ENXIO .
> >>>  For further details, see the kernel source file
> >>>  .IR Documentation/admin-guide/kernel-parameters.txt .
> >>> +.\" prctl PR_SVE_SET_VL
> >>> +.\" commit 2d2123bc7c7f843aa9db87720de159a049839862
> >>> +.\" linux-5.6/Documentation/arm64/sve.rst
> >>> +.TP
> >>> +.BR PR_SVE_SET_VL " (since Linux 4.15, only on arm64)"
> >>> +Configure the thread's SVE vector length,
> >>> +as specified by
> >>> +.IR arg2 .
> >>> +Arguments
> >>> +.IR arg3 ", " arg4 " and " arg5
> >>> +are ignored.
> >>
> >> Bugger, did we forget to force these to zero? I guess we should write the
> >> man-page first next time :(
> 
> Quite...
> 
> > Not an accident, but there does seem to be some inconsistency in policy
> > among the various prctls here.
> 
> The whole 5-args-for-prctl thing was a bit of a misdesign.
> 
> The general preference is that, for new prctls, unused arguments 
> should be required to be zero. Historically, there was much
> inconsistency.
> 
> > glibc explicitly has
> > 
> > 	extern int prctl (int __option, ...);
> > 
> > (and nobody has to write _exit(0, 0, 0, 0, 0, 0) after all.)
> > 
> > Is there some agreed rationale for requiring redundant arguments to be
> > supplied explicitly as zero?  For now there are likely to be few users
> > of this, so we _might_ get away with changing the behaviour here if it's
> > considered important enough.
> 
> See above.

So there is no bulletproof rationale for either approach, but the main
concern is inconsistency?  Have I understood that right?

I'll propose to get that written down in the kernel source somewhere
if so.

(From my end, the pros and cons of the two approaches seem superficial
but the inconsistency is indeed annoying.  For PR_SVE_SET_VL, I think
the first example I looked at didn't zero the trailing arguments, so I
didn't either... but it's been upstream for several releases, so most
likely we're stuck with it.)

> 
> > There is no forwards compatibility problem with this prctl though,
> > because there are spare bits in arg2 which can "turn on" additional
> > args if needed.
> > 
> > Also, it's implausible that PR_SVE_GET_VL will ever want an argument.
> > 
> > There are still 2 billion unallocated prctl numbers, so new prctls can
> > always be added if there's ever a need to work around these limitations,
> > but it seems extremely unlikely.
> > 
> >>
> >>> +.IP
> >>> +The bits of
> >>> +.I arg2
> >>> +corresponding to
> >>> +.B SVE_VL_LEN_MASK
> >>
> >> PR_SVE_LEN_MASK
> > 
> > Hmm, not sure how that happened.  Good spot!
> > 
> > I'll recheck that all the names are real when reposting.
> > 
> >>> +must be set to the desired vector length in bytes.
> >>> +In addition,
> >>> +.I arg2
> >>> +may include zero or more of the following flags:
> >>> +.RS
> >>> +.TP
> >>> +.B PR_SVE_VL_INHERIT
> >>> +Inherit the configured vector length across
> >>> +.BR execve (2).
> >>> +.TP
> >>> +.B PR_SVE_SET_VL_ONEXEC
> >>> +Defer the change until the next
> >>> +.BR execve (2)
> >>> +in this thread.
> >>
> >> (aside, it's weird that we didn't allocate (1<<16) for one of these flags)
> > 
> > The flag definitions are shared with ptrace: ptrace is the
> > SVE_PT_REGS_SVE format selection bit, which isn't relevant to the prctl.
> > 
> > Maybe it would have made more sense to keep the definitions completely
> > separate, but it's there now.
> > 
> >>> +If
> >>> +.B PR_SVE_VL_INHERIT
> >>> +is also included in
> >>> +.IR arg2 ,
> >>> +it takes effect
> >>> +.I after
> >>> +this deferred change.
> >>
> >> I find this a bit hard to follow, since it's not clear to me whether the
> >> INHERIT flag is effectively set before or after the next execve(). In other
> >> words, if both PR_SVE_SET_VL_ONEXEC and PR_SVE_VL_INHERIT are specified,
> >> is the vector length preserved or reset on the next execve()?
> > 
> > It makes no difference, because the ONEXEC handling takes priority over
> > the INHERIT handling. But either way INHERIT is never cleared by execve()
> > and will apply at subsequent execs().
> > 
> > Explaining all this properly seems out of scope here.  Maybe this should
> > be trimmed down rather than elaborated?  Or perhaps just explain it in
> > terms of what the kernel does instead of futile attempts to make it
> > intuitive?
> > 
> > Ultimately I'll probably write a separate page or pages for SVE and other
> > arm64 specifics.
> 
> (okay.)
> 
> >>> +.RE
> >>> +.IP
> >>> +On success, the vector length and flags are set as requested,
> >>> +and any deferred change that was pending immediately before the
> >>> +.B PR_SVE_SET_VL
> >>> +call is canceled.
> >>
> >> Huh, turns out 'canceled' is a valid US spelling. Fair enough, but it looks
> >> wrong to me ;)
> > 
> > Yeah, I know, but the man pages do have a documented policy on this...
> > 
> > prctl.2 has a lot of authors, so having mixed spellings could get
> > particularly messy.
> 
> Quite. Indeed, that was how things were when I took over as
> maintainer: a hodge-podge of British and American spellings,
> occasionally even in the same page. I decided we needed
> consistency, and though American is not my native spelling,
> it seemed the most appropriate convention.
> 
> >>
> >>> +If
> >>> +.B PR_SVE_SET_VL_ONEXEC
> >>> +was included in
> >>> +.IR arg2 ,
> >>> +the returned value describes the configuration
> >>> +scheduled to take effect at the next
> >>> +.BR execve (2).
> >>
> >> "describes the configuration" how?
> >>
> >>> +Otherwise, the effect is immediate and
> >>> +the returned value describes the new configuration.
> >>> +The returned value is encoded in the same way as the return value of
> >>> +.BR PR_SVE_GET_VL .
> >>
> >> Aha. Maybe move this bit up slightly?
> > 
> > Yes, I'll reorder that.
> > 
> >>
> >>> +.IP
> >>> +If neither of the above flags is included in
> >>
> >> are included
> > 
> > Debatable.
> > 
> > The subject of the verb here is not "flags" (plural), but "neither of
> > the above flags" (which is more nuanced, though it can be interpreted
> > as singular).  Usage varies, and I don't consider this wrong.
> 
> As far as I know, the grammarians are with you on this one,
> Dave, and if I was writing carefully, I'd do the same as you.

Good, because I just made that up to justify myself!

> But, the plural here is also frequent (and so common that I would
> hesitate to call it "wrong").

Sure, I don't think either is wrong as such.  My preference is only a
preference, and partly depends on the context anyway.

Cheers
---Dave
Will Deacon May 13, 2020, 9:11 p.m. UTC | #5
On Wed, May 13, 2020 at 03:02:00PM +0100, Dave Martin wrote:
> On Wed, May 13, 2020 at 01:01:12PM +0200, Michael Kerrisk (man-pages) wrote:
> > On 5/13/20 12:46 PM, Dave Martin wrote:
> > > On Wed, May 13, 2020 at 09:43:52AM +0100, Will Deacon wrote:
> > >> On Tue, May 12, 2020 at 05:36:58PM +0100, Dave Martin wrote:
> > > glibc explicitly has
> > > 
> > > 	extern int prctl (int __option, ...);
> > > 
> > > (and nobody has to write _exit(0, 0, 0, 0, 0, 0) after all.)
> > > 
> > > Is there some agreed rationale for requiring redundant arguments to be
> > > supplied explicitly as zero?  For now there are likely to be few users
> > > of this, so we _might_ get away with changing the behaviour here if it's
> > > considered important enough.
> > 
> > See above.
> 
> So there is no bulletproof rationale for either approach, but the main
> concern is inconsistency?  Have I understood that right?

I think it's all just an extension of "make sure unused parameters are 0"
idiom which allows those bits to be safely repurposed for flags and things
later on without the worry of existing applications getting away with
passing junk.

> I'll propose to get that written down in the kernel source somewhere
> if so.

That would be a really good idea, actually!

> (From my end, the pros and cons of the two approaches seem superficial
> but the inconsistency is indeed annoying.  For PR_SVE_SET_VL, I think
> the first example I looked at didn't zero the trailing arguments, so I
> didn't either... but it's been upstream for several releases, so most
> likely we're stuck with it.)

FWIW, I wasn't blaming you for this. Just that these oversights aren't
always apparent when reviewing patches, but become more clear when
reviewing the documentation.

> > > There is no forwards compatibility problem with this prctl though,
> > > because there are spare bits in arg2 which can "turn on" additional
> > > args if needed.
> > > 
> > > Also, it's implausible that PR_SVE_GET_VL will ever want an argument.
> > > 
> > > There are still 2 billion unallocated prctl numbers, so new prctls can
> > > always be added if there's ever a need to work around these limitations,
> > > but it seems extremely unlikely.

Oh, there are ways out, but had I noticed this during code review it
would've been very easy just to enforce zero for the other args and be done
with it.

> > >>> +If
> > >>> +.B PR_SVE_VL_INHERIT
> > >>> +is also included in
> > >>> +.IR arg2 ,
> > >>> +it takes effect
> > >>> +.I after
> > >>> +this deferred change.
> > >>
> > >> I find this a bit hard to follow, since it's not clear to me whether the
> > >> INHERIT flag is effectively set before or after the next execve(). In other
> > >> words, if both PR_SVE_SET_VL_ONEXEC and PR_SVE_VL_INHERIT are specified,
> > >> is the vector length preserved or reset on the next execve()?
> > > 
> > > It makes no difference, because the ONEXEC handling takes priority over
> > > the INHERIT handling. But either way INHERIT is never cleared by execve()
> > > and will apply at subsequent execs().
> > > 
> > > Explaining all this properly seems out of scope here.  Maybe this should
> > > be trimmed down rather than elaborated?  Or perhaps just explain it in
> > > terms of what the kernel does instead of futile attempts to make it
> > > intuitive?

Hmm, if we don't explain it in the man page then we should at least point
people to somewhere where they can get the gory details, because I think
they're necessary in order to use the prctl() request correctly. I'm still
not confident that I understand the semantics of setting both
PR_SVE_SET_VL_ONEXEC and PR_SVE_VL_INHERIT without reading the code, which
may change.

(I concede on all the spelling/grammar discussions ;)

Will
Dave Martin May 18, 2020, 4:37 p.m. UTC | #6
On Wed, May 13, 2020 at 10:11:54PM +0100, Will Deacon wrote:
> On Wed, May 13, 2020 at 03:02:00PM +0100, Dave Martin wrote:
> > On Wed, May 13, 2020 at 01:01:12PM +0200, Michael Kerrisk (man-pages) wrote:
> > > On 5/13/20 12:46 PM, Dave Martin wrote:
> > > > On Wed, May 13, 2020 at 09:43:52AM +0100, Will Deacon wrote:
> > > >> On Tue, May 12, 2020 at 05:36:58PM +0100, Dave Martin wrote:
> > > > glibc explicitly has
> > > > 
> > > > 	extern int prctl (int __option, ...);
> > > > 
> > > > (and nobody has to write _exit(0, 0, 0, 0, 0, 0) after all.)
> > > > 
> > > > Is there some agreed rationale for requiring redundant arguments to be
> > > > supplied explicitly as zero?  For now there are likely to be few users
> > > > of this, so we _might_ get away with changing the behaviour here if it's
> > > > considered important enough.
> > > 
> > > See above.
> > 
> > So there is no bulletproof rationale for either approach, but the main
> > concern is inconsistency?  Have I understood that right?
> 
> I think it's all just an extension of "make sure unused parameters are 0"
> idiom which allows those bits to be safely repurposed for flags and things
> later on without the worry of existing applications getting away with
> passing junk.

I'd say that the explicit zeroing may give a false sense of safety, but
I sympathise with the intent.

At least, I think the explicit zeroing means that any resulting bugs are
more likely to be fixable in userspace.

> > I'll propose to get that written down in the kernel source somewhere
> > if so.
> 
> That would be a really good idea, actually!
> 
> > (From my end, the pros and cons of the two approaches seem superficial
> > but the inconsistency is indeed annoying.  For PR_SVE_SET_VL, I think
> > the first example I looked at didn't zero the trailing arguments, so I
> > didn't either... but it's been upstream for several releases, so most
> > likely we're stuck with it.)
> 
> FWIW, I wasn't blaming you for this. Just that these oversights aren't
> always apparent when reviewing patches, but become more clear when
> reviewing the documentation.

I'll have a think, so long as nobody implies that the SVE prctls are
"wrong" ;)

Adding comments in the code about how the implementation of those prctls
can and can't safely be extended would be sensible though.  I'll try to
address that at some point.

> > > > There is no forwards compatibility problem with this prctl though,
> > > > because there are spare bits in arg2 which can "turn on" additional
> > > > args if needed.
> > > > 
> > > > Also, it's implausible that PR_SVE_GET_VL will ever want an argument.
> > > > 
> > > > There are still 2 billion unallocated prctl numbers, so new prctls can
> > > > always be added if there's ever a need to work around these limitations,
> > > > but it seems extremely unlikely.
> 
> Oh, there are ways out, but had I noticed this during code review it
> would've been very easy just to enforce zero for the other args and be done
> with it.

Ack

> > > >>> +If
> > > >>> +.B PR_SVE_VL_INHERIT
> > > >>> +is also included in
> > > >>> +.IR arg2 ,
> > > >>> +it takes effect
> > > >>> +.I after
> > > >>> +this deferred change.
> > > >>
> > > >> I find this a bit hard to follow, since it's not clear to me whether the
> > > >> INHERIT flag is effectively set before or after the next execve(). In other
> > > >> words, if both PR_SVE_SET_VL_ONEXEC and PR_SVE_VL_INHERIT are specified,
> > > >> is the vector length preserved or reset on the next execve()?
> > > > 
> > > > It makes no difference, because the ONEXEC handling takes priority over
> > > > the INHERIT handling. But either way INHERIT is never cleared by execve()
> > > > and will apply at subsequent execs().
> > > > 
> > > > Explaining all this properly seems out of scope here.  Maybe this should
> > > > be trimmed down rather than elaborated?  Or perhaps just explain it in
> > > > terms of what the kernel does instead of futile attempts to make it
> > > > intuitive?
> 
> Hmm, if we don't explain it in the man page then we should at least point
> people to somewhere where they can get the gory details, because I think
> they're necessary in order to use the prctl() request correctly. I'm still
> not confident that I understand the semantics of setting both
> PR_SVE_SET_VL_ONEXEC and PR_SVE_VL_INHERIT without reading the code, which
> may change.
> 
> (I concede on all the spelling/grammar discussions ;)

Ultimately I aim to add another page, but for now would it be sufficient
to refer to Documentation/?

Cheers
---Dave
Dave Martin May 26, 2020, 2:45 p.m. UTC | #7
On Wed, May 13, 2020 at 10:11:54PM +0100, Will Deacon wrote:
> On Wed, May 13, 2020 at 03:02:00PM +0100, Dave Martin wrote:
> > On Wed, May 13, 2020 at 01:01:12PM +0200, Michael Kerrisk (man-pages) wrote:
> > > On 5/13/20 12:46 PM, Dave Martin wrote:
> > > > On Wed, May 13, 2020 at 09:43:52AM +0100, Will Deacon wrote:
> > > >> On Tue, May 12, 2020 at 05:36:58PM +0100, Dave Martin wrote:

[...]

> > > >>> +If
> > > >>> +.B PR_SVE_VL_INHERIT
> > > >>> +is also included in
> > > >>> +.IR arg2 ,
> > > >>> +it takes effect
> > > >>> +.I after
> > > >>> +this deferred change.
> > > >>
> > > >> I find this a bit hard to follow, since it's not clear to me whether the
> > > >> INHERIT flag is effectively set before or after the next execve(). In other
> > > >> words, if both PR_SVE_SET_VL_ONEXEC and PR_SVE_VL_INHERIT are specified,
> > > >> is the vector length preserved or reset on the next execve()?
> > > > 
> > > > It makes no difference, because the ONEXEC handling takes priority over
> > > > the INHERIT handling. But either way INHERIT is never cleared by execve()
> > > > and will apply at subsequent execs().
> > > > 
> > > > Explaining all this properly seems out of scope here.  Maybe this should
> > > > be trimmed down rather than elaborated?  Or perhaps just explain it in
> > > > terms of what the kernel does instead of futile attempts to make it
> > > > intuitive?
> 
> Hmm, if we don't explain it in the man page then we should at least point
> people to somewhere where they can get the gory details, because I think
> they're necessary in order to use the prctl() request correctly. I'm still
> not confident that I understand the semantics of setting both
> PR_SVE_SET_VL_ONEXEC and PR_SVE_VL_INHERIT without reading the code, which
> may change.

On this point, can you review the following wording?

I simply enumerate the possible flag combinations now, rather than tying
myself in knots trying to describe the two flags independently.

Cheers
---Dave

--8<--

       PR_SVE_SET_VL (since Linux 4.15, only on arm64)
              Configure the thread's SVE vector length, as specified by  (int)
              arg2.  Arguments arg3, arg4 and arg5 are ignored.

              The bits of arg2 corresponding to PR_SVE_VL_LEN_MASK must be set
              to the desired vector length in bytes.  This is  interpreted  as
              an  upper  bound:  the kernel will select the greatest available
              vector length that does not exceed the value specified.  In par-
              ticular,  specifying  SVE_VL_MAX (defined in <asm/sigcontext.h>)
              for the PR_SVE_VL_LEN_MASK bits requests the  maximum  supported
              vector length.

              In  addition,  arg2  may  include  the following combinations of
              flags:

              0      Perform the change immediately.  At the next execve(2) in
                     the  thread, the vector length will be reset to the value
                     configured in /proc/sys/abi/sve_default_vector_length.

              PR_SVE_VL_INHERIT
                     Perform the  change  immediately.   Subsequent  execve(2)
                     calls will preserve the new vector length.

              PR_SVE_SET_VL_ONEXEC
                     Defer  the  change,  so  that it is performed at the next
                     execve(2) in the thread.  Further  execve(2)  calls  will
                     reset  the  vector  length  to  the  value  configured in
                     /proc/sys/abi/sve_default_vector_length.

              PR_SVE_SET_VL_ONEXEC | PR_SVE_VL_INHERIT
                     Defer the change, so that it is  performed  at  the  next
                     execve(2)  in  the  thread.  Further execve(2) calls will
                     preserve the new vector length.

              In all cases, any previously pending  deferred  change  is  can-
              celed.

              The  call fails with error EINVAL if SVE is not supported on the
              platform, if arg2 is unrecognized or invalid, or  the  value  in
              the  bits of arg2 corresponding to PR_SVE_VL_LEN_MASK is outside
              the range SVE_VL_MIN..SVE_VL_MAX, or is not a multiple of 16.

              On success, a nonnegative value is returned that  describes  the
              selected  configuration,  which may differ from the current con-
              figuration if PR_SVE_SET_VL_ONEXEC was specified.  The value  is
              encoded in the same way as the return value of PR_SVE_GET_VL.

              The  configuration  (including  any  pending deferred change) is
              inherited across fork(2) and clone(2).

              For more information, see  the  kernel  source  file  Documenta-
              tion/arm64/sve.rst  (or Documentation/arm64/sve.txt before Linux
              5.3).

              Warning: Because the compiler or  run-time  environment  may  be
              using SVE, using this call without the PR_SVE_SET_VL_ONEXEC flag
              can lead to unpredicable behaviour in the calling process.   The
              conditions for using it safely are complex and system-dependent.
              Don't use it unless you really know what you are doing.

-->8--
diff mbox series

Patch

diff --git a/man2/prctl.2 b/man2/prctl.2
index 7f511d2..dd16227 100644
--- a/man2/prctl.2
+++ b/man2/prctl.2
@@ -1291,6 +1291,104 @@  call failing with the error
 .BR ENXIO .
 For further details, see the kernel source file
 .IR Documentation/admin-guide/kernel-parameters.txt .
+.\" prctl PR_SVE_SET_VL
+.\" commit 2d2123bc7c7f843aa9db87720de159a049839862
+.\" linux-5.6/Documentation/arm64/sve.rst
+.TP
+.BR PR_SVE_SET_VL " (since Linux 4.15, only on arm64)"
+Configure the thread's SVE vector length,
+as specified by
+.IR arg2 .
+Arguments
+.IR arg3 ", " arg4 " and " arg5
+are ignored.
+.IP
+The bits of
+.I arg2
+corresponding to
+.B SVE_VL_LEN_MASK
+must be set to the desired vector length in bytes.
+In addition,
+.I arg2
+may include zero or more of the following flags:
+.RS
+.TP
+.B PR_SVE_VL_INHERIT
+Inherit the configured vector length across
+.BR execve (2).
+.TP
+.B PR_SVE_SET_VL_ONEXEC
+Defer the change until the next
+.BR execve (2)
+in this thread.
+If
+.B PR_SVE_VL_INHERIT
+is also included in
+.IR arg2 ,
+it takes effect
+.I after
+this deferred change.
+.RE
+.IP
+On success, the vector length and flags are set as requested,
+and any deferred change that was pending immediately before the
+.B PR_SVE_SET_VL
+call is canceled.
+If
+.B PR_SVE_SET_VL_ONEXEC
+was included in
+.IR arg2 ,
+the returned value describes the configuration
+scheduled to take effect at the next
+.BR execve (2).
+Otherwise, the effect is immediate and
+the returned value describes the new configuration.
+The returned value is encoded in the same way as the return value of
+.BR PR_SVE_GET_VL .
+.IP
+If neither of the above flags is included in
+.IR arg2 ,
+a subsequent
+.BR execve (2)
+resets the vector length to the default value configured in
+.IR /proc/sys/abi/sve_default_vector_length .
+.IP
+The actual vector length configured by this operation
+is the greatest vector length supported by the platform
+that does not exceed
+.I arg2
+&
+.BR PR_SVE_VL_LEN_MASK .
+.IP
+The configuration (including any pending deferred change)
+is inherited across
+.BR fork (2)
+and
+.BR clone (2).
+.\" prctl PR_SVE_GET_VL
+.TP
+.BR PR_SVE_GET_VL " (since Linux 4.15, only on arm64)"
+Get the thread's current SVE vector length configuration,
+as configured by
+.BR PR_SVE_SET_VL .
+.IP
+If successful, the return value describes the
+.I current
+configuration.
+The bits corresponding to
+.B PR_SVE_VL_LEN_MASK
+contain the currently configured vector length in bytes.
+The bit corresponding to
+.B PR_SVE_VL_INHERIT
+indicates whether the vector length will be inherited
+across
+.BR execve (2).
+.IP
+Note that there is no way determine whether there is
+a pending vector length change that has not yet taken effect.
+.IP
+Providing that the kernel and platform support SVE,
+this operation always succeeds.
 .\"
 .\" prctl PR_TASK_PERF_EVENTS_DISABLE
 .TP
@@ -1534,6 +1632,8 @@  On success,
 .BR PR_GET_NO_NEW_PRIVS ,
 .BR PR_GET_SECUREBITS ,
 .BR PR_GET_SPECULATION_CTRL ,
+.BR PR_SVE_GET_VL ,
+.BR PR_SVE_SET_VL ,
 .BR PR_GET_THP_DISABLE ,
 .BR PR_GET_TIMING ,
 .BR PR_GET_TIMERSLACK ,
@@ -1817,6 +1917,18 @@  and unused arguments to
 .BR prctl ()
 are not 0.
 .TP
+.B EINVAL
+.I option
+was
+.B PR_SVE_SET_VL
+and
+.I arg2
+contains invalid flags, or
+.I arg2
+&
+.B SVE_VL_LEN_MASK
+is not a valid vector length.
+.TP
 .B ENODEV
 .I option
 was