diff mbox series

[v2,4/6] prctl.2: Add SVE prctls (arm64)

Message ID 1590614258-24728-5-git-send-email-Dave.Martin@arm.com (mailing list archive)
State New, archived
Headers show
Series prctl.2 man page updates for Linux 5.6 | expand

Commit Message

Dave Martin May 27, 2020, 9:17 p.m. UTC
Add documentation for the the PR_SVE_SET_VL and PR_SVE_GET_VL
prctls added in Linux 4.15 for arm64.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>

---

Since v1:

 * Minor rewordings and typo fixes.

 * Fix typo'd #define names.

 * Add type annotation for PR_SVE_SET_VL arg2.

 * Clarify return value semantics of PR_SVE_SET_VL

 * Add note to say that the args for PR_SVE_GET_VL are ignored.

 * Note for PR_SVE_SET_VL that the PR_SVE_VL_LEN_MASK field specifies
   an upper bound on what vector length to set, not an exact value.

 * Rework PR_SVE_SET_VL arg2 description to enumerate all possible flag
   combinations rather than describing the flags independently.

   Coming up with a clear description of each flag that is independent
   of the description of the other flag turns out to be hard.

 * In lieu of having a separate man page to cross reference for detailed
   guidance, cross-reference the kernel documentation.

 * Avoid confusing cross-reference to PR_SVE_SET_VL when describing the
   return value of PR_SVE_GET_VL.

 * Clarify error conditions for PR_SVE_SET_VL and PR_SVE_GET_VL, and
   move detail to the individual prctl descriptions to keep related
   content close together while minimising duplication.

 * Add safety warning.  This is deliberately vague, pending ongoing
   discussions with libc folks.
---
 man2/prctl.2 | 160 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 160 insertions(+)

Comments

Will Deacon June 9, 2020, 9:57 a.m. UTC | #1
Hi Dave,

On Wed, May 27, 2020 at 10:17:36PM +0100, Dave Martin wrote:
> Add documentation for the the PR_SVE_SET_VL and PR_SVE_GET_VL
> prctls added in Linux 4.15 for arm64.

Looks really good to me, thanks. Just a few comments inline.

> diff --git a/man2/prctl.2 b/man2/prctl.2
> index cab9915..91df7c8 100644
> --- a/man2/prctl.2
> +++ b/man2/prctl.2
> @@ -1291,6 +1291,148 @@ call failing with the error
>  .BR ENXIO .
>  For further details, see the kernel source file
>  .IR Documentation/admin\-guide/kernel\-parameters.txt .
> +.\" prctl PR_SVE_SET_VL
> +.\" commit 2d2123bc7c7f843aa9db87720de159a049839862
> +.\" linux-5.6/Documentation/arm64/sve.rst
> +.TP
> +.BR PR_SVE_SET_VL " (since Linux 4.15, only on arm64)"
> +Configure the thread's SVE vector length,
> +as specified by
> +.IR "(int) arg2" .
> +Arguments
> +.IR arg3 ", " arg4 " and " arg5
> +are ignored.
> +.IP
> +The bits of
> +.I arg2
> +corresponding to
> +.B PR_SVE_VL_LEN_MASK
> +must be set to the desired vector length in bytes.
> +This is interpreted as an upper bound:
> +the kernel will select the greatest available vector length
> +that does not exceed the value specified.
> +In particular, specifying
> +.B SVE_VL_MAX
> +(defined in
> +.I <asm/sigcontext.h>)
> +for the
> +.B PR_SVE_VL_LEN_MASK
> +bits requests the maximum supported vector length.
> +.IP
> +In addition,
> +.I arg2
> +must be set to one of the following combinations of flags:

How about saying:

  In addition, the other bits of arg2 must be set according to the following
  combinations of flags:

Otherwise I find it a bit fiddly to read, because it's valid to have
flags of 0 and a non-zero length.

> +.RS
> +.TP
> +.B 0
> +Perform the change immediately.
> +At the next
> +.BR execve (2)
> +in the thread,
> +the vector length will be reset to the value configured in
> +.IR /proc/sys/abi/sve_default_vector_length .

(implementation note: does this mean that 'sve_default_vl' should be
 an atomic_t, as it can be accessed concurrently? We probably need
 {READ,WRITE}_ONCE() at the very least, as I'm not seeing any locks
 that help us here...)

> +.TP
> +.B PR_SVE_VL_INHERIT
> +Perform the change immediately.
> +Subsequent
> +.BR execve (2)
> +calls will preserve the new vector length.
> +.TP
> +.B PR_SVE_SET_VL_ONEXEC
> +Defer the change, so that it is performed at the next
> +.BR execve (2)
> +in the thread.
> +Further
> +.BR execve (2)
> +calls will reset the vector length to the value configured in
> +.IR /proc/sys/abi/sve_default_vector_length .
> +.TP
> +.B "PR_SVE_SET_VL_ONEXEC | PR_SVE_VL_INHERIT"
> +Defer the change, so that it is performed at the next
> +.BR execve (2)
> +in the thread.
> +Further
> +.BR execve (2)
> +calls will preserve the new vector length.
> +.RE
> +.IP
> +In all cases,
> +any previously pending deferred change is canceled.
> +.IP
> +The call fails with error
> +.B EINVAL
> +if SVE is not supported on the platform, if
> +.I arg2
> +is unrecognized or invalid, or the value in the bits of
> +.I arg2
> +corresponding to
> +.B PR_SVE_VL_LEN_MASK
> +is outside the range
> +.BR SVE_VL_MIN .. SVE_VL_MAX
> +or is not a multiple of 16.
> +.IP
> +On success,
> +a nonnegative value is returned that describes the
> +.I selected
> +configuration,

If I'm reading the kernel code correctly, this is slightly weird, as
the returned value may contain the PR_SVE_VL_INHERIT flag but it will
never contain the PR_SVE_SET_VL_ONEXEC flag. Is that right?

If so, maybe just say something like:

  On success, a nonnegative value is returned that describes the selected
  configuration in the same way as PR_SVE_GET_VL.

> +which may differ from the current configuration if
> +.B PR_SVE_SET_VL_ONEXEC
> +was specified.
> +The value is encoded in the same way as the return value of
> +.BR PR_SVE_GET_VL .
> +.IP
> +The configuration (including any pending deferred change)
> +is inherited across
> +.BR fork (2)
> +and
> +.BR clone (2).
> +.IP
> +.B Warning:
> +Because the compiler or run-time environment
> +may be using SVE, using this call without the
> +.B PR_SVE_SET_VL_ONEXEC
> +flag may crash the calling process.
> +The conditions for using it safely are complex and system-dependent.
> +Don't use it unless you really know what you are doing.
> +.IP
> +For more information, see the kernel source file
> +.I Documentation/arm64/sve.rst
> +.\"commit b693d0b372afb39432e1c49ad7b3454855bc6bed
> +(or
> +.I Documentation/arm64/sve.txt
> +before Linux 5.3).

I think I'd drop the kernel reference here, as it feels like we're saying
"only do this if you know what you're doing" on one hand, but then "if you
don't know what you're doing, see this other documentation" on the other.

Will
Michael Kerrisk (man-pages) June 9, 2020, 11:39 a.m. UTC | #2
Hi Dave,

I've not applied this patch yet, in case you want to make
some changes in response to Will's comments.

I think all of the rest of the patches in the series are now
applied (and pushed to master).

Thanks,

Michael

On 5/27/20 11:17 PM, Dave Martin wrote:
> Add documentation for the the PR_SVE_SET_VL and PR_SVE_GET_VL
> prctls added in Linux 4.15 for arm64.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> 
> ---
> 
> Since v1:
> 
>  * Minor rewordings and typo fixes.
> 
>  * Fix typo'd #define names.
> 
>  * Add type annotation for PR_SVE_SET_VL arg2.
> 
>  * Clarify return value semantics of PR_SVE_SET_VL
> 
>  * Add note to say that the args for PR_SVE_GET_VL are ignored.
> 
>  * Note for PR_SVE_SET_VL that the PR_SVE_VL_LEN_MASK field specifies
>    an upper bound on what vector length to set, not an exact value.
> 
>  * Rework PR_SVE_SET_VL arg2 description to enumerate all possible flag
>    combinations rather than describing the flags independently.
> 
>    Coming up with a clear description of each flag that is independent
>    of the description of the other flag turns out to be hard.
> 
>  * In lieu of having a separate man page to cross reference for detailed
>    guidance, cross-reference the kernel documentation.
> 
>  * Avoid confusing cross-reference to PR_SVE_SET_VL when describing the
>    return value of PR_SVE_GET_VL.
> 
>  * Clarify error conditions for PR_SVE_SET_VL and PR_SVE_GET_VL, and
>    move detail to the individual prctl descriptions to keep related
>    content close together while minimising duplication.
> 
>  * Add safety warning.  This is deliberately vague, pending ongoing
>    discussions with libc folks.
> ---
>  man2/prctl.2 | 160 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 160 insertions(+)
> 
> diff --git a/man2/prctl.2 b/man2/prctl.2
> index cab9915..91df7c8 100644
> --- a/man2/prctl.2
> +++ b/man2/prctl.2
> @@ -1291,6 +1291,148 @@ call failing with the error
>  .BR ENXIO .
>  For further details, see the kernel source file
>  .IR Documentation/admin\-guide/kernel\-parameters.txt .
> +.\" prctl PR_SVE_SET_VL
> +.\" commit 2d2123bc7c7f843aa9db87720de159a049839862
> +.\" linux-5.6/Documentation/arm64/sve.rst
> +.TP
> +.BR PR_SVE_SET_VL " (since Linux 4.15, only on arm64)"
> +Configure the thread's SVE vector length,
> +as specified by
> +.IR "(int) arg2" .
> +Arguments
> +.IR arg3 ", " arg4 " and " arg5
> +are ignored.
> +.IP
> +The bits of
> +.I arg2
> +corresponding to
> +.B PR_SVE_VL_LEN_MASK
> +must be set to the desired vector length in bytes.
> +This is interpreted as an upper bound:
> +the kernel will select the greatest available vector length
> +that does not exceed the value specified.
> +In particular, specifying
> +.B SVE_VL_MAX
> +(defined in
> +.I <asm/sigcontext.h>)
> +for the
> +.B PR_SVE_VL_LEN_MASK
> +bits requests the maximum supported vector length.
> +.IP
> +In addition,
> +.I arg2
> +must be set to one of the following combinations of flags:
> +.RS
> +.TP
> +.B 0
> +Perform the change immediately.
> +At the next
> +.BR execve (2)
> +in the thread,
> +the vector length will be reset to the value configured in
> +.IR /proc/sys/abi/sve_default_vector_length .
> +.TP
> +.B PR_SVE_VL_INHERIT
> +Perform the change immediately.
> +Subsequent
> +.BR execve (2)
> +calls will preserve the new vector length.
> +.TP
> +.B PR_SVE_SET_VL_ONEXEC
> +Defer the change, so that it is performed at the next
> +.BR execve (2)
> +in the thread.
> +Further
> +.BR execve (2)
> +calls will reset the vector length to the value configured in
> +.IR /proc/sys/abi/sve_default_vector_length .
> +.TP
> +.B "PR_SVE_SET_VL_ONEXEC | PR_SVE_VL_INHERIT"
> +Defer the change, so that it is performed at the next
> +.BR execve (2)
> +in the thread.
> +Further
> +.BR execve (2)
> +calls will preserve the new vector length.
> +.RE
> +.IP
> +In all cases,
> +any previously pending deferred change is canceled.
> +.IP
> +The call fails with error
> +.B EINVAL
> +if SVE is not supported on the platform, if
> +.I arg2
> +is unrecognized or invalid, or the value in the bits of
> +.I arg2
> +corresponding to
> +.B PR_SVE_VL_LEN_MASK
> +is outside the range
> +.BR SVE_VL_MIN .. SVE_VL_MAX
> +or is not a multiple of 16.
> +.IP
> +On success,
> +a nonnegative value is returned that describes the
> +.I selected
> +configuration,
> +which may differ from the current configuration if
> +.B PR_SVE_SET_VL_ONEXEC
> +was specified.
> +The value is encoded in the same way as the return value of
> +.BR PR_SVE_GET_VL .
> +.IP
> +The configuration (including any pending deferred change)
> +is inherited across
> +.BR fork (2)
> +and
> +.BR clone (2).
> +.IP
> +.B Warning:
> +Because the compiler or run-time environment
> +may be using SVE, using this call without the
> +.B PR_SVE_SET_VL_ONEXEC
> +flag may crash the calling process.
> +The conditions for using it safely are complex and system-dependent.
> +Don't use it unless you really know what you are doing.
> +.IP
> +For more information, see the kernel source file
> +.I Documentation/arm64/sve.rst
> +.\"commit b693d0b372afb39432e1c49ad7b3454855bc6bed
> +(or
> +.I Documentation/arm64/sve.txt
> +before Linux 5.3).
> +.\" prctl PR_SVE_GET_VL
> +.TP
> +.BR PR_SVE_GET_VL " (since Linux 4.15, only on arm64)"
> +Get the thread's current SVE vector length configuration.
> +.IP
> +Arguments
> +.IR arg2 ", " arg3 ", " arg4 " and " arg5
> +are ignored.
> +.IP
> +Providing that the kernel and platform support SVE
> +this operation always succeeds,
> +returning a nonnegative value that describes the
> +.I current
> +configuration.
> +The bits corresponding to
> +.B PR_SVE_VL_LEN_MASK
> +contain the currently configured vector length in bytes.
> +The bit corresponding to
> +.B PR_SVE_VL_INHERIT
> +indicates whether the vector length will be inherited
> +across
> +.BR execve (2).
> +.IP
> +Note that there is no way to determine whether there is
> +a pending vector length change that has not yet taken effect.
> +.IP
> +For more information, see the kernel source file
> +.I Documentation/arm64/sve.rst
> +.\"commit b693d0b372afb39432e1c49ad7b3454855bc6bed
> +(or
> +.I Documentation/arm64/sve.txt
> +before Linux 5.3).
>  .\"
>  .\" prctl PR_TASK_PERF_EVENTS_DISABLE
>  .TP
> @@ -1534,6 +1676,8 @@ On success,
>  .BR PR_GET_NO_NEW_PRIVS ,
>  .BR PR_GET_SECUREBITS ,
>  .BR PR_GET_SPECULATION_CTRL ,
> +.BR PR_SVE_GET_VL ,
> +.BR PR_SVE_SET_VL ,
>  .BR PR_GET_THP_DISABLE ,
>  .BR PR_GET_TIMING ,
>  .BR PR_GET_TIMERSLACK ,
> @@ -1817,6 +1961,22 @@ and unused arguments to
>  .BR prctl ()
>  are not 0.
>  .TP
> +.B EINVAL
> +.I option
> +is
> +.B PR_SVE_SET_VL
> +and the arguments are invalid or unsupported,
> +or SVE is not available on this platform.
> +See the description of
> +.B PR_SVE_SET_VL
> +above for details.
> +.TP
> +.B EINVAL
> +.I option
> +is
> +.B PR_SVE_GET_VL
> +and SVE is not available on this platform.
> +.TP
>  .B ENODEV
>  .I option
>  was
>
Dave Martin June 9, 2020, 2:11 p.m. UTC | #3
On Tue, Jun 09, 2020 at 10:57:35AM +0100, Will Deacon wrote:
> Hi Dave,
> 
> On Wed, May 27, 2020 at 10:17:36PM +0100, Dave Martin wrote:
> > Add documentation for the the PR_SVE_SET_VL and PR_SVE_GET_VL
> > prctls added in Linux 4.15 for arm64.
> 
> Looks really good to me, thanks. Just a few comments inline.
> 
> > diff --git a/man2/prctl.2 b/man2/prctl.2
> > index cab9915..91df7c8 100644
> > --- a/man2/prctl.2
> > +++ b/man2/prctl.2
> > @@ -1291,6 +1291,148 @@ call failing with the error
> >  .BR ENXIO .
> >  For further details, see the kernel source file
> >  .IR Documentation/admin\-guide/kernel\-parameters.txt .
> > +.\" prctl PR_SVE_SET_VL
> > +.\" commit 2d2123bc7c7f843aa9db87720de159a049839862
> > +.\" linux-5.6/Documentation/arm64/sve.rst
> > +.TP
> > +.BR PR_SVE_SET_VL " (since Linux 4.15, only on arm64)"
> > +Configure the thread's SVE vector length,
> > +as specified by
> > +.IR "(int) arg2" .
> > +Arguments
> > +.IR arg3 ", " arg4 " and " arg5
> > +are ignored.
> > +.IP
> > +The bits of
> > +.I arg2
> > +corresponding to
> > +.B PR_SVE_VL_LEN_MASK
> > +must be set to the desired vector length in bytes.
> > +This is interpreted as an upper bound:
> > +the kernel will select the greatest available vector length
> > +that does not exceed the value specified.
> > +In particular, specifying
> > +.B SVE_VL_MAX
> > +(defined in
> > +.I <asm/sigcontext.h>)
> > +for the
> > +.B PR_SVE_VL_LEN_MASK
> > +bits requests the maximum supported vector length.
> > +.IP
> > +In addition,
> > +.I arg2
> > +must be set to one of the following combinations of flags:
> 
> How about saying:
> 
>   In addition, the other bits of arg2 must be set according to the following
>   combinations of flags:
> 
> Otherwise I find it a bit fiddly to read, because it's valid to have
> flags of 0 and a non-zero length.

0 is listed, so I hoped that was clear enough.

Maybe just write "must be one of the following values:"?

0 is a value, but I can see why you might be uneasy about 0 being
described as a "combination of flags".

> > +.RS
> > +.TP
> > +.B 0
> > +Perform the change immediately.
> > +At the next
> > +.BR execve (2)
> > +in the thread,
> > +the vector length will be reset to the value configured in
> > +.IR /proc/sys/abi/sve_default_vector_length .
> 
> (implementation note: does this mean that 'sve_default_vl' should be
>  an atomic_t, as it can be accessed concurrently? We probably need
>  {READ,WRITE}_ONCE() at the very least, as I'm not seeing any locks
>  that help us here...)

Is this purely theoretical?  Can you point to what could go wrong?

While I doubt I thought about this very hard and I agree that you're
right in principle, I think there are probably non-atomic sysctls and
debugs files etc. all over the place.

I didn't want to clutter the code unnecessarily.

> > +.B PR_SVE_VL_INHERIT
> > +Perform the change immediately.
> > +Subsequent
> > +.BR execve (2)
> > +calls will preserve the new vector length.
> > +.TP
> > +.B PR_SVE_SET_VL_ONEXEC
> > +Defer the change, so that it is performed at the next
> > +.BR execve (2)
> > +in the thread.
> > +Further
> > +.BR execve (2)
> > +calls will reset the vector length to the value configured in
> > +.IR /proc/sys/abi/sve_default_vector_length .
> > +.TP
> > +.B "PR_SVE_SET_VL_ONEXEC | PR_SVE_VL_INHERIT"
> > +Defer the change, so that it is performed at the next
> > +.BR execve (2)
> > +in the thread.
> > +Further
> > +.BR execve (2)
> > +calls will preserve the new vector length.
> > +.RE
> > +.IP
> > +In all cases,
> > +any previously pending deferred change is canceled.
> > +.IP
> > +The call fails with error
> > +.B EINVAL
> > +if SVE is not supported on the platform, if
> > +.I arg2
> > +is unrecognized or invalid, or the value in the bits of
> > +.I arg2
> > +corresponding to
> > +.B PR_SVE_VL_LEN_MASK
> > +is outside the range
> > +.BR SVE_VL_MIN .. SVE_VL_MAX
> > +or is not a multiple of 16.
> > +.IP
> > +On success,
> > +a nonnegative value is returned that describes the
> > +.I selected
> > +configuration,
> 
> If I'm reading the kernel code correctly, this is slightly weird, as
> the returned value may contain the PR_SVE_VL_INHERIT flag but it will
> never contain the PR_SVE_SET_VL_ONEXEC flag. Is that right?

Yes, which is an oddity.

I suppose we could fake that up actually by returning that flag if
sve_vl and sve_vl_onexec are different, but we don't currently do this.

> If so, maybe just say something like:
> 
>   On success, a nonnegative value is returned that describes the selected
>   configuration in the same way as PR_SVE_GET_VL.

How does that help?  PR_SVE_GET_VL doesn't fully clarify the oddity you
call out anyway.

Really, I preferred not to have people relying on this one way or the
other.  The only sensible reason for an _ONEXEC is because you've
committed to calling execve().  On such a path, queryng the vector
length isn't likely to be useful.

Maybe I was optimistic.

> > +which may differ from the current configuration if
> > +.B PR_SVE_SET_VL_ONEXEC
> > +was specified.
> > +The value is encoded in the same way as the return value of
> > +.BR PR_SVE_GET_VL .
> > +.IP
> > +The configuration (including any pending deferred change)
> > +is inherited across
> > +.BR fork (2)
> > +and
> > +.BR clone (2).
> > +.IP
> > +.B Warning:
> > +Because the compiler or run-time environment
> > +may be using SVE, using this call without the
> > +.B PR_SVE_SET_VL_ONEXEC
> > +flag may crash the calling process.
> > +The conditions for using it safely are complex and system-dependent.
> > +Don't use it unless you really know what you are doing.
> > +.IP
> > +For more information, see the kernel source file
> > +.I Documentation/arm64/sve.rst
> > +.\"commit b693d0b372afb39432e1c49ad7b3454855bc6bed
> > +(or
> > +.I Documentation/arm64/sve.txt
> > +before Linux 5.3).
> 
> I think I'd drop the kernel reference here, as it feels like we're saying
> "only do this if you know what you're doing" on one hand, but then "if you
> don't know what you're doing, see this other documentation" on the other.

Well, the docmuentation doesn't answer those questions either.

I could just swap the warning and the cross-reference, so that the
cross-reference doesn't seem to follow on from "knowing what you're
doing"?

Cheers
---Dave
Will Deacon June 9, 2020, 2:49 p.m. UTC | #4
On Tue, Jun 09, 2020 at 03:11:42PM +0100, Dave Martin wrote:
> On Tue, Jun 09, 2020 at 10:57:35AM +0100, Will Deacon wrote:
> > On Wed, May 27, 2020 at 10:17:36PM +0100, Dave Martin wrote:
> > > diff --git a/man2/prctl.2 b/man2/prctl.2
> > > index cab9915..91df7c8 100644
> > > --- a/man2/prctl.2
> > > +++ b/man2/prctl.2
> > > @@ -1291,6 +1291,148 @@ call failing with the error
> > >  .BR ENXIO .
> > >  For further details, see the kernel source file
> > >  .IR Documentation/admin\-guide/kernel\-parameters.txt .
> > > +.\" prctl PR_SVE_SET_VL
> > > +.\" commit 2d2123bc7c7f843aa9db87720de159a049839862
> > > +.\" linux-5.6/Documentation/arm64/sve.rst
> > > +.TP
> > > +.BR PR_SVE_SET_VL " (since Linux 4.15, only on arm64)"
> > > +Configure the thread's SVE vector length,
> > > +as specified by
> > > +.IR "(int) arg2" .
> > > +Arguments
> > > +.IR arg3 ", " arg4 " and " arg5
> > > +are ignored.
> > > +.IP
> > > +The bits of
> > > +.I arg2
> > > +corresponding to
> > > +.B PR_SVE_VL_LEN_MASK
> > > +must be set to the desired vector length in bytes.
> > > +This is interpreted as an upper bound:
> > > +the kernel will select the greatest available vector length
> > > +that does not exceed the value specified.
> > > +In particular, specifying
> > > +.B SVE_VL_MAX
> > > +(defined in
> > > +.I <asm/sigcontext.h>)
> > > +for the
> > > +.B PR_SVE_VL_LEN_MASK
> > > +bits requests the maximum supported vector length.
> > > +.IP
> > > +In addition,
> > > +.I arg2
> > > +must be set to one of the following combinations of flags:
> > 
> > How about saying:
> > 
> >   In addition, the other bits of arg2 must be set according to the following
> >   combinations of flags:
> > 
> > Otherwise I find it a bit fiddly to read, because it's valid to have
> > flags of 0 and a non-zero length.
> 
> 0 is listed, so I hoped that was clear enough.
> 
> Maybe just write "must be one of the following values:"?
> 
> 0 is a value, but I can see why you might be uneasy about 0 being
> described as a "combination of flags".

It's more that arg2 *also* holds the length, so saying that arg2 must
be set to a combination of flags isn't quite right, because it's actually
to set to a combination of flags and the length.

> > > +.RS
> > > +.TP
> > > +.B 0
> > > +Perform the change immediately.
> > > +At the next
> > > +.BR execve (2)
> > > +in the thread,
> > > +the vector length will be reset to the value configured in
> > > +.IR /proc/sys/abi/sve_default_vector_length .
> > 
> > (implementation note: does this mean that 'sve_default_vl' should be
> >  an atomic_t, as it can be accessed concurrently? We probably need
> >  {READ,WRITE}_ONCE() at the very least, as I'm not seeing any locks
> >  that help us here...)
> 
> Is this purely theoretical?  Can you point to what could go wrong?

If the write is torn by the compiler, then a concurrent reader could end
up seeing a bogus value. There could also be ToCToU issues if it's re-read.

> While I doubt I thought about this very hard and I agree that you're
> right in principle, I think there are probably non-atomic sysctls and
> debugs files etc. all over the place.
> 
> I didn't want to clutter the code unnecessarily.

Right, but KCSAN is coming along and so somebody less familiar with the code
will hit this eventually.

> > > +.B PR_SVE_VL_INHERIT
> > > +Perform the change immediately.
> > > +Subsequent
> > > +.BR execve (2)
> > > +calls will preserve the new vector length.
> > > +.TP
> > > +.B PR_SVE_SET_VL_ONEXEC
> > > +Defer the change, so that it is performed at the next
> > > +.BR execve (2)
> > > +in the thread.
> > > +Further
> > > +.BR execve (2)
> > > +calls will reset the vector length to the value configured in
> > > +.IR /proc/sys/abi/sve_default_vector_length .
> > > +.TP
> > > +.B "PR_SVE_SET_VL_ONEXEC | PR_SVE_VL_INHERIT"
> > > +Defer the change, so that it is performed at the next
> > > +.BR execve (2)
> > > +in the thread.
> > > +Further
> > > +.BR execve (2)
> > > +calls will preserve the new vector length.
> > > +.RE
> > > +.IP
> > > +In all cases,
> > > +any previously pending deferred change is canceled.
> > > +.IP
> > > +The call fails with error
> > > +.B EINVAL
> > > +if SVE is not supported on the platform, if
> > > +.I arg2
> > > +is unrecognized or invalid, or the value in the bits of
> > > +.I arg2
> > > +corresponding to
> > > +.B PR_SVE_VL_LEN_MASK
> > > +is outside the range
> > > +.BR SVE_VL_MIN .. SVE_VL_MAX
> > > +or is not a multiple of 16.
> > > +.IP
> > > +On success,
> > > +a nonnegative value is returned that describes the
> > > +.I selected
> > > +configuration,
> > 
> > If I'm reading the kernel code correctly, this is slightly weird, as
> > the returned value may contain the PR_SVE_VL_INHERIT flag but it will
> > never contain the PR_SVE_SET_VL_ONEXEC flag. Is that right?
> 
> Yes, which is an oddity.
> 
> I suppose we could fake that up actually by returning that flag if
> sve_vl and sve_vl_onexec are different, but we don't currently do this.

I don't think there's any need to change the code, but I think this stuff
is worth documenting.

> > If so, maybe just say something like:
> > 
> >   On success, a nonnegative value is returned that describes the selected
> >   configuration in the same way as PR_SVE_GET_VL.
> 
> How does that help?  PR_SVE_GET_VL doesn't fully clarify the oddity you
> call out anyway.

It clarifies it enough for my liking (by explicitly talking about "the bit
corresponding to PR_SVE_VL_INHERIT" and not about PR_SVE_SET_VL_ONEXEC),
but either way, I think saying that the return value is the same is a
useful clarification. If you want to make PR_SVE_GET_VL more explicit,
we could do that too.

> > > +.B PR_SVE_SET_VL_ONEXEC
> > > +flag may crash the calling process.
> > > +The conditions for using it safely are complex and system-dependent.
> > > +Don't use it unless you really know what you are doing.
> > > +.IP
> > > +For more information, see the kernel source file
> > > +.I Documentation/arm64/sve.rst
> > > +.\"commit b693d0b372afb39432e1c49ad7b3454855bc6bed
> > > +(or
> > > +.I Documentation/arm64/sve.txt
> > > +before Linux 5.3).
> > 
> > I think I'd drop the kernel reference here, as it feels like we're saying
> > "only do this if you know what you're doing" on one hand, but then "if you
> > don't know what you're doing, see this other documentation" on the other.
> 
> Well, the docmuentation doesn't answer those questions either.
> 
> I could just swap the warning and the cross-reference, so that the
> cross-reference doesn't seem to follow on from "knowing what you're
> doing"?

Ok.

Will
Dave Martin June 10, 2020, 9:44 a.m. UTC | #5
On Tue, Jun 09, 2020 at 03:49:05PM +0100, Will Deacon wrote:
> On Tue, Jun 09, 2020 at 03:11:42PM +0100, Dave Martin wrote:
> > On Tue, Jun 09, 2020 at 10:57:35AM +0100, Will Deacon wrote:
> > > On Wed, May 27, 2020 at 10:17:36PM +0100, Dave Martin wrote:
> > > > diff --git a/man2/prctl.2 b/man2/prctl.2
> > > > index cab9915..91df7c8 100644
> > > > --- a/man2/prctl.2
> > > > +++ b/man2/prctl.2
> > > > @@ -1291,6 +1291,148 @@ call failing with the error
> > > >  .BR ENXIO .
> > > >  For further details, see the kernel source file
> > > >  .IR Documentation/admin\-guide/kernel\-parameters.txt .
> > > > +.\" prctl PR_SVE_SET_VL
> > > > +.\" commit 2d2123bc7c7f843aa9db87720de159a049839862
> > > > +.\" linux-5.6/Documentation/arm64/sve.rst
> > > > +.TP
> > > > +.BR PR_SVE_SET_VL " (since Linux 4.15, only on arm64)"
> > > > +Configure the thread's SVE vector length,
> > > > +as specified by
> > > > +.IR "(int) arg2" .
> > > > +Arguments
> > > > +.IR arg3 ", " arg4 " and " arg5
> > > > +are ignored.
> > > > +.IP
> > > > +The bits of
> > > > +.I arg2
> > > > +corresponding to
> > > > +.B PR_SVE_VL_LEN_MASK
> > > > +must be set to the desired vector length in bytes.
> > > > +This is interpreted as an upper bound:
> > > > +the kernel will select the greatest available vector length
> > > > +that does not exceed the value specified.
> > > > +In particular, specifying
> > > > +.B SVE_VL_MAX
> > > > +(defined in
> > > > +.I <asm/sigcontext.h>)
> > > > +for the
> > > > +.B PR_SVE_VL_LEN_MASK
> > > > +bits requests the maximum supported vector length.
> > > > +.IP
> > > > +In addition,
> > > > +.I arg2
> > > > +must be set to one of the following combinations of flags:
> > > 
> > > How about saying:
> > > 
> > >   In addition, the other bits of arg2 must be set according to the following
> > >   combinations of flags:
> > > 
> > > Otherwise I find it a bit fiddly to read, because it's valid to have
> > > flags of 0 and a non-zero length.
> > 
> > 0 is listed, so I hoped that was clear enough.
> > 
> > Maybe just write "must be one of the following values:"?
> > 
> > 0 is a value, but I can see why you might be uneasy about 0 being
> > described as a "combination of flags".
> 
> It's more that arg2 *also* holds the length, so saying that arg2 must
> be set to a combination of flags isn't quite right, because it's actually
> to set to a combination of flags and the length.
> 
> > > > +.RS
> > > > +.TP
> > > > +.B 0
> > > > +Perform the change immediately.
> > > > +At the next
> > > > +.BR execve (2)
> > > > +in the thread,
> > > > +the vector length will be reset to the value configured in
> > > > +.IR /proc/sys/abi/sve_default_vector_length .
> > > 
> > > (implementation note: does this mean that 'sve_default_vl' should be
> > >  an atomic_t, as it can be accessed concurrently? We probably need
> > >  {READ,WRITE}_ONCE() at the very least, as I'm not seeing any locks
> > >  that help us here...)
> > 
> > Is this purely theoretical?  Can you point to what could go wrong?
> 
> If the write is torn by the compiler, then a concurrent reader could end
> up seeing a bogus value. There could also be ToCToU issues if it's re-read.

It won't be torn in practice, no decision logic depends on the value
read, and you can't even get from the write to the read or vice-versa
without crossing a TU boundary (even under LTO), so there's basically
zero scope for sabotXXXXXoptimisation by the compiler.

Only root is allowed to write this thing anyway.

> > While I doubt I thought about this very hard and I agree that you're
> > right in principle, I think there are probably non-atomic sysctls and
> > debugs files etc. all over the place.
> > 
> > I didn't want to clutter the code unnecessarily.
> 
> Right, but KCSAN is coming along and so somebody less familiar with the code
> will hit this eventually.

So the issue is theoretical, probably one of very many similar issues,
and anyway we have a tool for tracking them down if we need to?

I'm playing devil's advocate here, but I'd debate whether it's worth
it -- or even wise -- to fix these piecemeal unless we're confident this
is an egregious case.  Doing so may encourage a false sense of safety.
When we're in a position to do a treewide cleanup, that would be better,
no?

> > > > +.B PR_SVE_VL_INHERIT
> > > > +Perform the change immediately.
> > > > +Subsequent
> > > > +.BR execve (2)
> > > > +calls will preserve the new vector length.
> > > > +.TP
> > > > +.B PR_SVE_SET_VL_ONEXEC
> > > > +Defer the change, so that it is performed at the next
> > > > +.BR execve (2)
> > > > +in the thread.
> > > > +Further
> > > > +.BR execve (2)
> > > > +calls will reset the vector length to the value configured in
> > > > +.IR /proc/sys/abi/sve_default_vector_length .
> > > > +.TP
> > > > +.B "PR_SVE_SET_VL_ONEXEC | PR_SVE_VL_INHERIT"
> > > > +Defer the change, so that it is performed at the next
> > > > +.BR execve (2)
> > > > +in the thread.
> > > > +Further
> > > > +.BR execve (2)
> > > > +calls will preserve the new vector length.
> > > > +.RE
> > > > +.IP
> > > > +In all cases,
> > > > +any previously pending deferred change is canceled.
> > > > +.IP
> > > > +The call fails with error
> > > > +.B EINVAL
> > > > +if SVE is not supported on the platform, if
> > > > +.I arg2
> > > > +is unrecognized or invalid, or the value in the bits of
> > > > +.I arg2
> > > > +corresponding to
> > > > +.B PR_SVE_VL_LEN_MASK
> > > > +is outside the range
> > > > +.BR SVE_VL_MIN .. SVE_VL_MAX
> > > > +or is not a multiple of 16.
> > > > +.IP
> > > > +On success,
> > > > +a nonnegative value is returned that describes the
> > > > +.I selected
> > > > +configuration,
> > > 
> > > If I'm reading the kernel code correctly, this is slightly weird, as
> > > the returned value may contain the PR_SVE_VL_INHERIT flag but it will
> > > never contain the PR_SVE_SET_VL_ONEXEC flag. Is that right?
> > 
> > Yes, which is an oddity.
> > 
> > I suppose we could fake that up actually by returning that flag if
> > sve_vl and sve_vl_onexec are different, but we don't currently do this.
> 
> I don't think there's any need to change the code, but I think this stuff
> is worth documenting.
> 
> > > If so, maybe just say something like:
> > > 
> > >   On success, a nonnegative value is returned that describes the selected
> > >   configuration in the same way as PR_SVE_GET_VL.
> > 
> > How does that help?  PR_SVE_GET_VL doesn't fully clarify the oddity you
> > call out anyway.
> 
> It clarifies it enough for my liking (by explicitly talking about "the bit
> corresponding to PR_SVE_VL_INHERIT" and not about PR_SVE_SET_VL_ONEXEC),
> but either way, I think saying that the return value is the same is a
> useful clarification. If you want to make PR_SVE_GET_VL more explicit,
> we could do that too.

Fair enough.  I'll just refer to PR_SVE_GET_VL, as you suggest.

I'm not keen to add any new wording at this stage.

> > > > +.B PR_SVE_SET_VL_ONEXEC
> > > > +flag may crash the calling process.
> > > > +The conditions for using it safely are complex and system-dependent.
> > > > +Don't use it unless you really know what you are doing.
> > > > +.IP
> > > > +For more information, see the kernel source file
> > > > +.I Documentation/arm64/sve.rst
> > > > +.\"commit b693d0b372afb39432e1c49ad7b3454855bc6bed
> > > > +(or
> > > > +.I Documentation/arm64/sve.txt
> > > > +before Linux 5.3).
> > > 
> > > I think I'd drop the kernel reference here, as it feels like we're saying
> > > "only do this if you know what you're doing" on one hand, but then "if you
> > > don't know what you're doing, see this other documentation" on the other.
> > 
> > Well, the docmuentation doesn't answer those questions either.
> > 
> > I could just swap the warning and the cross-reference, so that the
> > cross-reference doesn't seem to follow on from "knowing what you're
> > doing"?
> 
> Ok.

OK, I'll aim to do that then.

Cheers
---Dave
Dave Martin June 10, 2020, 9:45 a.m. UTC | #6
On Tue, Jun 09, 2020 at 01:39:03PM +0200, Michael Kerrisk (man-pages) wrote:
> Hi Dave,
> 
> I've not applied this patch yet, in case you want to make
> some changes in response to Will's comments.
> 
> I think all of the rest of the patches in the series are now
> applied (and pushed to master).
> 
> Thanks,

Ack, thanks
---Dave
Will Deacon June 10, 2020, 10:16 a.m. UTC | #7
[Dropped linux-man and Michael]

On Wed, Jun 10, 2020 at 10:44:42AM +0100, Dave Martin wrote:
> On Tue, Jun 09, 2020 at 03:49:05PM +0100, Will Deacon wrote:
> > On Tue, Jun 09, 2020 at 03:11:42PM +0100, Dave Martin wrote:
> > > On Tue, Jun 09, 2020 at 10:57:35AM +0100, Will Deacon wrote:
> > > > On Wed, May 27, 2020 at 10:17:36PM +0100, Dave Martin wrote:
> > > > > +.RS
> > > > > +.TP
> > > > > +.B 0
> > > > > +Perform the change immediately.
> > > > > +At the next
> > > > > +.BR execve (2)
> > > > > +in the thread,
> > > > > +the vector length will be reset to the value configured in
> > > > > +.IR /proc/sys/abi/sve_default_vector_length .
> > > > 
> > > > (implementation note: does this mean that 'sve_default_vl' should be
> > > >  an atomic_t, as it can be accessed concurrently? We probably need
> > > >  {READ,WRITE}_ONCE() at the very least, as I'm not seeing any locks
> > > >  that help us here...)
> > > 
> > > Is this purely theoretical?  Can you point to what could go wrong?
> > 
> > If the write is torn by the compiler, then a concurrent reader could end
> > up seeing a bogus value. There could also be ToCToU issues if it's re-read.
> 
> It won't be torn in practice, no decision logic depends on the value
> read, and you can't even get from the write to the read or vice-versa
> without crossing a TU boundary (even under LTO), so there's basically
> zero scope for sabotXXXXXoptimisation by the compiler.

Perhaps, but I'm not brave enough to state that :) Look at this crazy
thing, for example:

https://lore.kernel.org/lkml/CAG48ez2nFks+yN1Kp4TZisso+rjvv_4UW0FTo8iFUd4Qyq1qDw@mail.gmail.com/

Could the same sort of technique be applied to:


	vl = current->thread.sve_vl_onexec ?
	     current->thread.sve_vl_onexec : sve_default_vl;

	if (WARN_ON(!sve_vl_valid(vl)))
		vl = SVE_VL_MIN;

	supported_vl = find_supported_vector_length(vl);


so that the compiled code does something like:


	if (within_valid_bounds(sve_default_vl)) {
		supported_vl = jump_table(sve_default_vl); // Reload the variable
	} else {
		WARN_ON(1);
		supported_vl = SVE_VL_MIN;
	}


?

I'd certainly prefer not to have to think about that!

> Only root is allowed to write this thing anyway.
> 
> > > While I doubt I thought about this very hard and I agree that you're
> > > right in principle, I think there are probably non-atomic sysctls and
> > > debugs files etc. all over the place.
> > > 
> > > I didn't want to clutter the code unnecessarily.
> > 
> > Right, but KCSAN is coming along and so somebody less familiar with the code
> > will hit this eventually.
> 
> So the issue is theoretical, probably one of very many similar issues,
> and anyway we have a tool for tracking them down if we need to?
> 
> I'm playing devil's advocate here, but I'd debate whether it's worth
> it -- or even wise -- to fix these piecemeal unless we're confident this
> is an egregious case.  Doing so may encourage a false sense of safety.
> When we're in a position to do a treewide cleanup, that would be better,
> no?

That's a good point, but it is inevitable that people will try to attempt
treewide introduction of {READ,WRITE}_ONCE() based solely on KCSAN reports
rather than an understanding of the code, and so I'd much rather somebody
who understands the code (that's you ;) deals with it first.

If the race is benign, then you can annotate the accesses with data_race()
and add a comment along the lines of your "It won't be torn in practice..."
paragraph above.

Anyway, this is entirely independent to the manpage effort, just that the
concurrency wasn't clear to me before I read what you'd written and thought
I'd mention this before I forget. It's also looking less likely that KCSAN
is going to land for 5.8, so there's no urgency to this at all.

Will
Dave Martin June 10, 2020, 12:48 p.m. UTC | #8
On Wed, Jun 10, 2020 at 11:16:49AM +0100, Will Deacon wrote:
> [Dropped linux-man and Michael]
> 
> On Wed, Jun 10, 2020 at 10:44:42AM +0100, Dave Martin wrote:
> > On Tue, Jun 09, 2020 at 03:49:05PM +0100, Will Deacon wrote:
> > > On Tue, Jun 09, 2020 at 03:11:42PM +0100, Dave Martin wrote:
> > > > On Tue, Jun 09, 2020 at 10:57:35AM +0100, Will Deacon wrote:
> > > > > On Wed, May 27, 2020 at 10:17:36PM +0100, Dave Martin wrote:
> > > > > > +.RS
> > > > > > +.TP
> > > > > > +.B 0
> > > > > > +Perform the change immediately.
> > > > > > +At the next
> > > > > > +.BR execve (2)
> > > > > > +in the thread,
> > > > > > +the vector length will be reset to the value configured in
> > > > > > +.IR /proc/sys/abi/sve_default_vector_length .
> > > > > 
> > > > > (implementation note: does this mean that 'sve_default_vl' should be
> > > > >  an atomic_t, as it can be accessed concurrently? We probably need
> > > > >  {READ,WRITE}_ONCE() at the very least, as I'm not seeing any locks
> > > > >  that help us here...)
> > > > 
> > > > Is this purely theoretical?  Can you point to what could go wrong?
> > > 
> > > If the write is torn by the compiler, then a concurrent reader could end
> > > up seeing a bogus value. There could also be ToCToU issues if it's re-read.
> > 
> > It won't be torn in practice, no decision logic depends on the value
> > read, and you can't even get from the write to the read or vice-versa
> > without crossing a TU boundary (even under LTO), so there's basically
> > zero scope for sabotXXXXXoptimisation by the compiler.
> 
> Perhaps, but I'm not brave enough to state that :) Look at this crazy
> thing, for example:
> 
> https://lore.kernel.org/lkml/CAG48ez2nFks+yN1Kp4TZisso+rjvv_4UW0FTo8iFUd4Qyq1qDw@mail.gmail.com/
> 
> Could the same sort of technique be applied to:
> 
> 
> 	vl = current->thread.sve_vl_onexec ?
> 	     current->thread.sve_vl_onexec : sve_default_vl;
> 
> 	if (WARN_ON(!sve_vl_valid(vl)))
> 		vl = SVE_VL_MIN;
> 
> 	supported_vl = find_supported_vector_length(vl);
> 
> 
> so that the compiled code does something like:
> 
> 
> 	if (within_valid_bounds(sve_default_vl)) {
> 		supported_vl = jump_table(sve_default_vl); // Reload the variable
> 	} else {
> 		WARN_ON(1);
> 		supported_vl = SVE_VL_MIN;
> 	}
> 
> 
> ?
> 
> I'd certainly prefer not to have to think about that!

Well sure, but the compiler has much to lose and nothing to gain from
such a transformation here.  This is a bit different from a load of
conditional code that can be heavily const-folded during specialisation.

Anyway, I'm not saying that you're not correct about the risk, just that
this feels like a common pattern.


> > Only root is allowed to write this thing anyway.
> > 
> > > > While I doubt I thought about this very hard and I agree that you're
> > > > right in principle, I think there are probably non-atomic sysctls and
> > > > debugs files etc. all over the place.
> > > > 
> > > > I didn't want to clutter the code unnecessarily.
> > > 
> > > Right, but KCSAN is coming along and so somebody less familiar with the code
> > > will hit this eventually.
> > 
> > So the issue is theoretical, probably one of very many similar issues,
> > and anyway we have a tool for tracking them down if we need to?
> > 
> > I'm playing devil's advocate here, but I'd debate whether it's worth
> > it -- or even wise -- to fix these piecemeal unless we're confident this
> > is an egregious case.  Doing so may encourage a false sense of safety.
> > When we're in a position to do a treewide cleanup, that would be better,
> > no?
> 
> That's a good point, but it is inevitable that people will try to attempt
> treewide introduction of {READ,WRITE}_ONCE() based solely on KCSAN reports
> rather than an understanding of the code, and so I'd much rather somebody
> who understands the code (that's you ;) deals with it first.
> 
> If the race is benign, then you can annotate the accesses with data_race()
> and add a comment along the lines of your "It won't be torn in practice..."
> paragraph above.

Oh, it's complex enough to reason about that we should definitely use
proper atomics here so that we don't have to think about it.  Also, I'd
concede that the fact that this code has a custom sysctl accessor may
make justify a special case fix.

For most users, it would be better to clip sysctl's wings so that only
atomic accesses are allowed if the default implementation is used.  sysctl
is not a fast path: for single values of fundamental types, there's no
reason I can think of not to use atomics across the board.


> Anyway, this is entirely independent to the manpage effort, just that the
> concurrency wasn't clear to me before I read what you'd written and thought
> I'd mention this before I forget. It's also looking less likely that KCSAN
> is going to land for 5.8, so there's no urgency to this at all.

Sure, and I don't think I thought much beyond "I wonder what happens if
... nah, probably fine, if it mattered then everyone would be doing it."

I'm pretty sure I didn't get that wording out of the C spec.

It's a good spot though, and I may look at a fix if I get around to it.
Can't promise when, though.

Cheers
---Dave
diff mbox series

Patch

diff --git a/man2/prctl.2 b/man2/prctl.2
index cab9915..91df7c8 100644
--- a/man2/prctl.2
+++ b/man2/prctl.2
@@ -1291,6 +1291,148 @@  call failing with the error
 .BR ENXIO .
 For further details, see the kernel source file
 .IR Documentation/admin\-guide/kernel\-parameters.txt .
+.\" prctl PR_SVE_SET_VL
+.\" commit 2d2123bc7c7f843aa9db87720de159a049839862
+.\" linux-5.6/Documentation/arm64/sve.rst
+.TP
+.BR PR_SVE_SET_VL " (since Linux 4.15, only on arm64)"
+Configure the thread's SVE vector length,
+as specified by
+.IR "(int) arg2" .
+Arguments
+.IR arg3 ", " arg4 " and " arg5
+are ignored.
+.IP
+The bits of
+.I arg2
+corresponding to
+.B PR_SVE_VL_LEN_MASK
+must be set to the desired vector length in bytes.
+This is interpreted as an upper bound:
+the kernel will select the greatest available vector length
+that does not exceed the value specified.
+In particular, specifying
+.B SVE_VL_MAX
+(defined in
+.I <asm/sigcontext.h>)
+for the
+.B PR_SVE_VL_LEN_MASK
+bits requests the maximum supported vector length.
+.IP
+In addition,
+.I arg2
+must be set to one of the following combinations of flags:
+.RS
+.TP
+.B 0
+Perform the change immediately.
+At the next
+.BR execve (2)
+in the thread,
+the vector length will be reset to the value configured in
+.IR /proc/sys/abi/sve_default_vector_length .
+.TP
+.B PR_SVE_VL_INHERIT
+Perform the change immediately.
+Subsequent
+.BR execve (2)
+calls will preserve the new vector length.
+.TP
+.B PR_SVE_SET_VL_ONEXEC
+Defer the change, so that it is performed at the next
+.BR execve (2)
+in the thread.
+Further
+.BR execve (2)
+calls will reset the vector length to the value configured in
+.IR /proc/sys/abi/sve_default_vector_length .
+.TP
+.B "PR_SVE_SET_VL_ONEXEC | PR_SVE_VL_INHERIT"
+Defer the change, so that it is performed at the next
+.BR execve (2)
+in the thread.
+Further
+.BR execve (2)
+calls will preserve the new vector length.
+.RE
+.IP
+In all cases,
+any previously pending deferred change is canceled.
+.IP
+The call fails with error
+.B EINVAL
+if SVE is not supported on the platform, if
+.I arg2
+is unrecognized or invalid, or the value in the bits of
+.I arg2
+corresponding to
+.B PR_SVE_VL_LEN_MASK
+is outside the range
+.BR SVE_VL_MIN .. SVE_VL_MAX
+or is not a multiple of 16.
+.IP
+On success,
+a nonnegative value is returned that describes the
+.I selected
+configuration,
+which may differ from the current configuration if
+.B PR_SVE_SET_VL_ONEXEC
+was specified.
+The value is encoded in the same way as the return value of
+.BR PR_SVE_GET_VL .
+.IP
+The configuration (including any pending deferred change)
+is inherited across
+.BR fork (2)
+and
+.BR clone (2).
+.IP
+.B Warning:
+Because the compiler or run-time environment
+may be using SVE, using this call without the
+.B PR_SVE_SET_VL_ONEXEC
+flag may crash the calling process.
+The conditions for using it safely are complex and system-dependent.
+Don't use it unless you really know what you are doing.
+.IP
+For more information, see the kernel source file
+.I Documentation/arm64/sve.rst
+.\"commit b693d0b372afb39432e1c49ad7b3454855bc6bed
+(or
+.I Documentation/arm64/sve.txt
+before Linux 5.3).
+.\" prctl PR_SVE_GET_VL
+.TP
+.BR PR_SVE_GET_VL " (since Linux 4.15, only on arm64)"
+Get the thread's current SVE vector length configuration.
+.IP
+Arguments
+.IR arg2 ", " arg3 ", " arg4 " and " arg5
+are ignored.
+.IP
+Providing that the kernel and platform support SVE
+this operation always succeeds,
+returning a nonnegative value that describes the
+.I current
+configuration.
+The bits corresponding to
+.B PR_SVE_VL_LEN_MASK
+contain the currently configured vector length in bytes.
+The bit corresponding to
+.B PR_SVE_VL_INHERIT
+indicates whether the vector length will be inherited
+across
+.BR execve (2).
+.IP
+Note that there is no way to determine whether there is
+a pending vector length change that has not yet taken effect.
+.IP
+For more information, see the kernel source file
+.I Documentation/arm64/sve.rst
+.\"commit b693d0b372afb39432e1c49ad7b3454855bc6bed
+(or
+.I Documentation/arm64/sve.txt
+before Linux 5.3).
 .\"
 .\" prctl PR_TASK_PERF_EVENTS_DISABLE
 .TP
@@ -1534,6 +1676,8 @@  On success,
 .BR PR_GET_NO_NEW_PRIVS ,
 .BR PR_GET_SECUREBITS ,
 .BR PR_GET_SPECULATION_CTRL ,
+.BR PR_SVE_GET_VL ,
+.BR PR_SVE_SET_VL ,
 .BR PR_GET_THP_DISABLE ,
 .BR PR_GET_TIMING ,
 .BR PR_GET_TIMERSLACK ,
@@ -1817,6 +1961,22 @@  and unused arguments to
 .BR prctl ()
 are not 0.
 .TP
+.B EINVAL
+.I option
+is
+.B PR_SVE_SET_VL
+and the arguments are invalid or unsupported,
+or SVE is not available on this platform.
+See the description of
+.B PR_SVE_SET_VL
+above for details.
+.TP
+.B EINVAL
+.I option
+is
+.B PR_SVE_GET_VL
+and SVE is not available on this platform.
+.TP
 .B ENODEV
 .I option
 was