[v4,1/2] arm64: Define Documentation/arm64/tagged-address-abi.txt
diff mbox series

Message ID 20190612142111.28161-2-vincenzo.frascino@arm.com
State New
Headers show
Series
  • [v4,1/2] arm64: Define Documentation/arm64/tagged-address-abi.txt
Related show

Commit Message

Vincenzo Frascino June 12, 2019, 2:21 p.m. UTC
On arm64 the TCR_EL1.TBI0 bit has been always enabled hence
the userspace (EL0) is allowed to set a non-zero value in the
top byte but the resulting pointers are not allowed at the
user-kernel syscall ABI boundary.

With the relaxed ABI proposed through this document, it is now possible
to pass tagged pointers to the syscalls, when these pointers are in
memory ranges obtained by an anonymous (MAP_ANONYMOUS) mmap().

This change in the ABI requires a mechanism to requires the userspace
to opt-in to such an option.

Specify and document the way in which sysctl and prctl() can be used
in combination to allow the userspace to opt-in this feature.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
CC: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
---
 Documentation/arm64/tagged-address-abi.txt | 111 +++++++++++++++++++++
 1 file changed, 111 insertions(+)
 create mode 100644 Documentation/arm64/tagged-address-abi.txt

Comments

Catalin Marinas June 12, 2019, 3:35 p.m. UTC | #1
Hi Vincenzo,

Some minor comments below but it looks fine to me overall. Cc'ing
Szabolcs as well since I'd like a view from the libc people.

On Wed, Jun 12, 2019 at 03:21:10PM +0100, Vincenzo Frascino wrote:
> diff --git a/Documentation/arm64/tagged-address-abi.txt b/Documentation/arm64/tagged-address-abi.txt
> new file mode 100644
> index 000000000000..96e149e2c55c
> --- /dev/null
> +++ b/Documentation/arm64/tagged-address-abi.txt
> @@ -0,0 +1,111 @@
> +ARM64 TAGGED ADDRESS ABI
> +========================
> +
> +This document describes the usage and semantics of the Tagged Address
> +ABI on arm64.
> +
> +1. Introduction
> +---------------
> +
> +On arm64 the TCR_EL1.TBI0 bit has been always enabled on the arm64 kernel,
> +hence the userspace (EL0) is allowed to set a non-zero value in the top

I'd be clearer here: "userspace (EL0) is allowed to perform a user
memory access through a 64-bit pointer with a non-zero top byte" (or
something along the lines). Otherwise setting a non-zero top byte is
allowed on any architecture, dereferencing it is a problem.

> +byte but the resulting pointers are not allowed at the user-kernel syscall
> +ABI boundary.
> +
> +This document describes a relaxation of the ABI with which it is possible

"relaxation of the ABI that makes it possible to..."

> +to pass tagged tagged pointers to the syscalls, when these pointers are in
> +memory ranges obtained as described in paragraph 2.

"section 2" is better. There are a lot more paragraphs.

> +
> +Since it is not desirable to relax the ABI to allow tagged user addresses
> +into the kernel indiscriminately, arm64 provides a new sysctl interface
> +(/proc/sys/abi/tagged_addr) that is used to prevent the applications from
> +enabling the relaxed ABI and a new prctl() interface that can be used to
> +enable or disable the relaxed ABI.
> +
> +The sysctl is meant also for testing purposes in order to provide a simple
> +way for the userspace to verify the return error checking of the prctl()
> +command without having to reconfigure the kernel.
> +
> +The ABI properties are inherited by threads of the same application and
> +fork()'ed children but cleared when a new process is spawn (execve()).

"spawned".

I guess you could drop these three paragraphs here and mention the
inheritance properties when introducing the prctl() below. You can also
mention the global sysctl switch after the prctl() was introduced.

> +
> +2. ARM64 Tagged Address ABI
> +---------------------------
> +
> +From the kernel syscall interface prospective, we define, for the purposes
> +of this document, a "valid tagged pointer" as a pointer that either it has

"either has" (no 'it') sounds slightly better but I'm not a native
English speaker either.

> +a zero value set in the top byte or it has a non-zero value, it is in memory
> +ranges privately owned by a userspace process and it is obtained in one of
> +the following ways:
> +  - mmap() done by the process itself, where either:
> +    * flags = MAP_PRIVATE | MAP_ANONYMOUS
> +    * flags = MAP_PRIVATE and the file descriptor refers to a regular
> +      file or "/dev/zero"
> +  - a mapping below sbrk(0) done by the process itself
> +  - any memory mapped by the kernel in the process's address space during
> +    creation and following the restrictions presented above (i.e. data, bss,
> +    stack).
> +
> +The ARM64 Tagged Address ABI is an opt-in feature, and an application can
> +control it using the following prctl()s:
> +  - PR_SET_TAGGED_ADDR_CTRL: can be used to enable the Tagged Address ABI.

enable or disable (not sure we need the latter but it doesn't heart).

I'd add the arg2 description here as well.

> +  - PR_GET_TAGGED_ADDR_CTRL: can be used to check the status of the Tagged
> +                             Address ABI.
> +
> +As a consequence of invoking PR_SET_TAGGED_ADDR_CTRL prctl() by an applications,
> +the ABI guarantees the following behaviours:
> +
> +  - Every current or newly introduced syscall can accept any valid tagged
> +    pointers.
> +
> +  - If a non valid tagged pointer is passed to a syscall then the behaviour
> +    is undefined.
> +
> +  - Every valid tagged pointer is expected to work as an untagged one.
> +
> +  - The kernel preserves any valid tagged pointers and returns them to the
> +    userspace unchanged in all the cases except the ones documented in the
> +    "Preserving tags" paragraph of tagged-pointers.txt.

I'd think we need to qualify the context here in which the kernel
preserves the tagged pointers. Did you mean on the syscall return?

> +
> +A definition of the meaning of tagged pointers on arm64 can be found in:
> +Documentation/arm64/tagged-pointers.txt.
> +
> +3. ARM64 Tagged Address ABI Exceptions
> +--------------------------------------
> +
> +The behaviours described in paragraph 2, with particular reference to the

"section 2"

> +acceptance by the syscalls of any valid tagged pointer are not applicable
> +to the following cases:
> +  - mmap() addr parameter.
> +  - mremap() new_address parameter.
> +  - prctl_set_mm() struct prctl_map fields.
> +  - prctl_set_mm_map() struct prctl_map fields.
> +
> +4. Example of correct usage
> +---------------------------
> +
> +void main(void)
> +{
> +	static int tbi_enabled = 0;
> +	unsigned long tag = 0;
> +
> +	char *ptr = mmap(NULL, PAGE_SIZE, PROT_READ | PROT_WRITE,
> +			 MAP_ANONYMOUS, -1, 0);
> +
> +	if (prctl(PR_SET_TAGGED_ADDR_CTRL, PR_TAGGED_ADDR_ENABLE,
> +		  0, 0, 0) == 0)
> +		tbi_enabled = 1;
> +
> +	if (!ptr)
> +		return -1;
> +
> +	if (tbi_enabled)
> +		tag = rand() & 0xff;
> +
> +	ptr = (char *)((unsigned long)ptr | (tag << TAG_SHIFT));
> +
> +	*ptr = 'a';
> +
> +	...
> +}
> +
> -- 
> 2.21.0
Szabolcs Nagy June 12, 2019, 4:30 p.m. UTC | #2
On 12/06/2019 15:21, Vincenzo Frascino wrote:
> On arm64 the TCR_EL1.TBI0 bit has been always enabled hence
> the userspace (EL0) is allowed to set a non-zero value in the
> top byte but the resulting pointers are not allowed at the
> user-kernel syscall ABI boundary.
> 
> With the relaxed ABI proposed through this document, it is now possible
> to pass tagged pointers to the syscalls, when these pointers are in
> memory ranges obtained by an anonymous (MAP_ANONYMOUS) mmap().
> 
> This change in the ABI requires a mechanism to requires the userspace
> to opt-in to such an option.
> 
> Specify and document the way in which sysctl and prctl() can be used
> in combination to allow the userspace to opt-in this feature.
> 
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> CC: Andrey Konovalov <andreyknvl@google.com>
> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
> ---
>  Documentation/arm64/tagged-address-abi.txt | 111 +++++++++++++++++++++
>  1 file changed, 111 insertions(+)
>  create mode 100644 Documentation/arm64/tagged-address-abi.txt
> 
> diff --git a/Documentation/arm64/tagged-address-abi.txt b/Documentation/arm64/tagged-address-abi.txt
> new file mode 100644
> index 000000000000..96e149e2c55c
> --- /dev/null
> +++ b/Documentation/arm64/tagged-address-abi.txt
> @@ -0,0 +1,111 @@
> +ARM64 TAGGED ADDRESS ABI
> +========================
> +
> +This document describes the usage and semantics of the Tagged Address
> +ABI on arm64.
> +
> +1. Introduction
> +---------------
> +
> +On arm64 the TCR_EL1.TBI0 bit has been always enabled on the arm64 kernel,
> +hence the userspace (EL0) is allowed to set a non-zero value in the top
> +byte but the resulting pointers are not allowed at the user-kernel syscall
> +ABI boundary.
> +
> +This document describes a relaxation of the ABI with which it is possible
> +to pass tagged tagged pointers to the syscalls, when these pointers are in
           ^^^^^^^^^^^^^
typo.

> +memory ranges obtained as described in paragraph 2.
> +
> +Since it is not desirable to relax the ABI to allow tagged user addresses
> +into the kernel indiscriminately, arm64 provides a new sysctl interface
> +(/proc/sys/abi/tagged_addr) that is used to prevent the applications from
> +enabling the relaxed ABI and a new prctl() interface that can be used to
> +enable or disable the relaxed ABI.
> +
> +The sysctl is meant also for testing purposes in order to provide a simple
> +way for the userspace to verify the return error checking of the prctl()
> +command without having to reconfigure the kernel.
> +
> +The ABI properties are inherited by threads of the same application and
> +fork()'ed children but cleared when a new process is spawn (execve()).

OK.

> +
> +2. ARM64 Tagged Address ABI
> +---------------------------
> +
> +From the kernel syscall interface prospective, we define, for the purposes
                                     ^^^^^^^^^^^
perspective

> +of this document, a "valid tagged pointer" as a pointer that either it has
> +a zero value set in the top byte or it has a non-zero value, it is in memory
> +ranges privately owned by a userspace process and it is obtained in one of
> +the following ways:
> +  - mmap() done by the process itself, where either:
> +    * flags = MAP_PRIVATE | MAP_ANONYMOUS
> +    * flags = MAP_PRIVATE and the file descriptor refers to a regular
> +      file or "/dev/zero"

this does not make it clear if MAP_FIXED or other
flags are valid (there are many map flags i don't
know, but at least fixed should work and stack/growsdown.
i'd expect anything that's not incompatible with
private|anon to work).

> +  - a mapping below sbrk(0) done by the process itself

doesn't the mmap rule cover this?

> +  - any memory mapped by the kernel in the process's address space during
> +    creation and following the restrictions presented above (i.e. data, bss,
> +    stack).

OK.

Can a null pointer have a tag?
(in case NULL is valid to pass to a syscall)

> +
> +The ARM64 Tagged Address ABI is an opt-in feature, and an application can
> +control it using the following prctl()s:
> +  - PR_SET_TAGGED_ADDR_CTRL: can be used to enable the Tagged Address ABI.
> +  - PR_GET_TAGGED_ADDR_CTRL: can be used to check the status of the Tagged
> +                             Address ABI.
> +
> +As a consequence of invoking PR_SET_TAGGED_ADDR_CTRL prctl() by an applications,
> +the ABI guarantees the following behaviours:
> +
> +  - Every current or newly introduced syscall can accept any valid tagged
> +    pointers.
> +
> +  - If a non valid tagged pointer is passed to a syscall then the behaviour
> +    is undefined.
> +
> +  - Every valid tagged pointer is expected to work as an untagged one.
> +
> +  - The kernel preserves any valid tagged pointers and returns them to the
> +    userspace unchanged in all the cases except the ones documented in the
> +    "Preserving tags" paragraph of tagged-pointers.txt.

OK.

i guess pointers of another process are not "valid tagged
pointers" for the current one, so e.g. in ptrace the
ptracer has to clear the tags before PEEK etc.

> +
> +A definition of the meaning of tagged pointers on arm64 can be found in:
> +Documentation/arm64/tagged-pointers.txt.
> +
> +3. ARM64 Tagged Address ABI Exceptions
> +--------------------------------------
> +
> +The behaviours described in paragraph 2, with particular reference to the
> +acceptance by the syscalls of any valid tagged pointer are not applicable
> +to the following cases:
> +  - mmap() addr parameter.
> +  - mremap() new_address parameter.
> +  - prctl_set_mm() struct prctl_map fields.
> +  - prctl_set_mm_map() struct prctl_map fields.

i don't understand the exception: does it mean
that passing a tagged address to these syscalls
is undefined?

> +
> +4. Example of correct usage
> +---------------------------
> +
> +void main(void)
> +{
> +	static int tbi_enabled = 0;
> +	unsigned long tag = 0;
> +
> +	char *ptr = mmap(NULL, PAGE_SIZE, PROT_READ | PROT_WRITE,
> +			 MAP_ANONYMOUS, -1, 0);
> +
> +	if (prctl(PR_SET_TAGGED_ADDR_CTRL, PR_TAGGED_ADDR_ENABLE,
> +		  0, 0, 0) == 0)
> +		tbi_enabled = 1;
> +
> +	if (!ptr)
> +		return -1;

mmap returns MAP_FAILED on failure.

> +
> +	if (tbi_enabled)
> +		tag = rand() & 0xff;
> +
> +	ptr = (char *)((unsigned long)ptr | (tag << TAG_SHIFT));
> +
> +	*ptr = 'a';
> +
> +	...
> +}
> +
>
Catalin Marinas June 13, 2019, 9:20 a.m. UTC | #3
Hi Szabolcs,

On Wed, Jun 12, 2019 at 05:30:34PM +0100, Szabolcs Nagy wrote:
> On 12/06/2019 15:21, Vincenzo Frascino wrote:
> > +2. ARM64 Tagged Address ABI
> > +---------------------------
> > +
> > +From the kernel syscall interface prospective, we define, for the purposes
>                                      ^^^^^^^^^^^
> perspective
> 
> > +of this document, a "valid tagged pointer" as a pointer that either it has
> > +a zero value set in the top byte or it has a non-zero value, it is in memory
> > +ranges privately owned by a userspace process and it is obtained in one of
> > +the following ways:
> > +  - mmap() done by the process itself, where either:
> > +    * flags = MAP_PRIVATE | MAP_ANONYMOUS
> > +    * flags = MAP_PRIVATE and the file descriptor refers to a regular
> > +      file or "/dev/zero"
> 
> this does not make it clear if MAP_FIXED or other flags are valid
> (there are many map flags i don't know, but at least fixed should work
> and stack/growsdown. i'd expect anything that's not incompatible with
> private|anon to work).

Just to clarify, this document tries to define the memory ranges from
where tagged addresses can be passed into the kernel in the context
of TBI only (not MTE); that is for hwasan support. FIXED or GROWSDOWN
should not affect this.

> > +  - a mapping below sbrk(0) done by the process itself
> 
> doesn't the mmap rule cover this?

IIUC it doesn't cover it as that's memory mapped by the kernel
automatically on access vs a pointer returned by mmap(). The statement
above talks about how the address is obtained by the user.

> > +  - any memory mapped by the kernel in the process's address space during
> > +    creation and following the restrictions presented above (i.e. data, bss,
> > +    stack).
> 
> OK.
> 
> Can a null pointer have a tag?
> (in case NULL is valid to pass to a syscall)

Good point. I don't think it can. We may change this for MTE where we
give a hint tag but no hint address, however, this document only covers
TBI for now.

> > +The ARM64 Tagged Address ABI is an opt-in feature, and an application can
> > +control it using the following prctl()s:
> > +  - PR_SET_TAGGED_ADDR_CTRL: can be used to enable the Tagged Address ABI.
> > +  - PR_GET_TAGGED_ADDR_CTRL: can be used to check the status of the Tagged
> > +                             Address ABI.
> > +
> > +As a consequence of invoking PR_SET_TAGGED_ADDR_CTRL prctl() by an applications,
> > +the ABI guarantees the following behaviours:
> > +
> > +  - Every current or newly introduced syscall can accept any valid tagged
> > +    pointers.
> > +
> > +  - If a non valid tagged pointer is passed to a syscall then the behaviour
> > +    is undefined.
> > +
> > +  - Every valid tagged pointer is expected to work as an untagged one.
> > +
> > +  - The kernel preserves any valid tagged pointers and returns them to the
> > +    userspace unchanged in all the cases except the ones documented in the
> > +    "Preserving tags" paragraph of tagged-pointers.txt.
> 
> OK.
> 
> i guess pointers of another process are not "valid tagged pointers"
> for the current one, so e.g. in ptrace the ptracer has to clear the
> tags before PEEK etc.

Another good point. Are there any pros/cons here or use-cases? When we
add MTE support, should we handle this differently?

> > +A definition of the meaning of tagged pointers on arm64 can be found in:
> > +Documentation/arm64/tagged-pointers.txt.
> > +
> > +3. ARM64 Tagged Address ABI Exceptions
> > +--------------------------------------
> > +
> > +The behaviours described in paragraph 2, with particular reference to the
> > +acceptance by the syscalls of any valid tagged pointer are not applicable
> > +to the following cases:
> > +  - mmap() addr parameter.
> > +  - mremap() new_address parameter.
> > +  - prctl_set_mm() struct prctl_map fields.
> > +  - prctl_set_mm_map() struct prctl_map fields.
> 
> i don't understand the exception: does it mean that passing a tagged
> address to these syscalls is undefined?

I'd say it's as undefined as it is right now without these patches. We
may be able to explain this better in the document.
Szabolcs Nagy June 13, 2019, 10:14 a.m. UTC | #4
On 13/06/2019 10:20, Catalin Marinas wrote:
> Hi Szabolcs,
> 
> On Wed, Jun 12, 2019 at 05:30:34PM +0100, Szabolcs Nagy wrote:
>> On 12/06/2019 15:21, Vincenzo Frascino wrote:
>>> +2. ARM64 Tagged Address ABI
>>> +---------------------------
>>> +
>>> +From the kernel syscall interface prospective, we define, for the purposes
>>                                      ^^^^^^^^^^^
>> perspective
>>
>>> +of this document, a "valid tagged pointer" as a pointer that either it has
>>> +a zero value set in the top byte or it has a non-zero value, it is in memory
>>> +ranges privately owned by a userspace process and it is obtained in one of
>>> +the following ways:
>>> +  - mmap() done by the process itself, where either:
>>> +    * flags = MAP_PRIVATE | MAP_ANONYMOUS
>>> +    * flags = MAP_PRIVATE and the file descriptor refers to a regular
>>> +      file or "/dev/zero"
>>
>> this does not make it clear if MAP_FIXED or other flags are valid
>> (there are many map flags i don't know, but at least fixed should work
>> and stack/growsdown. i'd expect anything that's not incompatible with
>> private|anon to work).
> 
> Just to clarify, this document tries to define the memory ranges from
> where tagged addresses can be passed into the kernel in the context
> of TBI only (not MTE); that is for hwasan support. FIXED or GROWSDOWN
> should not affect this.

yes, so either the text should list MAP_* flags that don't affect
the pointer tagging semantics or specify private|anon mapping
with different wording.

>>> +  - a mapping below sbrk(0) done by the process itself
>>
>> doesn't the mmap rule cover this?
> 
> IIUC it doesn't cover it as that's memory mapped by the kernel
> automatically on access vs a pointer returned by mmap(). The statement
> above talks about how the address is obtained by the user.

ok i read 'mapping below sbrk' as an mmap (possibly MAP_FIXED)
that happens to be below the heap area.

i think "below sbrk(0)" is not the best term to use: there
may be address range below the heap area that can be mmapped
and thus below sbrk(0) and sbrk is a posix api not a linux
syscall, the libc can implement it with mmap or whatever.

i'm not sure what the right term for 'heap area' is
(the address range between syscall(__NR_brk,0) at
program startup and its current value?)

>>> +  - any memory mapped by the kernel in the process's address space during
>>> +    creation and following the restrictions presented above (i.e. data, bss,
>>> +    stack).
>>
>> OK.
>>
>> Can a null pointer have a tag?
>> (in case NULL is valid to pass to a syscall)
> 
> Good point. I don't think it can. We may change this for MTE where we
> give a hint tag but no hint address, however, this document only covers
> TBI for now.

OK.

>>> +The ARM64 Tagged Address ABI is an opt-in feature, and an application can
>>> +control it using the following prctl()s:
>>> +  - PR_SET_TAGGED_ADDR_CTRL: can be used to enable the Tagged Address ABI.
>>> +  - PR_GET_TAGGED_ADDR_CTRL: can be used to check the status of the Tagged
>>> +                             Address ABI.
>>> +
>>> +As a consequence of invoking PR_SET_TAGGED_ADDR_CTRL prctl() by an applications,
>>> +the ABI guarantees the following behaviours:
>>> +
>>> +  - Every current or newly introduced syscall can accept any valid tagged
>>> +    pointers.
>>> +
>>> +  - If a non valid tagged pointer is passed to a syscall then the behaviour
>>> +    is undefined.
>>> +
>>> +  - Every valid tagged pointer is expected to work as an untagged one.
>>> +
>>> +  - The kernel preserves any valid tagged pointers and returns them to the
>>> +    userspace unchanged in all the cases except the ones documented in the
>>> +    "Preserving tags" paragraph of tagged-pointers.txt.
>>
>> OK.
>>
>> i guess pointers of another process are not "valid tagged pointers"
>> for the current one, so e.g. in ptrace the ptracer has to clear the
>> tags before PEEK etc.
> 
> Another good point. Are there any pros/cons here or use-cases? When we
> add MTE support, should we handle this differently?

i'm not sure what gdb does currently, but it has
an 'address_significant' hook used at a few places
that drops the tag on aarch64, so it probably
avoids passing tagged pointer to ptrace.

i was worried about strace which tries to print
structs passed to syscalls and follow pointers in
them which currently would work, but if we allow
tags in syscalls then it needs some update.
(i haven't checked the strace code though)

>>> +A definition of the meaning of tagged pointers on arm64 can be found in:
>>> +Documentation/arm64/tagged-pointers.txt.
>>> +
>>> +3. ARM64 Tagged Address ABI Exceptions
>>> +--------------------------------------
>>> +
>>> +The behaviours described in paragraph 2, with particular reference to the
>>> +acceptance by the syscalls of any valid tagged pointer are not applicable
>>> +to the following cases:
>>> +  - mmap() addr parameter.
>>> +  - mremap() new_address parameter.
>>> +  - prctl_set_mm() struct prctl_map fields.
>>> +  - prctl_set_mm_map() struct prctl_map fields.
>>
>> i don't understand the exception: does it mean that passing a tagged
>> address to these syscalls is undefined?
> 
> I'd say it's as undefined as it is right now without these patches. We
> may be able to explain this better in the document.
>
Vincenzo Frascino June 13, 2019, 10:15 a.m. UTC | #5
Hi Catalin,

On 12/06/2019 16:35, Catalin Marinas wrote:
> Hi Vincenzo,
> 
> Some minor comments below but it looks fine to me overall. Cc'ing
> Szabolcs as well since I'd like a view from the libc people.
> 

Thanks for this, I saw Szabolcs comments.

> On Wed, Jun 12, 2019 at 03:21:10PM +0100, Vincenzo Frascino wrote:
>> diff --git a/Documentation/arm64/tagged-address-abi.txt b/Documentation/arm64/tagged-address-abi.txt
>> new file mode 100644
>> index 000000000000..96e149e2c55c
>> --- /dev/null
>> +++ b/Documentation/arm64/tagged-address-abi.txt
>> @@ -0,0 +1,111 @@
>> +ARM64 TAGGED ADDRESS ABI
>> +========================
>> +
>> +This document describes the usage and semantics of the Tagged Address
>> +ABI on arm64.
>> +
>> +1. Introduction
>> +---------------
>> +
>> +On arm64 the TCR_EL1.TBI0 bit has been always enabled on the arm64 kernel,
>> +hence the userspace (EL0) is allowed to set a non-zero value in the top
> 
> I'd be clearer here: "userspace (EL0) is allowed to perform a user
> memory access through a 64-bit pointer with a non-zero top byte" (or
> something along the lines). Otherwise setting a non-zero top byte is
> allowed on any architecture, dereferencing it is a problem.
> 

Ok.

>> +byte but the resulting pointers are not allowed at the user-kernel syscall
>> +ABI boundary.
>> +
>> +This document describes a relaxation of the ABI with which it is possible
> 
> "relaxation of the ABI that makes it possible to..."
> 
>> +to pass tagged tagged pointers to the syscalls, when these pointers are in
>> +memory ranges obtained as described in paragraph 2.
> 
> "section 2" is better. There are a lot more paragraphs.
> 

Agree.

>> +
>> +Since it is not desirable to relax the ABI to allow tagged user addresses
>> +into the kernel indiscriminately, arm64 provides a new sysctl interface
>> +(/proc/sys/abi/tagged_addr) that is used to prevent the applications from
>> +enabling the relaxed ABI and a new prctl() interface that can be used to
>> +enable or disable the relaxed ABI.
>> +
>> +The sysctl is meant also for testing purposes in order to provide a simple
>> +way for the userspace to verify the return error checking of the prctl()
>> +command without having to reconfigure the kernel.
>> +
>> +The ABI properties are inherited by threads of the same application and
>> +fork()'ed children but cleared when a new process is spawn (execve()).
> 
> "spawned".
> 
> I guess you could drop these three paragraphs here and mention the
> inheritance properties when introducing the prctl() below. You can also
> mention the global sysctl switch after the prctl() was introduced.
> 

I will move the last two (rewording them) to the _section_ 2, but I would still
prefer the Introduction to give an overview of the solution as well.

>> +
>> +2. ARM64 Tagged Address ABI
>> +---------------------------
>> +
>> +From the kernel syscall interface prospective, we define, for the purposes
>> +of this document, a "valid tagged pointer" as a pointer that either it has
> 
> "either has" (no 'it') sounds slightly better but I'm not a native
> English speaker either.
> 
>> +a zero value set in the top byte or it has a non-zero value, it is in memory
>> +ranges privately owned by a userspace process and it is obtained in one of
>> +the following ways:
>> +  - mmap() done by the process itself, where either:
>> +    * flags = MAP_PRIVATE | MAP_ANONYMOUS
>> +    * flags = MAP_PRIVATE and the file descriptor refers to a regular
>> +      file or "/dev/zero"
>> +  - a mapping below sbrk(0) done by the process itself
>> +  - any memory mapped by the kernel in the process's address space during
>> +    creation and following the restrictions presented above (i.e. data, bss,
>> +    stack).
>> +
>> +The ARM64 Tagged Address ABI is an opt-in feature, and an application can
>> +control it using the following prctl()s:
>> +  - PR_SET_TAGGED_ADDR_CTRL: can be used to enable the Tagged Address ABI.
> 
> enable or disable (not sure we need the latter but it doesn't heart).
> 
> I'd add the arg2 description here as well.
> 

Good point I missed this.

>> +  - PR_GET_TAGGED_ADDR_CTRL: can be used to check the status of the Tagged
>> +                             Address ABI.
>> +
>> +As a consequence of invoking PR_SET_TAGGED_ADDR_CTRL prctl() by an applications,
>> +the ABI guarantees the following behaviours:
>> +
>> +  - Every current or newly introduced syscall can accept any valid tagged
>> +    pointers.
>> +
>> +  - If a non valid tagged pointer is passed to a syscall then the behaviour
>> +    is undefined.
>> +
>> +  - Every valid tagged pointer is expected to work as an untagged one.
>> +
>> +  - The kernel preserves any valid tagged pointers and returns them to the
>> +    userspace unchanged in all the cases except the ones documented in the
>> +    "Preserving tags" paragraph of tagged-pointers.txt.
> 
> I'd think we need to qualify the context here in which the kernel
> preserves the tagged pointers. Did you mean on the syscall return?
> 

What this means is that on syscall return the tags are preserved, but if for
example you have tagged pointers inside siginfo_t, they will not because
according to tagged-pointers.txt non-zero tags are not preserved when delivering
signals.

>> +
>> +A definition of the meaning of tagged pointers on arm64 can be found in:
>> +Documentation/arm64/tagged-pointers.txt.
>> +
>> +3. ARM64 Tagged Address ABI Exceptions
>> +--------------------------------------
>> +
>> +The behaviours described in paragraph 2, with particular reference to the
> 
> "section 2"
> 
>> +acceptance by the syscalls of any valid tagged pointer are not applicable
>> +to the following cases:
>> +  - mmap() addr parameter.
>> +  - mremap() new_address parameter.
>> +  - prctl_set_mm() struct prctl_map fields.
>> +  - prctl_set_mm_map() struct prctl_map fields.
>> +
>> +4. Example of correct usage
>> +---------------------------
>> +
>> +void main(void)
>> +{
>> +	static int tbi_enabled = 0;
>> +	unsigned long tag = 0;
>> +
>> +	char *ptr = mmap(NULL, PAGE_SIZE, PROT_READ | PROT_WRITE,
>> +			 MAP_ANONYMOUS, -1, 0);
>> +
>> +	if (prctl(PR_SET_TAGGED_ADDR_CTRL, PR_TAGGED_ADDR_ENABLE,
>> +		  0, 0, 0) == 0)
>> +		tbi_enabled = 1;
>> +
>> +	if (!ptr)
>> +		return -1;
>> +
>> +	if (tbi_enabled)
>> +		tag = rand() & 0xff;
>> +
>> +	ptr = (char *)((unsigned long)ptr | (tag << TAG_SHIFT));
>> +
>> +	*ptr = 'a';
>> +
>> +	...
>> +}
>> +
>> -- 
>> 2.21.0
>
Vincenzo Frascino June 13, 2019, 11:16 a.m. UTC | #6
Hi Szabolcs,

thank you for your review.

On 13/06/2019 11:14, Szabolcs Nagy wrote:
> On 13/06/2019 10:20, Catalin Marinas wrote:
>> Hi Szabolcs,
>>
>> On Wed, Jun 12, 2019 at 05:30:34PM +0100, Szabolcs Nagy wrote:
>>> On 12/06/2019 15:21, Vincenzo Frascino wrote:
>>>> +2. ARM64 Tagged Address ABI
>>>> +---------------------------
>>>> +
>>>> +From the kernel syscall interface prospective, we define, for the purposes
>>>                                      ^^^^^^^^^^^
>>> perspective
>>>
>>>> +of this document, a "valid tagged pointer" as a pointer that either it has
>>>> +a zero value set in the top byte or it has a non-zero value, it is in memory
>>>> +ranges privately owned by a userspace process and it is obtained in one of
>>>> +the following ways:
>>>> +  - mmap() done by the process itself, where either:
>>>> +    * flags = MAP_PRIVATE | MAP_ANONYMOUS
>>>> +    * flags = MAP_PRIVATE and the file descriptor refers to a regular
>>>> +      file or "/dev/zero"
>>>
>>> this does not make it clear if MAP_FIXED or other flags are valid
>>> (there are many map flags i don't know, but at least fixed should work
>>> and stack/growsdown. i'd expect anything that's not incompatible with
>>> private|anon to work).
>>
>> Just to clarify, this document tries to define the memory ranges from
>> where tagged addresses can be passed into the kernel in the context
>> of TBI only (not MTE); that is for hwasan support. FIXED or GROWSDOWN
>> should not affect this.
> 
> yes, so either the text should list MAP_* flags that don't affect
> the pointer tagging semantics or specify private|anon mapping
> with different wording.
> 

Good point. Could you please propose a wording that would be suitable for this case?

>>>> +  - a mapping below sbrk(0) done by the process itself
>>>
>>> doesn't the mmap rule cover this?
>>
>> IIUC it doesn't cover it as that's memory mapped by the kernel
>> automatically on access vs a pointer returned by mmap(). The statement
>> above talks about how the address is obtained by the user.
> 
> ok i read 'mapping below sbrk' as an mmap (possibly MAP_FIXED)
> that happens to be below the heap area.
> 
> i think "below sbrk(0)" is not the best term to use: there
> may be address range below the heap area that can be mmapped
> and thus below sbrk(0) and sbrk is a posix api not a linux
> syscall, the libc can implement it with mmap or whatever.
> 
> i'm not sure what the right term for 'heap area' is
> (the address range between syscall(__NR_brk,0) at
> program startup and its current value?)
> 

I used sbrk(0) with the meaning of "end of the process's data segment" not
implying that this is a syscall, but just as a useful way to identify the mapping.
I agree that it is a posix function implemented by libc but when it is used with
0 finds the current location of the program break, which can be changed by brk()
and depending on the new address passed to this syscall can have the effect of
allocating or deallocating memory.

Will changing sbrk(0) with "end of the process's data segment" make it more clear?

I will add what you are suggesting about the heap area.

>>>> +  - any memory mapped by the kernel in the process's address space during
>>>> +    creation and following the restrictions presented above (i.e. data, bss,
>>>> +    stack).
>>>
>>> OK.
>>>
>>> Can a null pointer have a tag?
>>> (in case NULL is valid to pass to a syscall)
>>
>> Good point. I don't think it can. We may change this for MTE where we
>> give a hint tag but no hint address, however, this document only covers
>> TBI for now.
> 
> OK.
> 
>>>> +The ARM64 Tagged Address ABI is an opt-in feature, and an application can
>>>> +control it using the following prctl()s:
>>>> +  - PR_SET_TAGGED_ADDR_CTRL: can be used to enable the Tagged Address ABI.
>>>> +  - PR_GET_TAGGED_ADDR_CTRL: can be used to check the status of the Tagged
>>>> +                             Address ABI.
>>>> +
>>>> +As a consequence of invoking PR_SET_TAGGED_ADDR_CTRL prctl() by an applications,
>>>> +the ABI guarantees the following behaviours:
>>>> +
>>>> +  - Every current or newly introduced syscall can accept any valid tagged
>>>> +    pointers.
>>>> +
>>>> +  - If a non valid tagged pointer is passed to a syscall then the behaviour
>>>> +    is undefined.
>>>> +
>>>> +  - Every valid tagged pointer is expected to work as an untagged one.
>>>> +
>>>> +  - The kernel preserves any valid tagged pointers and returns them to the
>>>> +    userspace unchanged in all the cases except the ones documented in the
>>>> +    "Preserving tags" paragraph of tagged-pointers.txt.
>>>
>>> OK.
>>>
>>> i guess pointers of another process are not "valid tagged pointers"
>>> for the current one, so e.g. in ptrace the ptracer has to clear the
>>> tags before PEEK etc.
>>
>> Another good point. Are there any pros/cons here or use-cases? When we
>> add MTE support, should we handle this differently?
> 
> i'm not sure what gdb does currently, but it has
> an 'address_significant' hook used at a few places
> that drops the tag on aarch64, so it probably
> avoids passing tagged pointer to ptrace.
> 
> i was worried about strace which tries to print
> structs passed to syscalls and follow pointers in
> them which currently would work, but if we allow
> tags in syscalls then it needs some update.
> (i haven't checked the strace code though)
>>>>> +A definition of the meaning of tagged pointers on arm64 can be found in:
>>>> +Documentation/arm64/tagged-pointers.txt.
>>>> +
>>>> +3. ARM64 Tagged Address ABI Exceptions
>>>> +--------------------------------------
>>>> +
>>>> +The behaviours described in paragraph 2, with particular reference to the
>>>> +acceptance by the syscalls of any valid tagged pointer are not applicable
>>>> +to the following cases:
>>>> +  - mmap() addr parameter.
>>>> +  - mremap() new_address parameter.
>>>> +  - prctl_set_mm() struct prctl_map fields.
>>>> +  - prctl_set_mm_map() struct prctl_map fields.
>>>
>>> i don't understand the exception: does it mean that passing a tagged
>>> address to these syscalls is undefined?
>>
>> I'd say it's as undefined as it is right now without these patches. We
>> may be able to explain this better in the document.
>>
>
Dave Martin June 13, 2019, 11:37 a.m. UTC | #7
On Thu, Jun 13, 2019 at 11:15:34AM +0100, Vincenzo Frascino wrote:
> Hi Catalin,
> 
> On 12/06/2019 16:35, Catalin Marinas wrote:
> > Hi Vincenzo,
> > 
> > Some minor comments below but it looks fine to me overall. Cc'ing
> > Szabolcs as well since I'd like a view from the libc people.
> > 
> 
> Thanks for this, I saw Szabolcs comments.
> 
> > On Wed, Jun 12, 2019 at 03:21:10PM +0100, Vincenzo Frascino wrote:
> >> diff --git a/Documentation/arm64/tagged-address-abi.txt b/Documentation/arm64/tagged-address-abi.txt
> >> new file mode 100644
> >> index 000000000000..96e149e2c55c
> >> --- /dev/null
> >> +++ b/Documentation/arm64/tagged-address-abi.txt

[...]

> >> +Since it is not desirable to relax the ABI to allow tagged user addresses
> >> +into the kernel indiscriminately, arm64 provides a new sysctl interface
> >> +(/proc/sys/abi/tagged_addr) that is used to prevent the applications from
> >> +enabling the relaxed ABI and a new prctl() interface that can be used to
> >> +enable or disable the relaxed ABI.
> >> +
> >> +The sysctl is meant also for testing purposes in order to provide a simple
> >> +way for the userspace to verify the return error checking of the prctl()
> >> +command without having to reconfigure the kernel.
> >> +
> >> +The ABI properties are inherited by threads of the same application and
> >> +fork()'ed children but cleared when a new process is spawn (execve()).
> > 
> > "spawned".

I'd just say "cleared by execve()."

"Spawn" suggests (v)fork+exec to me (at least, what's what "spawn" means on
certain other OSes).

> > 
> > I guess you could drop these three paragraphs here and mention the
> > inheritance properties when introducing the prctl() below. You can also
> > mention the global sysctl switch after the prctl() was introduced.
> > 
> 
> I will move the last two (rewording them) to the _section_ 2, but I would still
> prefer the Introduction to give an overview of the solution as well.
> 
> >> +
> >> +2. ARM64 Tagged Address ABI
> >> +---------------------------
> >> +
> >> +From the kernel syscall interface prospective, we define, for the purposes
> >> +of this document, a "valid tagged pointer" as a pointer that either it has
> > 
> > "either has" (no 'it') sounds slightly better but I'm not a native
> > English speaker either.
> > 
> >> +a zero value set in the top byte or it has a non-zero value, it is in memory
> >> +ranges privately owned by a userspace process and it is obtained in one of
> >> +the following ways:
> >> +  - mmap() done by the process itself, where either:
> >> +    * flags = MAP_PRIVATE | MAP_ANONYMOUS
> >> +    * flags = MAP_PRIVATE and the file descriptor refers to a regular
> >> +      file or "/dev/zero"
> >> +  - a mapping below sbrk(0) done by the process itself
> >> +  - any memory mapped by the kernel in the process's address space during
> >> +    creation and following the restrictions presented above (i.e. data, bss,
> >> +    stack).
> >> +
> >> +The ARM64 Tagged Address ABI is an opt-in feature, and an application can
> >> +control it using the following prctl()s:
> >> +  - PR_SET_TAGGED_ADDR_CTRL: can be used to enable the Tagged Address ABI.
> > 
> > enable or disable (not sure we need the latter but it doesn't heart).
> > 
> > I'd add the arg2 description here as well.
> > 
> 
> Good point I missed this.
> 
> >> +  - PR_GET_TAGGED_ADDR_CTRL: can be used to check the status of the Tagged
> >> +                             Address ABI.

For both prctls, you should also document the zeroed arguments up to
arg5 (unless we get rid of the enforcement and just ignore them).


Is there a canonical way to detect whether this whole API/ABI is
available?  (i.e., try to call this prctl / check for an HWCAP bit,
etc.)

[...]

Cheers
---Dave
Szabolcs Nagy June 13, 2019, 12:28 p.m. UTC | #8
On 13/06/2019 12:16, Vincenzo Frascino wrote:
> Hi Szabolcs,
> 
> thank you for your review.
> 
> On 13/06/2019 11:14, Szabolcs Nagy wrote:
>> On 13/06/2019 10:20, Catalin Marinas wrote:
>>> Hi Szabolcs,
>>>
>>> On Wed, Jun 12, 2019 at 05:30:34PM +0100, Szabolcs Nagy wrote:
>>>> On 12/06/2019 15:21, Vincenzo Frascino wrote:
>>>>> +2. ARM64 Tagged Address ABI
>>>>> +---------------------------
>>>>> +
>>>>> +From the kernel syscall interface prospective, we define, for the purposes
>>>>                                      ^^^^^^^^^^^
>>>> perspective
>>>>
>>>>> +of this document, a "valid tagged pointer" as a pointer that either it has
>>>>> +a zero value set in the top byte or it has a non-zero value, it is in memory
>>>>> +ranges privately owned by a userspace process and it is obtained in one of
>>>>> +the following ways:
>>>>> +  - mmap() done by the process itself, where either:
>>>>> +    * flags = MAP_PRIVATE | MAP_ANONYMOUS
>>>>> +    * flags = MAP_PRIVATE and the file descriptor refers to a regular
>>>>> +      file or "/dev/zero"
>>>>
>>>> this does not make it clear if MAP_FIXED or other flags are valid
>>>> (there are many map flags i don't know, but at least fixed should work
>>>> and stack/growsdown. i'd expect anything that's not incompatible with
>>>> private|anon to work).
>>>
>>> Just to clarify, this document tries to define the memory ranges from
>>> where tagged addresses can be passed into the kernel in the context
>>> of TBI only (not MTE); that is for hwasan support. FIXED or GROWSDOWN
>>> should not affect this.
>>
>> yes, so either the text should list MAP_* flags that don't affect
>> the pointer tagging semantics or specify private|anon mapping
>> with different wording.
>>
> 
> Good point. Could you please propose a wording that would be suitable for this case?

i don't know all the MAP_ magic, but i think it's enough to change
the "flags =" to

* flags have MAP_PRIVATE and MAP_ANONYMOUS set or
* flags have MAP_PRIVATE set and the file descriptor refers to...


>>>>> +  - a mapping below sbrk(0) done by the process itself
>>>>
>>>> doesn't the mmap rule cover this?
>>>
>>> IIUC it doesn't cover it as that's memory mapped by the kernel
>>> automatically on access vs a pointer returned by mmap(). The statement
>>> above talks about how the address is obtained by the user.
>>
>> ok i read 'mapping below sbrk' as an mmap (possibly MAP_FIXED)
>> that happens to be below the heap area.
>>
>> i think "below sbrk(0)" is not the best term to use: there
>> may be address range below the heap area that can be mmapped
>> and thus below sbrk(0) and sbrk is a posix api not a linux
>> syscall, the libc can implement it with mmap or whatever.
>>
>> i'm not sure what the right term for 'heap area' is
>> (the address range between syscall(__NR_brk,0) at
>> program startup and its current value?)
>>
> 
> I used sbrk(0) with the meaning of "end of the process's data segment" not
> implying that this is a syscall, but just as a useful way to identify the mapping.
> I agree that it is a posix function implemented by libc but when it is used with
> 0 finds the current location of the program break, which can be changed by brk()
> and depending on the new address passed to this syscall can have the effect of
> allocating or deallocating memory.
> 
> Will changing sbrk(0) with "end of the process's data segment" make it more clear?

i don't understand what's the relevance of the *end*
of the data segment.

i'd expect the text to say something about the address
range of the data segment.

i can do

mmap((void*)65536, 65536, PROT_READ|PROT_WRITE, MAP_FIXED|MAP_SHARED|MAP_ANON, -1, 0);

and it will be below the end of the data segment.

> 
> I will add what you are suggesting about the heap area.
>
Catalin Marinas June 13, 2019, 12:28 p.m. UTC | #9
On Thu, Jun 13, 2019 at 12:37:32PM +0100, Dave P Martin wrote:
> On Thu, Jun 13, 2019 at 11:15:34AM +0100, Vincenzo Frascino wrote:
> > On 12/06/2019 16:35, Catalin Marinas wrote:
> > > On Wed, Jun 12, 2019 at 03:21:10PM +0100, Vincenzo Frascino wrote:
> > >> +  - PR_GET_TAGGED_ADDR_CTRL: can be used to check the status of the Tagged
> > >> +                             Address ABI.
[...]
> Is there a canonical way to detect whether this whole API/ABI is
> available?  (i.e., try to call this prctl / check for an HWCAP bit,
> etc.)

The canonical way is a prctl() call. HWCAP doesn't make sense since it's
not a hardware feature. If you really want a different way of detecting
this (which I don't think it's worth), we can reinstate the AT_FLAGS
bit.
Dave Martin June 13, 2019, 1:23 p.m. UTC | #10
On Thu, Jun 13, 2019 at 01:28:21PM +0100, Catalin Marinas wrote:
> On Thu, Jun 13, 2019 at 12:37:32PM +0100, Dave P Martin wrote:
> > On Thu, Jun 13, 2019 at 11:15:34AM +0100, Vincenzo Frascino wrote:
> > > On 12/06/2019 16:35, Catalin Marinas wrote:
> > > > On Wed, Jun 12, 2019 at 03:21:10PM +0100, Vincenzo Frascino wrote:
> > > >> +  - PR_GET_TAGGED_ADDR_CTRL: can be used to check the status of the Tagged
> > > >> +                             Address ABI.
> [...]
> > Is there a canonical way to detect whether this whole API/ABI is
> > available?  (i.e., try to call this prctl / check for an HWCAP bit,
> > etc.)
> 
> The canonical way is a prctl() call. HWCAP doesn't make sense since it's
> not a hardware feature. If you really want a different way of detecting
> this (which I don't think it's worth), we can reinstate the AT_FLAGS
> bit.

Sure, I think this probably makes sense -- I'm still getting my around
which parts of the design are directly related to MTE and which aren't.

I was a bit concerned about the interaction between
PR_SET_TAGGED_ADDR_CTRL and the sysctl: the caller might conclude that
this API is unavailable when actually tagged addresses are stuck on.

I'm not sure whether this matters, but it's a bit weird.

One option would be to change the semantics, so that the sysctl just
forbids turning tagging from off to on.  Alternatively, we could return
a different error code to distinguish this case.

Or we just leave it as proposed.

Cheers
---Dave
Vincenzo Frascino June 13, 2019, 2:03 p.m. UTC | #11
On 13/06/2019 13:28, Szabolcs Nagy wrote:
> On 13/06/2019 12:16, Vincenzo Frascino wrote:
>> Hi Szabolcs,
>>
>> thank you for your review.
>>
>> On 13/06/2019 11:14, Szabolcs Nagy wrote:
>>> On 13/06/2019 10:20, Catalin Marinas wrote:
>>>> Hi Szabolcs,
>>>>
>>>> On Wed, Jun 12, 2019 at 05:30:34PM +0100, Szabolcs Nagy wrote:
>>>>> On 12/06/2019 15:21, Vincenzo Frascino wrote:
>>>>>> +2. ARM64 Tagged Address ABI
>>>>>> +---------------------------
>>>>>> +
>>>>>> +From the kernel syscall interface prospective, we define, for the purposes
>>>>>                                      ^^^^^^^^^^^
>>>>> perspective
>>>>>
>>>>>> +of this document, a "valid tagged pointer" as a pointer that either it has
>>>>>> +a zero value set in the top byte or it has a non-zero value, it is in memory
>>>>>> +ranges privately owned by a userspace process and it is obtained in one of
>>>>>> +the following ways:
>>>>>> +  - mmap() done by the process itself, where either:
>>>>>> +    * flags = MAP_PRIVATE | MAP_ANONYMOUS
>>>>>> +    * flags = MAP_PRIVATE and the file descriptor refers to a regular
>>>>>> +      file or "/dev/zero"
>>>>>
>>>>> this does not make it clear if MAP_FIXED or other flags are valid
>>>>> (there are many map flags i don't know, but at least fixed should work
>>>>> and stack/growsdown. i'd expect anything that's not incompatible with
>>>>> private|anon to work).
>>>>
>>>> Just to clarify, this document tries to define the memory ranges from
>>>> where tagged addresses can be passed into the kernel in the context
>>>> of TBI only (not MTE); that is for hwasan support. FIXED or GROWSDOWN
>>>> should not affect this.
>>>
>>> yes, so either the text should list MAP_* flags that don't affect
>>> the pointer tagging semantics or specify private|anon mapping
>>> with different wording.
>>>
>>
>> Good point. Could you please propose a wording that would be suitable for this case?
> 
> i don't know all the MAP_ magic, but i think it's enough to change
> the "flags =" to
> 
> * flags have MAP_PRIVATE and MAP_ANONYMOUS set or
> * flags have MAP_PRIVATE set and the file descriptor refers to...
> 
> 

Fine by me.  I will add it the next iterations.

>>>>>> +  - a mapping below sbrk(0) done by the process itself
>>>>>
>>>>> doesn't the mmap rule cover this?
>>>>
>>>> IIUC it doesn't cover it as that's memory mapped by the kernel
>>>> automatically on access vs a pointer returned by mmap(). The statement
>>>> above talks about how the address is obtained by the user.
>>>
>>> ok i read 'mapping below sbrk' as an mmap (possibly MAP_FIXED)
>>> that happens to be below the heap area.
>>>
>>> i think "below sbrk(0)" is not the best term to use: there
>>> may be address range below the heap area that can be mmapped
>>> and thus below sbrk(0) and sbrk is a posix api not a linux
>>> syscall, the libc can implement it with mmap or whatever.
>>>
>>> i'm not sure what the right term for 'heap area' is
>>> (the address range between syscall(__NR_brk,0) at
>>> program startup and its current value?)
>>>
>>
>> I used sbrk(0) with the meaning of "end of the process's data segment" not
>> implying that this is a syscall, but just as a useful way to identify the mapping.
>> I agree that it is a posix function implemented by libc but when it is used with
>> 0 finds the current location of the program break, which can be changed by brk()
>> and depending on the new address passed to this syscall can have the effect of
>> allocating or deallocating memory.
>>
>> Will changing sbrk(0) with "end of the process's data segment" make it more clear?
> 
> i don't understand what's the relevance of the *end*
> of the data segment.
> 
> i'd expect the text to say something about the address
> range of the data segment.
> 
> i can do
> 
> mmap((void*)65536, 65536, PROT_READ|PROT_WRITE, MAP_FIXED|MAP_SHARED|MAP_ANON, -1, 0);
> 
> and it will be below the end of the data segment.
>

As far as I understand the data segment "lives" below the program break, hence
it is a way of describing the range from which the user can obtain a valid
tagged pointer.

Said that, I am not really sure on how do you want me to document this (my aim
is for this to be clear to the userspace developers). Could you please propose
something?

>>
>> I will add what you are suggesting about the heap area.
>>
Szabolcs Nagy June 13, 2019, 3:32 p.m. UTC | #12
On 13/06/2019 15:03, Vincenzo Frascino wrote:
> On 13/06/2019 13:28, Szabolcs Nagy wrote:
>> On 13/06/2019 12:16, Vincenzo Frascino wrote:
>>> On 13/06/2019 11:14, Szabolcs Nagy wrote:
>>>> On 13/06/2019 10:20, Catalin Marinas wrote:
>>>>> On Wed, Jun 12, 2019 at 05:30:34PM +0100, Szabolcs Nagy wrote:
>>>>>> On 12/06/2019 15:21, Vincenzo Frascino wrote:
>>>>>>> +  - a mapping below sbrk(0) done by the process itself
>>>>>>
>>>>>> doesn't the mmap rule cover this?
>>>>>
>>>>> IIUC it doesn't cover it as that's memory mapped by the kernel
>>>>> automatically on access vs a pointer returned by mmap(). The statement
>>>>> above talks about how the address is obtained by the user.
>>>>
>>>> ok i read 'mapping below sbrk' as an mmap (possibly MAP_FIXED)
>>>> that happens to be below the heap area.
>>>>
>>>> i think "below sbrk(0)" is not the best term to use: there
>>>> may be address range below the heap area that can be mmapped
>>>> and thus below sbrk(0) and sbrk is a posix api not a linux
>>>> syscall, the libc can implement it with mmap or whatever.
>>>>
>>>> i'm not sure what the right term for 'heap area' is
>>>> (the address range between syscall(__NR_brk,0) at
>>>> program startup and its current value?)
>>>>
>>>
>>> I used sbrk(0) with the meaning of "end of the process's data segment" not
>>> implying that this is a syscall, but just as a useful way to identify the mapping.
>>> I agree that it is a posix function implemented by libc but when it is used with
>>> 0 finds the current location of the program break, which can be changed by brk()
>>> and depending on the new address passed to this syscall can have the effect of
>>> allocating or deallocating memory.
>>>
>>> Will changing sbrk(0) with "end of the process's data segment" make it more clear?
>>
>> i don't understand what's the relevance of the *end*
>> of the data segment.
>>
>> i'd expect the text to say something about the address
>> range of the data segment.
>>
>> i can do
>>
>> mmap((void*)65536, 65536, PROT_READ|PROT_WRITE, MAP_FIXED|MAP_SHARED|MAP_ANON, -1, 0);
>>
>> and it will be below the end of the data segment.
>>
> 
> As far as I understand the data segment "lives" below the program break, hence
> it is a way of describing the range from which the user can obtain a valid
> tagged pointer.>
> Said that, I am not really sure on how do you want me to document this (my aim
> is for this to be clear to the userspace developers). Could you please propose
> something?

[...], it is in the memory ranges privately owned by a
userspace process and it is obtained in one of the
following ways:

- mmap done by the process itself, [...]

- brk syscall done by the process itself.
  (i.e. the heap area between the initial location
  of the program break at process creation and its
  current location.)

- any memory mapped by the kernel [...]

the data segment that's part of the process image is
already covered by the last point.
Vincenzo Frascino June 13, 2019, 3:35 p.m. UTC | #13
On 13/06/2019 16:32, Szabolcs Nagy wrote:
> On 13/06/2019 15:03, Vincenzo Frascino wrote:
>> On 13/06/2019 13:28, Szabolcs Nagy wrote:
>>> On 13/06/2019 12:16, Vincenzo Frascino wrote:
>>>> On 13/06/2019 11:14, Szabolcs Nagy wrote:
>>>>> On 13/06/2019 10:20, Catalin Marinas wrote:
>>>>>> On Wed, Jun 12, 2019 at 05:30:34PM +0100, Szabolcs Nagy wrote:
>>>>>>> On 12/06/2019 15:21, Vincenzo Frascino wrote:
>>>>>>>> +  - a mapping below sbrk(0) done by the process itself
>>>>>>>
>>>>>>> doesn't the mmap rule cover this?
>>>>>>
>>>>>> IIUC it doesn't cover it as that's memory mapped by the kernel
>>>>>> automatically on access vs a pointer returned by mmap(). The statement
>>>>>> above talks about how the address is obtained by the user.
>>>>>
>>>>> ok i read 'mapping below sbrk' as an mmap (possibly MAP_FIXED)
>>>>> that happens to be below the heap area.
>>>>>
>>>>> i think "below sbrk(0)" is not the best term to use: there
>>>>> may be address range below the heap area that can be mmapped
>>>>> and thus below sbrk(0) and sbrk is a posix api not a linux
>>>>> syscall, the libc can implement it with mmap or whatever.
>>>>>
>>>>> i'm not sure what the right term for 'heap area' is
>>>>> (the address range between syscall(__NR_brk,0) at
>>>>> program startup and its current value?)
>>>>>
>>>>
>>>> I used sbrk(0) with the meaning of "end of the process's data segment" not
>>>> implying that this is a syscall, but just as a useful way to identify the mapping.
>>>> I agree that it is a posix function implemented by libc but when it is used with
>>>> 0 finds the current location of the program break, which can be changed by brk()
>>>> and depending on the new address passed to this syscall can have the effect of
>>>> allocating or deallocating memory.
>>>>
>>>> Will changing sbrk(0) with "end of the process's data segment" make it more clear?
>>>
>>> i don't understand what's the relevance of the *end*
>>> of the data segment.
>>>
>>> i'd expect the text to say something about the address
>>> range of the data segment.
>>>
>>> i can do
>>>
>>> mmap((void*)65536, 65536, PROT_READ|PROT_WRITE, MAP_FIXED|MAP_SHARED|MAP_ANON, -1, 0);
>>>
>>> and it will be below the end of the data segment.
>>>
>>
>> As far as I understand the data segment "lives" below the program break, hence
>> it is a way of describing the range from which the user can obtain a valid
>> tagged pointer.>
>> Said that, I am not really sure on how do you want me to document this (my aim
>> is for this to be clear to the userspace developers). Could you please propose
>> something?
> 
> [...], it is in the memory ranges privately owned by a
> userspace process and it is obtained in one of the
> following ways:
> 
> - mmap done by the process itself, [...]
> 
> - brk syscall done by the process itself.
>   (i.e. the heap area between the initial location
>   of the program break at process creation and its
>   current location.)
> 
> - any memory mapped by the kernel [...]
> 
> the data segment that's part of the process image is
> already covered by the last point.
> 

Thanks Szabolcs, I will update the document accordingly.
Catalin Marinas June 13, 2019, 3:39 p.m. UTC | #14
On Thu, Jun 13, 2019 at 02:23:43PM +0100, Dave P Martin wrote:
> On Thu, Jun 13, 2019 at 01:28:21PM +0100, Catalin Marinas wrote:
> > On Thu, Jun 13, 2019 at 12:37:32PM +0100, Dave P Martin wrote:
> > > On Thu, Jun 13, 2019 at 11:15:34AM +0100, Vincenzo Frascino wrote:
> > > > On 12/06/2019 16:35, Catalin Marinas wrote:
> > > > > On Wed, Jun 12, 2019 at 03:21:10PM +0100, Vincenzo Frascino wrote:
> > > > >> +  - PR_GET_TAGGED_ADDR_CTRL: can be used to check the status of the Tagged
> > > > >> +                             Address ABI.
> > [...]
> > > Is there a canonical way to detect whether this whole API/ABI is
> > > available?  (i.e., try to call this prctl / check for an HWCAP bit,
> > > etc.)
> > 
> > The canonical way is a prctl() call. HWCAP doesn't make sense since it's
> > not a hardware feature. If you really want a different way of detecting
> > this (which I don't think it's worth), we can reinstate the AT_FLAGS
> > bit.
> 
> Sure, I think this probably makes sense -- I'm still getting my around
> which parts of the design are directly related to MTE and which aren't.
> 
> I was a bit concerned about the interaction between
> PR_SET_TAGGED_ADDR_CTRL and the sysctl: the caller might conclude that
> this API is unavailable when actually tagged addresses are stuck on.
> 
> I'm not sure whether this matters, but it's a bit weird.
> 
> One option would be to change the semantics, so that the sysctl just
> forbids turning tagging from off to on.  Alternatively, we could return
> a different error code to distinguish this case.

This is the intention, just to forbid turning tagging on. We could
return -EPERM instead, though my original intent was to simply pretend
that the prctl does not exist like in an older kernel version.

Patch
diff mbox series

diff --git a/Documentation/arm64/tagged-address-abi.txt b/Documentation/arm64/tagged-address-abi.txt
new file mode 100644
index 000000000000..96e149e2c55c
--- /dev/null
+++ b/Documentation/arm64/tagged-address-abi.txt
@@ -0,0 +1,111 @@ 
+ARM64 TAGGED ADDRESS ABI
+========================
+
+This document describes the usage and semantics of the Tagged Address
+ABI on arm64.
+
+1. Introduction
+---------------
+
+On arm64 the TCR_EL1.TBI0 bit has been always enabled on the arm64 kernel,
+hence the userspace (EL0) is allowed to set a non-zero value in the top
+byte but the resulting pointers are not allowed at the user-kernel syscall
+ABI boundary.
+
+This document describes a relaxation of the ABI with which it is possible
+to pass tagged tagged pointers to the syscalls, when these pointers are in
+memory ranges obtained as described in paragraph 2.
+
+Since it is not desirable to relax the ABI to allow tagged user addresses
+into the kernel indiscriminately, arm64 provides a new sysctl interface
+(/proc/sys/abi/tagged_addr) that is used to prevent the applications from
+enabling the relaxed ABI and a new prctl() interface that can be used to
+enable or disable the relaxed ABI.
+
+The sysctl is meant also for testing purposes in order to provide a simple
+way for the userspace to verify the return error checking of the prctl()
+command without having to reconfigure the kernel.
+
+The ABI properties are inherited by threads of the same application and
+fork()'ed children but cleared when a new process is spawn (execve()).
+
+2. ARM64 Tagged Address ABI
+---------------------------
+
+From the kernel syscall interface prospective, we define, for the purposes
+of this document, a "valid tagged pointer" as a pointer that either it has
+a zero value set in the top byte or it has a non-zero value, it is in memory
+ranges privately owned by a userspace process and it is obtained in one of
+the following ways:
+  - mmap() done by the process itself, where either:
+    * flags = MAP_PRIVATE | MAP_ANONYMOUS
+    * flags = MAP_PRIVATE and the file descriptor refers to a regular
+      file or "/dev/zero"
+  - a mapping below sbrk(0) done by the process itself
+  - any memory mapped by the kernel in the process's address space during
+    creation and following the restrictions presented above (i.e. data, bss,
+    stack).
+
+The ARM64 Tagged Address ABI is an opt-in feature, and an application can
+control it using the following prctl()s:
+  - PR_SET_TAGGED_ADDR_CTRL: can be used to enable the Tagged Address ABI.
+  - PR_GET_TAGGED_ADDR_CTRL: can be used to check the status of the Tagged
+                             Address ABI.
+
+As a consequence of invoking PR_SET_TAGGED_ADDR_CTRL prctl() by an applications,
+the ABI guarantees the following behaviours:
+
+  - Every current or newly introduced syscall can accept any valid tagged
+    pointers.
+
+  - If a non valid tagged pointer is passed to a syscall then the behaviour
+    is undefined.
+
+  - Every valid tagged pointer is expected to work as an untagged one.
+
+  - The kernel preserves any valid tagged pointers and returns them to the
+    userspace unchanged in all the cases except the ones documented in the
+    "Preserving tags" paragraph of tagged-pointers.txt.
+
+A definition of the meaning of tagged pointers on arm64 can be found in:
+Documentation/arm64/tagged-pointers.txt.
+
+3. ARM64 Tagged Address ABI Exceptions
+--------------------------------------
+
+The behaviours described in paragraph 2, with particular reference to the
+acceptance by the syscalls of any valid tagged pointer are not applicable
+to the following cases:
+  - mmap() addr parameter.
+  - mremap() new_address parameter.
+  - prctl_set_mm() struct prctl_map fields.
+  - prctl_set_mm_map() struct prctl_map fields.
+
+4. Example of correct usage
+---------------------------
+
+void main(void)
+{
+	static int tbi_enabled = 0;
+	unsigned long tag = 0;
+
+	char *ptr = mmap(NULL, PAGE_SIZE, PROT_READ | PROT_WRITE,
+			 MAP_ANONYMOUS, -1, 0);
+
+	if (prctl(PR_SET_TAGGED_ADDR_CTRL, PR_TAGGED_ADDR_ENABLE,
+		  0, 0, 0) == 0)
+		tbi_enabled = 1;
+
+	if (!ptr)
+		return -1;
+
+	if (tbi_enabled)
+		tag = rand() & 0xff;
+
+	ptr = (char *)((unsigned long)ptr | (tag << TAG_SHIFT));
+
+	*ptr = 'a';
+
+	...
+}
+