diff mbox series

[RFC] arm: Don't trap conditional UDF instructions

Message ID b2042f19-9477-272c-0989-d6cab1572cca@strupe.net (mailing list archive)
State New, archived
Headers show
Series [RFC] arm: Don't trap conditional UDF instructions | expand

Commit Message

Fredrik Strupe May 13, 2020, 3:41 p.m. UTC
Hi,

This is more of a question than a patch, but I hope the attached patch makes
the issue a bit clearer.

The arm port of Linux supports hooking/trapping of undefined instructions. Some
parts of the code use this to trap UDF instructions with certain immediates in
order to use them for other purposes, like 'UDF #16' which is equivalent to a
BKPT instruction in A32.

Moreover, most of the undef hooks on UDF instructions assume that UDF is
conditional and mask out the condition prefix during matching. The attached
patch shows the locations where this happens. However, the Arm architecture
reference manual explicitly states that UDF is *not* conditional, making
any instruction encoding with a condition prefix other than 0xe (always
execute) unallocated.

My question is whether trapping invalid UDF instructions is intentional or
rather a bug resulting from an oversight. While the unallocated instructions
are not used for anything else and trapping them in addition to the legal UDF
probably doesn't hurt, it seems to be slightly inconsistent with the ISA
specification.

Sort of related, when looking for these kind of traps I came over the following
code in arch/arm/probes/uprobes/core.c:

    bpinsn = UPROBE_SWBP_ARM_INSN & 0x0fffffff;
    if (insn >= 0xe0000000)
        bpinsn |= 0xe0000000;  /* Unconditional instruction */
    else
        bpinsn |= insn & 0xf0000000;  /* Copy condition from insn */

    auprobe->bpinsn = bpinsn;

Where UPROBE_SWBP_ARM_INSN is equal to the encoding of 'UDF #25'. I might be
mistaken, but it seems like the condition of the UDF instruction is set from
another instruction with the expectation that it will execute conditionally,
which is not the case in practice.

While the above code is not directly related to the code in the patch and
possibly a more clear-cut bug, it might indicate that the conditional UDF
hooks also stem from a slight misunderstanding.

So just to reiterate the question: is trapping UDF instructions with invalid
condition prefixes considered a bug or not?

Thanks,
Fredrik

---
UDF instructions with a conditional prefix other than 0xe (always
execute) are unallocated as opposed to permanently undefined. They
should therefore not be used for instruction hooking/trapping.

Signed-off-by: Fredrik Strupe <fredrik@strupe.net>
---
 arch/arm/kernel/ptrace.c       |  6 +++---
 arch/arm/probes/kprobes/core.c |  2 +-
 arch/arm/probes/kprobes/core.h |  2 +-
 arch/arm/probes/uprobes/core.c | 16 +++++++---------
 4 files changed, 12 insertions(+), 14 deletions(-)

Comments

Russell King (Oracle) May 13, 2020, 6:12 p.m. UTC | #1
On Wed, May 13, 2020 at 05:41:58PM +0200, Fredrik Strupe wrote:
> Hi,
> 
> This is more of a question than a patch, but I hope the attached patch makes
> the issue a bit clearer.
> 
> The arm port of Linux supports hooking/trapping of undefined instructions. Some
> parts of the code use this to trap UDF instructions with certain immediates in
> order to use them for other purposes, like 'UDF #16' which is equivalent to a
> BKPT instruction in A32.
> 
> Moreover, most of the undef hooks on UDF instructions assume that UDF is
> conditional and mask out the condition prefix during matching. The attached
> patch shows the locations where this happens. However, the Arm architecture
> reference manual explicitly states that UDF is *not* conditional, making
> any instruction encoding with a condition prefix other than 0xe (always
> execute) unallocated.

The latest version of the ARM architecture reference manual may say
that, but earlier versions say different things. The latest reference
manual does not apply to earlier architectures, so if you're writing
code to cover multiple different architectures, you must have an
understanding of each of those architectures.

So, from the code:

	ARM:   xxxx 0111 1111 xxxx xxxx xxxx 1111 xxxx

From DDI0100E:

3.13.1 Undefined instruction space
       Instructions with the following opcodes are undefined
       instruction space:

       opcode[27:25] = 0b011
       opcode[4] = 1

       31 28 27 26 25 24                                     5 4 3     0
       cond  0  1  1  x  x x x x x x x x x x x x x x x x x x x 1 x x x x

So, in this version of the architecture, undefined instructions may
be conditional - and indeed that used to be the case.  The condition
code was always respected, and cond=1111 meant "never" (NV).

Hence, trapping them if the condition code is not 1110 (AL) is
entirely reasonable, legal and safe.  If an ARM CPU defines an
instruction coding that matches the above, then it won't take the
undefined instruction trap, and we'll never see it.

Now, as for UDF usage in the kernel, it may be quite correct that we
always use the AL condition code for them, but it would be very odd
for there to be an instruction implemented with a different (non-NV)
condition code that can't also have it's AL condition code encoding.
You could never execute such an instruction unconditionally.
Fredrik Strupe May 13, 2020, 6:56 p.m. UTC | #2
On 13.05.2020 20:12, Russell King - ARM Linux admin wrote:
> On Wed, May 13, 2020 at 05:41:58PM +0200, Fredrik Strupe wrote:
>> Hi,
>>
>> This is more of a question than a patch, but I hope the attached patch makes
>> the issue a bit clearer.
>>
>> The arm port of Linux supports hooking/trapping of undefined instructions. Some
>> parts of the code use this to trap UDF instructions with certain immediates in
>> order to use them for other purposes, like 'UDF #16' which is equivalent to a
>> BKPT instruction in A32.
>>
>> Moreover, most of the undef hooks on UDF instructions assume that UDF is
>> conditional and mask out the condition prefix during matching. The attached
>> patch shows the locations where this happens. However, the Arm architecture
>> reference manual explicitly states that UDF is *not* conditional, making
>> any instruction encoding with a condition prefix other than 0xe (always
>> execute) unallocated.
>
> The latest version of the ARM architecture reference manual may say
> that, but earlier versions say different things. The latest reference
> manual does not apply to earlier architectures, so if you're writing
> code to cover multiple different architectures, you must have an
> understanding of each of those architectures.
>
> So, from the code:
>
> 	ARM:   xxxx 0111 1111 xxxx xxxx xxxx 1111 xxxx
>
> From DDI0100E:
>
> 3.13.1 Undefined instruction space
>        Instructions with the following opcodes are undefined
>        instruction space:
>
>        opcode[27:25] = 0b011
>        opcode[4] = 1
>
>        31 28 27 26 25 24                                     5 4 3     0
>        cond  0  1  1  x  x x x x x x x x x x x x x x x x x x x 1 x x x x
>
> So, in this version of the architecture, undefined instructions may
> be conditional - and indeed that used to be the case.  The condition
> code was always respected, and cond=1111 meant "never" (NV).
>
> Hence, trapping them if the condition code is not 1110 (AL) is
> entirely reasonable, legal and safe.  If an ARM CPU defines an
> instruction coding that matches the above, then it won't take the
> undefined instruction trap, and we'll never see it.
>
> Now, as for UDF usage in the kernel, it may be quite correct that we
> always use the AL condition code for them, but it would be very odd
> for there to be an instruction implemented with a different (non-NV)
> condition code that can't also have it's AL condition code encoding.
> You could never execute such an instruction unconditionally.
>

That makes sense. Thank you very much for a great answer!

Fredrik
Robin Murphy May 13, 2020, 8:57 p.m. UTC | #3
On 2020-05-13 7:56 pm, Fredrik Strupe wrote:
> On 13.05.2020 20:12, Russell King - ARM Linux admin wrote:
>> On Wed, May 13, 2020 at 05:41:58PM +0200, Fredrik Strupe wrote:
>>> Hi,
>>>
>>> This is more of a question than a patch, but I hope the attached patch makes
>>> the issue a bit clearer.
>>>
>>> The arm port of Linux supports hooking/trapping of undefined instructions. Some
>>> parts of the code use this to trap UDF instructions with certain immediates in
>>> order to use them for other purposes, like 'UDF #16' which is equivalent to a
>>> BKPT instruction in A32.
>>>
>>> Moreover, most of the undef hooks on UDF instructions assume that UDF is
>>> conditional and mask out the condition prefix during matching. The attached
>>> patch shows the locations where this happens. However, the Arm architecture
>>> reference manual explicitly states that UDF is *not* conditional, making
>>> any instruction encoding with a condition prefix other than 0xe (always
>>> execute) unallocated.
>>
>> The latest version of the ARM architecture reference manual may say
>> that, but earlier versions say different things. The latest reference
>> manual does not apply to earlier architectures, so if you're writing
>> code to cover multiple different architectures, you must have an
>> understanding of each of those architectures.
>>
>> So, from the code:
>>
>> 	ARM:   xxxx 0111 1111 xxxx xxxx xxxx 1111 xxxx
>>
>>  From DDI0100E:
>>
>> 3.13.1 Undefined instruction space
>>         Instructions with the following opcodes are undefined
>>         instruction space:
>>
>>         opcode[27:25] = 0b011
>>         opcode[4] = 1
>>
>>         31 28 27 26 25 24                                     5 4 3     0
>>         cond  0  1  1  x  x x x x x x x x x x x x x x x x x x x 1 x x x x
>>
>> So, in this version of the architecture, undefined instructions may
>> be conditional - and indeed that used to be the case.  The condition
>> code was always respected, and cond=1111 meant "never" (NV).
>>
>> Hence, trapping them if the condition code is not 1110 (AL) is
>> entirely reasonable, legal and safe.  If an ARM CPU defines an
>> instruction coding that matches the above, then it won't take the
>> undefined instruction trap, and we'll never see it.
>>
>> Now, as for UDF usage in the kernel, it may be quite correct that we
>> always use the AL condition code for them, but it would be very odd
>> for there to be an instruction implemented with a different (non-NV)
>> condition code that can't also have it's AL condition code encoding.
>> You could never execute such an instruction unconditionally.
>>
> 
> That makes sense. Thank you very much for a great answer!

The other subtlety is that UDF is a *mnemonic* for a specific encoding 
with cond=0b1110, thus "conditional UDF" is an oxymoron. You can still 
have a conditional instruction from the "permanently undefined" space 
(although on Armv7-A it's implementation-defined whether it causes an 
exception regardless of the condition), it's just not a UDF.

Robin.
diff mbox series

Patch

diff --git a/arch/arm/kernel/ptrace.c b/arch/arm/kernel/ptrace.c
index b606cded90cd..a425691a41ad 100644
--- a/arch/arm/kernel/ptrace.c
+++ b/arch/arm/kernel/ptrace.c
@@ -50,7 +50,7 @@ 
  * reference manual guarantees that the following instruction space
  * will produce an undefined instruction exception on all CPUs:
  *
- *  ARM:   xxxx 0111 1111 xxxx xxxx xxxx 1111 xxxx
+ *  ARM:   1110 0111 1111 xxxx xxxx xxxx 1111 xxxx
  *  Thumb: 1101 1110 xxxx xxxx
  */
 #define BREAKINST_ARM	0xe7f001f0
@@ -211,8 +211,8 @@  static int break_trap(struct pt_regs *regs, unsigned int instr)
 }
 
 static struct undef_hook arm_break_hook = {
-	.instr_mask	= 0x0fffffff,
-	.instr_val	= 0x07f001f0,
+	.instr_mask	= 0xffffffff,
+	.instr_val	= 0xe7f001f0,
 	.cpsr_mask	= PSR_T_BIT,
 	.cpsr_val	= 0,
 	.fn		= break_trap,
diff --git a/arch/arm/probes/kprobes/core.c b/arch/arm/probes/kprobes/core.c
index 90b5bc723c83..77b8baa6638f 100644
--- a/arch/arm/probes/kprobes/core.c
+++ b/arch/arm/probes/kprobes/core.c
@@ -525,7 +525,7 @@  static struct undef_hook kprobes_thumb32_break_hook = {
 #else  /* !CONFIG_THUMB2_KERNEL */
 
 static struct undef_hook kprobes_arm_break_hook = {
-	.instr_mask	= 0x0fffffff,
+	.instr_mask	= 0xffffffff,
 	.instr_val	= KPROBE_ARM_BREAKPOINT_INSTRUCTION,
 	.cpsr_mask	= MODE_MASK,
 	.cpsr_val	= SVC_MODE,
diff --git a/arch/arm/probes/kprobes/core.h b/arch/arm/probes/kprobes/core.h
index c3db468650ce..ee77ae553690 100644
--- a/arch/arm/probes/kprobes/core.h
+++ b/arch/arm/probes/kprobes/core.h
@@ -18,7 +18,7 @@ 
  * These undefined instructions must be unique and
  * reserved solely for kprobes' use.
  */
-#define KPROBE_ARM_BREAKPOINT_INSTRUCTION	0x07f001f8
+#define KPROBE_ARM_BREAKPOINT_INSTRUCTION	0xe7f001f8
 #define KPROBE_THUMB16_BREAKPOINT_INSTRUCTION	0xde18
 #define KPROBE_THUMB32_BREAKPOINT_INSTRUCTION	0xf7f0a018
 
diff --git a/arch/arm/probes/uprobes/core.c b/arch/arm/probes/uprobes/core.c
index c4b49b322e8a..f574356284f4 100644
--- a/arch/arm/probes/uprobes/core.c
+++ b/arch/arm/probes/uprobes/core.c
@@ -22,8 +22,7 @@ 
 
 bool is_swbp_insn(uprobe_opcode_t *insn)
 {
-	return (__mem_to_opcode_arm(*insn) & 0x0fffffff) ==
-		(UPROBE_SWBP_ARM_INSN & 0x0fffffff);
+	return __mem_to_opcode_arm(*insn) == UPROBE_SWBP_ARM_INSN;
 }
 
 int set_swbp(struct arch_uprobe *auprobe, struct mm_struct *mm,
@@ -186,10 +185,9 @@  static int uprobe_trap_handler(struct pt_regs *regs, unsigned int instr)
 	unsigned long flags;
 
 	local_irq_save(flags);
-	instr &= 0x0fffffff;
-	if (instr == (UPROBE_SWBP_ARM_INSN & 0x0fffffff))
+	if (instr == UPROBE_SWBP_ARM_INSN)
 		uprobe_pre_sstep_notifier(regs);
-	else if (instr == (UPROBE_SS_ARM_INSN & 0x0fffffff))
+	else if (instr == UPROBE_SS_ARM_INSN)
 		uprobe_post_sstep_notifier(regs);
 	local_irq_restore(flags);
 
@@ -202,16 +200,16 @@  unsigned long uprobe_get_swbp_addr(struct pt_regs *regs)
 }
 
 static struct undef_hook uprobes_arm_break_hook = {
-	.instr_mask	= 0x0fffffff,
-	.instr_val	= (UPROBE_SWBP_ARM_INSN & 0x0fffffff),
+	.instr_mask	= 0xffffffff,
+	.instr_val	= UPROBE_SWBP_ARM_INSN,
 	.cpsr_mask	= MODE_MASK,
 	.cpsr_val	= USR_MODE,
 	.fn		= uprobe_trap_handler,
 };
 
 static struct undef_hook uprobes_arm_ss_hook = {
-	.instr_mask	= 0x0fffffff,
-	.instr_val	= (UPROBE_SS_ARM_INSN & 0x0fffffff),
+	.instr_mask	= 0xffffffff,
+	.instr_val	= UPROBE_SS_ARM_INSN,
 	.cpsr_mask	= MODE_MASK,
 	.cpsr_val	= USR_MODE,
 	.fn		= uprobe_trap_handler,