diff mbox series

[-next,V3,1/2] riscv: jump_label: Fixup unaligned arch_static_branch function

Message ID 20230126170607.1489141-2-guoren@kernel.org (mailing list archive)
State Accepted
Commit 9ddfc3cd806081ce1f6c9c2f988cbb031f35d28f
Delegated to: Palmer Dabbelt
Headers show
Series riscv: jump_label: Fixup & Optimization | expand

Checks

Context Check Description
conchuod/cover_letter success Series has a cover letter
conchuod/tree_selection success Guessed tree name to be fixes
conchuod/fixes_present success Fixes tag present in non-next series
conchuod/maintainers_pattern success MAINTAINERS pattern errors before the patch: 13 and now 13
conchuod/verify_signedoff success Signed-off-by tag matches author and committer
conchuod/kdoc success Errors and warnings before: 0 this patch: 0
conchuod/module_param success Was 0 now: 0
conchuod/build_rv64_gcc_allmodconfig success Errors and warnings before: 2050 this patch: 2050
conchuod/alphanumeric_selects success Out of order selects before the patch: 57 and now 57
conchuod/build_rv32_defconfig success Build OK
conchuod/dtb_warn_rv64 success Errors and warnings before: 2 this patch: 2
conchuod/header_inline success No static functions without inline keyword in header files
conchuod/checkpatch warning WARNING: unnecessary whitespace before a quoted newline
conchuod/source_inline success Was 0 now: 0
conchuod/build_rv64_nommu_k210_defconfig success Build OK
conchuod/verify_fixes success Fixes tag looks correct
conchuod/build_rv64_nommu_virt_defconfig success Build OK

Commit Message

Guo Ren Jan. 26, 2023, 5:06 p.m. UTC
From: Andy Chiu <andy.chiu@sifive.com>

Runtime code patching must be done at a naturally aligned address, or we
may execute on a partial instruction.

We have encountered problems traced back to static jump functions during
the test. We switched the tracer randomly for every 1~5 seconds on a
dual-core QEMU setup and found the kernel sucking at a static branch
where it jumps to itself.

The reason is that the static branch was 2-byte but not 4-byte aligned.
Then, the kernel would patch the instruction, either J or NOP, with two
half-word stores if the machine does not have efficient unaligned
accesses. Thus, moments exist where half of the NOP mixes with the other
half of the J when transitioning the branch. In our particular case, on
a little-endian machine, the upper half of the NOP was mixed with the
lower part of the J when enabling the branch, resulting in a jump that
jumped to itself. Conversely, it would result in a HINT instruction when
disabling the branch, but it might not be observable.

ARM64 does not have this problem since all instructions must be 4-byte
aligned.

Fixes: ebc00dde8a97 ("riscv: Add jump-label implementation")
Link: https://lore.kernel.org/linux-riscv/20220913094252.3555240-6-andy.chiu@sifive.com/
Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Greentime Hu <greentime.hu@sifive.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 arch/riscv/include/asm/jump_label.h | 2 ++
 1 file changed, 2 insertions(+)

Comments

Björn Töpel Jan. 30, 2023, 11:57 a.m. UTC | #1
guoren@kernel.org writes:

> From: Andy Chiu <andy.chiu@sifive.com>
>
> Runtime code patching must be done at a naturally aligned address, or we
> may execute on a partial instruction.
>
> We have encountered problems traced back to static jump functions during
> the test. We switched the tracer randomly for every 1~5 seconds on a
> dual-core QEMU setup and found the kernel sucking at a static branch
> where it jumps to itself.
>
> The reason is that the static branch was 2-byte but not 4-byte aligned.
> Then, the kernel would patch the instruction, either J or NOP, with two
> half-word stores if the machine does not have efficient unaligned
> accesses. Thus, moments exist where half of the NOP mixes with the other
> half of the J when transitioning the branch. In our particular case, on
> a little-endian machine, the upper half of the NOP was mixed with the
> lower part of the J when enabling the branch, resulting in a jump that
> jumped to itself. Conversely, it would result in a HINT instruction when
> disabling the branch, but it might not be observable.
>
> ARM64 does not have this problem since all instructions must be 4-byte
> aligned.

Reviewed-by: Björn Töpel <bjorn@kernel.org>

Nice catch! And I guess this is an issue for kprobes as well, no?
I.e. in general replacing 32b insns with an ebreak. This is only valid
for natural aligned 32b insns?

@Guo I don't see the point of doing a series for this, and asking the
maintainers to "pick this patch to stable, and the other for
next". Isn't that just more work for the maintainers/reviewers?


Björn
Guo Ren Jan. 31, 2023, 1:35 p.m. UTC | #2
On Mon, Jan 30, 2023 at 7:57 PM Björn Töpel <bjorn@kernel.org> wrote:
>
> guoren@kernel.org writes:
>
> > From: Andy Chiu <andy.chiu@sifive.com>
> >
> > Runtime code patching must be done at a naturally aligned address, or we
> > may execute on a partial instruction.
> >
> > We have encountered problems traced back to static jump functions during
> > the test. We switched the tracer randomly for every 1~5 seconds on a
> > dual-core QEMU setup and found the kernel sucking at a static branch
> > where it jumps to itself.
> >
> > The reason is that the static branch was 2-byte but not 4-byte aligned.
> > Then, the kernel would patch the instruction, either J or NOP, with two
> > half-word stores if the machine does not have efficient unaligned
> > accesses. Thus, moments exist where half of the NOP mixes with the other
> > half of the J when transitioning the branch. In our particular case, on
> > a little-endian machine, the upper half of the NOP was mixed with the
> > lower part of the J when enabling the branch, resulting in a jump that
> > jumped to itself. Conversely, it would result in a HINT instruction when
> > disabling the branch, but it might not be observable.
> >
> > ARM64 does not have this problem since all instructions must be 4-byte
> > aligned.
>
> Reviewed-by: Björn Töpel <bjorn@kernel.org>
>
> Nice catch! And I guess this is an issue for kprobes as well, no?
> I.e. in general replacing 32b insns with an ebreak. This is only valid
> for natural aligned 32b insns?
>
> @Guo I don't see the point of doing a series for this, and asking the
> maintainers to "pick this patch to stable, and the other for
> next". Isn't that just more work for the maintainers/reviewers?
If these two patches are separated, they are all fixup that issue and
competed with each other. Making my patch an optimization patch must
depend on it. That's why I put them in one series.

>
>
> Björn
Björn Töpel Feb. 6, 2023, 8:09 a.m. UTC | #3
Trimming Cc.

Guo Ren <guoren@kernel.org> writes:

>> @Guo I don't see the point of doing a series for this, and asking the
>> maintainers to "pick this patch to stable, and the other for
>> next". Isn't that just more work for the maintainers/reviewers?
> If these two patches are separated, they are all fixup that issue and
> competed with each other. Making my patch an optimization patch must
> depend on it. That's why I put them in one series.

They are not depedent at all, and not "fixup". The first is a fix, and
should go into -fixes ASAP. The other patch is completely stand-alone,
and an optimization (maybe). If that go in, it's for -next.

Having them as separate patches, makes it easier for
reviewers/maintainers. Now, with your approach there's more work
(cognitive, and manual) for others.


Björn
Guo Ren Feb. 6, 2023, 8:41 a.m. UTC | #4
On Mon, Feb 6, 2023 at 4:09 PM Björn Töpel <bjorn@kernel.org> wrote:
>
> Trimming Cc.
>
> Guo Ren <guoren@kernel.org> writes:
>
> >> @Guo I don't see the point of doing a series for this, and asking the
> >> maintainers to "pick this patch to stable, and the other for
> >> next". Isn't that just more work for the maintainers/reviewers?
> > If these two patches are separated, they are all fixup that issue and
> > competed with each other. Making my patch an optimization patch must
> > depend on it. That's why I put them in one series.
>
> They are not depedent at all, and not "fixup". The first is a fix, and
> should go into -fixes ASAP. The other patch is completely stand-alone,
> and an optimization (maybe). If that go in, it's for -next.
>
> Having them as separate patches, makes it easier for
> reviewers/maintainers. Now, with your approach there's more work
> (cognitive, and manual) for others.
Okay, I would separate them.

>
>
> Björn
diff mbox series

Patch

diff --git a/arch/riscv/include/asm/jump_label.h b/arch/riscv/include/asm/jump_label.h
index 6d58bbb5da46..14a5ea8d8ef0 100644
--- a/arch/riscv/include/asm/jump_label.h
+++ b/arch/riscv/include/asm/jump_label.h
@@ -18,6 +18,7 @@  static __always_inline bool arch_static_branch(struct static_key * const key,
 					       const bool branch)
 {
 	asm_volatile_goto(
+		"	.align		2			\n\t"
 		"	.option push				\n\t"
 		"	.option norelax				\n\t"
 		"	.option norvc				\n\t"
@@ -39,6 +40,7 @@  static __always_inline bool arch_static_branch_jump(struct static_key * const ke
 						    const bool branch)
 {
 	asm_volatile_goto(
+		"	.align		2			\n\t"
 		"	.option push				\n\t"
 		"	.option norelax				\n\t"
 		"	.option norvc				\n\t"