Message ID | 20240625040500.1788-3-jszhang@kernel.org (mailing list archive) |
---|---|
State | Changes Requested, archived |
Headers | show |
Series | riscv: uaccess: optimizations | expand |
On Tue, Jun 25, 2024, at 06:04, Jisheng Zhang wrote: > I believe the output constraints "=m" is not necessary, because > the instruction itself is "write", we don't need the compiler > to "write" for us. So tell compiler we read from memory instead > of writing. > > Signed-off-by: Jisheng Zhang <jszhang@kernel.org> I think this is a bit too confusing: clearly there is no read access from the __user pointer, so what you add in here is not correct. There also needs to be a code comment about why you do it this way, as it's not clear that this is a workaround for old compilers without CONFIG_CC_HAS_ASM_GOTO_OUTPUT. > index 09d4ca37522c..84b084e388a7 100644 > --- a/arch/riscv/include/asm/uaccess.h > +++ b/arch/riscv/include/asm/uaccess.h > @@ -186,11 +186,11 @@ do { \ > __typeof__(*(ptr)) __x = x; \ > __asm__ __volatile__ ( \ > "1:\n" \ > - " " insn " %z2, %1\n" \ > + " " insn " %z1, %2\n" \ > "2:\n" \ > _ASM_EXTABLE_UACCESS_ERR(1b, 2b, %0) \ > - : "+r" (err), "=m" (*(ptr)) \ > - : "rJ" (__x)); \ > + : "+r" (err) \ > + : "rJ" (__x), "m"(*(ptr))); \ > } while (0) > I suspect this could just be a "r" constraint instead of "m", treating the __user pointer as a plain integer. For kernel pointers, using "m" and "=m" constraints correctly is necessary since gcc will often access the same data from C code as well. For __user pointers, we can probably get away without it since no C code is ever allowed to just dereference them. If you do that, you may want to have the same thing in the __get_user side. Arnd
On Tue, Jun 25, 2024 at 07:54:30AM +0200, Arnd Bergmann wrote: > On Tue, Jun 25, 2024, at 06:04, Jisheng Zhang wrote: > > I believe the output constraints "=m" is not necessary, because > > the instruction itself is "write", we don't need the compiler > > to "write" for us. So tell compiler we read from memory instead > > of writing. > > > > Signed-off-by: Jisheng Zhang <jszhang@kernel.org> > > I think this is a bit too confusing: clearly there is no > read access from the __user pointer, so what you add in here > is not correct. There also needs to be a code comment about Here is my understanding: the __put_user is implemented with sd(or its less wider variant, sw etc.), w/o considering the ex_table, the previous code can be simplified as below: __asm__ __volatile__ ( "sw %z2, %1\n" : "+r" (err), "=m" (*(ptr)) : "rJ" (__x)); Here ptr is really an input, just tells gcc where to store, And the "store" action is from the "sw" instruction, I don't need the gcc generates "store" instruction for me. so IMHO, there's no need to use output constraints here. so I changed it to __asm__ __volatile__ ( "sw %z1, %2\n" : "+r" (err) : "rJ" (__x), "m"(*(ptr))); The key here: is this correct? Here is the put_user piece code and comments from x86 /* * Tell gcc we read from memory instead of writing: this is because * we do not write to any memory gcc knows about, so there are no * aliasing issues. */ #define __put_user_goto(x, addr, itype, ltype, label) \ asm goto("\n" \ "1: mov"itype" %0,%1\n" \ _ASM_EXTABLE_UA(1b, %l2) \ : : ltype(x), "m" (__m(addr)) \ : : label) As can be seen, x86 also doesn't put the (addr) in output constraints, I think x86 version did similar modification in history, but when I tried to searh the git history, the comment is there from the git first day. Any hint or suggestion is appreciated! > why you do it this way, as it's not clear that this is > a workaround for old compilers without > CONFIG_CC_HAS_ASM_GOTO_OUTPUT. > > > index 09d4ca37522c..84b084e388a7 100644 > > --- a/arch/riscv/include/asm/uaccess.h > > +++ b/arch/riscv/include/asm/uaccess.h > > @@ -186,11 +186,11 @@ do { \ > > __typeof__(*(ptr)) __x = x; \ > > __asm__ __volatile__ ( \ > > "1:\n" \ > > - " " insn " %z2, %1\n" \ > > + " " insn " %z1, %2\n" \ > > "2:\n" \ > > _ASM_EXTABLE_UACCESS_ERR(1b, 2b, %0) \ > > - : "+r" (err), "=m" (*(ptr)) \ > > - : "rJ" (__x)); \ > > + : "+r" (err) \ > > + : "rJ" (__x), "m"(*(ptr))); \ > > } while (0) > > > > I suspect this could just be a "r" constraint instead of > "m", treating the __user pointer as a plain integer. I tried "r", the generated code is not as good as "m" for example __put_user(0x12, &frame->uc.uc_flags); with "m", the generated code will be ... csrs sstatus,a5 li a4,18 sd a4,128(s1) csrc sstatus,a5 ... with "r", the generated code will be ... csrs sstatus,a5 li a4,18 addi s1,s1,128 sd a4,0(s1) csrc sstatus,a5 ... As can be seen, "m" can make use of the 'offset' of sd, so save one instruction. > > For kernel pointers, using "m" and "=m" constraints > correctly is necessary since gcc will often access the > same data from C code as well. For __user pointers, we > can probably get away without it since no C code is > ever allowed to just dereference them. If you do that, > you may want to have the same thing in the __get_user > side. > > Arnd
On Wed, Jun 26, 2024 at 08:32:38PM +0800, Jisheng Zhang wrote: > On Tue, Jun 25, 2024 at 07:54:30AM +0200, Arnd Bergmann wrote: > > On Tue, Jun 25, 2024, at 06:04, Jisheng Zhang wrote: > > > I believe the output constraints "=m" is not necessary, because > > > the instruction itself is "write", we don't need the compiler > > > to "write" for us. So tell compiler we read from memory instead > > > of writing. > > > > > > Signed-off-by: Jisheng Zhang <jszhang@kernel.org> > > > > I think this is a bit too confusing: clearly there is no > > read access from the __user pointer, so what you add in here > > is not correct. There also needs to be a code comment about > > Here is my understanding: the __put_user is implemented with > sd(or its less wider variant, sw etc.), w/o considering the > ex_table, the previous code can be simplified as below: > > __asm__ __volatile__ ( > "sw %z2, %1\n" > : "+r" (err), "=m" (*(ptr)) > : "rJ" (__x)); > > Here ptr is really an input, just tells gcc where to store, > And the "store" action is from the "sw" instruction, I don't > need the gcc generates "store" instruction for me. so IMHO, > there's no need to use output constraints here. so I changed > it to > > __asm__ __volatile__ ( > "sw %z1, %2\n" > : "+r" (err) > : "rJ" (__x), "m"(*(ptr))); > > The key here: is this correct? > > > Here is the put_user piece code and comments from x86 > > /* > * Tell gcc we read from memory instead of writing: this is because > * we do not write to any memory gcc knows about, so there are no > * aliasing issues. > */ > #define __put_user_goto(x, addr, itype, ltype, label) \ > asm goto("\n" \ > "1: mov"itype" %0,%1\n" \ > _ASM_EXTABLE_UA(1b, %l2) \ > : : ltype(x), "m" (__m(addr)) \ > : : label) Here is the simplified put_user piece code of arm64: #define __put_mem_asm(store, reg, x, addr, err, type) \ asm volatile( \ "1: " store " " reg "1, [%2]\n" \ "2:\n" \ _ASM_EXTABLE_##type##ACCESS_ERR(1b, 2b, %w0) \ : "+r" (err) \ : "rZ" (x), "r" (addr)) no output constraints either. It just uses "r" input constraints to tell gcc to read the store address into one proper GP reg. > > > As can be seen, x86 also doesn't put the (addr) in output constraints, > I think x86 version did similar modification in history, but when I tried > to searh the git history, the comment is there from the git first day. > > Any hint or suggestion is appreciated! > > > why you do it this way, as it's not clear that this is > > a workaround for old compilers without > > CONFIG_CC_HAS_ASM_GOTO_OUTPUT. > > > > > index 09d4ca37522c..84b084e388a7 100644 > > > --- a/arch/riscv/include/asm/uaccess.h > > > +++ b/arch/riscv/include/asm/uaccess.h > > > @@ -186,11 +186,11 @@ do { \ > > > __typeof__(*(ptr)) __x = x; \ > > > __asm__ __volatile__ ( \ > > > "1:\n" \ > > > - " " insn " %z2, %1\n" \ > > > + " " insn " %z1, %2\n" \ > > > "2:\n" \ > > > _ASM_EXTABLE_UACCESS_ERR(1b, 2b, %0) \ > > > - : "+r" (err), "=m" (*(ptr)) \ > > > - : "rJ" (__x)); \ > > > + : "+r" (err) \ > > > + : "rJ" (__x), "m"(*(ptr))); \ > > > } while (0) > > > > > > > I suspect this could just be a "r" constraint instead of > > "m", treating the __user pointer as a plain integer. > > I tried "r", the generated code is not as good as "m" > > for example > __put_user(0x12, &frame->uc.uc_flags); > > with "m", the generated code will be > > ... > csrs sstatus,a5 > li a4,18 > sd a4,128(s1) > csrc sstatus,a5 > ... > > > with "r", the generated code will be > > ... > csrs sstatus,a5 > li a4,18 > addi s1,s1,128 > sd a4,0(s1) > csrc sstatus,a5 > ... > > As can be seen, "m" can make use of the 'offset' of > sd, so save one instruction. > > > > > For kernel pointers, using "m" and "=m" constraints > > correctly is necessary since gcc will often access the > > same data from C code as well. For __user pointers, we > > can probably get away without it since no C code is > > ever allowed to just dereference them. If you do that, > > you may want to have the same thing in the __get_user > > side. > > > > Arnd
On Wed, Jun 26, 2024 at 03:12:50PM +0200, Andreas Schwab wrote: > On Jun 25 2024, Jisheng Zhang wrote: > > > I believe the output constraints "=m" is not necessary, because > > the instruction itself is "write", we don't need the compiler > > to "write" for us. > > No, this is backwards. Being an output operand means that the *asm* is > writing to it, and the compiler can read the value from there afterwards > (and the previous value is dead before the asm). Hi Andreas, I compared tens of __put_user() caller's generated code between orig version and patched version, they are the same. Sure maybe this is not enough. But your explanation can be applied to x86 and arm64 __put_user() implementations, asm is also writing, then why there's no output constraints there?(see the other two emails)? Could you please help me to understand the tricky points? Thanks in advance
On Jun 25 2024, Jisheng Zhang wrote: > I believe the output constraints "=m" is not necessary, because > the instruction itself is "write", we don't need the compiler > to "write" for us. No, this is backwards. Being an output operand means that the *asm* is writing to it, and the compiler can read the value from there afterwards (and the previous value is dead before the asm).
On Wed, Jun 26, 2024 at 08:49:59PM +0800, Jisheng Zhang wrote: > On Wed, Jun 26, 2024 at 08:32:38PM +0800, Jisheng Zhang wrote: > > On Tue, Jun 25, 2024 at 07:54:30AM +0200, Arnd Bergmann wrote: > > > On Tue, Jun 25, 2024, at 06:04, Jisheng Zhang wrote: > > > > I believe the output constraints "=m" is not necessary, because > > > > the instruction itself is "write", we don't need the compiler > > > > to "write" for us. So tell compiler we read from memory instead > > > > of writing. > > > > > > > > Signed-off-by: Jisheng Zhang <jszhang@kernel.org> > > > > > > I think this is a bit too confusing: clearly there is no > > > read access from the __user pointer, so what you add in here > > > is not correct. There also needs to be a code comment about > > > > Here is my understanding: the __put_user is implemented with > > sd(or its less wider variant, sw etc.), w/o considering the > > ex_table, the previous code can be simplified as below: > > > > __asm__ __volatile__ ( > > "sw %z2, %1\n" > > : "+r" (err), "=m" (*(ptr)) > > : "rJ" (__x)); > > > > Here ptr is really an input, just tells gcc where to store, > > And the "store" action is from the "sw" instruction, I don't > > need the gcc generates "store" instruction for me. so IMHO, > > there's no need to use output constraints here. so I changed > > it to > > > > __asm__ __volatile__ ( > > "sw %z1, %2\n" > > : "+r" (err) > > : "rJ" (__x), "m"(*(ptr))); > > > > The key here: is this correct? > > > > > > Here is the put_user piece code and comments from x86 > > > > /* > > * Tell gcc we read from memory instead of writing: this is because > > * we do not write to any memory gcc knows about, so there are no > > * aliasing issues. > > */ > > #define __put_user_goto(x, addr, itype, ltype, label) \ > > asm goto("\n" \ > > "1: mov"itype" %0,%1\n" \ > > _ASM_EXTABLE_UA(1b, %l2) \ > > : : ltype(x), "m" (__m(addr)) \ > > : : label) > > Here is the simplified put_user piece code of arm64: > > #define __put_mem_asm(store, reg, x, addr, err, type) \ > asm volatile( \ > "1: " store " " reg "1, [%2]\n" \ > "2:\n" \ > _ASM_EXTABLE_##type##ACCESS_ERR(1b, 2b, %w0) \ > : "+r" (err) \ > : "rZ" (x), "r" (addr)) > > no output constraints either. It just uses "r" input constraints to tell make it accurate: by this I mean the "addr" of __put_user() isn't in the output constraints. > gcc to read the store address into one proper GP reg. > > > > > > > As can be seen, x86 also doesn't put the (addr) in output constraints, > > I think x86 version did similar modification in history, but when I tried > > to searh the git history, the comment is there from the git first day. > > > > Any hint or suggestion is appreciated! > > > > > why you do it this way, as it's not clear that this is > > > a workaround for old compilers without > > > CONFIG_CC_HAS_ASM_GOTO_OUTPUT. > > > > > > > index 09d4ca37522c..84b084e388a7 100644 > > > > --- a/arch/riscv/include/asm/uaccess.h > > > > +++ b/arch/riscv/include/asm/uaccess.h > > > > @@ -186,11 +186,11 @@ do { \ > > > > __typeof__(*(ptr)) __x = x; \ > > > > __asm__ __volatile__ ( \ > > > > "1:\n" \ > > > > - " " insn " %z2, %1\n" \ > > > > + " " insn " %z1, %2\n" \ > > > > "2:\n" \ > > > > _ASM_EXTABLE_UACCESS_ERR(1b, 2b, %0) \ > > > > - : "+r" (err), "=m" (*(ptr)) \ > > > > - : "rJ" (__x)); \ > > > > + : "+r" (err) \ > > > > + : "rJ" (__x), "m"(*(ptr))); \ > > > > } while (0) > > > > > > > > > > I suspect this could just be a "r" constraint instead of > > > "m", treating the __user pointer as a plain integer. > > > > I tried "r", the generated code is not as good as "m" > > > > for example > > __put_user(0x12, &frame->uc.uc_flags); > > > > with "m", the generated code will be > > > > ... > > csrs sstatus,a5 > > li a4,18 > > sd a4,128(s1) > > csrc sstatus,a5 > > ... > > > > > > with "r", the generated code will be > > > > ... > > csrs sstatus,a5 > > li a4,18 > > addi s1,s1,128 > > sd a4,0(s1) > > csrc sstatus,a5 > > ... > > > > As can be seen, "m" can make use of the 'offset' of > > sd, so save one instruction. > > > > > > > > For kernel pointers, using "m" and "=m" constraints > > > correctly is necessary since gcc will often access the > > > same data from C code as well. For __user pointers, we > > > can probably get away without it since no C code is > > > ever allowed to just dereference them. If you do that, > > > you may want to have the same thing in the __get_user > > > side. > > > > > > Arnd
On Jun 26 2024, Jisheng Zhang wrote: > no output constraints either. It just uses "r" input constraints to tell > gcc to read the store address into one proper GP reg. Again, this is backwards. Being an input operand means the asm is using this operand as an input to the instructions. The compiler needs to arrange to put the value in the allocated operand location according to the constraint.
On Wed, Jun 26, 2024 at 03:35:54PM +0200, Andreas Schwab wrote: > On Jun 26 2024, Jisheng Zhang wrote: > > > no output constraints either. It just uses "r" input constraints to tell > > gcc to read the store address into one proper GP reg. > > Again, this is backwards. Being an input operand means the asm is using > this operand as an input to the instructions. The compiler needs to > arrange to put the value in the allocated operand location according to > the constraint. Hi Andreas, Your information is clearly received. What confused me is: why x86 and arm64 don't put the "addr" of __put_user into output constraints? Especially the following comments, why this is "read" from memory? * Tell gcc we read from memory instead of writing: this is because * we do not write to any memory gcc knows about, so there are no * aliasing issues. can you please kindly help me understand the tricky points here? thanks > > -- > Andreas Schwab, SUSE Labs, schwab@suse.de > GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 > "And now for something completely different."
On Wed, Jun 26, 2024, at 15:12, Jisheng Zhang wrote: > On Wed, Jun 26, 2024 at 03:12:50PM +0200, Andreas Schwab wrote: >> On Jun 25 2024, Jisheng Zhang wrote: >> >> > I believe the output constraints "=m" is not necessary, because >> > the instruction itself is "write", we don't need the compiler >> > to "write" for us. >> >> No, this is backwards. Being an output operand means that the *asm* is >> writing to it, and the compiler can read the value from there afterwards >> (and the previous value is dead before the asm). > > Hi Andreas, > > I compared tens of __put_user() caller's generated code between orig > version and patched version, they are the same. Sure maybe this is > not enough. > > But your explanation can be applied to x86 and arm64 __put_user() > implementations, asm is also writing, then why there's no output > constraints there?(see the other two emails)? Could you please help > me to understand the tricky points? I think part of the reason for the specific way the x86 user access is written is to work around bugs in old compiler versions, as well as to take advantage of the complex addressing modes in x86 assembler, see this bit that dates back to the earliest version of the x86_64 codebase and is still left in place: /* FIXME: this hack is definitely wrong -AK */ struct __large_struct { unsigned long buf[100]; }; #define __m(x) (*(struct __large_struct __user *)(x)) Using the memory input constraint means that x86 can use a load from a pointer plus offset, but riscv doesn't actually do this. The __large_struct I think was needed either to prevent the compiler from reading the data outside of the assembly, or to tell the compiler about the fact that there is an actual memory access if __put_user() was pointed at kernel memory. If you just copy from the arm64 version that uses an "r"(address) constraint instead of the "m"(*address) version, it should be fine for any user space access. The output constraint is technically still be needed if we have code like this one where we actually write to something in kernel space: int f(void) { int a = 1; int b = 2; __put_kernel_nofault(&a, &b, int, error); return a; error: return -EFAULT; } In this case, __put_kernel_nofault() writes the value of b into a, but the compiler can safely assume that a is not changed by the assembly because there is no memory output, and would likely just return a constant '1'. For put_user(), this cannot happen because the compiler doesn't know anything about the contents of the __user pointer. For __put_kernel_nofault(), we rely on the callers never using it on pointers they access, which is probably a reasonable assumption, but not entirely correct. Arnd
On Wed, Jun 26, 2024 at 04:25:26PM +0200, Arnd Bergmann wrote: > On Wed, Jun 26, 2024, at 15:12, Jisheng Zhang wrote: > > On Wed, Jun 26, 2024 at 03:12:50PM +0200, Andreas Schwab wrote: > >> On Jun 25 2024, Jisheng Zhang wrote: > >> > >> > I believe the output constraints "=m" is not necessary, because > >> > the instruction itself is "write", we don't need the compiler > >> > to "write" for us. > >> > >> No, this is backwards. Being an output operand means that the *asm* is > >> writing to it, and the compiler can read the value from there afterwards > >> (and the previous value is dead before the asm). > > > > Hi Andreas, > > > > I compared tens of __put_user() caller's generated code between orig > > version and patched version, they are the same. Sure maybe this is > > not enough. > > > > But your explanation can be applied to x86 and arm64 __put_user() > > implementations, asm is also writing, then why there's no output > > constraints there?(see the other two emails)? Could you please help > > me to understand the tricky points? > > I think part of the reason for the specific way the x86 > user access is written is to work around bugs in old > compiler versions, as well as to take advantage of the > complex addressing modes in x86 assembler, see this bit > that dates back to the earliest version of the x86_64 > codebase and is still left in place: > > /* FIXME: this hack is definitely wrong -AK */ > struct __large_struct { unsigned long buf[100]; }; > #define __m(x) (*(struct __large_struct __user *)(x)) > > Using the memory input constraint means that x86 can use > a load from a pointer plus offset, but riscv doesn't > actually do this. The __large_struct I think was needed > either to prevent the compiler from reading the data > outside of the assembly, or to tell the compiler about > the fact that there is an actual memory access if > __put_user() was pointed at kernel memory. Thank you so much, Arnd! > > If you just copy from the arm64 version that uses an > "r"(address) constraint instead of the "m"(*address) "m" version is better than "r", usually can save one instruction. I will try to combine other constraints with "r" to see whether we can still generate the sd with offset instruction. If can't, seems sticking with "m" and keeping output constraints is better > version, it should be fine for any user space access. You only mention "user space access", so just curious, does arm64 version still correctly work with below __put_kernel_nofault() example? > > The output constraint is technically still be needed > if we have code like this one where we actually write to > something in kernel space: > > int f(void) > { > int a = 1; > int b = 2; > __put_kernel_nofault(&a, &b, int, error); > return a; > error: > return -EFAULT; > } > > In this case, __put_kernel_nofault() writes the value > of b into a, but the compiler can safely assume that > a is not changed by the assembly because there is no > memory output, and would likely just return a constant '1'. > > For put_user(), this cannot happen because the compiler > doesn't know anything about the contents of the __user > pointer. For __put_kernel_nofault(), we rely on the > callers never using it on pointers they access, which > is probably a reasonable assumption, but not entirely > correct. > > Arnd Well explained! Thanks a lot.
On Wed, Jun 26, 2024, at 18:02, Jisheng Zhang wrote: > > "m" version is better than "r", usually can save one > instruction. > I will try to combine other constraints with "r" to > see whether we can still generate the sd with offset > instruction. If can't, seems sticking with "m" and keeping > output constraints is better Ah, I see. > You only mention "user space access", so just curious, does > arm64 version still correctly work with below __put_kernel_nofault() > example? No, I think the example I gave would break for both x86 and arm64 without adding an output constraint. My main concern about using an input constraint was that it doesn't match what the code does. Maybe there is a way to make it use the correct "=m" output when CONFIG_CC_HAS_ASM_GOTO_OUTPUT is set but use either "r" or "m" inputs on older gcc releases. After gcc-11 becomes the minimum in a few years, the hack can be removed. Arnd
From: Arnd Bergmann > Sent: 26 June 2024 15:25 ... > If you just copy from the arm64 version that uses an > "r"(address) constraint instead of the "m"(*address) > version, it should be fine for any user space access. Arm certainly has 'reg+offset' addressing and I'd have thought the RISC-V would have it as well. I'd guess that the compiler also knows when the offset is too big. Probably noticeable when code is accessing structures in user memory. OTOH I can't remember if "m" implies a memory clobber? For user copies the memory clobber isn't needed and not having it may well allow better instruction scheduling. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
diff --git a/arch/riscv/include/asm/uaccess.h b/arch/riscv/include/asm/uaccess.h index 09d4ca37522c..84b084e388a7 100644 --- a/arch/riscv/include/asm/uaccess.h +++ b/arch/riscv/include/asm/uaccess.h @@ -186,11 +186,11 @@ do { \ __typeof__(*(ptr)) __x = x; \ __asm__ __volatile__ ( \ "1:\n" \ - " " insn " %z2, %1\n" \ + " " insn " %z1, %2\n" \ "2:\n" \ _ASM_EXTABLE_UACCESS_ERR(1b, 2b, %0) \ - : "+r" (err), "=m" (*(ptr)) \ - : "rJ" (__x)); \ + : "+r" (err) \ + : "rJ" (__x), "m"(*(ptr))); \ } while (0) #ifdef CONFIG_64BIT @@ -203,16 +203,16 @@ do { \ u64 __x = (__typeof__((x)-(x)))(x); \ __asm__ __volatile__ ( \ "1:\n" \ - " sw %z3, %1\n" \ + " sw %z1, %3\n" \ "2:\n" \ - " sw %z4, %2\n" \ + " sw %z2, %4\n" \ "3:\n" \ _ASM_EXTABLE_UACCESS_ERR(1b, 3b, %0) \ _ASM_EXTABLE_UACCESS_ERR(2b, 3b, %0) \ - : "+r" (err), \ - "=m" (__ptr[__LSW]), \ - "=m" (__ptr[__MSW]) \ - : "rJ" (__x), "rJ" (__x >> 32)); \ + : "+r" (err) \ + : "rJ" (__x), "rJ" (__x >> 32), \ + "m" (__ptr[__LSW]), \ + "m" (__ptr[__MSW])); \ } while (0) #endif /* CONFIG_64BIT */
I believe the output constraints "=m" is not necessary, because the instruction itself is "write", we don't need the compiler to "write" for us. So tell compiler we read from memory instead of writing. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> --- arch/riscv/include/asm/uaccess.h | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-)