Message ID | 20190115004904.tr7zhrovzotrivle@ltop.local (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
Series | [SPARSE] noderef & ASM statements | expand |
On Tue, Jan 15, 2019 at 12:49 PM Luc Van Oostenryck <luc.vanoostenryck@gmail.com> wrote: > > The only simple temporary 'solution' is see for the moment would be > to disable 'noderef' warnings while expanding ASM. No, I think the real fix is to consider memory arguments to asm's to have an implicit "address_of" operation applied to them. Basically, the *compiler* never derefences the expression, it's just given as an address to the asm. But note that this would require looking at the type of argument to the asm. If it's a register that the compiler is supposed to fill in, then the compiler obviously *does* dereference the expression. But if the asm constraint is "m", then the compiler basically passes in just he address, and the asm is what *may* do the dereference. Note the *may*. It's entirely possible that we're passing a memory op to the inline asm in order for the asm to generate not a dereference, but something entirely different (like a cache flush, or in the case of the kernel, maybe a special user access). Is this somewhat annoying? Yes. But a memory op to an asm really is more like an address than an access. Linus
On Tue, Jan 15, 2019 at 04:50:05PM +1200, Linus Torvalds wrote: > On Tue, Jan 15, 2019 at 12:49 PM Luc Van Oostenryck > <luc.vanoostenryck@gmail.com> wrote: > > > > The only simple temporary 'solution' is see for the moment would be > > to disable 'noderef' warnings while expanding ASM. > > No, I think the real fix is to consider memory arguments to asm's to > have an implicit "address_of" operation applied to them. > > Basically, the *compiler* never derefences the expression, it's just > given as an address to the asm. > > But note that this would require looking at the type of argument to > the asm. If it's a register that the compiler is supposed to fill in, > then the compiler obviously *does* dereference the expression. > > But if the asm constraint is "m", then the compiler basically passes > in just he address, and the asm is what *may* do the dereference. > > Note the *may*. It's entirely possible that we're passing a memory op > to the inline asm in order for the asm to generate not a dereference, > but something entirely different (like a cache flush, or in the case > of the kernel, maybe a special user access). > > Is this somewhat annoying? Yes. But a memory op to an asm really is > more like an address than an access. Yes, I agree. It's especially visible when looking at uaccess.h's macro __m() (it dereferences put_user/get_user()'s address argument just before for 'm' asm operands). But yes, it's certainly annoying because: *) the constraints are possibly arch specific *) the linearization needs to be special cased for asm memory op too (but I suppose it must currently be considered as wrong for asm memory ops) *) I've always considered __noderef as really meaning "must not be dereferenced except by some special purpose operation which effectively do the real deref". In the present case (percpu & uaccess (on x86)), it's clear that the special operation is the asm memory op but isn't this essentially incidental? Thinking a bit more about it and looking at some dumps I made: It seems that most of these these 'noderef in asm operand' warnings (99.5%) come from percpu operations (which have an API using an lvalue and not a pointer, unlike the uaccess API). I begin to suspect that the warnings occuring with an uaccess operation come from some other kind of error (possibly a missing call to degenerate()). Anyway, I'm looking at it. Thanks, -- Luc
On Wed, Jan 16, 2019 at 10:59 AM Luc Van Oostenryck <luc.vanoostenryck@gmail.com> wrote: > *) I've always considered __noderef as really meaning "must not be > dereferenced except by some special purpose operation which > effectively do the real deref". In the present case (percpu & > uaccess (on x86)), it's clear that the special operation is > the asm memory op but isn't this essentially incidental? I agree that it's somewhat incidental rather than anything hugely by design. That said, the *design* of __noderef was literally to make sure that the *compiler* never derefences things, and the "special thing" that does dereference was literally meant to be an inline asm. The "{get,put}_user()" case was exactly what __noderef was designed for. So it *is* very much by design, but then the implementation not *checking* inline asm arguments was kind of an incidental thing. Which is why I very much admit that it's annoying. The good news is that the asm constraints aren't really all that architecture-specific. Yes, all the various register classes etc obviously are, but within the concept of "the compiler never dereferences it", the only asm constraints that matter are "m" and "p", I think. "m" means "memory", and "p" means "address". They are kind of the same, except for a gcc memory access ordering situation, I think. > Thinking a bit more about it and looking at some dumps I made: > It seems that most of these these 'noderef in asm operand' > warnings (99.5%) come from percpu operations (which have an API > using an lvalue and not a pointer, unlike the uaccess API). > > I begin to suspect that the warnings occuring with an uaccess > operation come from some other kind of error (possibly a missing > call to degenerate()). Hmm. The uaccess cases definitely pass the lvalue in some cases. Not the cases where we just end up doing a special "call" sequence (the normal get_user/put_user() case). But look closer: the cases where we actaully inline the access itself, we generate a "LKmode" lvalue with that "__m(addr)" thing, and the actual access is done by the inline asm. See __get_user_asm(), __get_user_asm_nozero(), __get_user_asm_ex() and __put_user_goto(). (Some of this has changed recently, so __put_user_goto() is a new version of __put_user_asm(), so if you don't look at current kernel -git, that's what you'll see instead). Linus
On Wed, Jan 16, 2019 at 05:22:48PM +1200, Linus Torvalds wrote: > On Wed, Jan 16, 2019 at 10:59 AM Luc Van Oostenryck > <luc.vanoostenryck@gmail.com> wrote: > > *) I've always considered __noderef as really meaning "must not be > > dereferenced except by some special purpose operation which > > effectively do the real deref". In the present case (percpu & > > uaccess (on x86)), it's clear that the special operation is > > the asm memory op but isn't this essentially incidental? > > I agree that it's somewhat incidental rather than anything hugely by design. > > That said, the *design* of __noderef was literally to make sure that > the *compiler* never derefences things, and the "special thing" that > does dereference was literally meant to be an inline asm. The > "{get,put}_user()" case was exactly what __noderef was designed for. > > So it *is* very much by design, OK, that's very clear. I'm much more familiar with __iomem accesses which I kinda used as reference. > The good news is that the asm constraints aren't really all that > architecture-specific. Yes, all the various register classes etc > obviously are, but within the concept of "the compiler never > dereferences it", the only asm constraints that matter are "m" and > "p", I think. > > "m" means "memory", and "p" means "address". They are kind of the > same, except for a gcc memory access ordering situation, I think. Well, it's alas more complicated :( The common constraints for memory operands are "m", "o" & "V". "p" is different, it must be a memory *address*, so there is no possible problem with noderef with it. So far so good but, for example: *) ARM also has "Q" as memory operand (while for ARM64, "Q" is a memory address) as well as "Uv", "Uy" & "Uq" *) PPC also has "wF", "wG", "wO", "es" "Q" & "Z" *) S390 has "Q", "R", "S" & "T" *) ... That's really annoying, especially the ARM vs ARM64 "Q". Of course, sparse could just take care of "m", "o" & "V" but that seems really wrong. ARM64 also use "Ump" for memory *address*, so simply scanning for the charatcer "m" will gives false positives. Sparse would need to known all constraints for all archs :( > > Thinking a bit more about it and looking at some dumps I made: > > It seems that most of these these 'noderef in asm operand' > > warnings (99.5%) come from percpu operations (which have an API > > using an lvalue and not a pointer, unlike the uaccess API). > > > > I begin to suspect that the warnings occuring with an uaccess > > operation come from some other kind of error (possibly a missing > > call to degenerate()). > > Hmm. The uaccess cases definitely pass the lvalue in some cases. Not > the cases where we just end up doing a special "call" sequence (the > normal get_user/put_user() case). But look closer: the cases where we > actaully inline the access itself, we generate a "LKmode" lvalue with > that "__m(addr)" thing, and the actual access is done by the inline > asm. > > See __get_user_asm(), __get_user_asm_nozero(), __get_user_asm_ex() and > __put_user_goto(). Yes, indeed I was misled by the relatively small number of these. That said, asm input operands really miss a call to degenerate(). -- Luc
diff --git a/expand.c b/expand.c index e8e50b080..fa410f39f 100644 --- a/expand.c +++ b/expand.c @@ -1155,6 +1155,22 @@ static int expand_if_statement(struct statement *stmt) return SIDE_EFFECTS; } +static int expand_asm_statement(struct statement *stmt) +{ + struct expression *op; + int cost = 0; + + FOR_EACH_PTR(stmt->asm_outputs, op) { + cost += expand_expression(op->expr); + } END_FOR_EACH_PTR(op); + + FOR_EACH_PTR(stmt->asm_inputs, op) { + cost += expand_expression(op->expr); + } END_FOR_EACH_PTR(op); + + return cost; +} + /* * Expanding a compound statement is really just * about adding up the costs of each individual @@ -1245,7 +1261,7 @@ static int expand_statement(struct statement *stmt) case STMT_NONE: break; case STMT_ASM: - /* FIXME! Do the asm parameter evaluation! */ + expand_asm_statement(stmt); break; case STMT_CONTEXT: expand_expression(stmt->expression);
I'm investigating some sparse's "warning: unknown expression (4 0)" on the current kernel. They are a few distinct cases but all of them are caused by some __builtin_types_compatible_p() not being expanded. In some of these cases the statement is not expanded because of an earlier error, for example in compatible_assignment_types(), but this statement is then linearized anyway. This needs to be be fixed but is not really a problem. Another case, much more annoying, is because input & output operands of ASM statements are not expanded. This is very easy to fix (see patch here below) but once fixed and used on the kernel, sparse then issues many thousands of "warning: dereference of noderef expression" (in fact, more than one million on defconfig + allyesconfig). These warnings are perfectly correct: a lot of ASM operands are indeed noderef variables or expressions (most warnings seem to come from put_user(), get_user() or one of the percpu operations). I'm quite reluctant to commit a patch that will add so much warnings. On the other hand, properly silenting these warnings in the kernel is a non-obvious job, especially given the fact that __force can't simply be used to 'ignore' __noderef (because the dereferencing kinda happens before the cast), you need to act via a pointer with something like: *(typeof(X) __kernel __force *) &(X); The only simple temporary 'solution' is see for the moment would be to disable 'noderef' warnings while expanding ASM. -- Luc Van Oostenryck From 89e3ba40948c9c98ada673f3451be0b11b5740c3 Mon Sep 17 00:00:00 2001 From: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Date: Mon, 14 Jan 2019 16:02:06 +0100 Subject: [PATCH] fix expansion of asm statements --- expand.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-)