Message ID | 26ff8f42-2a76-4f8d-9af6-5830b0aae739@suse.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | gnttab: hypervisor side XSA-448 follow-up | expand |
Hi Jan, Title: I would add 'gnttab:' to clarify which subsystem you are modifying. On 05/02/2024 11:03, Jan Beulich wrote: > Along the line with observations in the context of XSA-448, besides > "op" no field is relevant when the range to be flushed is empty, much > like e.g. the pointers passed to memcpy() are irrelevant (and would > never be "validated") when the passed length is zero. Split the existing > condition validating "op", "offset", and "length", leaving only the "op" > part ahead of the check for length being zero (or no flushing to be > performed). I am probably missing something here. I understand the theory behind reducing the number of checks when len == 0. But an OS cannot rely on it: 1) older hypervisor would still return an error if the check doesn't pass) 2) it does feel odd to allow "invalid" offset when len == 0 (at least. So to me, it is better to keep those checks early. That said, I agree this is a matter of opinion, so I will not Nack it but also I will not Ack it. > > In the course of splitting also simplify the moved part of the condition > from 3 to 2 conditionals, potentially (depending on the architecture) > requiring one less (conditional) branch. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > > --- a/xen/common/grant_table.c > +++ b/xen/common/grant_table.c > @@ -3528,15 +3528,16 @@ static int _cache_flush(const gnttab_cac > void *v; > int ret; > > - if ( (cflush->offset >= PAGE_SIZE) || > - (cflush->length > PAGE_SIZE) || > - (cflush->offset + cflush->length > PAGE_SIZE) || > - (cflush->op & ~(GNTTAB_CACHE_INVAL | GNTTAB_CACHE_CLEAN)) ) > + if ( cflush->op & ~(GNTTAB_CACHE_INVAL | GNTTAB_CACHE_CLEAN) ) > return -EINVAL; > > if ( cflush->length == 0 || cflush->op == 0 ) > return !*cur_ref ? 0 : -EILSEQ; > > + if ( (cflush->offset | cflush->length) > PAGE_SIZE || This is confusing. I understand you are trying to force the compiler to optimize. But is it really worth it? After all, the rest of operation will outweight this check (cache flush are quite expensive). We probably should take a more generic decision (and encode in our policy) because you seem to like this pattern and I dislike it :). Not sure what the others think. Cheers,
On 19.02.2024 23:22, Julien Grall wrote: > Title: I would add 'gnttab:' to clarify which subsystem you are modifying. That's how I actually have it here; it's not clear to me why I lost the prefix when sending. > On 05/02/2024 11:03, Jan Beulich wrote: >> Along the line with observations in the context of XSA-448, besides >> "op" no field is relevant when the range to be flushed is empty, much >> like e.g. the pointers passed to memcpy() are irrelevant (and would >> never be "validated") when the passed length is zero. Split the existing >> condition validating "op", "offset", and "length", leaving only the "op" >> part ahead of the check for length being zero (or no flushing to be >> performed). > > I am probably missing something here. I understand the theory behind > reducing the number of checks when len == 0. But an OS cannot rely on it: > 1) older hypervisor would still return an error if the check doesn't > pass) Right, but that's no reason to keep the bogus earlier behavior. > 2) it does feel odd to allow "invalid" offset when len == 0 (at least. I'm puzzled: You've given R-b for patch 1 (thanks), where exactly the same reasoning is used, i.e. similarly referring to memcpy() to justify the (new / supposed) behavior. >> In the course of splitting also simplify the moved part of the condition >> from 3 to 2 conditionals, potentially (depending on the architecture) >> requiring one less (conditional) branch. >> >> Signed-off-by: Jan Beulich <jbeulich@suse.com> >> >> --- a/xen/common/grant_table.c >> +++ b/xen/common/grant_table.c >> @@ -3528,15 +3528,16 @@ static int _cache_flush(const gnttab_cac >> void *v; >> int ret; >> >> - if ( (cflush->offset >= PAGE_SIZE) || >> - (cflush->length > PAGE_SIZE) || >> - (cflush->offset + cflush->length > PAGE_SIZE) || >> - (cflush->op & ~(GNTTAB_CACHE_INVAL | GNTTAB_CACHE_CLEAN)) ) >> + if ( cflush->op & ~(GNTTAB_CACHE_INVAL | GNTTAB_CACHE_CLEAN) ) >> return -EINVAL; >> >> if ( cflush->length == 0 || cflush->op == 0 ) >> return !*cur_ref ? 0 : -EILSEQ; >> >> + if ( (cflush->offset | cflush->length) > PAGE_SIZE || > > This is confusing. I understand you are trying to force the compiler to > optimize. But is it really worth it? After all, the rest of operation > will outweight this check (cache flush are quite expensive). From purely a performance point of view it may not be worth it. From code size angle (taken globally) I already view this differently. Plus I think that we ought to aim at avoiding undesirable patterns, just because people tend to clone existing code when they can. Thing is that (as per below) the two of us apparently disagree on what "undesirable" is in cases like this one. > We probably should take a more generic decision (and encode in our > policy) because you seem to like this pattern and I dislike it :). Not > sure what the others think. Perhaps. If the folding alone was the problem, I'd accept to split (or even undo) that part. But the earlier aspect you raised also needs sorting before I can decide whether to adjust or whether to consider the patch rejected. Jan
Hi Jan, On 20/02/2024 08:26, Jan Beulich wrote: > On 19.02.2024 23:22, Julien Grall wrote: >> Title: I would add 'gnttab:' to clarify which subsystem you are modifying. > > That's how I actually have it here; it's not clear to me why I lost the > prefix when sending. > >> On 05/02/2024 11:03, Jan Beulich wrote: >>> Along the line with observations in the context of XSA-448, besides >>> "op" no field is relevant when the range to be flushed is empty, much >>> like e.g. the pointers passed to memcpy() are irrelevant (and would >>> never be "validated") when the passed length is zero. Split the existing >>> condition validating "op", "offset", and "length", leaving only the "op" >>> part ahead of the check for length being zero (or no flushing to be >>> performed). >> >> I am probably missing something here. I understand the theory behind >> reducing the number of checks when len == 0. But an OS cannot rely on it: >> 1) older hypervisor would still return an error if the check doesn't >> pass) > > Right, but that's no reason to keep the bogus earlier behavior. Hmmm... I am not sure why you say the behavior is bogus. From the commit message, it seems this is just an optimization that have side effect (ignoring the other fields). > >> 2) it does feel odd to allow "invalid" offset when len == 0 (at least. > > I'm puzzled: You've given R-b for patch 1 (thanks), where exactly the > same reasoning is used, i.e. similarly referring to memcpy() to > justify the (new / supposed) behavior. I realize it. But I viewed it slightly different as you are adding the check. I think it is a good idea to add the check and ideally it should be after. Here you don't seem to add any check and only re-order it. Hence why I treated it differently. Cheers,
On 20.02.2024 12:52, Julien Grall wrote: > Hi Jan, > > On 20/02/2024 08:26, Jan Beulich wrote: >> On 19.02.2024 23:22, Julien Grall wrote: >>> Title: I would add 'gnttab:' to clarify which subsystem you are modifying. >> >> That's how I actually have it here; it's not clear to me why I lost the >> prefix when sending. >> >>> On 05/02/2024 11:03, Jan Beulich wrote: >>>> Along the line with observations in the context of XSA-448, besides >>>> "op" no field is relevant when the range to be flushed is empty, much >>>> like e.g. the pointers passed to memcpy() are irrelevant (and would >>>> never be "validated") when the passed length is zero. Split the existing >>>> condition validating "op", "offset", and "length", leaving only the "op" >>>> part ahead of the check for length being zero (or no flushing to be >>>> performed). >>> >>> I am probably missing something here. I understand the theory behind >>> reducing the number of checks when len == 0. But an OS cannot rely on it: >>> 1) older hypervisor would still return an error if the check doesn't >>> pass) >> >> Right, but that's no reason to keep the bogus earlier behavior. > > Hmmm... I am not sure why you say the behavior is bogus. From the commit > message, it seems this is just an optimization that have side effect > (ignoring the other fields). I don't view this as primarily an optimization; I'm in particular after not raising errors for cases where there is no error to be raised. Hence the comparison to memcpy(), which you can pass "bogus" pointers so long as you pass zero size. >>> 2) it does feel odd to allow "invalid" offset when len == 0 (at least. >> >> I'm puzzled: You've given R-b for patch 1 (thanks), where exactly the >> same reasoning is used, i.e. similarly referring to memcpy() to >> justify the (new / supposed) behavior. > > I realize it. But I viewed it slightly different as you are adding the > check. I think it is a good idea to add the check and ideally it should > be after. > > Here you don't seem to add any check and only re-order it. Hence why I > treated it differently. Right, there already was a zero-length check here. Just that zero length requests still could have an error returned for no reason. So the "optimization" part that you're talking about above was already there, but as said, that's secondary to me. Jan
On Tue, Feb 20, 2024 at 4:26 PM Jan Beulich <jbeulich@suse.com> wrote: > >> + if ( (cflush->offset | cflush->length) > PAGE_SIZE || > > > > This is confusing. I understand you are trying to force the compiler to > > optimize. But is it really worth it? After all, the rest of operation > > will outweight this check (cache flush are quite expensive). > > From purely a performance point of view it may not be worth it. From > code size angle (taken globally) I already view this differently. > Plus I think that we ought to aim at avoiding undesirable patterns, > just because people tend to clone existing code when they can. Thing > is that (as per below) the two of us apparently disagree on what > "undesirable" is in cases like this one. > > > We probably should take a more generic decision (and encode in our > > policy) because you seem to like this pattern and I dislike it :). Not > > sure what the others think. This is similar to the policy question I raised among the x86 committers a few weeks ago: You're manually specifying a more specific behavior than is required, rather than specifying what you want and then letting the compiler optimize things. The problem with this is twofold: 1. It's harder for humans to read and understand the intent 2. It ties the compiler's hands. If you write your intent, then the compiler is free to apply the optimization or not, or apply a different optimization. If you specify this optimization, then the compiler has fewer ways that it's allowed to compile the code. #1 by itself is probably enough to counterindicate this kind of behavior. Add them together, and I'm inclined to say that we should write a policy against such optimizations, without specific justifications. -George
On 21.02.2024 03:32, George Dunlap wrote: > On Tue, Feb 20, 2024 at 4:26 PM Jan Beulich <jbeulich@suse.com> wrote: >>>> + if ( (cflush->offset | cflush->length) > PAGE_SIZE || >>> >>> This is confusing. I understand you are trying to force the compiler to >>> optimize. But is it really worth it? After all, the rest of operation >>> will outweight this check (cache flush are quite expensive). >> >> From purely a performance point of view it may not be worth it. From >> code size angle (taken globally) I already view this differently. >> Plus I think that we ought to aim at avoiding undesirable patterns, >> just because people tend to clone existing code when they can. Thing >> is that (as per below) the two of us apparently disagree on what >> "undesirable" is in cases like this one. >> >>> We probably should take a more generic decision (and encode in our >>> policy) because you seem to like this pattern and I dislike it :). Not >>> sure what the others think. > > This is similar to the policy question I raised among the x86 > committers a few weeks ago: You're manually specifying a more specific > behavior than is required, rather than specifying what you want and > then letting the compiler optimize things. The problem with this is > twofold: > > 1. It's harder for humans to read and understand the intent Depends. > 2. It ties the compiler's hands. If you write your intent, then the > compiler is free to apply the optimization or not, or apply a > different optimization. If you specify this optimization, then the > compiler has fewer ways that it's allowed to compile the code. I'm inclined to believe that no compiler will do this kind of optimization, unless a specific request was raised against it. The pattern may not seem overly complex, but to recognize it would require effort that on the whole may simply not be justified by the gains (from the compiler's perspective). > #1 by itself is probably enough to counterindicate this kind of > behavior. Add them together, and I'm inclined to say that we should > write a policy against such optimizations, without specific > justifications. It's not like I didn't give any justification. So I guess you mean without better (whatever that means) justification. But yes, I'll undo that part of the change then and submit a v2, albeit with not overly much hope for it to then be accepted. Jan
On Wed, Feb 21, 2024 at 3:17 PM Jan Beulich <jbeulich@suse.com> wrote: > > #1 by itself is probably enough to counterindicate this kind of > > behavior. Add them together, and I'm inclined to say that we should > > write a policy against such optimizations, without specific > > justifications. > > It's not like I didn't give any justification. So I guess you mean > without better (whatever that means) justification. Sorry, what I meant was that the policy would have to include a sketch for what sorts of justifications would be acceptable. For instance, here's a justification I would consider for this sort of thing: A. In use-case X, there is hard limit Y on the binary size. For X's configuration, with a reasonably small number of features enabled, we are already close to 90% of the way there. If we were to consistently use this sort of manual code size optimization techniques across the codebase, we could cut down the total size of the code base by 25%. Here's a situation I would absolutely not consider worth it: B. If we consistently use this sort of code size optimization techniques across the codebase, we could cut down the entire size of the codebase by 0.1%. There are no hard limits, we're just trying to generally keep things smaller. Filling our codebase with these sorts of logic puzzles ("Why are we binary or-ing the offset and the length?") makes it more difficult for people to understand the code base and increases the risk of someone making a mistake as they try to change it. For instance, is this change really equivalent, given that previously one of the comparisons had >=? It turns out yes, but only because we filter out situations where the length is 0; what if we were to move things around again, such that we actually can get here with length 0? Making the binary 0.1% smaller is absolutely not worth the cost of that. I'm not sure even 5% would be worth that cost, given that we don't really have any hard limits we're in danger of exceeding (at least that I'm aware of). But a minimum justification for allowing these sorts of things would need to include a concrete prediction of the improvement we would get by applying these sorts of things all over the place; not simply, "in this instance it goes from three to two branches". -George
Hi, On 20/02/2024 12:25, Jan Beulich wrote: > On 20.02.2024 12:52, Julien Grall wrote: >> Hi Jan, >> >> On 20/02/2024 08:26, Jan Beulich wrote: >>> On 19.02.2024 23:22, Julien Grall wrote: >>>> Title: I would add 'gnttab:' to clarify which subsystem you are modifying. >>> >>> That's how I actually have it here; it's not clear to me why I lost the >>> prefix when sending. >>> >>>> On 05/02/2024 11:03, Jan Beulich wrote: >>>>> Along the line with observations in the context of XSA-448, besides >>>>> "op" no field is relevant when the range to be flushed is empty, much >>>>> like e.g. the pointers passed to memcpy() are irrelevant (and would >>>>> never be "validated") when the passed length is zero. Split the existing >>>>> condition validating "op", "offset", and "length", leaving only the "op" >>>>> part ahead of the check for length being zero (or no flushing to be >>>>> performed). >>>> >>>> I am probably missing something here. I understand the theory behind >>>> reducing the number of checks when len == 0. But an OS cannot rely on it: >>>> 1) older hypervisor would still return an error if the check doesn't >>>> pass) >>> >>> Right, but that's no reason to keep the bogus earlier behavior. >> >> Hmmm... I am not sure why you say the behavior is bogus. From the commit >> message, it seems this is just an optimization that have side effect >> (ignoring the other fields). > > I don't view this as primarily an optimization; I'm in particular after > not raising errors for cases where there is no error to be raised. > Hence the comparison to memcpy(), which you can pass "bogus" pointers > so long as you pass zero size. The part I am missing is why this approach is better than what we have. So far what you described is just a matter of taste. To give a concrete example, if tomorrow a contributor decides to send a patch undoing what you did (IOW enforcing the check for zero-length or replace | with two branches), then on what grounds I will be able to refuse their patch? Cheers,
On 21.02.2024 10:34, Julien Grall wrote: > Hi, > > On 20/02/2024 12:25, Jan Beulich wrote: >> On 20.02.2024 12:52, Julien Grall wrote: >>> Hi Jan, >>> >>> On 20/02/2024 08:26, Jan Beulich wrote: >>>> On 19.02.2024 23:22, Julien Grall wrote: >>>>> Title: I would add 'gnttab:' to clarify which subsystem you are modifying. >>>> >>>> That's how I actually have it here; it's not clear to me why I lost the >>>> prefix when sending. >>>> >>>>> On 05/02/2024 11:03, Jan Beulich wrote: >>>>>> Along the line with observations in the context of XSA-448, besides >>>>>> "op" no field is relevant when the range to be flushed is empty, much >>>>>> like e.g. the pointers passed to memcpy() are irrelevant (and would >>>>>> never be "validated") when the passed length is zero. Split the existing >>>>>> condition validating "op", "offset", and "length", leaving only the "op" >>>>>> part ahead of the check for length being zero (or no flushing to be >>>>>> performed). >>>>> >>>>> I am probably missing something here. I understand the theory behind >>>>> reducing the number of checks when len == 0. But an OS cannot rely on it: >>>>> 1) older hypervisor would still return an error if the check doesn't >>>>> pass) >>>> >>>> Right, but that's no reason to keep the bogus earlier behavior. >>> >>> Hmmm... I am not sure why you say the behavior is bogus. From the commit >>> message, it seems this is just an optimization that have side effect >>> (ignoring the other fields). >> >> I don't view this as primarily an optimization; I'm in particular after >> not raising errors for cases where there is no error to be raised. >> Hence the comparison to memcpy(), which you can pass "bogus" pointers >> so long as you pass zero size. > > The part I am missing is why this approach is better than what we have. > So far what you described is just a matter of taste. > > To give a concrete example, if tomorrow a contributor decides to send a > patch undoing what you did (IOW enforcing the check for zero-length or > replace | with two branches), then on what grounds I will be able to > refuse their patch? On the grounds of the argument I gave before: Consistency with other more or less similar operations, where length 0 simply means "no-op", up to and including "no errors from arguments specifying the address(es) to operate on". Jan
--- a/xen/common/grant_table.c +++ b/xen/common/grant_table.c @@ -3528,15 +3528,16 @@ static int _cache_flush(const gnttab_cac void *v; int ret; - if ( (cflush->offset >= PAGE_SIZE) || - (cflush->length > PAGE_SIZE) || - (cflush->offset + cflush->length > PAGE_SIZE) || - (cflush->op & ~(GNTTAB_CACHE_INVAL | GNTTAB_CACHE_CLEAN)) ) + if ( cflush->op & ~(GNTTAB_CACHE_INVAL | GNTTAB_CACHE_CLEAN) ) return -EINVAL; if ( cflush->length == 0 || cflush->op == 0 ) return !*cur_ref ? 0 : -EILSEQ; + if ( (cflush->offset | cflush->length) > PAGE_SIZE || + cflush->offset + cflush->length > PAGE_SIZE ) + return -EINVAL; + /* currently unimplemented */ if ( cflush->op & GNTTAB_CACHE_SOURCE_GREF ) return -EOPNOTSUPP;
Along the line with observations in the context of XSA-448, besides "op" no field is relevant when the range to be flushed is empty, much like e.g. the pointers passed to memcpy() are irrelevant (and would never be "validated") when the passed length is zero. Split the existing condition validating "op", "offset", and "length", leaving only the "op" part ahead of the check for length being zero (or no flushing to be performed). In the course of splitting also simplify the moved part of the condition from 3 to 2 conditionals, potentially (depending on the architecture) requiring one less (conditional) branch. Signed-off-by: Jan Beulich <jbeulich@suse.com>