Message ID | 710e9e6477270212136d6f2047fd15a033fa7d71.1660902588.git.rahul.singh@arm.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | xen/evtchn: implement static event channel signaling | expand |
On 19.08.2022 12:02, Rahul Singh wrote: > From: Julien Grall <jgrall@amazon.com> > > Since commit 01280dc19cf3 "evtchn: simplify port_is_valid()", the event > channels code assumes that all the buckets below d->valid_evtchns are > always allocated. > > This assumption hold in most of the situation because a guest is not > allowed to chose the port. Instead, it will be the first free from port > 0. > > When using Guest Transparent Migration and LiveUpdate, we will only > preserve ports that are currently in use. As a guest can open/close > event channels, this means the ports may be sparse. > > The existing implementation of evtchn_allocate_port() is not able to > deal with such situation and will end up to override bucket or/and leave > some bucket unallocated. The latter will result to a droplet crash if > the event channel belongs to an unallocated bucket. > > This can be solved by making sure that all the buckets below > d->valid_evtchns are allocated. There should be no impact for most of > the situation but LM/LU as only one bucket would be allocated. For > LM/LU, we may end up to allocate multiple buckets if ports in use are > sparse. > > A potential alternative is to check that the bucket is valid in > is_port_valid(). This should still possible to do it without taking > per-domain lock but will result a couple more of memory access. > > Signed-off-by: Julien Grall <jgrall@amazon.com> > Signed-off-by: Rahul Singh <rahul.singh@arm.com> While I'm mostly okay with the code, I think the description wants changing / amending as long as the features talked about above aren't anywhere near reaching upstream (afaict), to at least _also_ mention the goal you have with this. > Changes in v2: > - new patch in this version to fix the security issue I guess you mean "avoid", not "fix". > @@ -207,30 +216,35 @@ int evtchn_allocate_port(struct domain *d, evtchn_port_t port) > } > else > { > - struct evtchn *chn; > - struct evtchn **grp; > - > - if ( !group_from_port(d, port) ) > + do > { > - grp = xzalloc_array(struct evtchn *, BUCKETS_PER_GROUP); > - if ( !grp ) > - return -ENOMEM; > - group_from_port(d, port) = grp; > - } > + struct evtchn *chn; > + struct evtchn **grp; > + unsigned int alloc_port = read_atomic(&d->valid_evtchns); > > - chn = alloc_evtchn_bucket(d, port); > - if ( !chn ) > - return -ENOMEM; > - bucket_from_port(d, port) = chn; > + if ( !group_from_port(d, alloc_port) ) > + { > + grp = xzalloc_array(struct evtchn *, BUCKETS_PER_GROUP); > + if ( !grp ) > + return -ENOMEM; > + group_from_port(d, alloc_port) = grp; > + } > > - /* > - * d->valid_evtchns is used to check whether the bucket can be > - * accessed without the per-domain lock. Therefore, > - * d->valid_evtchns should be seen *after* the new bucket has > - * been setup. > - */ > - smp_wmb(); > - write_atomic(&d->valid_evtchns, d->valid_evtchns + EVTCHNS_PER_BUCKET); > + chn = alloc_evtchn_bucket(d, alloc_port); > + if ( !chn ) > + return -ENOMEM; > + bucket_from_port(d, alloc_port) = chn; > + > + /* > + * d->valid_evtchns is used to check whether the bucket can be > + * accessed without the per-domain lock. Therefore, > + * d->valid_evtchns should be seen *after* the new bucket has > + * been setup. > + */ > + smp_wmb(); > + write_atomic(&d->valid_evtchns, > + d->valid_evtchns + EVTCHNS_PER_BUCKET); > + } while ( port >= read_atomic(&d->valid_evtchns) ); This updating of d->valid_evtchns looks a little inconsistent to me, wrt the uses of {read,write}_atomic(). To make obvious that there's an implicit expectation that no 2nd invocation of this function could race the updates, I'd recommend reading allocate_port ahead of the loop and then never again. Here you'd then do smp_wmb(); allocate_port += EVTCHNS_PER_BUCKET; write_atomic(&d->valid_evtchns, allocate_port); } while ( port >= allocate_port ); at the same time rendering the code (imo) a little more legible. Jan
Hi Jan, On 22/08/2022 14:08, Jan Beulich wrote: > On 19.08.2022 12:02, Rahul Singh wrote: >> From: Julien Grall <jgrall@amazon.com> >> >> Since commit 01280dc19cf3 "evtchn: simplify port_is_valid()", the event >> channels code assumes that all the buckets below d->valid_evtchns are >> always allocated. >> >> This assumption hold in most of the situation because a guest is not >> allowed to chose the port. Instead, it will be the first free from port >> 0. >> >> When using Guest Transparent Migration and LiveUpdate, we will only >> preserve ports that are currently in use. As a guest can open/close >> event channels, this means the ports may be sparse. >> >> The existing implementation of evtchn_allocate_port() is not able to >> deal with such situation and will end up to override bucket or/and leave >> some bucket unallocated. The latter will result to a droplet crash if >> the event channel belongs to an unallocated bucket. >> >> This can be solved by making sure that all the buckets below >> d->valid_evtchns are allocated. There should be no impact for most of >> the situation but LM/LU as only one bucket would be allocated. For >> LM/LU, we may end up to allocate multiple buckets if ports in use are >> sparse. >> >> A potential alternative is to check that the bucket is valid in >> is_port_valid(). This should still possible to do it without taking >> per-domain lock but will result a couple more of memory access. >> >> Signed-off-by: Julien Grall <jgrall@amazon.com> >> Signed-off-by: Rahul Singh <rahul.singh@arm.com> > > While I'm mostly okay with the code, I think the description wants > changing / amending as long as the features talked about above aren't > anywhere near reaching upstream (afaict), to at least _also_ mention > the goal you have with this. Correct, neither Guest Transparent Migration nor Live-Update is going to reach Xen in 4.17 :). Also, if we decide to continue to mention it, then we would need to s/Guest Transparent Migration/non-cooperative migration/ to match the name we decided to use in upstream (see docs/designs/non-cooperative-migration.md). > >> Changes in v2: >> - new patch in this version to fix the security issue > > I guess you mean "avoid", not "fix". > >> @@ -207,30 +216,35 @@ int evtchn_allocate_port(struct domain *d, evtchn_port_t port) >> } >> else >> { >> - struct evtchn *chn; >> - struct evtchn **grp; >> - >> - if ( !group_from_port(d, port) ) >> + do >> { >> - grp = xzalloc_array(struct evtchn *, BUCKETS_PER_GROUP); >> - if ( !grp ) >> - return -ENOMEM; >> - group_from_port(d, port) = grp; >> - } >> + struct evtchn *chn; >> + struct evtchn **grp; >> + unsigned int alloc_port = read_atomic(&d->valid_evtchns); >> >> - chn = alloc_evtchn_bucket(d, port); >> - if ( !chn ) >> - return -ENOMEM; >> - bucket_from_port(d, port) = chn; >> + if ( !group_from_port(d, alloc_port) ) >> + { >> + grp = xzalloc_array(struct evtchn *, BUCKETS_PER_GROUP); >> + if ( !grp ) >> + return -ENOMEM; >> + group_from_port(d, alloc_port) = grp; >> + } >> >> - /* >> - * d->valid_evtchns is used to check whether the bucket can be >> - * accessed without the per-domain lock. Therefore, >> - * d->valid_evtchns should be seen *after* the new bucket has >> - * been setup. >> - */ >> - smp_wmb(); >> - write_atomic(&d->valid_evtchns, d->valid_evtchns + EVTCHNS_PER_BUCKET); >> + chn = alloc_evtchn_bucket(d, alloc_port); >> + if ( !chn ) >> + return -ENOMEM; >> + bucket_from_port(d, alloc_port) = chn; >> + >> + /* >> + * d->valid_evtchns is used to check whether the bucket can be >> + * accessed without the per-domain lock. Therefore, >> + * d->valid_evtchns should be seen *after* the new bucket has >> + * been setup. >> + */ >> + smp_wmb(); >> + write_atomic(&d->valid_evtchns, >> + d->valid_evtchns + EVTCHNS_PER_BUCKET); >> + } while ( port >= read_atomic(&d->valid_evtchns) ); > > This updating of d->valid_evtchns looks a little inconsistent to me, > wrt the uses of {read,write}_atomic(). To make obvious that there's > an implicit expectation that no 2nd invocation of this function > could race the updates, I'd recommend reading allocate_port ahead > of the loop and then never again. Here you'd then do > > smp_wmb(); > allocate_port += EVTCHNS_PER_BUCKET; > write_atomic(&d->valid_evtchns, allocate_port); > } while ( port >= allocate_port ); I know it is my code. But I agree with this comment :). > > at the same time rendering the code (imo) a little more legible. > > Jan
Hi Jan. > On 22 Aug 2022, at 2:08 pm, Jan Beulich <jbeulich@suse.com> wrote: > > On 19.08.2022 12:02, Rahul Singh wrote: >> From: Julien Grall <jgrall@amazon.com> >> >> Since commit 01280dc19cf3 "evtchn: simplify port_is_valid()", the event >> channels code assumes that all the buckets below d->valid_evtchns are >> always allocated. >> >> This assumption hold in most of the situation because a guest is not >> allowed to chose the port. Instead, it will be the first free from port >> 0. >> >> When using Guest Transparent Migration and LiveUpdate, we will only >> preserve ports that are currently in use. As a guest can open/close >> event channels, this means the ports may be sparse. >> >> The existing implementation of evtchn_allocate_port() is not able to >> deal with such situation and will end up to override bucket or/and leave >> some bucket unallocated. The latter will result to a droplet crash if >> the event channel belongs to an unallocated bucket. >> >> This can be solved by making sure that all the buckets below >> d->valid_evtchns are allocated. There should be no impact for most of >> the situation but LM/LU as only one bucket would be allocated. For >> LM/LU, we may end up to allocate multiple buckets if ports in use are >> sparse. >> >> A potential alternative is to check that the bucket is valid in >> is_port_valid(). This should still possible to do it without taking >> per-domain lock but will result a couple more of memory access. >> >> Signed-off-by: Julien Grall <jgrall@amazon.com> >> Signed-off-by: Rahul Singh <rahul.singh@arm.com> > > While I'm mostly okay with the code, I think the description wants > changing / amending as long as the features talked about above aren't > anywhere near reaching upstream (afaict), to at least _also_ mention > the goal you have with this. Ok. I will remove this and add that we need this patch to support static event channel. Something like: “ When static event channel support will be added for dom0less domains user can request to allocate the evtchn port numbers that are scattered in nature." > >> Changes in v2: >> - new patch in this version to fix the security issue > > I guess you mean "avoid", not "fix". Ack. > >> @@ -207,30 +216,35 @@ int evtchn_allocate_port(struct domain *d, evtchn_port_t port) >> } >> else >> { >> - struct evtchn *chn; >> - struct evtchn **grp; >> - >> - if ( !group_from_port(d, port) ) >> + do >> { >> - grp = xzalloc_array(struct evtchn *, BUCKETS_PER_GROUP); >> - if ( !grp ) >> - return -ENOMEM; >> - group_from_port(d, port) = grp; >> - } >> + struct evtchn *chn; >> + struct evtchn **grp; >> + unsigned int alloc_port = read_atomic(&d->valid_evtchns); >> >> - chn = alloc_evtchn_bucket(d, port); >> - if ( !chn ) >> - return -ENOMEM; >> - bucket_from_port(d, port) = chn; >> + if ( !group_from_port(d, alloc_port) ) >> + { >> + grp = xzalloc_array(struct evtchn *, BUCKETS_PER_GROUP); >> + if ( !grp ) >> + return -ENOMEM; >> + group_from_port(d, alloc_port) = grp; >> + } >> >> - /* >> - * d->valid_evtchns is used to check whether the bucket can be >> - * accessed without the per-domain lock. Therefore, >> - * d->valid_evtchns should be seen *after* the new bucket has >> - * been setup. >> - */ >> - smp_wmb(); >> - write_atomic(&d->valid_evtchns, d->valid_evtchns + EVTCHNS_PER_BUCKET); >> + chn = alloc_evtchn_bucket(d, alloc_port); >> + if ( !chn ) >> + return -ENOMEM; >> + bucket_from_port(d, alloc_port) = chn; >> + >> + /* >> + * d->valid_evtchns is used to check whether the bucket can be >> + * accessed without the per-domain lock. Therefore, >> + * d->valid_evtchns should be seen *after* the new bucket has >> + * been setup. >> + */ >> + smp_wmb(); >> + write_atomic(&d->valid_evtchns, >> + d->valid_evtchns + EVTCHNS_PER_BUCKET); >> + } while ( port >= read_atomic(&d->valid_evtchns) ); > > This updating of d->valid_evtchns looks a little inconsistent to me, > wrt the uses of {read,write}_atomic(). To make obvious that there's > an implicit expectation that no 2nd invocation of this function > could race the updates, I'd recommend reading allocate_port ahead > of the loop and then never again. Here you'd then do > > smp_wmb(); > allocate_port += EVTCHNS_PER_BUCKET; > write_atomic(&d->valid_evtchns, allocate_port); > } while ( port >= allocate_port ); > > at the same time rendering the code (imo) a little more legible. > > Jan Ack. I will fix this as suggested by you I next version. Regards, Rahul
diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c index c2c6f8c151..dbe0a27311 100644 --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -193,6 +193,15 @@ static struct evtchn *alloc_evtchn_bucket(struct domain *d, unsigned int port) return NULL; } +/* + * Allocate a given port and ensure all the buckets up to that ports + * have been allocated. + * + * The last part is important because the rest of the event channel code + * relies on all the buckets up to d->valid_evtchns to be valid. However, + * event channels may be sparsed when restoring a domain during Guest + * Transparent Migration and Live Update. + */ int evtchn_allocate_port(struct domain *d, evtchn_port_t port) { if ( port > d->max_evtchn_port || port >= max_evtchns(d) ) @@ -207,30 +216,35 @@ int evtchn_allocate_port(struct domain *d, evtchn_port_t port) } else { - struct evtchn *chn; - struct evtchn **grp; - - if ( !group_from_port(d, port) ) + do { - grp = xzalloc_array(struct evtchn *, BUCKETS_PER_GROUP); - if ( !grp ) - return -ENOMEM; - group_from_port(d, port) = grp; - } + struct evtchn *chn; + struct evtchn **grp; + unsigned int alloc_port = read_atomic(&d->valid_evtchns); - chn = alloc_evtchn_bucket(d, port); - if ( !chn ) - return -ENOMEM; - bucket_from_port(d, port) = chn; + if ( !group_from_port(d, alloc_port) ) + { + grp = xzalloc_array(struct evtchn *, BUCKETS_PER_GROUP); + if ( !grp ) + return -ENOMEM; + group_from_port(d, alloc_port) = grp; + } - /* - * d->valid_evtchns is used to check whether the bucket can be - * accessed without the per-domain lock. Therefore, - * d->valid_evtchns should be seen *after* the new bucket has - * been setup. - */ - smp_wmb(); - write_atomic(&d->valid_evtchns, d->valid_evtchns + EVTCHNS_PER_BUCKET); + chn = alloc_evtchn_bucket(d, alloc_port); + if ( !chn ) + return -ENOMEM; + bucket_from_port(d, alloc_port) = chn; + + /* + * d->valid_evtchns is used to check whether the bucket can be + * accessed without the per-domain lock. Therefore, + * d->valid_evtchns should be seen *after* the new bucket has + * been setup. + */ + smp_wmb(); + write_atomic(&d->valid_evtchns, + d->valid_evtchns + EVTCHNS_PER_BUCKET); + } while ( port >= read_atomic(&d->valid_evtchns) ); } write_atomic(&d->active_evtchns, d->active_evtchns + 1);