Message ID | 5ea66595248c41a011ac465bfabd7a7a40dcd565.1655903088.git.rahul.singh@arm.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | xen/evtchn: implement static event channel signaling | expand |
Hi, On 22/06/2022 15:37, Rahul Singh wrote: > evtchn_alloc_unbound() always allocates the next available port. Static > event channel support for dom0less domains requires allocating a > specified port. > > Modify the evtchn_alloc_unbound() to accept the port number as an > argument and allocate the specified port if available. If the port > number argument is zero, the next available port will be allocated. I haven't yet fully reviewed this series. But I would like to point out that this opening a security hole (which I thought I had mention before) that could be exploited by a guest at runtime. You would need [1] or similar in order to fix the issue. I am wrote "similar" because the patch could potentially be a problem if you allow a guest to use FIFO (you may need to allocate a lot of memory to fill the hole). Cheers, [1] https://xenbits.xen.org/gitweb/?p=people/julieng/xen-unstable.git;a=commit;h=2d89486fcf11216331e58a21b367b8a9be1af725
On 22.06.2022 16:51, Julien Grall wrote: > On 22/06/2022 15:37, Rahul Singh wrote: >> evtchn_alloc_unbound() always allocates the next available port. Static >> event channel support for dom0less domains requires allocating a >> specified port. >> >> Modify the evtchn_alloc_unbound() to accept the port number as an >> argument and allocate the specified port if available. If the port >> number argument is zero, the next available port will be allocated. > > I haven't yet fully reviewed this series. But I would like to point out > that this opening a security hole (which I thought I had mention before) > that could be exploited by a guest at runtime. > > You would need [1] or similar in order to fix the issue. I am wrote > "similar" because the patch could potentially be a problem if you allow > a guest to use FIFO (you may need to allocate a lot of memory to fill > the hole). At least from an abstract pov this is an issue with the shim then as well, at the very least when shim's and the underlying Xen's alloc algorithms would differ. With the nature of the shim that's not a security concern, though. Jan
Hi Julien, > On 22 Jun 2022, at 3:51 pm, Julien Grall <julien@xen.org> wrote: > > Hi, > > On 22/06/2022 15:37, Rahul Singh wrote: >> evtchn_alloc_unbound() always allocates the next available port. Static >> event channel support for dom0less domains requires allocating a >> specified port. >> Modify the evtchn_alloc_unbound() to accept the port number as an >> argument and allocate the specified port if available. If the port >> number argument is zero, the next available port will be allocated. > > I haven't yet fully reviewed this series. But I would like to point out that this opening a security hole (which I thought I had mention before) that could be exploited by a guest at runtime. > > You would need [1] or similar in order to fix the issue. I am wrote "similar" because the patch could potentially be a problem if you allow a guest to use FIFO (you may need to allocate a lot of memory to fill the hole). > > Cheers, > > [1] https://xenbits.xen.org/gitweb/?p=people/julieng/xen-unstable.git;a=commit;h=2d89486fcf11216331e58a21b367b8a9be1af725 Thanks for sharing the patch. If you are okay I can use this patch in next version to fix the security hole. For the FIFO issue, we can introduce the new config option to restrict the maximum number of static port supported in Xen. We can check the user-defined static port when we parse the device tree and if a user-defined static port is greater than the maximum allowed static port will return an error to the user. In this way, we can avoid allocating a lot of memory to fill the hole. Let me know your view on this. config MAX_STATIC_PORT int "Maximum number of static ports” range 1 4095 help Controls the build-time maximum number of static port supported. Regards, Rahul
On 11/07/2022 17:08, Rahul Singh wrote: > Hi Julien, Hi Rahul, >> On 22 Jun 2022, at 3:51 pm, Julien Grall <julien@xen.org> wrote: >> >> Hi, >> >> On 22/06/2022 15:37, Rahul Singh wrote: >>> evtchn_alloc_unbound() always allocates the next available port. Static >>> event channel support for dom0less domains requires allocating a >>> specified port. >>> Modify the evtchn_alloc_unbound() to accept the port number as an >>> argument and allocate the specified port if available. If the port >>> number argument is zero, the next available port will be allocated. >> >> I haven't yet fully reviewed this series. But I would like to point out that this opening a security hole (which I thought I had mention before) that could be exploited by a guest at runtime. >> >> You would need [1] or similar in order to fix the issue. I am wrote "similar" because the patch could potentially be a problem if you allow a guest to use FIFO (you may need to allocate a lot of memory to fill the hole). >> >> Cheers, >> >> [1] https://xenbits.xen.org/gitweb/?p=people/julieng/xen-unstable.git;a=commit;h=2d89486fcf11216331e58a21b367b8a9be1af725 > > Thanks for sharing the patch. If you are okay I can use this patch in next version to fix the security hole. I am fine with that. > > For the FIFO issue, we can introduce the new config option to restrict the maximum number of static > port supported in Xen. We can check the user-defined static port when we parse the device tree and if > a user-defined static port is greater than the maximum allowed static port will return an error to the user. > In this way, we can avoid allocating a lot of memory to fill the hole. > > Let me know your view on this. > > config MAX_STATIC_PORT > int "Maximum number of static ports” > range 1 4095 > help > Controls the build-time maximum number of static port supported The problem is not exclusive to the static event channel. So I don't think this is right to introduce MAX_STATIC_PORT to mitigate the issue (even though this is the only user today). A few of alternative solutions: 1) Handle preemption in alloc_evtchn_bucket() 2) Allocate all the buckets when the domain is created (the max numbers event channel is known). We may need to think about preemption 3) Tweak is_port_valid() to check if the bucket is valid. This would introduce a couple of extra memory access (might be OK as the bucket would be accessed afterwards) and we would need to update some users. At the moment, 3) is appealing me the most. I would be interested to have an opionions from the other maintainers. Cheers,
On 12.07.2022 19:28, Julien Grall wrote: > On 11/07/2022 17:08, Rahul Singh wrote: >>> On 22 Jun 2022, at 3:51 pm, Julien Grall <julien@xen.org> wrote: >>> On 22/06/2022 15:37, Rahul Singh wrote: >>>> evtchn_alloc_unbound() always allocates the next available port. Static >>>> event channel support for dom0less domains requires allocating a >>>> specified port. >>>> Modify the evtchn_alloc_unbound() to accept the port number as an >>>> argument and allocate the specified port if available. If the port >>>> number argument is zero, the next available port will be allocated. >>> >>> I haven't yet fully reviewed this series. But I would like to point out that this opening a security hole (which I thought I had mention before) that could be exploited by a guest at runtime. >>> >>> You would need [1] or similar in order to fix the issue. I am wrote "similar" because the patch could potentially be a problem if you allow a guest to use FIFO (you may need to allocate a lot of memory to fill the hole). >>> >>> Cheers, >>> >>> [1] https://xenbits.xen.org/gitweb/?p=people/julieng/xen-unstable.git;a=commit;h=2d89486fcf11216331e58a21b367b8a9be1af725 >> >> Thanks for sharing the patch. If you are okay I can use this patch in next version to fix the security hole. > > I am fine with that. > >> >> For the FIFO issue, we can introduce the new config option to restrict the maximum number of static >> port supported in Xen. We can check the user-defined static port when we parse the device tree and if >> a user-defined static port is greater than the maximum allowed static port will return an error to the user. >> In this way, we can avoid allocating a lot of memory to fill the hole. >> >> Let me know your view on this. >> >> config MAX_STATIC_PORT >> int "Maximum number of static ports” >> range 1 4095 >> help >> Controls the build-time maximum number of static port supported > > The problem is not exclusive to the static event channel. So I don't > think this is right to introduce MAX_STATIC_PORT to mitigate the issue > (even though this is the only user today). > > A few of alternative solutions: > 1) Handle preemption in alloc_evtchn_bucket() > 2) Allocate all the buckets when the domain is created (the max > numbers event channel is known). We may need to think about preemption > 3) Tweak is_port_valid() to check if the bucket is valid. This would > introduce a couple of extra memory access (might be OK as the bucket > would be accessed afterwards) and we would need to update some users. > > At the moment, 3) is appealing me the most. I would be interested to > have an opionions from the other maintainers. Fwiw of the named alternatives I would also prefer 3. Whether things really need generalizing at this point I'm not sure, though. Jan
Hi, On 13/07/2022 07:21, Jan Beulich wrote: >>> For the FIFO issue, we can introduce the new config option to restrict the maximum number of static >>> port supported in Xen. We can check the user-defined static port when we parse the device tree and if >>> a user-defined static port is greater than the maximum allowed static port will return an error to the user. >>> In this way, we can avoid allocating a lot of memory to fill the hole. >>> >>> Let me know your view on this. >>> >>> config MAX_STATIC_PORT >>> int "Maximum number of static ports” >>> range 1 4095 >>> help >>> Controls the build-time maximum number of static port supported >> >> The problem is not exclusive to the static event channel. So I don't >> think this is right to introduce MAX_STATIC_PORT to mitigate the issue >> (even though this is the only user today). >> >> A few of alternative solutions: >> 1) Handle preemption in alloc_evtchn_bucket() >> 2) Allocate all the buckets when the domain is created (the max >> numbers event channel is known). We may need to think about preemption >> 3) Tweak is_port_valid() to check if the bucket is valid. This would >> introduce a couple of extra memory access (might be OK as the bucket >> would be accessed afterwards) and we would need to update some users. >> >> At the moment, 3) is appealing me the most. I would be interested to >> have an opionions from the other maintainers. > > Fwiw of the named alternatives I would also prefer 3. Whether things > really need generalizing at this point I'm not sure, though. I am worry that we may end up to forget that we had non-generaic way (e.g. MAX_STATIC_PORT) to prevent trigger the issue. So we could end up to mistakenly introduce a security issue. However, my point was less about generalization but more about introducing CONFIG_MAX_STATIC_PORT. It seems strange to let the admin to decide the maximum number of static port supported. If we want to rely on non-generic mechanism, then I think the right way to go is to restrict max_evtchn_port for domUs to 4096 (it is -1 at the moment). If we want to give more flexibility then it should be a per-domain property in the DT. Cheers,
On 13.07.2022 11:35, Julien Grall wrote: > Hi, > > On 13/07/2022 07:21, Jan Beulich wrote: >>>> For the FIFO issue, we can introduce the new config option to restrict the maximum number of static >>>> port supported in Xen. We can check the user-defined static port when we parse the device tree and if >>>> a user-defined static port is greater than the maximum allowed static port will return an error to the user. >>>> In this way, we can avoid allocating a lot of memory to fill the hole. >>>> >>>> Let me know your view on this. >>>> >>>> config MAX_STATIC_PORT >>>> int "Maximum number of static ports” >>>> range 1 4095 >>>> help >>>> Controls the build-time maximum number of static port supported >>> >>> The problem is not exclusive to the static event channel. So I don't >>> think this is right to introduce MAX_STATIC_PORT to mitigate the issue >>> (even though this is the only user today). >>> >>> A few of alternative solutions: >>> 1) Handle preemption in alloc_evtchn_bucket() >>> 2) Allocate all the buckets when the domain is created (the max >>> numbers event channel is known). We may need to think about preemption >>> 3) Tweak is_port_valid() to check if the bucket is valid. This would >>> introduce a couple of extra memory access (might be OK as the bucket >>> would be accessed afterwards) and we would need to update some users. >>> >>> At the moment, 3) is appealing me the most. I would be interested to >>> have an opionions from the other maintainers. >> >> Fwiw of the named alternatives I would also prefer 3. Whether things >> really need generalizing at this point I'm not sure, though. > I am worry that we may end up to forget that we had non-generaic way > (e.g. MAX_STATIC_PORT) to prevent trigger the issue. So we could end up > to mistakenly introduce a security issue. > > However, my point was less about generalization but more about > introducing CONFIG_MAX_STATIC_PORT. > > It seems strange to let the admin to decide the maximum number of static > port supported. Why (assuming s/admin/build admin/)? I view both a build time upper bound as well as (alternatively) a command line driven upper bound as reasonable (in the latter case there of course still would want/need to be an upper bound on what is legitimate to specify from the command line). Using static ports after all means there's a static largest port number. Determined across all intended uses of a build such an upper bound can be a feasible mechanism. Jan
> On 13 Jul 2022, at 10:53, Jan Beulich <jbeulich@suse.com> wrote: > > On 13.07.2022 11:35, Julien Grall wrote: >> Hi, >> >> On 13/07/2022 07:21, Jan Beulich wrote: >>>>> For the FIFO issue, we can introduce the new config option to restrict the maximum number of static >>>>> port supported in Xen. We can check the user-defined static port when we parse the device tree and if >>>>> a user-defined static port is greater than the maximum allowed static port will return an error to the user. >>>>> In this way, we can avoid allocating a lot of memory to fill the hole. >>>>> >>>>> Let me know your view on this. >>>>> >>>>> config MAX_STATIC_PORT >>>>> int "Maximum number of static ports” >>>>> range 1 4095 >>>>> help >>>>> Controls the build-time maximum number of static port supported >>>> >>>> The problem is not exclusive to the static event channel. So I don't >>>> think this is right to introduce MAX_STATIC_PORT to mitigate the issue >>>> (even though this is the only user today). >>>> >>>> A few of alternative solutions: >>>> 1) Handle preemption in alloc_evtchn_bucket() >>>> 2) Allocate all the buckets when the domain is created (the max >>>> numbers event channel is known). We may need to think about preemption >>>> 3) Tweak is_port_valid() to check if the bucket is valid. This would >>>> introduce a couple of extra memory access (might be OK as the bucket >>>> would be accessed afterwards) and we would need to update some users. >>>> >>>> At the moment, 3) is appealing me the most. I would be interested to >>>> have an opionions from the other maintainers. >>> >>> Fwiw of the named alternatives I would also prefer 3. Whether things >>> really need generalizing at this point I'm not sure, though. >> I am worry that we may end up to forget that we had non-generaic way >> (e.g. MAX_STATIC_PORT) to prevent trigger the issue. So we could end up >> to mistakenly introduce a security issue. >> >> However, my point was less about generalization but more about >> introducing CONFIG_MAX_STATIC_PORT. >> >> It seems strange to let the admin to decide the maximum number of static >> port supported. > > Why (assuming s/admin/build admin/)? I view both a build time upper bound > as well as (alternatively) a command line driven upper bound as reasonable > (in the latter case there of course still would want/need to be an upper > bound on what is legitimate to specify from the command line). Using > static ports after all means there's a static largest port number. > Determined across all intended uses of a build such an upper bound can be > a feasible mechanism. I agree with this. Right now all static port must be defined on boot so this is fully ok to have a maximum compilation value or something that can be customised on command line. In this case the admin must be fully aware of what he does from the start. Also whatever value we define as default will probably be far bigger than any real use case I think. Bertrand > > Jan
Hi Jan, On 13/07/2022 10:53, Jan Beulich wrote: > On 13.07.2022 11:35, Julien Grall wrote: >> Hi, >> >> On 13/07/2022 07:21, Jan Beulich wrote: >>>>> For the FIFO issue, we can introduce the new config option to restrict the maximum number of static >>>>> port supported in Xen. We can check the user-defined static port when we parse the device tree and if >>>>> a user-defined static port is greater than the maximum allowed static port will return an error to the user. >>>>> In this way, we can avoid allocating a lot of memory to fill the hole. >>>>> >>>>> Let me know your view on this. >>>>> >>>>> config MAX_STATIC_PORT >>>>> int "Maximum number of static ports” >>>>> range 1 4095 >>>>> help >>>>> Controls the build-time maximum number of static port supported >>>> >>>> The problem is not exclusive to the static event channel. So I don't >>>> think this is right to introduce MAX_STATIC_PORT to mitigate the issue >>>> (even though this is the only user today). >>>> >>>> A few of alternative solutions: >>>> 1) Handle preemption in alloc_evtchn_bucket() >>>> 2) Allocate all the buckets when the domain is created (the max >>>> numbers event channel is known). We may need to think about preemption >>>> 3) Tweak is_port_valid() to check if the bucket is valid. This would >>>> introduce a couple of extra memory access (might be OK as the bucket >>>> would be accessed afterwards) and we would need to update some users. >>>> >>>> At the moment, 3) is appealing me the most. I would be interested to >>>> have an opionions from the other maintainers. >>> >>> Fwiw of the named alternatives I would also prefer 3. Whether things >>> really need generalizing at this point I'm not sure, though. >> I am worry that we may end up to forget that we had non-generaic way >> (e.g. MAX_STATIC_PORT) to prevent trigger the issue. So we could end up >> to mistakenly introduce a security issue. >> >> However, my point was less about generalization but more about >> introducing CONFIG_MAX_STATIC_PORT. >> >> It seems strange to let the admin to decide the maximum number of static >> port supported. > > Why (assuming s/admin/build admin/)? I view both a build time upper bound > as well as (alternatively) a command line driven upper bound as reasonable > (in the latter case there of course still would want/need to be an upper > bound on what is legitimate to specify from the command line). Using > static ports after all means there's a static largest port number. > Determined across all intended uses of a build such an upper bound can be > a feasible mechanism. AFAICT, the limit is only here to mitigate the risk with the patch I proposed. With a proper fix, the limit would be articial and therefore it is not clear why the admin should be able to configure it (even temporarily). Instead, I think we want to have a limit that apply for both statically and dynamically allocated even channel. This is what d->max_evtchn_port is for. Cheers,
Hi, On 13/07/2022 11:03, Bertrand Marquis wrote: > > >> On 13 Jul 2022, at 10:53, Jan Beulich <jbeulich@suse.com> wrote: >> >> On 13.07.2022 11:35, Julien Grall wrote: >>> Hi, >>> >>> On 13/07/2022 07:21, Jan Beulich wrote: >>>>>> For the FIFO issue, we can introduce the new config option to restrict the maximum number of static >>>>>> port supported in Xen. We can check the user-defined static port when we parse the device tree and if >>>>>> a user-defined static port is greater than the maximum allowed static port will return an error to the user. >>>>>> In this way, we can avoid allocating a lot of memory to fill the hole. >>>>>> >>>>>> Let me know your view on this. >>>>>> >>>>>> config MAX_STATIC_PORT >>>>>> int "Maximum number of static ports” >>>>>> range 1 4095 >>>>>> help >>>>>> Controls the build-time maximum number of static port supported >>>>> >>>>> The problem is not exclusive to the static event channel. So I don't >>>>> think this is right to introduce MAX_STATIC_PORT to mitigate the issue >>>>> (even though this is the only user today). >>>>> >>>>> A few of alternative solutions: >>>>> 1) Handle preemption in alloc_evtchn_bucket() >>>>> 2) Allocate all the buckets when the domain is created (the max >>>>> numbers event channel is known). We may need to think about preemption >>>>> 3) Tweak is_port_valid() to check if the bucket is valid. This would >>>>> introduce a couple of extra memory access (might be OK as the bucket >>>>> would be accessed afterwards) and we would need to update some users. >>>>> >>>>> At the moment, 3) is appealing me the most. I would be interested to >>>>> have an opionions from the other maintainers. >>>> >>>> Fwiw of the named alternatives I would also prefer 3. Whether things >>>> really need generalizing at this point I'm not sure, though. >>> I am worry that we may end up to forget that we had non-generaic way >>> (e.g. MAX_STATIC_PORT) to prevent trigger the issue. So we could end up >>> to mistakenly introduce a security issue. >>> >>> However, my point was less about generalization but more about >>> introducing CONFIG_MAX_STATIC_PORT. >>> >>> It seems strange to let the admin to decide the maximum number of static >>> port supported. >> >> Why (assuming s/admin/build admin/)? I view both a build time upper bound >> as well as (alternatively) a command line driven upper bound as reasonable >> (in the latter case there of course still would want/need to be an upper >> bound on what is legitimate to specify from the command line). Using >> static ports after all means there's a static largest port number. >> Determined across all intended uses of a build such an upper bound can be >> a feasible mechanism. > > I agree with this. Right now all static port must be defined on boot so this is fully > ok to have a maximum compilation value or something that can be customised > on command line. > In this case the admin must be fully aware of what he does from the start. Let me start that I think it would be a mistake to introduce a command line option (or Kconfig) that have high chance to disappear or would become moot. Today we already have a per-domain limit for the number of event channels (see d->max_evtchn_port). If we leave aside the issue in the event channel code, it is not clear to me why we need a separate limit for statically allocated event channel. Cheers,
On 13.07.2022 12:18, Julien Grall wrote: > On 13/07/2022 10:53, Jan Beulich wrote: >> On 13.07.2022 11:35, Julien Grall wrote: >>> On 13/07/2022 07:21, Jan Beulich wrote: >>>>>> For the FIFO issue, we can introduce the new config option to restrict the maximum number of static >>>>>> port supported in Xen. We can check the user-defined static port when we parse the device tree and if >>>>>> a user-defined static port is greater than the maximum allowed static port will return an error to the user. >>>>>> In this way, we can avoid allocating a lot of memory to fill the hole. >>>>>> >>>>>> Let me know your view on this. >>>>>> >>>>>> config MAX_STATIC_PORT >>>>>> int "Maximum number of static ports” >>>>>> range 1 4095 >>>>>> help >>>>>> Controls the build-time maximum number of static port supported >>>>> >>>>> The problem is not exclusive to the static event channel. So I don't >>>>> think this is right to introduce MAX_STATIC_PORT to mitigate the issue >>>>> (even though this is the only user today). >>>>> >>>>> A few of alternative solutions: >>>>> 1) Handle preemption in alloc_evtchn_bucket() >>>>> 2) Allocate all the buckets when the domain is created (the max >>>>> numbers event channel is known). We may need to think about preemption >>>>> 3) Tweak is_port_valid() to check if the bucket is valid. This would >>>>> introduce a couple of extra memory access (might be OK as the bucket >>>>> would be accessed afterwards) and we would need to update some users. >>>>> >>>>> At the moment, 3) is appealing me the most. I would be interested to >>>>> have an opionions from the other maintainers. >>>> >>>> Fwiw of the named alternatives I would also prefer 3. Whether things >>>> really need generalizing at this point I'm not sure, though. >>> I am worry that we may end up to forget that we had non-generaic way >>> (e.g. MAX_STATIC_PORT) to prevent trigger the issue. So we could end up >>> to mistakenly introduce a security issue. >>> >>> However, my point was less about generalization but more about >>> introducing CONFIG_MAX_STATIC_PORT. >>> >>> It seems strange to let the admin to decide the maximum number of static >>> port supported. >> >> Why (assuming s/admin/build admin/)? I view both a build time upper bound >> as well as (alternatively) a command line driven upper bound as reasonable >> (in the latter case there of course still would want/need to be an upper >> bound on what is legitimate to specify from the command line). Using >> static ports after all means there's a static largest port number. >> Determined across all intended uses of a build such an upper bound can be >> a feasible mechanism. > > AFAICT, the limit is only here to mitigate the risk with the patch I > proposed. With a proper fix, the limit would be articial and therefore > it is not clear why the admin should be able to configure it (even > temporarily). The limit would be as artificial as other limits we enforce. I can't see why it would be wrong to have a more tight limit on static ports than on traditional ("dynamic") ones. Even if only to make sure so many dynamic ones are left. That said, ... > Instead, I think we want to have a limit that apply for both statically > and dynamically allocated even channel. This is what d->max_evtchn_port > is for. ... I also have no issue with following your way of thinking. I view both perspectives as valid ones to take. Jan
Hi Jan, On 13/07/2022 11:56, Jan Beulich wrote: > On 13.07.2022 12:18, Julien Grall wrote: >> On 13/07/2022 10:53, Jan Beulich wrote: >>> On 13.07.2022 11:35, Julien Grall wrote: >>>> On 13/07/2022 07:21, Jan Beulich wrote: >>>>>>> For the FIFO issue, we can introduce the new config option to restrict the maximum number of static >>>>>>> port supported in Xen. We can check the user-defined static port when we parse the device tree and if >>>>>>> a user-defined static port is greater than the maximum allowed static port will return an error to the user. >>>>>>> In this way, we can avoid allocating a lot of memory to fill the hole. >>>>>>> >>>>>>> Let me know your view on this. >>>>>>> >>>>>>> config MAX_STATIC_PORT >>>>>>> int "Maximum number of static ports” >>>>>>> range 1 4095 >>>>>>> help >>>>>>> Controls the build-time maximum number of static port supported >>>>>> >>>>>> The problem is not exclusive to the static event channel. So I don't >>>>>> think this is right to introduce MAX_STATIC_PORT to mitigate the issue >>>>>> (even though this is the only user today). >>>>>> >>>>>> A few of alternative solutions: >>>>>> 1) Handle preemption in alloc_evtchn_bucket() >>>>>> 2) Allocate all the buckets when the domain is created (the max >>>>>> numbers event channel is known). We may need to think about preemption >>>>>> 3) Tweak is_port_valid() to check if the bucket is valid. This would >>>>>> introduce a couple of extra memory access (might be OK as the bucket >>>>>> would be accessed afterwards) and we would need to update some users. >>>>>> >>>>>> At the moment, 3) is appealing me the most. I would be interested to >>>>>> have an opionions from the other maintainers. >>>>> >>>>> Fwiw of the named alternatives I would also prefer 3. Whether things >>>>> really need generalizing at this point I'm not sure, though. >>>> I am worry that we may end up to forget that we had non-generaic way >>>> (e.g. MAX_STATIC_PORT) to prevent trigger the issue. So we could end up >>>> to mistakenly introduce a security issue. >>>> >>>> However, my point was less about generalization but more about >>>> introducing CONFIG_MAX_STATIC_PORT. >>>> >>>> It seems strange to let the admin to decide the maximum number of static >>>> port supported. >>> >>> Why (assuming s/admin/build admin/)? I view both a build time upper bound >>> as well as (alternatively) a command line driven upper bound as reasonable >>> (in the latter case there of course still would want/need to be an upper >>> bound on what is legitimate to specify from the command line). Using >>> static ports after all means there's a static largest port number. >>> Determined across all intended uses of a build such an upper bound can be >>> a feasible mechanism. >> >> AFAICT, the limit is only here to mitigate the risk with the patch I >> proposed. With a proper fix, the limit would be articial and therefore >> it is not clear why the admin should be able to configure it (even >> temporarily). > > The limit would be as artificial as other limits we enforce. You are right. But we need to be cautious in adding new one that overlaps with existing one. > I can't > see why it would be wrong to have a more tight limit on static ports > than on traditional ("dynamic") ones. Even if only to make sure so > many dynamic ones are left. This is similar to Xen forbidding to close a static port: it is not the hypervisor business to check that there are enough event channel ports freed for dynamic allocation. Cheers,
Hi, > On 13 Jul 2022, at 12:31, Julien Grall <julien@xen.org> wrote: > > Hi Jan, > > On 13/07/2022 11:56, Jan Beulich wrote: >> On 13.07.2022 12:18, Julien Grall wrote: >>> On 13/07/2022 10:53, Jan Beulich wrote: >>>> On 13.07.2022 11:35, Julien Grall wrote: >>>>> On 13/07/2022 07:21, Jan Beulich wrote: >>>>>>>> For the FIFO issue, we can introduce the new config option to restrict the maximum number of static >>>>>>>> port supported in Xen. We can check the user-defined static port when we parse the device tree and if >>>>>>>> a user-defined static port is greater than the maximum allowed static port will return an error to the user. >>>>>>>> In this way, we can avoid allocating a lot of memory to fill the hole. >>>>>>>> >>>>>>>> Let me know your view on this. >>>>>>>> >>>>>>>> config MAX_STATIC_PORT >>>>>>>> int "Maximum number of static ports” >>>>>>>> range 1 4095 >>>>>>>> help >>>>>>>> Controls the build-time maximum number of static port supported >>>>>>> >>>>>>> The problem is not exclusive to the static event channel. So I don't >>>>>>> think this is right to introduce MAX_STATIC_PORT to mitigate the issue >>>>>>> (even though this is the only user today). >>>>>>> >>>>>>> A few of alternative solutions: >>>>>>> 1) Handle preemption in alloc_evtchn_bucket() >>>>>>> 2) Allocate all the buckets when the domain is created (the max >>>>>>> numbers event channel is known). We may need to think about preemption >>>>>>> 3) Tweak is_port_valid() to check if the bucket is valid. This would >>>>>>> introduce a couple of extra memory access (might be OK as the bucket >>>>>>> would be accessed afterwards) and we would need to update some users. >>>>>>> >>>>>>> At the moment, 3) is appealing me the most. I would be interested to >>>>>>> have an opionions from the other maintainers. >>>>>> >>>>>> Fwiw of the named alternatives I would also prefer 3. Whether things >>>>>> really need generalizing at this point I'm not sure, though. >>>>> I am worry that we may end up to forget that we had non-generaic way >>>>> (e.g. MAX_STATIC_PORT) to prevent trigger the issue. So we could end up >>>>> to mistakenly introduce a security issue. >>>>> >>>>> However, my point was less about generalization but more about >>>>> introducing CONFIG_MAX_STATIC_PORT. >>>>> >>>>> It seems strange to let the admin to decide the maximum number of static >>>>> port supported. >>>> >>>> Why (assuming s/admin/build admin/)? I view both a build time upper bound >>>> as well as (alternatively) a command line driven upper bound as reasonable >>>> (in the latter case there of course still would want/need to be an upper >>>> bound on what is legitimate to specify from the command line). Using >>>> static ports after all means there's a static largest port number. >>>> Determined across all intended uses of a build such an upper bound can be >>>> a feasible mechanism. >>> >>> AFAICT, the limit is only here to mitigate the risk with the patch I >>> proposed. With a proper fix, the limit would be articial and therefore >>> it is not clear why the admin should be able to configure it (even >>> temporarily). >> The limit would be as artificial as other limits we enforce. > > You are right. But we need to be cautious in adding new one that overlaps with existing one. > >> I can't >> see why it would be wrong to have a more tight limit on static ports >> than on traditional ("dynamic") ones. Even if only to make sure so >> many dynamic ones are left. > > This is similar to Xen forbidding to close a static port: it is not the hypervisor business to check that there are enough event channel ports freed for dynamic allocation. On other side we need to be cautious not to add too much complexity in the code by trying to make things always magically work. If you want Xen to be accessible to non expert by magically working all the time, there would be a lot of work to do. Cheers Bertrand > > Cheers, > > -- > Julien Grall
On 13/07/2022 13:12, Bertrand Marquis wrote: >> On 13 Jul 2022, at 12:31, Julien Grall <julien@xen.org> wrote: >>> I can't >>> see why it would be wrong to have a more tight limit on static ports >>> than on traditional ("dynamic") ones. Even if only to make sure so >>> many dynamic ones are left. >> >> This is similar to Xen forbidding to close a static port: it is not the hypervisor business to check that there are enough event channel ports freed for dynamic allocation. > > On other side we need to be cautious not to add too much complexity in the code by trying to make things always magically work. > If you want Xen to be accessible to non expert by magically working all the time, there would be a lot of work to do. It is not clear to me whether you are referring to a developper or admin here. On the admin side, we need to make sure they have an easy way to configure event channels. One knob is always going to easier than two knobs. On the developper side, this could be resolved by better documentation in the code/interface. Cheers,
Hi , > On 13 Jul 2022, at 1:29 pm, Julien Grall <julien@xen.org> wrote: > > > > On 13/07/2022 13:12, Bertrand Marquis wrote: >>> On 13 Jul 2022, at 12:31, Julien Grall <julien@xen.org> wrote: >>>> I can't >>>> see why it would be wrong to have a more tight limit on static ports >>>> than on traditional ("dynamic") ones. Even if only to make sure so >>>> many dynamic ones are left. >>> >>> This is similar to Xen forbidding to close a static port: it is not the hypervisor business to check that there are enough event channel ports freed for dynamic allocation. >> On other side we need to be cautious not to add too much complexity in the code by trying to make things always magically work. >> If you want Xen to be accessible to non expert by magically working all the time, there would be a lot of work to do. > > It is not clear to me whether you are referring to a developper or admin here. > > On the admin side, we need to make sure they have an easy way to configure event channels. One knob is always going to easier than two knobs. > > On the developper side, this could be resolved by better documentation in the code/interface. > > Cheers, To conclude the discussion, If everyone agree I will add the below patch or similar in the next version to restrict the max number of evtchn supported as suggested. diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index 532e50e321..a8c5825a4f 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -3422,7 +3422,7 @@ void __init create_domUs(void) struct xen_domctl_createdomain d_cfg = { .arch.gic_version = XEN_DOMCTL_CONFIG_GIC_NATIVE, .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap, - .max_evtchn_port = -1, + .max_evtchn_port = MAX_EVTCHNS_PORT, .max_grant_frames = -1, .max_maptrack_frames = -1, .grant_opts = XEN_DOMCTL_GRANT_version(opt_gnttab_max_version), @@ -3582,7 +3582,7 @@ void __init create_dom0(void) struct domain *dom0; struct xen_domctl_createdomain dom0_cfg = { .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap, - .max_evtchn_port = -1, + .max_evtchn_port = MAX_EVTCHNS_PORT, .max_grant_frames = gnttab_dom0_frames(), .max_maptrack_frames = -1, .grant_opts = XEN_DOMCTL_GRANT_version(opt_gnttab_max_version), diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c index f08b07b8de..b1f95fbe1a 100644 --- a/xen/arch/x86/setup.c +++ b/xen/arch/x86/setup.c @@ -763,7 +763,7 @@ static struct domain *__init create_dom0(const module_t *image, { struct xen_domctl_createdomain dom0_cfg = { .flags = IS_ENABLED(CONFIG_TBOOT) ? XEN_DOMCTL_CDF_s3_integrity : 0, - .max_evtchn_port = -1, + .max_evtchn_port = MAX_EVTCHNS_PORT, .max_grant_frames = -1, .max_maptrack_frames = -1, .grant_opts = XEN_DOMCTL_GRANT_version(opt_gnttab_max_version), diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index d6c029020f..783359f733 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -76,6 +76,8 @@ extern domid_t hardware_domid; /* Maximum number of event channels for any ABI. */ #define MAX_NR_EVTCHNS MAX(EVTCHN_2L_NR_CHANNELS, EVTCHN_FIFO_NR_CHANNELS) +#define MAX_EVTCHNS_PORT 4096 + #define EVTCHNS_PER_BUCKET (PAGE_SIZE / next_power_of_2(sizeof(struct evtchn))) #define EVTCHNS_PER_GROUP (BUCKETS_PER_GROUP * EVTCHNS_PER_BUCKET) #define NR_EVTCHN_GROUPS DIV_ROUND_UP(MAX_NR_EVTCHNS, EVTCHNS_PER_GROUP Regards, Rahul
Hi Rahul, On 20/07/2022 10:59, Rahul Singh wrote: >> On 13 Jul 2022, at 1:29 pm, Julien Grall <julien@xen.org> wrote: >> >> >> >> On 13/07/2022 13:12, Bertrand Marquis wrote: >>>> On 13 Jul 2022, at 12:31, Julien Grall <julien@xen.org> wrote: >>>>> I can't >>>>> see why it would be wrong to have a more tight limit on static ports >>>>> than on traditional ("dynamic") ones. Even if only to make sure so >>>>> many dynamic ones are left. >>>> >>>> This is similar to Xen forbidding to close a static port: it is not the hypervisor business to check that there are enough event channel ports freed for dynamic allocation. >>> On other side we need to be cautious not to add too much complexity in the code by trying to make things always magically work. >>> If you want Xen to be accessible to non expert by magically working all the time, there would be a lot of work to do. >> >> It is not clear to me whether you are referring to a developper or admin here. >> >> On the admin side, we need to make sure they have an easy way to configure event channels. One knob is always going to easier than two knobs. >> >> On the developper side, this could be resolved by better documentation in the code/interface. >> >> Cheers, > > To conclude the discussion, If everyone agree I will add the below patch or similar in the next version to restrict the > max number of evtchn supported as suggested. I am fine if the limit for domU is fixed by Xen for now. However, for dom0, 4096 is potentially too low if you have many PV drivers (each backend will need a few event channels). So I don't think this wants to be fixed by Xen. I am not entirely sure we want to limit the number of event channels for dom0. But if you want to, then this will have to be done via a command line option (or device-tree property). Cheers,
On 20.07.2022 13:16, Julien Grall wrote: > On 20/07/2022 10:59, Rahul Singh wrote: >>> On 13 Jul 2022, at 1:29 pm, Julien Grall <julien@xen.org> wrote: >>> On 13/07/2022 13:12, Bertrand Marquis wrote: >>>>> On 13 Jul 2022, at 12:31, Julien Grall <julien@xen.org> wrote: >>>>>> I can't >>>>>> see why it would be wrong to have a more tight limit on static ports >>>>>> than on traditional ("dynamic") ones. Even if only to make sure so >>>>>> many dynamic ones are left. >>>>> >>>>> This is similar to Xen forbidding to close a static port: it is not the hypervisor business to check that there are enough event channel ports freed for dynamic allocation. >>>> On other side we need to be cautious not to add too much complexity in the code by trying to make things always magically work. >>>> If you want Xen to be accessible to non expert by magically working all the time, there would be a lot of work to do. >>> >>> It is not clear to me whether you are referring to a developper or admin here. >>> >>> On the admin side, we need to make sure they have an easy way to configure event channels. One knob is always going to easier than two knobs. >>> >>> On the developper side, this could be resolved by better documentation in the code/interface. >>> >>> Cheers, >> >> To conclude the discussion, If everyone agree I will add the below patch or similar in the next version to restrict the >> max number of evtchn supported as suggested. > > I am fine if the limit for domU is fixed by Xen for now. However, for > dom0, 4096 is potentially too low if you have many PV drivers (each > backend will need a few event channels). So I don't think this wants to > be fixed by Xen. > > I am not entirely sure we want to limit the number of event channels for > dom0. But if you want to, then this will have to be done via a command > line option (or device-tree property). Imo there would need to be a very good reason to limit Dom0's port range. Jan
Hi Julien, > On 20 Jul 2022, at 12:16 pm, Julien Grall <julien@xen.org> wrote: > > Hi Rahul, > > On 20/07/2022 10:59, Rahul Singh wrote: >>> On 13 Jul 2022, at 1:29 pm, Julien Grall <julien@xen.org> wrote: >>> >>> >>> >>> On 13/07/2022 13:12, Bertrand Marquis wrote: >>>>> On 13 Jul 2022, at 12:31, Julien Grall <julien@xen.org> wrote: >>>>>> I can't >>>>>> see why it would be wrong to have a more tight limit on static ports >>>>>> than on traditional ("dynamic") ones. Even if only to make sure so >>>>>> many dynamic ones are left. >>>>> >>>>> This is similar to Xen forbidding to close a static port: it is not the hypervisor business to check that there are enough event channel ports freed for dynamic allocation. >>>> On other side we need to be cautious not to add too much complexity in the code by trying to make things always magically work. >>>> If you want Xen to be accessible to non expert by magically working all the time, there would be a lot of work to do. >>> >>> It is not clear to me whether you are referring to a developper or admin here. >>> >>> On the admin side, we need to make sure they have an easy way to configure event channels. One knob is always going to easier than two knobs. >>> >>> On the developper side, this could be resolved by better documentation in the code/interface. >>> >>> Cheers, >> To conclude the discussion, If everyone agree I will add the below patch or similar in the next version to restrict the >> max number of evtchn supported as suggested. > > I am fine if the limit for domU is fixed by Xen for now. However, for dom0, 4096 is potentially too low if you have many PV drivers (each backend will need a few event channels). So I don't think this wants to be fixed by Xen. Agree. > > I am not entirely sure we want to limit the number of event channels for dom0. But if you want to, then this will have to be done via a command line option (or device-tree property). We need to support the static event channel for dom0 also, in that case, we need to restrict the max number of evtchn for dom0 to mitigate the security issue. I am okay with introducing the new command line option "max_event_channels” for dom0 and setting the default value to 4096. If the admin set the value greater than 4096 (or what we agreed on) and static event channel support is enabled we will print the warning to the user related to fill the hole issue for FIFO ABI. Regards, Rahul
On 21/07/2022 13:50, Rahul Singh wrote: > Hi Julien, Hi Rahul, >> On 20 Jul 2022, at 12:16 pm, Julien Grall <julien@xen.org> wrote: >> >> Hi Rahul, >> >> On 20/07/2022 10:59, Rahul Singh wrote: >>>> On 13 Jul 2022, at 1:29 pm, Julien Grall <julien@xen.org> wrote: >>>> >>>> >>>> >>>> On 13/07/2022 13:12, Bertrand Marquis wrote: >>>>>> On 13 Jul 2022, at 12:31, Julien Grall <julien@xen.org> wrote: >>>>>>> I can't >>>>>>> see why it would be wrong to have a more tight limit on static ports >>>>>>> than on traditional ("dynamic") ones. Even if only to make sure so >>>>>>> many dynamic ones are left. >>>>>> >>>>>> This is similar to Xen forbidding to close a static port: it is not the hypervisor business to check that there are enough event channel ports freed for dynamic allocation. >>>>> On other side we need to be cautious not to add too much complexity in the code by trying to make things always magically work. >>>>> If you want Xen to be accessible to non expert by magically working all the time, there would be a lot of work to do. >>>> >>>> It is not clear to me whether you are referring to a developper or admin here. >>>> >>>> On the admin side, we need to make sure they have an easy way to configure event channels. One knob is always going to easier than two knobs. >>>> >>>> On the developper side, this could be resolved by better documentation in the code/interface. >>>> >>>> Cheers, >>> To conclude the discussion, If everyone agree I will add the below patch or similar in the next version to restrict the >>> max number of evtchn supported as suggested. >> >> I am fine if the limit for domU is fixed by Xen for now. However, for dom0, 4096 is potentially too low if you have many PV drivers (each backend will need a few event channels). So I don't think this wants to be fixed by Xen. > Agree. >> >> I am not entirely sure we want to limit the number of event channels for dom0. But if you want to, then this will have to be done via a command line option (or device-tree property). > > We need to support the static event channel for dom0 also, in that case, we need to restrict the max number of evtchn for dom0 to mitigate the security issue. It sounds like there are some misundertanding or I am missing some context. The static event channels will be allocated at boot, so the worse that can happen is it will be slower to boot. My point regarding fifo was more in the generic case of allowing the caller to select the port. This would be a concern in the context of non-cooperative live-migration. An easy way is to restrict the number of ports. For you, this is just an increase in boot time. Furthermore, there is an issue for dom0less domUs because we don't limit the number of port by default. This means that a domU can allocate a large amount of memory in Xen (we need some per-event channel state). Hence why I suggested to update max_evtchn_channel. > If the admin set the value greater than 4096 (or what we agreed on) and static event channel support is enabled we will print the warning to the user related to fill > the hole issue for FIFO ABI. See above. I don't see the need for a warning. The admin will notice that it is slower to boot. See above. Cheers, > > Regards, > Rahul >
Hi Julien, > On 21 Jul 2022, at 2:29 pm, Julien Grall <julien@xen.org> wrote: > > On 21/07/2022 13:50, Rahul Singh wrote: >> Hi Julien, > > Hi Rahul, > >>> On 20 Jul 2022, at 12:16 pm, Julien Grall <julien@xen.org> wrote: >>> >>> Hi Rahul, >>> >>> On 20/07/2022 10:59, Rahul Singh wrote: >>>>> On 13 Jul 2022, at 1:29 pm, Julien Grall <julien@xen.org> wrote: >>>>> >>>>> >>>>> >>>>> On 13/07/2022 13:12, Bertrand Marquis wrote: >>>>>>> On 13 Jul 2022, at 12:31, Julien Grall <julien@xen.org> wrote: >>>>>>>> I can't >>>>>>>> see why it would be wrong to have a more tight limit on static ports >>>>>>>> than on traditional ("dynamic") ones. Even if only to make sure so >>>>>>>> many dynamic ones are left. >>>>>>> >>>>>>> This is similar to Xen forbidding to close a static port: it is not the hypervisor business to check that there are enough event channel ports freed for dynamic allocation. >>>>>> On other side we need to be cautious not to add too much complexity in the code by trying to make things always magically work. >>>>>> If you want Xen to be accessible to non expert by magically working all the time, there would be a lot of work to do. >>>>> >>>>> It is not clear to me whether you are referring to a developper or admin here. >>>>> >>>>> On the admin side, we need to make sure they have an easy way to configure event channels. One knob is always going to easier than two knobs. >>>>> >>>>> On the developper side, this could be resolved by better documentation in the code/interface. >>>>> >>>>> Cheers, >>>> To conclude the discussion, If everyone agree I will add the below patch or similar in the next version to restrict the >>>> max number of evtchn supported as suggested. >>> >>> I am fine if the limit for domU is fixed by Xen for now. However, for dom0, 4096 is potentially too low if you have many PV drivers (each backend will need a few event channels). So I don't think this wants to be fixed by Xen. >> Agree. >>> >>> I am not entirely sure we want to limit the number of event channels for dom0. But if you want to, then this will have to be done via a command line option (or device-tree property). >> We need to support the static event channel for dom0 also, in that case, we need to restrict the max number of evtchn for dom0 to mitigate the security issue. > > It sounds like there are some misundertanding or I am missing some context. The static event channels will be allocated at boot, so the worse that can happen is it will be slower to boot. > > My point regarding fifo was more in the generic case of allowing the caller to select the port. This would be a concern in the context of non-cooperative live-migration. An easy way is to restrict the number of ports. For you, this is just an increase in boot time. > > Furthermore, there is an issue for dom0less domUs because we don't limit the number of port by default. This means that a domU can allocate a large amount of memory in Xen (we need some per-event channel state). Hence why I suggested to update max_evtchn_channel. Thanks for the clarification. > >> If the admin set the value greater than 4096 (or what we agreed on) and static event channel support is enabled we will print the warning to the user related to fill >> the hole issue for FIFO ABI. > > See above. I don't see the need for a warning. The admin will notice that it is slower to boot. Ok. I will not add the warning. Just to confirm again is that okay If I add new command line option "max_event_channels” in next version for dom0 to restrict the max number of evtchn. Regards, Rahul
On 21/07/2022 16:37, Rahul Singh wrote: > Hi Julien, Hi Rahul, >> On 21 Jul 2022, at 2:29 pm, Julien Grall <julien@xen.org> wrote: >> >> On 21/07/2022 13:50, Rahul Singh wrote: >>> Hi Julien, >> >> Hi Rahul, >> >>>> On 20 Jul 2022, at 12:16 pm, Julien Grall <julien@xen.org> wrote: >>>> >>>> Hi Rahul, >>>> >>>> On 20/07/2022 10:59, Rahul Singh wrote: >>>>>> On 13 Jul 2022, at 1:29 pm, Julien Grall <julien@xen.org> wrote: >>>>>> >>>>>> >>>>>> >>>>>> On 13/07/2022 13:12, Bertrand Marquis wrote: >>>>>>>> On 13 Jul 2022, at 12:31, Julien Grall <julien@xen.org> wrote: >>>>>>>>> I can't >>>>>>>>> see why it would be wrong to have a more tight limit on static ports >>>>>>>>> than on traditional ("dynamic") ones. Even if only to make sure so >>>>>>>>> many dynamic ones are left. >>>>>>>> >>>>>>>> This is similar to Xen forbidding to close a static port: it is not the hypervisor business to check that there are enough event channel ports freed for dynamic allocation. >>>>>>> On other side we need to be cautious not to add too much complexity in the code by trying to make things always magically work. >>>>>>> If you want Xen to be accessible to non expert by magically working all the time, there would be a lot of work to do. >>>>>> >>>>>> It is not clear to me whether you are referring to a developper or admin here. >>>>>> >>>>>> On the admin side, we need to make sure they have an easy way to configure event channels. One knob is always going to easier than two knobs. >>>>>> >>>>>> On the developper side, this could be resolved by better documentation in the code/interface. >>>>>> >>>>>> Cheers, >>>>> To conclude the discussion, If everyone agree I will add the below patch or similar in the next version to restrict the >>>>> max number of evtchn supported as suggested. >>>> >>>> I am fine if the limit for domU is fixed by Xen for now. However, for dom0, 4096 is potentially too low if you have many PV drivers (each backend will need a few event channels). So I don't think this wants to be fixed by Xen. >>> Agree. >>>> >>>> I am not entirely sure we want to limit the number of event channels for dom0. But if you want to, then this will have to be done via a command line option (or device-tree property). >>> We need to support the static event channel for dom0 also, in that case, we need to restrict the max number of evtchn for dom0 to mitigate the security issue. >> >> It sounds like there are some misundertanding or I am missing some context. The static event channels will be allocated at boot, so the worse that can happen is it will be slower to boot. >> >> My point regarding fifo was more in the generic case of allowing the caller to select the port. This would be a concern in the context of non-cooperative live-migration. An easy way is to restrict the number of ports. For you, this is just an increase in boot time. >> >> Furthermore, there is an issue for dom0less domUs because we don't limit the number of port by default. This means that a domU can allocate a large amount of memory in Xen (we need some per-event channel state). Hence why I suggested to update max_evtchn_channel. > > Thanks for the clarification. >> >>> If the admin set the value greater than 4096 (or what we agreed on) and static event channel support is enabled we will print the warning to the user related to fill >>> the hole issue for FIFO ABI. >> >> See above. I don't see the need for a warning. The admin will notice that it is slower to boot. > > Ok. I will not add the warning. Just to confirm again is that okay If I add new command line option "max_event_channels” in > next version for dom0 to restrict the max number of evtchn. Personally I am fine with a command line option to *globally* restrict the number of events channel. But Jan seemed to have some reservation. Quoting what he wrote previously: "Imo there would need to be a very good reason to limit Dom0's port range. " Cheers,
Hi Julien > On 26 Jul 2022, at 6:37 pm, Julien Grall <julien@xen.org> wrote: > > > > On 21/07/2022 16:37, Rahul Singh wrote: >> Hi Julien, > > Hi Rahul, > >>> On 21 Jul 2022, at 2:29 pm, Julien Grall <julien@xen.org> wrote: >>> >>> On 21/07/2022 13:50, Rahul Singh wrote: >>>> Hi Julien, >>> >>> Hi Rahul, >>> >>>>> On 20 Jul 2022, at 12:16 pm, Julien Grall <julien@xen.org> wrote: >>>>> >>>>> Hi Rahul, >>>>> >>>>> On 20/07/2022 10:59, Rahul Singh wrote: >>>>>>> On 13 Jul 2022, at 1:29 pm, Julien Grall <julien@xen.org> wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 13/07/2022 13:12, Bertrand Marquis wrote: >>>>>>>>> On 13 Jul 2022, at 12:31, Julien Grall <julien@xen.org> wrote: >>>>>>>>>> I can't >>>>>>>>>> see why it would be wrong to have a more tight limit on static ports >>>>>>>>>> than on traditional ("dynamic") ones. Even if only to make sure so >>>>>>>>>> many dynamic ones are left. >>>>>>>>> >>>>>>>>> This is similar to Xen forbidding to close a static port: it is not the hypervisor business to check that there are enough event channel ports freed for dynamic allocation. >>>>>>>> On other side we need to be cautious not to add too much complexity in the code by trying to make things always magically work. >>>>>>>> If you want Xen to be accessible to non expert by magically working all the time, there would be a lot of work to do. >>>>>>> >>>>>>> It is not clear to me whether you are referring to a developper or admin here. >>>>>>> >>>>>>> On the admin side, we need to make sure they have an easy way to configure event channels. One knob is always going to easier than two knobs. >>>>>>> >>>>>>> On the developper side, this could be resolved by better documentation in the code/interface. >>>>>>> >>>>>>> Cheers, >>>>>> To conclude the discussion, If everyone agree I will add the below patch or similar in the next version to restrict the >>>>>> max number of evtchn supported as suggested. >>>>> >>>>> I am fine if the limit for domU is fixed by Xen for now. However, for dom0, 4096 is potentially too low if you have many PV drivers (each backend will need a few event channels). So I don't think this wants to be fixed by Xen. >>>> Agree. >>>>> >>>>> I am not entirely sure we want to limit the number of event channels for dom0. But if you want to, then this will have to be done via a command line option (or device-tree property). >>>> We need to support the static event channel for dom0 also, in that case, we need to restrict the max number of evtchn for dom0 to mitigate the security issue. >>> >>> It sounds like there are some misundertanding or I am missing some context. The static event channels will be allocated at boot, so the worse that can happen is it will be slower to boot. >>> >>> My point regarding fifo was more in the generic case of allowing the caller to select the port. This would be a concern in the context of non-cooperative live-migration. An easy way is to restrict the number of ports. For you, this is just an increase in boot time. >>> >>> Furthermore, there is an issue for dom0less domUs because we don't limit the number of port by default. This means that a domU can allocate a large amount of memory in Xen (we need some per-event channel state). Hence why I suggested to update max_evtchn_channel. >> Thanks for the clarification. >>> >>>> If the admin set the value greater than 4096 (or what we agreed on) and static event channel support is enabled we will print the warning to the user related to fill >>>> the hole issue for FIFO ABI. >>> >>> See above. I don't see the need for a warning. The admin will notice that it is slower to boot. >> Ok. I will not add the warning. Just to confirm again is that okay If I add new command line option "max_event_channels” in >> next version for dom0 to restrict the max number of evtchn. > > Personally I am fine with a command line option to *globally* restrict the number of events channel. But Jan seemed to have some reservation. Quoting what he wrote previously: > > "Imo there would need to be a very good reason to limit Dom0's port range. As you mentioned, if we don’t restrict the number of events channel for the dom0 system will boot slower. This is a good reason to restrict the number of event channels for dom0. @Jan: I need your input on this before I send the next version of the patch series. Regards, Rahul
On 28.07.2022 17:37, Rahul Singh wrote: >> On 26 Jul 2022, at 6:37 pm, Julien Grall <julien@xen.org> wrote: >> On 21/07/2022 16:37, Rahul Singh wrote: >>> Ok. I will not add the warning. Just to confirm again is that okay If I add new command line option "max_event_channels” in >>> next version for dom0 to restrict the max number of evtchn. >> >> Personally I am fine with a command line option to *globally* restrict the number of events channel. But Jan seemed to have some reservation. Quoting what he wrote previously: >> >> "Imo there would need to be a very good reason to limit Dom0's port range. > > As you mentioned, if we don’t restrict the number of events channel for the dom0 system will boot slower. > This is a good reason to restrict the number of event channels for dom0. > > @Jan: I need your input on this before I send the next version of the patch series. I have to admit that it's not clear to me what you expect - I continue to think the way expressed before. Jan
Hi Rahul, On 28/07/2022 16:37, Rahul Singh wrote: > As you mentioned, if we don’t restrict the number of events channel for the dom0 system will boot slower. > This is a good reason to restrict the number of event channels for dom0. Let me start that I am still fine if you want to push for a new parameter (so long it is not Arm specific). However, I am afraid that I will not be able to argue for it because I don't see a strict need for it. Let me play the devil's advocate for a bit. AFAIU, you would like to introduce the new parameter just to tell the admin the boot is going to be slower if you use a event channel ID higher than N. To me this sounds like the same as if an admin decide to use 10GB rather than 1GB. There will be slow down. This slowness is only boot specific and will not vary. So one could argue this is easily noticeable and an admin can take remediation. Given Jan's objection, I would like to propose to document it in the bindings instead (a concerned admin will likely read it). Below a rough proposal for the documentation: "It is recommended to use low event channel ID." Would that be suitable for you? Cheers,
Hi Julien > On 28 Jul 2022, at 9:50 pm, Julien Grall <julien@xen.org> wrote: > > Hi Rahul, > > On 28/07/2022 16:37, Rahul Singh wrote: >> As you mentioned, if we don’t restrict the number of events channel for the dom0 system will boot slower. >> This is a good reason to restrict the number of event channels for dom0. > Let me start that I am still fine if you want to push for a new parameter (so long it is not Arm specific). However, I am afraid that I will not be able to argue for it because I don't see a strict need for it. > > Let me play the devil's advocate for a bit. AFAIU, you would like to introduce the new parameter just to tell the admin the boot is going to be slower if you use a event channel ID higher than N. > > To me this sounds like the same as if an admin decide to use 10GB rather than 1GB. There will be slow down. > > This slowness is only boot specific and will not vary. So one could argue this is easily noticeable and an admin can take remediation. > > Given Jan's objection, I would like to propose to document it in the bindings instead (a concerned admin will likely read it). Below a rough proposal for the documentation: > > "It is recommended to use low event channel ID." > > Would that be suitable for you? Yes, that will works for me. I will restrict the max event channel for domU only and also add the comment in "docs/misc/arm/device-tree/booting.txt” as suggested by you. Regards, Rahul
diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index 7ddd16c26d..5f97d9d181 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -3171,7 +3171,7 @@ static int __init alloc_xenstore_evtchn(struct domain *d) alloc.dom = d->domain_id; alloc.remote_dom = hardware_domain->domain_id; - rc = evtchn_alloc_unbound(&alloc); + rc = evtchn_alloc_unbound(&alloc, 0); if ( rc ) { printk("Failed allocating event channel for domain\n"); diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c index 8cbe9681da..80a88c1544 100644 --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -290,11 +290,15 @@ void evtchn_free(struct domain *d, struct evtchn *chn) xsm_evtchn_close_post(chn); } -int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc) +/* + * If port is zero get the next free port and allocate. If port is non-zero + * allocate the specified port. + */ +int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, evtchn_port_t port) { struct evtchn *chn; struct domain *d; - int port, rc; + int rc; domid_t dom = alloc->dom; d = rcu_lock_domain_by_any_id(dom); @@ -303,8 +307,20 @@ int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc) spin_lock(&d->event_lock); - if ( (port = get_free_port(d)) < 0 ) - ERROR_EXIT_DOM(port, d); + if ( port != 0 ) + { + if ( (rc = evtchn_allocate_port(d, port)) != 0 ) + ERROR_EXIT_DOM(rc, d); + } + else + { + int alloc_port = get_free_port(d); + + if ( alloc_port < 0 ) + ERROR_EXIT_DOM(alloc_port, d); + port = alloc_port; + } + chn = evtchn_from_port(d, port); rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom); @@ -1206,7 +1222,7 @@ long cf_check do_event_channel_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) struct evtchn_alloc_unbound alloc_unbound; if ( copy_from_guest(&alloc_unbound, arg, 1) != 0 ) return -EFAULT; - rc = evtchn_alloc_unbound(&alloc_unbound); + rc = evtchn_alloc_unbound(&alloc_unbound, 0); if ( !rc && __copy_to_guest(arg, &alloc_unbound, 1) ) rc = -EFAULT; /* Cleaning up here would be a mess! */ break; diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h index 61615ebbe3..48820e393e 100644 --- a/xen/include/xen/event.h +++ b/xen/include/xen/event.h @@ -72,7 +72,8 @@ void evtchn_free(struct domain *d, struct evtchn *chn); int evtchn_allocate_port(struct domain *d, unsigned int port); /* Allocate a new event channel */ -int __must_check evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc); +int __must_check evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, + evtchn_port_t port); /* Bind an event channel port to interdomain */ int __must_check evtchn_bind_interdomain(evtchn_bind_interdomain_t *bind);
evtchn_alloc_unbound() always allocates the next available port. Static event channel support for dom0less domains requires allocating a specified port. Modify the evtchn_alloc_unbound() to accept the port number as an argument and allocate the specified port if available. If the port number argument is zero, the next available port will be allocated. Signed-off-by: Rahul Singh <rahul.singh@arm.com> --- xen/arch/arm/domain_build.c | 2 +- xen/common/event_channel.c | 26 +++++++++++++++++++++----- xen/include/xen/event.h | 3 ++- 3 files changed, 24 insertions(+), 7 deletions(-)