diff mbox

[3/3] tools: introduce parameter max_ranges.

Message ID 1453195678-25944-4-git-send-email-yu.c.zhang@linux.intel.com
State New, archived
Headers show

Commit Message

Yu Zhang Jan. 19, 2016, 9:27 a.m. UTC
A new parameter - max_ranges is added to set the upper limit of ranges
to be tracked inside one ioreq server rangeset.

Ioreq server uses a group of rangesets to track the I/O or memory
resources to be emulated. The default value of this limit is set to
256. Yet there are circumstances under which the limit should exceed
the default one. E.g. in XenGT, when tracking the per-process graphic
translation tables on intel broadwell platforms, the number of page
tables concerned will be several thousand(normally in this case, 8192
could be a big enough value). Users who set his item explicitly are
supposed to know the specific scenarios that necessitate this
configuration.

Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
---
 docs/man/xl.cfg.pod.5           | 17 +++++++++++++++++
 tools/libxl/libxl_dom.c         |  3 +++
 tools/libxl/libxl_types.idl     |  1 +
 tools/libxl/xl_cmdimpl.c        |  4 ++++
 xen/arch/x86/hvm/hvm.c          |  7 ++++++-
 xen/include/public/hvm/params.h |  5 ++++-
 6 files changed, 35 insertions(+), 2 deletions(-)

Comments

Wei Liu Jan. 19, 2016, 11:53 a.m. UTC | #1
On Tue, Jan 19, 2016 at 05:27:58PM +0800, Yu Zhang wrote:
> A new parameter - max_ranges is added to set the upper limit of ranges
> to be tracked inside one ioreq server rangeset.
> 
> Ioreq server uses a group of rangesets to track the I/O or memory
> resources to be emulated. The default value of this limit is set to
> 256. Yet there are circumstances under which the limit should exceed
> the default one. E.g. in XenGT, when tracking the per-process graphic
> translation tables on intel broadwell platforms, the number of page
> tables concerned will be several thousand(normally in this case, 8192
> could be a big enough value). Users who set his item explicitly are
> supposed to know the specific scenarios that necessitate this
> configuration.
> 
> Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
> ---
>  docs/man/xl.cfg.pod.5           | 17 +++++++++++++++++
>  tools/libxl/libxl_dom.c         |  3 +++
>  tools/libxl/libxl_types.idl     |  1 +
>  tools/libxl/xl_cmdimpl.c        |  4 ++++
>  xen/arch/x86/hvm/hvm.c          |  7 ++++++-
>  xen/include/public/hvm/params.h |  5 ++++-
>  6 files changed, 35 insertions(+), 2 deletions(-)
> 
> diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
> index 8899f75..562563d 100644
> --- a/docs/man/xl.cfg.pod.5
> +++ b/docs/man/xl.cfg.pod.5
> @@ -962,6 +962,23 @@ FIFO-based event channel ABI support up to 131,071 event channels.
>  Other guests are limited to 4095 (64-bit x86 and ARM) or 1023 (32-bit
>  x86).
>  
> +=item B<max_ranges=N>
> +

This name is too generic. I don't have better suggestion though.

> +Limit the maximum ranges that can be tracked inside one ioreq server
> +rangeset.
> +
> +Ioreq server uses a group of rangesets to track the I/O or memory
> +resources to be emulated. By default, this item is not set. Not
> +configuring this item, or setting its value to 0 will result in the
> +upper limit set to its default value - 256. Yet there are circumstances

No need to say 256 because this might change in the future in
hypervisor.

> +under which the upper limit inside one rangeset should exceed the
> +default one. E.g. in XenGT, when tracking the per-process graphic
> +translation tables on intel broadwell platforms, the number of page
> +tables concerned will be several thousand(normally in this case, 8192
> +could be a big enough value). Users who set his item explicitly are
> +supposed to know the specific scenarios that necessitate this
> +configuration.
> +
>  =back
>  
>  =head2 Paravirtualised (PV) Guest Specific Options
> diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
> index 47971a9..607b0c4 100644
> --- a/tools/libxl/libxl_dom.c
> +++ b/tools/libxl/libxl_dom.c
> @@ -288,6 +288,9 @@ static void hvm_set_conf_params(xc_interface *handle, uint32_t domid,
>                      libxl_defbool_val(info->u.hvm.nested_hvm));
>      xc_hvm_param_set(handle, domid, HVM_PARAM_ALTP2M,
>                      libxl_defbool_val(info->u.hvm.altp2m));
> +    if (info->u.hvm.max_ranges > 0)
> +        xc_hvm_param_set(handle, domid, HVM_PARAM_MAX_RANGES,
> +                        info->u.hvm.max_ranges);
>  }
>  
>  int libxl__build_pre(libxl__gc *gc, uint32_t domid,
> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> index 9ad7eba..c936265 100644
> --- a/tools/libxl/libxl_types.idl
> +++ b/tools/libxl/libxl_types.idl
> @@ -518,6 +518,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
>                                         ("serial_list",      libxl_string_list),
>                                         ("rdm", libxl_rdm_reserve),
>                                         ("rdm_mem_boundary_memkb", MemKB),
> +                                       ("max_ranges", uint32),

This also needs a better name.

Another thing is that you need to define LIBXL_HAVE_XXX in libxl.h to
indicate we introduce a new field.

Wei.
Paul Durrant Jan. 19, 2016, 1:54 p.m. UTC | #2
> -----Original Message-----
> From: Wei Liu [mailto:wei.liu2@citrix.com]
> Sent: 19 January 2016 11:54
> To: Yu Zhang
> Cc: xen-devel@lists.xen.org; Kevin Tian; Keir (Xen.org); Stefano Stabellini;
> Andrew Cooper; Paul Durrant; zhiyuan.lv@intel.com; jbeulich@suse.com;
> Wei Liu
> Subject: Re: [Xen-devel] [PATCH 3/3] tools: introduce parameter
> max_ranges.
> 
> On Tue, Jan 19, 2016 at 05:27:58PM +0800, Yu Zhang wrote:
> > A new parameter - max_ranges is added to set the upper limit of ranges
> > to be tracked inside one ioreq server rangeset.
> >
> > Ioreq server uses a group of rangesets to track the I/O or memory
> > resources to be emulated. The default value of this limit is set to
> > 256. Yet there are circumstances under which the limit should exceed
> > the default one. E.g. in XenGT, when tracking the per-process graphic
> > translation tables on intel broadwell platforms, the number of page
> > tables concerned will be several thousand(normally in this case, 8192
> > could be a big enough value). Users who set his item explicitly are
> > supposed to know the specific scenarios that necessitate this
> > configuration.
> >
> > Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
> > ---
> >  docs/man/xl.cfg.pod.5           | 17 +++++++++++++++++
> >  tools/libxl/libxl_dom.c         |  3 +++
> >  tools/libxl/libxl_types.idl     |  1 +
> >  tools/libxl/xl_cmdimpl.c        |  4 ++++
> >  xen/arch/x86/hvm/hvm.c          |  7 ++++++-
> >  xen/include/public/hvm/params.h |  5 ++++-
> >  6 files changed, 35 insertions(+), 2 deletions(-)
> >
> > diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
> > index 8899f75..562563d 100644
> > --- a/docs/man/xl.cfg.pod.5
> > +++ b/docs/man/xl.cfg.pod.5
> > @@ -962,6 +962,23 @@ FIFO-based event channel ABI support up to
> 131,071 event channels.
> >  Other guests are limited to 4095 (64-bit x86 and ARM) or 1023 (32-bit
> >  x86).
> >
> > +=item B<max_ranges=N>
> > +
> 
> This name is too generic. I don't have better suggestion though.
> 

I think the increased limit for XenGT need only be applied to wp mem ranges so perhaps the parameter name could be 'max_wp_memory_ranges'?

  Paul

> > +Limit the maximum ranges that can be tracked inside one ioreq server
> > +rangeset.
> > +
> > +Ioreq server uses a group of rangesets to track the I/O or memory
> > +resources to be emulated. By default, this item is not set. Not
> > +configuring this item, or setting its value to 0 will result in the
> > +upper limit set to its default value - 256. Yet there are circumstances
> 
> No need to say 256 because this might change in the future in
> hypervisor.
> 
> > +under which the upper limit inside one rangeset should exceed the
> > +default one. E.g. in XenGT, when tracking the per-process graphic
> > +translation tables on intel broadwell platforms, the number of page
> > +tables concerned will be several thousand(normally in this case, 8192
> > +could be a big enough value). Users who set his item explicitly are
> > +supposed to know the specific scenarios that necessitate this
> > +configuration.
> > +
> >  =back
> >
> >  =head2 Paravirtualised (PV) Guest Specific Options
> > diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
> > index 47971a9..607b0c4 100644
> > --- a/tools/libxl/libxl_dom.c
> > +++ b/tools/libxl/libxl_dom.c
> > @@ -288,6 +288,9 @@ static void hvm_set_conf_params(xc_interface
> *handle, uint32_t domid,
> >                      libxl_defbool_val(info->u.hvm.nested_hvm));
> >      xc_hvm_param_set(handle, domid, HVM_PARAM_ALTP2M,
> >                      libxl_defbool_val(info->u.hvm.altp2m));
> > +    if (info->u.hvm.max_ranges > 0)
> > +        xc_hvm_param_set(handle, domid, HVM_PARAM_MAX_RANGES,
> > +                        info->u.hvm.max_ranges);
> >  }
> >
> >  int libxl__build_pre(libxl__gc *gc, uint32_t domid,
> > diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> > index 9ad7eba..c936265 100644
> > --- a/tools/libxl/libxl_types.idl
> > +++ b/tools/libxl/libxl_types.idl
> > @@ -518,6 +518,7 @@ libxl_domain_build_info =
> Struct("domain_build_info",[
> >                                         ("serial_list",      libxl_string_list),
> >                                         ("rdm", libxl_rdm_reserve),
> >                                         ("rdm_mem_boundary_memkb", MemKB),
> > +                                       ("max_ranges", uint32),
> 
> This also needs a better name.
> 
> Another thing is that you need to define LIBXL_HAVE_XXX in libxl.h to
> indicate we introduce a new field.
> 
> Wei.
Wei Liu Jan. 19, 2016, 2:37 p.m. UTC | #3
On Tue, Jan 19, 2016 at 01:54:42PM +0000, Paul Durrant wrote:
> > -----Original Message-----
> > From: Wei Liu [mailto:wei.liu2@citrix.com]
> > Sent: 19 January 2016 11:54
> > To: Yu Zhang
> > Cc: xen-devel@lists.xen.org; Kevin Tian; Keir (Xen.org); Stefano Stabellini;
> > Andrew Cooper; Paul Durrant; zhiyuan.lv@intel.com; jbeulich@suse.com;
> > Wei Liu
> > Subject: Re: [Xen-devel] [PATCH 3/3] tools: introduce parameter
> > max_ranges.
> > 
> > On Tue, Jan 19, 2016 at 05:27:58PM +0800, Yu Zhang wrote:
> > > A new parameter - max_ranges is added to set the upper limit of ranges
> > > to be tracked inside one ioreq server rangeset.
> > >
> > > Ioreq server uses a group of rangesets to track the I/O or memory
> > > resources to be emulated. The default value of this limit is set to
> > > 256. Yet there are circumstances under which the limit should exceed
> > > the default one. E.g. in XenGT, when tracking the per-process graphic
> > > translation tables on intel broadwell platforms, the number of page
> > > tables concerned will be several thousand(normally in this case, 8192
> > > could be a big enough value). Users who set his item explicitly are
> > > supposed to know the specific scenarios that necessitate this
> > > configuration.
> > >
> > > Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
> > > ---
> > >  docs/man/xl.cfg.pod.5           | 17 +++++++++++++++++
> > >  tools/libxl/libxl_dom.c         |  3 +++
> > >  tools/libxl/libxl_types.idl     |  1 +
> > >  tools/libxl/xl_cmdimpl.c        |  4 ++++
> > >  xen/arch/x86/hvm/hvm.c          |  7 ++++++-
> > >  xen/include/public/hvm/params.h |  5 ++++-
> > >  6 files changed, 35 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
> > > index 8899f75..562563d 100644
> > > --- a/docs/man/xl.cfg.pod.5
> > > +++ b/docs/man/xl.cfg.pod.5
> > > @@ -962,6 +962,23 @@ FIFO-based event channel ABI support up to
> > 131,071 event channels.
> > >  Other guests are limited to 4095 (64-bit x86 and ARM) or 1023 (32-bit
> > >  x86).
> > >
> > > +=item B<max_ranges=N>
> > > +
> > 
> > This name is too generic. I don't have better suggestion though.
> > 
> 
> I think the increased limit for XenGT need only be applied to wp mem ranges so perhaps the parameter name could be 'max_wp_memory_ranges'?
> 

What does "WP" mean? "Write Protected"?

Is this parameter closely related to IOREQ server? Should it contain
"ioreq" somehow?

Wei.
Paul Durrant Jan. 19, 2016, 2:47 p.m. UTC | #4
> -----Original Message-----
> From: Wei Liu [mailto:wei.liu2@citrix.com]
> Sent: 19 January 2016 14:37
> To: Paul Durrant
> Cc: Wei Liu; Yu Zhang; xen-devel@lists.xen.org; Kevin Tian; Keir (Xen.org);
> Stefano Stabellini; Andrew Cooper; zhiyuan.lv@intel.com;
> jbeulich@suse.com
> Subject: Re: [Xen-devel] [PATCH 3/3] tools: introduce parameter
> max_ranges.
> 
> On Tue, Jan 19, 2016 at 01:54:42PM +0000, Paul Durrant wrote:
> > > -----Original Message-----
> > > From: Wei Liu [mailto:wei.liu2@citrix.com]
> > > Sent: 19 January 2016 11:54
> > > To: Yu Zhang
> > > Cc: xen-devel@lists.xen.org; Kevin Tian; Keir (Xen.org); Stefano Stabellini;
> > > Andrew Cooper; Paul Durrant; zhiyuan.lv@intel.com; jbeulich@suse.com;
> > > Wei Liu
> > > Subject: Re: [Xen-devel] [PATCH 3/3] tools: introduce parameter
> > > max_ranges.
> > >
> > > On Tue, Jan 19, 2016 at 05:27:58PM +0800, Yu Zhang wrote:
> > > > A new parameter - max_ranges is added to set the upper limit of
> ranges
> > > > to be tracked inside one ioreq server rangeset.
> > > >
> > > > Ioreq server uses a group of rangesets to track the I/O or memory
> > > > resources to be emulated. The default value of this limit is set to
> > > > 256. Yet there are circumstances under which the limit should exceed
> > > > the default one. E.g. in XenGT, when tracking the per-process graphic
> > > > translation tables on intel broadwell platforms, the number of page
> > > > tables concerned will be several thousand(normally in this case, 8192
> > > > could be a big enough value). Users who set his item explicitly are
> > > > supposed to know the specific scenarios that necessitate this
> > > > configuration.
> > > >
> > > > Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
> > > > ---
> > > >  docs/man/xl.cfg.pod.5           | 17 +++++++++++++++++
> > > >  tools/libxl/libxl_dom.c         |  3 +++
> > > >  tools/libxl/libxl_types.idl     |  1 +
> > > >  tools/libxl/xl_cmdimpl.c        |  4 ++++
> > > >  xen/arch/x86/hvm/hvm.c          |  7 ++++++-
> > > >  xen/include/public/hvm/params.h |  5 ++++-
> > > >  6 files changed, 35 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
> > > > index 8899f75..562563d 100644
> > > > --- a/docs/man/xl.cfg.pod.5
> > > > +++ b/docs/man/xl.cfg.pod.5
> > > > @@ -962,6 +962,23 @@ FIFO-based event channel ABI support up to
> > > 131,071 event channels.
> > > >  Other guests are limited to 4095 (64-bit x86 and ARM) or 1023 (32-bit
> > > >  x86).
> > > >
> > > > +=item B<max_ranges=N>
> > > > +
> > >
> > > This name is too generic. I don't have better suggestion though.
> > >
> >
> > I think the increased limit for XenGT need only be applied to wp mem
> ranges so perhaps the parameter name could be
> 'max_wp_memory_ranges'?
> >
> 
> What does "WP" mean? "Write Protected"?
> 

Yes.

> Is this parameter closely related to IOREQ server? Should it contain
> "ioreq" somehow?
> 

It is closely related but ioreq server is an implementation detail so do we want to expose it as a tunable? The concept we need to capture is that the toolstack can tune the limit of the maximum number of pages in the VM that can be set such that writes are emulated (but reads are as for normal ram). Or I guess we could get very specific and call it something like 'max_gtt_shadows'?

> Wei.
Wei Liu Jan. 19, 2016, 3:04 p.m. UTC | #5
On Tue, Jan 19, 2016 at 02:47:40PM +0000, Paul Durrant wrote:
[...]
> > ranges so perhaps the parameter name could be
> > 'max_wp_memory_ranges'?
> > >
> > 
> > What does "WP" mean? "Write Protected"?
> > 
> 
> Yes.
> 
> > Is this parameter closely related to IOREQ server? Should it contain
> > "ioreq" somehow?
> > 
> 
> It is closely related but ioreq server is an implementation detail so
> do we want to expose it as a tunable? The concept we need to capture
> is that the toolstack can tune the limit of the maximum number of
> pages in the VM that can be set such that writes are emulated (but
> reads are as for normal ram). Or I guess we could get very specific
> and call it something like 'max_gtt_shadows'?

I would prefer generic concept in this case ("wp"). Let's wait a bit for
other people to voice their opinion.

Whichever one we pick it the meaning of the acronym needs to be clearly
documented...

Wei.

> 
> > Wei.
Ian Campbell Jan. 19, 2016, 3:18 p.m. UTC | #6
On Tue, 2016-01-19 at 15:04 +0000, Wei Liu wrote:

This patch doesn't seem to have been CCd to the tools maintainers, adding
Ian too, I think everyone else was picked up along the way.

Please use ./scripts/get_maintainers.pl in the future.

> On Tue, Jan 19, 2016 at 02:47:40PM +0000, Paul Durrant wrote:
> [...]
> > > ranges so perhaps the parameter name could be
> > > 'max_wp_memory_ranges'?
> > > > 
> > > 
> > > What does "WP" mean? "Write Protected"?
> > > 
> > 
> > Yes.
> > 
> > > Is this parameter closely related to IOREQ server? Should it contain
> > > "ioreq" somehow?
> > > 
> > 
> > It is closely related but ioreq server is an implementation detail so
> > do we want to expose it as a tunable? The concept we need to capture
> > is that the toolstack can tune the limit of the maximum number of
> > pages in the VM that can be set such that writes are emulated (but
> > reads are as for normal ram). Or I guess we could get very specific
> > and call it something like 'max_gtt_shadows'?
> 
> I would prefer generic concept in this case ("wp"). Let's wait a bit for
> other people to voice their opinion.
> 
> Whichever one we pick it the meaning of the acronym needs to be clearly
> documented...

I've got no ideas for a better name, "max_ranges" is clearly too generic
though.

One thought -- does XenGT require some other configuration option to enable
it or maybe a privilege which the target domain must necessarily have?
Could we use something like one of those to cause the t/stack to just DTRT
without the user having to micromanage the amount of pages which are
allowed to have this property?

> 
> Wei.
> 
> > 
> > > Wei.
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
Tian, Kevin Jan. 20, 2016, 3:14 a.m. UTC | #7
> From: Ian Campbell [mailto:ian.campbell@citrix.com]
> Sent: Tuesday, January 19, 2016 11:19 PM
> 
> On Tue, 2016-01-19 at 15:04 +0000, Wei Liu wrote:
> 
> This patch doesn't seem to have been CCd to the tools maintainers, adding
> Ian too, I think everyone else was picked up along the way.
> 
> Please use ./scripts/get_maintainers.pl in the future.
> 
> > On Tue, Jan 19, 2016 at 02:47:40PM +0000, Paul Durrant wrote:
> > [...]
> > > > ranges so perhaps the parameter name could be
> > > > 'max_wp_memory_ranges'?
> > > > >
> > > >
> > > > What does "WP" mean? "Write Protected"?
> > > >
> > >
> > > Yes.
> > >
> > > > Is this parameter closely related to IOREQ server? Should it contain
> > > > "ioreq" somehow?
> > > >
> > >
> > > It is closely related but ioreq server is an implementation detail so
> > > do we want to expose it as a tunable? The concept we need to capture
> > > is that the toolstack can tune the limit of the maximum number of
> > > pages in the VM that can be set such that writes are emulated (but
> > > reads are as for normal ram). Or I guess we could get very specific
> > > and call it something like 'max_gtt_shadows'?
> >
> > I would prefer generic concept in this case ("wp"). Let's wait a bit for
> > other people to voice their opinion.
> >
> > Whichever one we pick it the meaning of the acronym needs to be clearly
> > documented...
> 
> I've got no ideas for a better name, "max_ranges" is clearly too generic
> though.
> 
> One thought -- does XenGT require some other configuration option to enable
> it or maybe a privilege which the target domain must necessarily have?
> Could we use something like one of those to cause the t/stack to just DTRT
> without the user having to micromanage the amount of pages which are
> allowed to have this property?
> 

Using "wp" is clear to me.

As a feature this write-protection has nothing to be GPU virtualization specific.
In the future the same mediated pass-through idea used in XenGT may be
used on other I/O devices which need to shadow some structure w/ requirement
to write-protect guest memory. So it's not good to tie this to either XenGT
or GTT.

Thanks
Kevin
Yu Zhang Jan. 20, 2016, 3:33 a.m. UTC | #8
On 1/20/2016 11:14 AM, Tian, Kevin wrote:
>> From: Ian Campbell [mailto:ian.campbell@citrix.com]
>> Sent: Tuesday, January 19, 2016 11:19 PM
>>
>> On Tue, 2016-01-19 at 15:04 +0000, Wei Liu wrote:
>>
>> This patch doesn't seem to have been CCd to the tools maintainers, adding
>> Ian too, I think everyone else was picked up along the way.
>>
>> Please use ./scripts/get_maintainers.pl in the future.
>>
>>> On Tue, Jan 19, 2016 at 02:47:40PM +0000, Paul Durrant wrote:
>>> [...]
>>>>> ranges so perhaps the parameter name could be
>>>>> 'max_wp_memory_ranges'?
>>>>>>
>>>>>
>>>>> What does "WP" mean? "Write Protected"?
>>>>>
>>>>
>>>> Yes.
>>>>
>>>>> Is this parameter closely related to IOREQ server? Should it contain
>>>>> "ioreq" somehow?
>>>>>
>>>>
>>>> It is closely related but ioreq server is an implementation detail so
>>>> do we want to expose it as a tunable? The concept we need to capture
>>>> is that the toolstack can tune the limit of the maximum number of
>>>> pages in the VM that can be set such that writes are emulated (but
>>>> reads are as for normal ram). Or I guess we could get very specific
>>>> and call it something like 'max_gtt_shadows'?
>>>
>>> I would prefer generic concept in this case ("wp"). Let's wait a bit for
>>> other people to voice their opinion.
>>>
>>> Whichever one we pick it the meaning of the acronym needs to be clearly
>>> documented...
>>
>> I've got no ideas for a better name, "max_ranges" is clearly too generic
>> though.
>>
>> One thought -- does XenGT require some other configuration option to enable
>> it or maybe a privilege which the target domain must necessarily have?
>> Could we use something like one of those to cause the t/stack to just DTRT
>> without the user having to micromanage the amount of pages which are
>> allowed to have this property?
>>
>
> Using "wp" is clear to me.
>
Thank you all. :)
So how about "max_wp_ram_ranges"? And the "wp" shall be well explained 
in documentation.

> As a feature this write-protection has nothing to be GPU virtualization specific.
> In the future the same mediated pass-through idea used in XenGT may be
> used on other I/O devices which need to shadow some structure w/ requirement
> to write-protect guest memory. So it's not good to tie this to either XenGT
> or GTT.
>
Thank you, Kevin.
Well, if this parameter is not supposed to be xengt specific, we do not 
need to connect it with any xengt flag such as ."vgt=1" or "GVT-g=1". 
Hence the user will have to configure the max_wp_ram_ranges himself,
right?

B.R.
Yu

> Thanks
> Kevin
>
Tian, Kevin Jan. 20, 2016, 3:58 a.m. UTC | #9
> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
> Sent: Wednesday, January 20, 2016 11:33 AM
> > As a feature this write-protection has nothing to be GPU virtualization specific.
> > In the future the same mediated pass-through idea used in XenGT may be
> > used on other I/O devices which need to shadow some structure w/ requirement
> > to write-protect guest memory. So it's not good to tie this to either XenGT
> > or GTT.
> >
> Thank you, Kevin.
> Well, if this parameter is not supposed to be xengt specific, we do not
> need to connect it with any xengt flag such as ."vgt=1" or "GVT-g=1".
> Hence the user will have to configure the max_wp_ram_ranges himself,
> right?
> 

Not always. The option can be configured manually by the user, or 
automatically set in the code when "vgt=1" is recognized.

Thanks
Kevin
Yu Zhang Jan. 20, 2016, 5:02 a.m. UTC | #10
On 1/20/2016 11:58 AM, Tian, Kevin wrote:
>> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
>> Sent: Wednesday, January 20, 2016 11:33 AM
>>> As a feature this write-protection has nothing to be GPU virtualization specific.
>>> In the future the same mediated pass-through idea used in XenGT may be
>>> used on other I/O devices which need to shadow some structure w/ requirement
>>> to write-protect guest memory. So it's not good to tie this to either XenGT
>>> or GTT.
>>>
>> Thank you, Kevin.
>> Well, if this parameter is not supposed to be xengt specific, we do not
>> need to connect it with any xengt flag such as ."vgt=1" or "GVT-g=1".
>> Hence the user will have to configure the max_wp_ram_ranges himself,
>> right?
>>
>
> Not always. The option can be configured manually by the user, or
> automatically set in the code when "vgt=1" is recognized.

OK. That sounds more reasonable. :)
To give a summary, I'll do the following changes in next version:

1> rename this new parameter to "max_wp_ram_ranges", then use this
parameter as the wp-ram rangeset limit, for the I/O rangeset, keep
MAX_NR_IO_RANGES as its limit;
2> clear the documentation part;
3> define a LIBXL_HAVE_XXX in libxl.h to indicate a new field in the
build info;
4> We do not introduce the xengt flag by now, and will add code to
automatically set the "max_wp_ram_ranges" after this flag is accepted
in the future.

Does anyone have more suggestions? :)

B.R.
Yu
>
> Thanks
> Kevin
>
Ian Campbell Jan. 20, 2016, 10:16 a.m. UTC | #11
On Wed, 2016-01-20 at 03:58 +0000, Tian, Kevin wrote:
> > From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
> > Sent: Wednesday, January 20, 2016 11:33 AM
> > > As a feature this write-protection has nothing to be GPU
> > > virtualization specific.
> > > In the future the same mediated pass-through idea used in XenGT may
> > > be
> > > used on other I/O devices which need to shadow some structure w/
> > > requirement
> > > to write-protect guest memory. So it's not good to tie this to either
> > > XenGT
> > > or GTT.
> > > 
> > Thank you, Kevin.
> > Well, if this parameter is not supposed to be xengt specific, we do not
> > need to connect it with any xengt flag such as ."vgt=1" or "GVT-g=1".
> > Hence the user will have to configure the max_wp_ram_ranges himself,
> > right?
> > 
> 
> Not always. The option can be configured manually by the user, or 
> automatically set in the code when "vgt=1" is recognized.

Is the latter approach not always sufficient? IOW, if it can be done
automatically, why would the user need to tweak it?

Ian.
Wei Liu Jan. 20, 2016, 10:17 a.m. UTC | #12
On Wed, Jan 20, 2016 at 01:02:39PM +0800, Yu, Zhang wrote:
> 
> 
> On 1/20/2016 11:58 AM, Tian, Kevin wrote:
> >>From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
> >>Sent: Wednesday, January 20, 2016 11:33 AM
> >>>As a feature this write-protection has nothing to be GPU virtualization specific.
> >>>In the future the same mediated pass-through idea used in XenGT may be
> >>>used on other I/O devices which need to shadow some structure w/ requirement
> >>>to write-protect guest memory. So it's not good to tie this to either XenGT
> >>>or GTT.
> >>>
> >>Thank you, Kevin.
> >>Well, if this parameter is not supposed to be xengt specific, we do not
> >>need to connect it with any xengt flag such as ."vgt=1" or "GVT-g=1".
> >>Hence the user will have to configure the max_wp_ram_ranges himself,
> >>right?
> >>
> >
> >Not always. The option can be configured manually by the user, or
> >automatically set in the code when "vgt=1" is recognized.
> 
> OK. That sounds more reasonable. :)
> To give a summary, I'll do the following changes in next version:
> 
> 1> rename this new parameter to "max_wp_ram_ranges", then use this
> parameter as the wp-ram rangeset limit, for the I/O rangeset, keep
> MAX_NR_IO_RANGES as its limit;
> 2> clear the documentation part;
> 3> define a LIBXL_HAVE_XXX in libxl.h to indicate a new field in the
> build info;
> 4> We do not introduce the xengt flag by now, and will add code to
> automatically set the "max_wp_ram_ranges" after this flag is accepted
> in the future.
> 
> Does anyone have more suggestions? :)
> 

Ian posted an enquiry earlier:

"Could we use something like one of those to cause the t/stack to just
DTRT without the user having to micromanage the amount of pages which
are allowed to have this property?"

Is that possible?

Wei.


> B.R.
> Yu
> >
> >Thanks
> >Kevin
> >
Paul Durrant Jan. 20, 2016, 10:18 a.m. UTC | #13
> -----Original Message-----
> From: Ian Campbell [mailto:ian.campbell@citrix.com]
> Sent: 20 January 2016 10:16
> To: Kevin Tian; Yu, Zhang; Wei Liu; Paul Durrant
> Cc: Keir (Xen.org); jbeulich@suse.com; Andrew Cooper; xen-
> devel@lists.xen.org; Lv, Zhiyuan; Stefano Stabellini
> Subject: Re: [Xen-devel] [PATCH 3/3] tools: introduce parameter
> max_ranges.
> 
> On Wed, 2016-01-20 at 03:58 +0000, Tian, Kevin wrote:
> > > From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
> > > Sent: Wednesday, January 20, 2016 11:33 AM
> > > > As a feature this write-protection has nothing to be GPU
> > > > virtualization specific.
> > > > In the future the same mediated pass-through idea used in XenGT may
> > > > be
> > > > used on other I/O devices which need to shadow some structure w/
> > > > requirement
> > > > to write-protect guest memory. So it's not good to tie this to either
> > > > XenGT
> > > > or GTT.
> > > >
> > > Thank you, Kevin.
> > > Well, if this parameter is not supposed to be xengt specific, we do not
> > > need to connect it with any xengt flag such as ."vgt=1" or "GVT-g=1".
> > > Hence the user will have to configure the max_wp_ram_ranges himself,
> > > right?
> > >
> >
> > Not always. The option can be configured manually by the user, or
> > automatically set in the code when "vgt=1" is recognized.
> 
> Is the latter approach not always sufficient? IOW, if it can be done
> automatically, why would the user need to tweak it?
>

I think latter is sufficient for now. We always have the option of adding a specific wp_ram_ranges parameter in future if there is a need.

  Paul
 
> Ian.
Yu Zhang Jan. 20, 2016, 11:13 a.m. UTC | #14
On 1/20/2016 6:18 PM, Paul Durrant wrote:
>> -----Original Message-----
>> From: Ian Campbell [mailto:ian.campbell@citrix.com]
>> Sent: 20 January 2016 10:16
>> To: Kevin Tian; Yu, Zhang; Wei Liu; Paul Durrant
>> Cc: Keir (Xen.org); jbeulich@suse.com; Andrew Cooper; xen-
>> devel@lists.xen.org; Lv, Zhiyuan; Stefano Stabellini
>> Subject: Re: [Xen-devel] [PATCH 3/3] tools: introduce parameter
>> max_ranges.
>>
>> On Wed, 2016-01-20 at 03:58 +0000, Tian, Kevin wrote:
>>>> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
>>>> Sent: Wednesday, January 20, 2016 11:33 AM
>>>>> As a feature this write-protection has nothing to be GPU
>>>>> virtualization specific.
>>>>> In the future the same mediated pass-through idea used in XenGT may
>>>>> be
>>>>> used on other I/O devices which need to shadow some structure w/
>>>>> requirement
>>>>> to write-protect guest memory. So it's not good to tie this to either
>>>>> XenGT
>>>>> or GTT.
>>>>>
>>>> Thank you, Kevin.
>>>> Well, if this parameter is not supposed to be xengt specific, we do not
>>>> need to connect it with any xengt flag such as ."vgt=1" or "GVT-g=1".
>>>> Hence the user will have to configure the max_wp_ram_ranges himself,
>>>> right?
>>>>
>>>
>>> Not always. The option can be configured manually by the user, or
>>> automatically set in the code when "vgt=1" is recognized.
>>
>> Is the latter approach not always sufficient? IOW, if it can be done
>> automatically, why would the user need to tweak it?
>>
>
> I think latter is sufficient for now. We always have the option of adding a specific wp_ram_ranges parameter in future if there is a need

Thank you all for your reply.
Well, I believe the latter option is only sufficient for most
usage models on BDW due to rangeset's ability to merge continuous
pages into one range, but there might be some extreme cases, e.g.
too many graphic related applications in one VM, which create a
great deal of per-process graphic translation tables. And also,
future cpu platforms might provide even more PPGGTs. So, I suggest
we use this max_wp_ram_ranges, and give the control to the system
administrator. Besides, like Kevin said, XenGT's mediated pass-thru
idea can also be adopted to other devices, and this parameter may
also help.
Also, we have plans to upstream the tool-stack changes later this
year. If this max_wp_ram_ranges is not convenient, we can introduce
a method to automatically set its default value.

B.R.
Yu
>
>    Paul
>
>> Ian.
diff mbox

Patch

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index 8899f75..562563d 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -962,6 +962,23 @@  FIFO-based event channel ABI support up to 131,071 event channels.
 Other guests are limited to 4095 (64-bit x86 and ARM) or 1023 (32-bit
 x86).
 
+=item B<max_ranges=N>
+
+Limit the maximum ranges that can be tracked inside one ioreq server
+rangeset.
+
+Ioreq server uses a group of rangesets to track the I/O or memory
+resources to be emulated. By default, this item is not set. Not
+configuring this item, or setting its value to 0 will result in the
+upper limit set to its default value - 256. Yet there are circumstances
+under which the upper limit inside one rangeset should exceed the
+default one. E.g. in XenGT, when tracking the per-process graphic
+translation tables on intel broadwell platforms, the number of page
+tables concerned will be several thousand(normally in this case, 8192
+could be a big enough value). Users who set his item explicitly are
+supposed to know the specific scenarios that necessitate this
+configuration.
+
 =back
 
 =head2 Paravirtualised (PV) Guest Specific Options
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 47971a9..607b0c4 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -288,6 +288,9 @@  static void hvm_set_conf_params(xc_interface *handle, uint32_t domid,
                     libxl_defbool_val(info->u.hvm.nested_hvm));
     xc_hvm_param_set(handle, domid, HVM_PARAM_ALTP2M,
                     libxl_defbool_val(info->u.hvm.altp2m));
+    if (info->u.hvm.max_ranges > 0)
+        xc_hvm_param_set(handle, domid, HVM_PARAM_MAX_RANGES,
+                        info->u.hvm.max_ranges);
 }
 
 int libxl__build_pre(libxl__gc *gc, uint32_t domid,
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 9ad7eba..c936265 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -518,6 +518,7 @@  libxl_domain_build_info = Struct("domain_build_info",[
                                        ("serial_list",      libxl_string_list),
                                        ("rdm", libxl_rdm_reserve),
                                        ("rdm_mem_boundary_memkb", MemKB),
+                                       ("max_ranges", uint32),
                                        ])),
                  ("pv", Struct(None, [("kernel", string),
                                       ("slack_memkb", MemKB),
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 25507c7..9359de7 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -1626,6 +1626,10 @@  static void parse_config_data(const char *config_source,
 
         if (!xlu_cfg_get_long (config, "rdm_mem_boundary", &l, 0))
             b_info->u.hvm.rdm_mem_boundary_memkb = l * 1024;
+
+        if (!xlu_cfg_get_long (config, "max_ranges", &l, 0))
+            b_info->u.hvm.max_ranges = l;
+
         break;
     case LIBXL_DOMAIN_TYPE_PV:
     {
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index d59e7bc..2f85089 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -943,6 +943,10 @@  static int hvm_ioreq_server_alloc_rangesets(struct hvm_ioreq_server *s,
 {
     unsigned int i;
     int rc;
+    unsigned int max_ranges =
+        ( s->domain->arch.hvm_domain.params[HVM_PARAM_MAX_RANGES] > 0 ) ?
+        s->domain->arch.hvm_domain.params[HVM_PARAM_MAX_RANGES] :
+        MAX_NR_IO_RANGES;
 
     if ( is_default )
         goto done;
@@ -965,7 +969,7 @@  static int hvm_ioreq_server_alloc_rangesets(struct hvm_ioreq_server *s,
         if ( !s->range[i] )
             goto fail;
 
-        rangeset_limit(s->range[i], MAX_NR_IO_RANGES);
+        rangeset_limit(s->range[i], max_ranges);
     }
 
  done:
@@ -6012,6 +6016,7 @@  static int hvm_allow_set_param(struct domain *d,
     case HVM_PARAM_IOREQ_SERVER_PFN:
     case HVM_PARAM_NR_IOREQ_SERVER_PAGES:
     case HVM_PARAM_ALTP2M:
+    case HVM_PARAM_MAX_RANGES:
         if ( value != 0 && a->value != value )
             rc = -EEXIST;
         break;
diff --git a/xen/include/public/hvm/params.h b/xen/include/public/hvm/params.h
index 81f9451..7732087 100644
--- a/xen/include/public/hvm/params.h
+++ b/xen/include/public/hvm/params.h
@@ -210,6 +210,9 @@ 
 /* Boolean: Enable altp2m */
 #define HVM_PARAM_ALTP2M       35
 
-#define HVM_NR_PARAMS          36
+/* Maximum ranges to be tracked in one rangeset by ioreq server */
+#define HVM_PARAM_MAX_RANGES  36
+
+#define HVM_NR_PARAMS          37
 
 #endif /* __XEN_PUBLIC_HVM_PARAMS_H__ */