Message ID | 20201209161618.309-3-jgross@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | xen: add hypfs per-domain abi-features | expand |
Hi Juergen, On 09/12/2020 16:16, Juergen Gross wrote: > Add /domain/<domid> directories to hypfs. Those are completely > dynamic, so the related hypfs access functions need to be implemented. > > Signed-off-by: Juergen Gross <jgross@suse.com> > --- > V3: > - new patch > --- > docs/misc/hypfs-paths.pandoc | 10 +++ > xen/common/Makefile | 1 + > xen/common/hypfs_dom.c | 137 +++++++++++++++++++++++++++++++++++ > 3 files changed, 148 insertions(+) > create mode 100644 xen/common/hypfs_dom.c > > diff --git a/docs/misc/hypfs-paths.pandoc b/docs/misc/hypfs-paths.pandoc > index e86f7d0dbe..116642e367 100644 > --- a/docs/misc/hypfs-paths.pandoc > +++ b/docs/misc/hypfs-paths.pandoc > @@ -34,6 +34,7 @@ not containing any '/' character. The names "." and ".." are reserved > for file system internal use. > > VALUES are strings and can take the following forms (note that this represents > +>>>>>>> patched This seems to be a left-over of a merge. > only the syntax used in this document): > > * STRING -- an arbitrary 0-delimited byte string. > @@ -191,6 +192,15 @@ The scheduling granularity of a cpupool. > Writing a value is allowed only for cpupools with no cpu assigned and if the > architecture is supporting different scheduling granularities. > [...] > + > +static int domain_dir_read(const struct hypfs_entry *entry, > + XEN_GUEST_HANDLE_PARAM(void) uaddr) > +{ > + int ret = 0; > + const struct domain *d; > + > + for_each_domain ( d ) This is definitely going to be an issue if you have a lot of domain running as Xen is not preemptible. I think the first step is to make sure that HYPFS can scale without hogging a pCPU for a long time. Cheers,
On 09.12.20 17:37, Julien Grall wrote: > Hi Juergen, > > On 09/12/2020 16:16, Juergen Gross wrote: >> Add /domain/<domid> directories to hypfs. Those are completely >> dynamic, so the related hypfs access functions need to be implemented. >> >> Signed-off-by: Juergen Gross <jgross@suse.com> >> --- >> V3: >> - new patch >> --- >> docs/misc/hypfs-paths.pandoc | 10 +++ >> xen/common/Makefile | 1 + >> xen/common/hypfs_dom.c | 137 +++++++++++++++++++++++++++++++++++ >> 3 files changed, 148 insertions(+) >> create mode 100644 xen/common/hypfs_dom.c >> >> diff --git a/docs/misc/hypfs-paths.pandoc b/docs/misc/hypfs-paths.pandoc >> index e86f7d0dbe..116642e367 100644 >> --- a/docs/misc/hypfs-paths.pandoc >> +++ b/docs/misc/hypfs-paths.pandoc >> @@ -34,6 +34,7 @@ not containing any '/' character. The names "." and >> ".." are reserved >> for file system internal use. >> VALUES are strings and can take the following forms (note that this >> represents >> +>>>>>>> patched > > This seems to be a left-over of a merge. Oh, interesting that I wasn't warned about that. > >> only the syntax used in this document): >> * STRING -- an arbitrary 0-delimited byte string. >> @@ -191,6 +192,15 @@ The scheduling granularity of a cpupool. >> Writing a value is allowed only for cpupools with no cpu assigned >> and if the >> architecture is supporting different scheduling granularities. > > [...] > >> + >> +static int domain_dir_read(const struct hypfs_entry *entry, >> + XEN_GUEST_HANDLE_PARAM(void) uaddr) >> +{ >> + int ret = 0; >> + const struct domain *d; >> + >> + for_each_domain ( d ) > > This is definitely going to be an issue if you have a lot of domain > running as Xen is not preemptible. In general this is correct, but in this case I don't think it will be a problem. The execution time for each loop iteration should be rather short (in the microsecond range?), so even with 32000 guests we would stay way below one second. And on rather slow cpus I don't think we'd have thousands of guests anyway. > I think the first step is to make sure that HYPFS can scale without > hogging a pCPU for a long time. I agree this would be best. Juergen
On 10/12/2020 07:54, Jürgen Groß wrote: > On 09.12.20 17:37, Julien Grall wrote: >>> only the syntax used in this document): >>> * STRING -- an arbitrary 0-delimited byte string. >>> @@ -191,6 +192,15 @@ The scheduling granularity of a cpupool. >>> Writing a value is allowed only for cpupools with no cpu assigned >>> and if the >>> architecture is supporting different scheduling granularities. >> >> [...] >> >>> + >>> +static int domain_dir_read(const struct hypfs_entry *entry, >>> + XEN_GUEST_HANDLE_PARAM(void) uaddr) >>> +{ >>> + int ret = 0; >>> + const struct domain *d; >>> + >>> + for_each_domain ( d ) >> >> This is definitely going to be an issue if you have a lot of domain >> running as Xen is not preemptible. > > In general this is correct, but in this case I don't think it will > be a problem. The execution time for each loop iteration should be > rather short (in the microsecond range?), so even with 32000 guests > we would stay way below one second. The scheduling slice are usually in ms and not second (yet this will depend on your scheduler). It would be unacceptable to me if another vCPU cannot run for a second because dom0 is trying to list the domain via HYPFS. Cheers,
On 10.12.20 12:51, Julien Grall wrote: > > > On 10/12/2020 07:54, Jürgen Groß wrote: >> On 09.12.20 17:37, Julien Grall wrote: >>>> only the syntax used in this document): >>>> * STRING -- an arbitrary 0-delimited byte string. >>>> @@ -191,6 +192,15 @@ The scheduling granularity of a cpupool. >>>> Writing a value is allowed only for cpupools with no cpu assigned >>>> and if the >>>> architecture is supporting different scheduling granularities. >>> >>> [...] >>> >>>> + >>>> +static int domain_dir_read(const struct hypfs_entry *entry, >>>> + XEN_GUEST_HANDLE_PARAM(void) uaddr) >>>> +{ >>>> + int ret = 0; >>>> + const struct domain *d; >>>> + >>>> + for_each_domain ( d ) >>> >>> This is definitely going to be an issue if you have a lot of domain >>> running as Xen is not preemptible. >> >> In general this is correct, but in this case I don't think it will >> be a problem. The execution time for each loop iteration should be >> rather short (in the microsecond range?), so even with 32000 guests >> we would stay way below one second. > > The scheduling slice are usually in ms and not second (yet this will > depend on your scheduler). It would be unacceptable to me if another > vCPU cannot run for a second because dom0 is trying to list the domain > via HYPFS. Okay, I did a test. The worrying operation is the reading of /domain/ with lots of domains. "xenhypfs ls /domain" with 500 domains running needed 231 us of real time for the library call, while "xenhypfs ls /" needed about 70 us. This makes 3 domains per usec, resulting in about 10 msecs with 30000 domains. Juergen
diff --git a/docs/misc/hypfs-paths.pandoc b/docs/misc/hypfs-paths.pandoc index e86f7d0dbe..116642e367 100644 --- a/docs/misc/hypfs-paths.pandoc +++ b/docs/misc/hypfs-paths.pandoc @@ -34,6 +34,7 @@ not containing any '/' character. The names "." and ".." are reserved for file system internal use. VALUES are strings and can take the following forms (note that this represents +>>>>>>> patched only the syntax used in this document): * STRING -- an arbitrary 0-delimited byte string. @@ -191,6 +192,15 @@ The scheduling granularity of a cpupool. Writing a value is allowed only for cpupools with no cpu assigned and if the architecture is supporting different scheduling granularities. +#### /domain/ + +A directory of all current domains. + +#### /domain/*/ + +The individual domains. Each entry is a directory with the name being the +domain-id (e.g. /domain/0/). + #### /params/ A directory of runtime parameters. diff --git a/xen/common/Makefile b/xen/common/Makefile index d109f279a4..e88a9ee91e 100644 --- a/xen/common/Makefile +++ b/xen/common/Makefile @@ -15,6 +15,7 @@ obj-$(CONFIG_GRANT_TABLE) += grant_table.o obj-y += guestcopy.o obj-bin-y += gunzip.init.o obj-$(CONFIG_HYPFS) += hypfs.o +obj-$(CONFIG_HYPFS) += hypfs_dom.o obj-y += irq.o obj-y += kernel.o obj-y += keyhandler.o diff --git a/xen/common/hypfs_dom.c b/xen/common/hypfs_dom.c new file mode 100644 index 0000000000..241e379b24 --- /dev/null +++ b/xen/common/hypfs_dom.c @@ -0,0 +1,137 @@ +/****************************************************************************** + * + * hypfs_dom.c + * + * Per domain hypfs nodes. + */ + +#include <xen/err.h> +#include <xen/hypfs.h> +#include <xen/lib.h> +#include <xen/sched.h> + +static const struct hypfs_entry *domain_domdir_enter( + const struct hypfs_entry *entry) +{ + struct hypfs_dyndir_id *data; + struct domain *d; + + data = hypfs_get_dyndata(); + d = get_domain_by_id(data->id); + data->data = d; + if ( !d ) + return ERR_PTR(-ENOENT); + + return entry; +} + +static void domain_domdir_exit(const struct hypfs_entry *entry) +{ + struct hypfs_dyndir_id *data; + struct domain *d; + + data = hypfs_get_dyndata(); + d = data->data; + put_domain(d); +} + +static const struct hypfs_funcs domain_domdir_funcs = { + .enter = domain_domdir_enter, + .exit = domain_domdir_exit, + .read = hypfs_read_dir, + .write = hypfs_write_deny, + .getsize = hypfs_getsize, + .findentry = hypfs_dir_findentry, +}; + +static HYPFS_DIR_INIT_FUNC(domain_domdir, "%u", &domain_domdir_funcs); + +static int domain_dir_read(const struct hypfs_entry *entry, + XEN_GUEST_HANDLE_PARAM(void) uaddr) +{ + int ret = 0; + const struct domain *d; + + for_each_domain ( d ) + { + ret = hypfs_read_dyndir_id_entry(&domain_domdir, d->domain_id, + !d->next_in_list, &uaddr); + if ( ret ) + break; + } + + return ret; +} + +static unsigned int domain_dir_getsize(const struct hypfs_entry *entry) +{ + const struct domain *d; + unsigned int size = 0; + + for_each_domain ( d ) + size += hypfs_dynid_entry_size(entry, d->domain_id); + + return size; +} + +static const struct hypfs_entry *domain_dir_enter( + const struct hypfs_entry *entry) +{ + struct hypfs_dyndir_id *data; + + data = hypfs_alloc_dyndata(struct hypfs_dyndir_id); + if ( !data ) + return ERR_PTR(-ENOMEM); + data->id = DOMID_SELF; + + rcu_read_lock(&domlist_read_lock); + + return entry; +} + +static void domain_dir_exit(const struct hypfs_entry *entry) +{ + rcu_read_unlock(&domlist_read_lock); + + hypfs_free_dyndata(); +} + +static struct hypfs_entry *domain_dir_findentry( + const struct hypfs_entry_dir *dir, const char *name, unsigned int name_len) +{ + unsigned long id; + const char *end; + struct domain *d; + + id = simple_strtoul(name, &end, 10); + if ( end != name + name_len ) + return ERR_PTR(-ENOENT); + + d = rcu_lock_domain_by_id(id); + if ( !d ) + return ERR_PTR(-ENOENT); + + rcu_unlock_domain(d); + + return hypfs_gen_dyndir_id_entry(&domain_domdir, id, d); +} + +static const struct hypfs_funcs domain_dir_funcs = { + .enter = domain_dir_enter, + .exit = domain_dir_exit, + .read = domain_dir_read, + .write = hypfs_write_deny, + .getsize = domain_dir_getsize, + .findentry = domain_dir_findentry, +}; + +static HYPFS_DIR_INIT_FUNC(domain_dir, "domain", &domain_dir_funcs); + +static int __init domhypfs_init(void) +{ + hypfs_add_dir(&hypfs_root, &domain_dir, true); + hypfs_add_dyndir(&domain_dir, &domain_domdir); + + return 0; +} +__initcall(domhypfs_init);
Add /domain/<domid> directories to hypfs. Those are completely dynamic, so the related hypfs access functions need to be implemented. Signed-off-by: Juergen Gross <jgross@suse.com> --- V3: - new patch --- docs/misc/hypfs-paths.pandoc | 10 +++ xen/common/Makefile | 1 + xen/common/hypfs_dom.c | 137 +++++++++++++++++++++++++++++++++++ 3 files changed, 148 insertions(+) create mode 100644 xen/common/hypfs_dom.c