Message ID | E1kDR8W-0001s4-Sr@lizzy.crudebyte.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | 9pfs: log warning if msize <= 8192 | expand |
On Wed, Sep 02, 2020 at 01:22:49PM +0200, Christian Schoenebeck wrote: > It is essential to choose a reasonable high value for 'msize' to avoid > severe degraded file I/O performance. This parameter has to be chosen > on client/guest side, and a Linux client defaults to an 'msize' of only > 8192 if the user did not explicitly specify a value for 'msize'. > > Unfortunately many users are not aware that they should specify an > appropriate value for 'msize' to avoid severe performance issues, so > log a performance warning on host side in that case to make it more > clear. What is a more reasonable "msize" value to pick instead of 8k ? ie at what msize is I/O not several degraded ? If there a reason that Linux can't pick a better default ? > > Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> > --- > hw/9pfs/9p.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c > index 7bb994bbf2..33e948d636 100644 > --- a/hw/9pfs/9p.c > +++ b/hw/9pfs/9p.c > @@ -1353,6 +1353,14 @@ static void coroutine_fn v9fs_version(void *opaque) > goto out; > } > > + /* 8192 is the default msize of Linux clients */ > + if (s->msize <= 8192) { > + warn_report_once( > + "9p: degraded performance: a reasonable high msize should be " > + "chosen on client/guest side (chosen msize is <= 8192)." > + ); > + } > + > marshal: > err = pdu_marshal(pdu, offset, "ds", s->msize, &version); > if (err < 0) { > -- > 2.20.1 > > Regards, Daniel
On Mittwoch, 2. September 2020 14:25:47 CEST Daniel P. Berrangé wrote: > On Wed, Sep 02, 2020 at 01:22:49PM +0200, Christian Schoenebeck wrote: > > It is essential to choose a reasonable high value for 'msize' to avoid > > severe degraded file I/O performance. This parameter has to be chosen > > on client/guest side, and a Linux client defaults to an 'msize' of only > > 8192 if the user did not explicitly specify a value for 'msize'. > > > > Unfortunately many users are not aware that they should specify an > > appropriate value for 'msize' to avoid severe performance issues, so > > log a performance warning on host side in that case to make it more > > clear. > > What is a more reasonable "msize" value to pick instead of 8k ? > ie at what msize is I/O not several degraded ? A good value depends on the file I/O potential of the underlying storage on host side, and then you still would need to trade off between performance profit and additional RAM costs, i.e. with growing msize (RAM occupation), performance still increases, but performance delta will shrink continuously. So in practice you might e.g. choose anything between 10MiB ... >100MiB for a SATA spindle disk storage, and a much higher value for anything PCIe based flash storage. So a user probably should benchmark and decide what's reasonable for the intended use case. > If there a reason that Linux can't pick a better default ? I was not involved when that default value was picked on Linux side, so I don't know why exactly this value (8192) had been chosen as default 'msize' years ago. It certainly (sh/c)ould be higher, as it is already close to the theoreticaly minimum msize of 4096. However how should a Linux guest automatically pick a reasonable msize if it does not have any knowlege of host's storage features? But even if this will be addressed on Linux kernel side, I still think users of old kernels should be made aware of this issue, as it is not obvious to the user. Best regards, Christian Schoenebeck
On Wed, 02 Sep 2020 14:52:33 +0200 Christian Schoenebeck <qemu_oss@crudebyte.com> wrote: > On Mittwoch, 2. September 2020 14:25:47 CEST Daniel P. Berrangé wrote: > > On Wed, Sep 02, 2020 at 01:22:49PM +0200, Christian Schoenebeck wrote: > > > It is essential to choose a reasonable high value for 'msize' to avoid > > > severe degraded file I/O performance. This parameter has to be chosen > > > on client/guest side, and a Linux client defaults to an 'msize' of only > > > 8192 if the user did not explicitly specify a value for 'msize'. > > > > > > Unfortunately many users are not aware that they should specify an > > > appropriate value for 'msize' to avoid severe performance issues, so > > > log a performance warning on host side in that case to make it more > > > clear. > > > > What is a more reasonable "msize" value to pick instead of 8k ? > > ie at what msize is I/O not several degraded ? > > A good value depends on the file I/O potential of the underlying storage on > host side, and then you still would need to trade off between performance > profit and additional RAM costs, i.e. with growing msize (RAM occupation), > performance still increases, but performance delta will shrink continuously. > > So in practice you might e.g. choose anything between 10MiB ... >100MiB for a > SATA spindle disk storage, and a much higher value for anything PCIe based > flash storage. So a user probably should benchmark and decide what's > reasonable for the intended use case. > > > If there a reason that Linux can't pick a better default ? > > I was not involved when that default value was picked on Linux side, so I > don't know why exactly this value (8192) had been chosen as default 'msize' > years ago. > The original size back in 2005 was 9000: [greg@bahia kernel-linus]$ git show 9e82cf6a802a7 | grep 9000 + v9ses->maxdata = 9000; + if (v9ses->maxdata != 9000) which was later converted to 8192. I couldn't find any hint on why such a small size was chosen. Maybe you can try to contact 9pfs father ? Eric Van Hensbergen <ericvh@gmail.com> > It certainly (sh/c)ould be higher, as it is already close to the theoreticaly > minimum msize of 4096. However how should a Linux guest automatically pick a > reasonable msize if it does not have any knowlege of host's storage features? > > But even if this will be addressed on Linux kernel side, I still think users > of old kernels should be made aware of this issue, as it is not obvious to the > user. > I tend to agree. Until linux decides of a better default, we should only warn the user if they decide to go below the current one. Cheers, -- Greg > Best regards, > Christian Schoenebeck > >
On Wed, Sep 02, 2020 at 03:39:34PM +0200, Greg Kurz wrote: > On Wed, 02 Sep 2020 14:52:33 +0200 > Christian Schoenebeck <qemu_oss@crudebyte.com> wrote: > > > On Mittwoch, 2. September 2020 14:25:47 CEST Daniel P. Berrangé wrote: > > > On Wed, Sep 02, 2020 at 01:22:49PM +0200, Christian Schoenebeck wrote: > > > > It is essential to choose a reasonable high value for 'msize' to avoid > > > > severe degraded file I/O performance. This parameter has to be chosen > > > > on client/guest side, and a Linux client defaults to an 'msize' of only > > > > 8192 if the user did not explicitly specify a value for 'msize'. > > > > > > > > Unfortunately many users are not aware that they should specify an > > > > appropriate value for 'msize' to avoid severe performance issues, so > > > > log a performance warning on host side in that case to make it more > > > > clear. > > > > > > What is a more reasonable "msize" value to pick instead of 8k ? > > > ie at what msize is I/O not several degraded ? > > > > A good value depends on the file I/O potential of the underlying storage on > > host side, and then you still would need to trade off between performance > > profit and additional RAM costs, i.e. with growing msize (RAM occupation), > > performance still increases, but performance delta will shrink continuously. > > > > So in practice you might e.g. choose anything between 10MiB ... >100MiB for a > > SATA spindle disk storage, and a much higher value for anything PCIe based > > flash storage. So a user probably should benchmark and decide what's > > reasonable for the intended use case. > > > > > If there a reason that Linux can't pick a better default ? > > > > I was not involved when that default value was picked on Linux side, so I > > don't know why exactly this value (8192) had been chosen as default 'msize' > > years ago. > > > > The original size back in 2005 was 9000: > > [greg@bahia kernel-linus]$ git show 9e82cf6a802a7 | grep 9000 > + v9ses->maxdata = 9000; > + if (v9ses->maxdata != 9000) > > which was later converted to 8192. > > I couldn't find any hint on why such a small size was chosen. > > Maybe you can try to contact 9pfs father ? > > Eric Van Hensbergen <ericvh@gmail.com> > > > It certainly (sh/c)ould be higher, as it is already close to the theoreticaly > > minimum msize of 4096. However how should a Linux guest automatically pick a > > reasonable msize if it does not have any knowlege of host's storage features? > > > > But even if this will be addressed on Linux kernel side, I still think users > > of old kernels should be made aware of this issue, as it is not obvious to the > > user. > > > > I tend to agree. Until linux decides of a better default, we should only > warn the user if they decide to go below the current one. To be clear, I'm not objecting to warning - just that the proposed warning doesn't give any useful information about what is considered to be a sensible alternative size, and nor does the commit message. Just like to see the commit message provide the background info above, and ideally have the warning message at least give the user a suggestion that is in the sensible order of magnitude they should be looking at. eg tell them to aim for 1 MB (or whatever value) as a starting point to tuning. Regards, Daniel
Hi Eric, do you remember any specific reason why the default 'msize' for the Linux kernel's 9P client was chosen such low as 8 kiB? (see QEMU discussion below). Best regards, Christian Schoenebeck On Mittwoch, 2. September 2020 15:39:34 CEST Greg Kurz wrote: > On Wed, 02 Sep 2020 14:52:33 +0200 > > Christian Schoenebeck <qemu_oss@crudebyte.com> wrote: > > On Mittwoch, 2. September 2020 14:25:47 CEST Daniel P. Berrangé wrote: > > > On Wed, Sep 02, 2020 at 01:22:49PM +0200, Christian Schoenebeck wrote: > > > > It is essential to choose a reasonable high value for 'msize' to avoid > > > > severe degraded file I/O performance. This parameter has to be chosen > > > > on client/guest side, and a Linux client defaults to an 'msize' of > > > > only > > > > 8192 if the user did not explicitly specify a value for 'msize'. > > > > > > > > Unfortunately many users are not aware that they should specify an > > > > appropriate value for 'msize' to avoid severe performance issues, so > > > > log a performance warning on host side in that case to make it more > > > > clear. > > > > > > What is a more reasonable "msize" value to pick instead of 8k ? > > > ie at what msize is I/O not several degraded ? > > > > A good value depends on the file I/O potential of the underlying storage > > on > > host side, and then you still would need to trade off between performance > > profit and additional RAM costs, i.e. with growing msize (RAM occupation), > > performance still increases, but performance delta will shrink > > continuously. > > > > So in practice you might e.g. choose anything between 10MiB ... >100MiB > > for a SATA spindle disk storage, and a much higher value for anything > > PCIe based flash storage. So a user probably should benchmark and decide > > what's reasonable for the intended use case. > > > > > If there a reason that Linux can't pick a better default ? > > > > I was not involved when that default value was picked on Linux side, so I > > don't know why exactly this value (8192) had been chosen as default > > 'msize' > > years ago. > > The original size back in 2005 was 9000: > > [greg@bahia kernel-linus]$ git show 9e82cf6a802a7 | grep 9000 > + v9ses->maxdata = 9000; > + if (v9ses->maxdata != 9000) > > which was later converted to 8192. > > I couldn't find any hint on why such a small size was chosen. > > Maybe you can try to contact 9pfs father ? > > Eric Van Hensbergen <ericvh@gmail.com> > > > It certainly (sh/c)ould be higher, as it is already close to the > > theoreticaly minimum msize of 4096. However how should a Linux guest > > automatically pick a reasonable msize if it does not have any knowlege of > > host's storage features? > > > > But even if this will be addressed on Linux kernel side, I still think > > users of old kernels should be made aware of this issue, as it is not > > obvious to the user. > > I tend to agree. Until linux decides of a better default, we should only > warn the user if they decide to go below the current one. > > Cheers, > > -- > Greg > > > Best regards, > > Christian Schoenebeck
On Mittwoch, 2. September 2020 15:45:03 CEST Daniel P. Berrangé wrote: > To be clear, I'm not objecting to warning - just that the proposed warning > doesn't give any useful information about what is considered to be a > sensible alternative size, and nor does the commit message. > > Just like to see the commit message provide the background info above, > and ideally have the warning message at least give the user a suggestion > that is in the sensible order of magnitude they should be looking at. > > eg tell them to aim for 1 MB (or whatever value) as a starting point to > tuning. Yeah, I know, but the problem is I don't see how I would squeeze the relevant information into only one log message; and even "what's a good starting point" is already questionable. For that reason my plan was: - logging this warning - describing the 'msize' issue in detail on the QEMU wiki (what's the point, how would you benchmark it) So my idea was: user sees the message, "what is 'msize?'" -> Google "msize qemu" -> click 'QEMU wiki' -> read all the details. But how about this: I put a QEMU wiki link directly into the log message? P.S. I don't have a QEMU wiki account yet BTW, so if somebody could hook me up, very much appreciated! Best regards, Christian Schoenebeck
On Wed, Sep 02, 2020 at 04:08:48PM +0200, Christian Schoenebeck wrote: > On Mittwoch, 2. September 2020 15:45:03 CEST Daniel P. Berrangé wrote: > > To be clear, I'm not objecting to warning - just that the proposed warning > > doesn't give any useful information about what is considered to be a > > sensible alternative size, and nor does the commit message. > > > > Just like to see the commit message provide the background info above, > > and ideally have the warning message at least give the user a suggestion > > that is in the sensible order of magnitude they should be looking at. > > > > eg tell them to aim for 1 MB (or whatever value) as a starting point to > > tuning. > > Yeah, I know, but the problem is I don't see how I would squeeze the relevant > information into only one log message; and even "what's a good starting point" > is already questionable. > > For that reason my plan was: > > - logging this warning > > - describing the 'msize' issue in detail on the QEMU wiki (what's the > point, how would you benchmark it) > > So my idea was: user sees the message, "what is 'msize?'" -> Google "msize > qemu" -> click 'QEMU wiki' -> read all the details. > > But how about this: I put a QEMU wiki link directly into the log message? Rather than that, how about putting it in the QEMU man page, and then just add "See 'man 1 qemu' for further guidance". Regards, Daniel
On Mittwoch, 2. September 2020 16:10:35 CEST Daniel P. Berrangé wrote: > > Yeah, I know, but the problem is I don't see how I would squeeze the > > relevant information into only one log message; and even "what's a good > > starting point" is already questionable. > > > > For that reason my plan was: > > - logging this warning > > > > - describing the 'msize' issue in detail on the QEMU wiki (what's the > > > > point, how would you benchmark it) > > > > So my idea was: user sees the message, "what is 'msize?'" -> Google "msize > > qemu" -> click 'QEMU wiki' -> read all the details. > > > > But how about this: I put a QEMU wiki link directly into the log message? > > Rather than that, how about putting it in the QEMU man page, and then > just add "See 'man 1 qemu' for further guidance". Well, I can do that of course. But somehow I fear users get lost by just pointing them to "man 1 qemu" in the log message. It already starts that e.g. on Debian there is no "man qemu", it is "man qemu-system" there instead. Next issue is that qemu man page is currently not structured in a way that would allow me to directly point them to the relevant man heading like: man --pager='less -p ^9P-msize' qemu So they would need to scroll their way through the entire man page by themselfes and find confusing sections like "-fsdev -device virtio-9p-pci" vs. "-virtfs", etc. I can imagine some people will struggle with that. With a link like "https://wiki.qemu.org/Documentation/9psetup#msize" the thing would be crystal clear within seconds. Just my opinion. Greg? Best regards, Christian Schoenebeck
On Wed, Sep 02, 2020 at 06:03:12PM +0200, Christian Schoenebeck wrote: > On Mittwoch, 2. September 2020 16:10:35 CEST Daniel P. Berrangé wrote: > > > Yeah, I know, but the problem is I don't see how I would squeeze the > > > relevant information into only one log message; and even "what's a good > > > starting point" is already questionable. > > > > > > For that reason my plan was: > > > - logging this warning > > > > > > - describing the 'msize' issue in detail on the QEMU wiki (what's the > > > > > > point, how would you benchmark it) > > > > > > So my idea was: user sees the message, "what is 'msize?'" -> Google "msize > > > qemu" -> click 'QEMU wiki' -> read all the details. > > > > > > But how about this: I put a QEMU wiki link directly into the log message? > > > > Rather than that, how about putting it in the QEMU man page, and then > > just add "See 'man 1 qemu' for further guidance". > > Well, I can do that of course. But somehow I fear users get lost by just > pointing them to "man 1 qemu" in the log message. It already starts that e.g. > on Debian there is no "man qemu", it is "man qemu-system" there instead. Next > issue is that qemu man page is currently not structured in a way that would > allow me to directly point them to the relevant man heading like: > > man --pager='less -p ^9P-msize' qemu > > So they would need to scroll their way through the entire man page by > themselfes and find confusing sections like "-fsdev -device virtio-9p-pci" vs. > "-virtfs", etc. I can imagine some people will struggle with that. > > With a link like "https://wiki.qemu.org/Documentation/9psetup#msize" the thing > would be crystal clear within seconds. I don't feel strongly either way as long as we have docs somewhere, so I won't object to either approach. Regards, Daniel
On Wed, 02 Sep 2020 18:03:12 +0200 Christian Schoenebeck <qemu_oss@crudebyte.com> wrote: > On Mittwoch, 2. September 2020 16:10:35 CEST Daniel P. Berrangé wrote: > > > Yeah, I know, but the problem is I don't see how I would squeeze the > > > relevant information into only one log message; and even "what's a good > > > starting point" is already questionable. > > > > > > For that reason my plan was: > > > - logging this warning > > > > > > - describing the 'msize' issue in detail on the QEMU wiki (what's the > > > > > > point, how would you benchmark it) > > > > > > So my idea was: user sees the message, "what is 'msize?'" -> Google "msize > > > qemu" -> click 'QEMU wiki' -> read all the details. > > > > > > But how about this: I put a QEMU wiki link directly into the log message? > > > > Rather than that, how about putting it in the QEMU man page, and then > > just add "See 'man 1 qemu' for further guidance". > > Well, I can do that of course. But somehow I fear users get lost by just > pointing them to "man 1 qemu" in the log message. It already starts that e.g. > on Debian there is no "man qemu", it is "man qemu-system" there instead. Next > issue is that qemu man page is currently not structured in a way that would > allow me to directly point them to the relevant man heading like: > > man --pager='less -p ^9P-msize' qemu > > So they would need to scroll their way through the entire man page by > themselfes and find confusing sections like "-fsdev -device virtio-9p-pci" vs. > "-virtfs", etc. I can imagine some people will struggle with that. > > With a link like "https://wiki.qemu.org/Documentation/9psetup#msize" the thing > would be crystal clear within seconds. > > Just my opinion. Greg? > Fine by me for this patch. For a longer term, maybe we should find a way to advertise some hint for msize to the guest... Not sure how to do that though. Cheers, -- Greg > Best regards, > Christian Schoenebeck > >
On Mittwoch, 2. September 2020 18:54:22 CEST Greg Kurz wrote: > > Well, I can do that of course. But somehow I fear users get lost by just > > pointing them to "man 1 qemu" in the log message. It already starts that > > e.g. on Debian there is no "man qemu", it is "man qemu-system" there > > instead. Next issue is that qemu man page is currently not structured in > > a way that would> > > allow me to directly point them to the relevant man heading like: > > man --pager='less -p ^9P-msize' qemu > > > > So they would need to scroll their way through the entire man page by > > themselfes and find confusing sections like "-fsdev -device virtio-9p-pci" > > vs. "-virtfs", etc. I can imagine some people will struggle with that. > > > > With a link like "https://wiki.qemu.org/Documentation/9psetup#msize" the > > thing would be crystal clear within seconds. > > > > Just my opinion. Greg? > > Fine by me for this patch. > > For a longer term, maybe we should find a way to advertise some hint > for msize to the guest... Not sure how to do that though. On the long term that would be possible, however only with a protocol change allowing server to send minimum, maximum and recommended msize to client. As you know, right now server only has a say in maximum msize. Best regards, Christian Schoenebeck
On Thu, 03 Sep 2020 10:20:40 +0200 Christian Schoenebeck <qemu_oss@crudebyte.com> wrote: > On Mittwoch, 2. September 2020 18:54:22 CEST Greg Kurz wrote: > > > Well, I can do that of course. But somehow I fear users get lost by just > > > pointing them to "man 1 qemu" in the log message. It already starts that > > > e.g. on Debian there is no "man qemu", it is "man qemu-system" there > > > instead. Next issue is that qemu man page is currently not structured in > > > a way that would> > > > allow me to directly point them to the relevant man heading like: > > > man --pager='less -p ^9P-msize' qemu > > > > > > So they would need to scroll their way through the entire man page by > > > themselfes and find confusing sections like "-fsdev -device virtio-9p-pci" > > > vs. "-virtfs", etc. I can imagine some people will struggle with that. > > > > > > With a link like "https://wiki.qemu.org/Documentation/9psetup#msize" the > > > thing would be crystal clear within seconds. > > > > > > Just my opinion. Greg? > > > > Fine by me for this patch. > > > > For a longer term, maybe we should find a way to advertise some hint > > for msize to the guest... Not sure how to do that though. > > On the long term that would be possible, however only with a protocol change > allowing server to send minimum, maximum and recommended msize to client. > Hmm... not sure adding a new 9P protocol version for this is the way to go. Not speaking of all the hustle this would cause, these msizes rather look like properties of the device that the guest can use to configure the 9P session. What about adding them to the virtio-9p device config along with the mount 'tag' and teach trans_virtio.c in linux to expose them as well in the sysfs entry of the device ? This could also be the occasion to describe virtio-9p in the virtio spec [1]. Something that has been sitting on my TODO list for years but I could never find time to consider... [1] https://docs.oasis-open.org/virtio/virtio/v1.1/virtio-v1.1.html > As you know, right now server only has a say in maximum msize. > I guess you mean minimum size ? > Best regards, > Christian Schoenebeck > >
On Donnerstag, 3. September 2020 11:35:14 CEST Greg Kurz wrote: > > On the long term that would be possible, however only with a protocol > > change allowing server to send minimum, maximum and recommended msize to > > client. > Hmm... not sure adding a new 9P protocol version for this is the > way to go. Not speaking of all the hustle this would cause, these > msizes rather look like properties of the device that the guest > can use to configure the 9P session. > > What about adding them to the virtio-9p device config along with > the mount 'tag' and teach trans_virtio.c in linux to expose them > as well in the sysfs entry of the device ? > > This could also be the occasion to describe virtio-9p in the virtio > spec [1]. Something that has been sitting on my TODO list for years > but I could never find time to consider... > > [1] https://docs.oasis-open.org/virtio/virtio/v1.1/virtio-v1.1.html Sounds like a good idea! > > As you know, right now server only has a say in maximum msize. > > I guess you mean minimum size ? Well, it's ambiguous, but we mean the same thing: server may optionally lower the msize previously suggested by client; server must not raise client's suggested msize though. If you see it in the context of Rversion response handling then that's a "minimum" operation, yes. If you see it as prose then it is "maximum msize", e.g.: Server's max. supported msize: 20 MiB <- covered by 9P protocol Server's min. supported msize: 4 kiB <- not officially covered by 9P Client suggests msize 100 MiB -> server lowers that to min(100MiB, 20MiB) = 20MiB. Best regards, Christian Schoenebeck
diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c index 7bb994bbf2..33e948d636 100644 --- a/hw/9pfs/9p.c +++ b/hw/9pfs/9p.c @@ -1353,6 +1353,14 @@ static void coroutine_fn v9fs_version(void *opaque) goto out; } + /* 8192 is the default msize of Linux clients */ + if (s->msize <= 8192) { + warn_report_once( + "9p: degraded performance: a reasonable high msize should be " + "chosen on client/guest side (chosen msize is <= 8192)." + ); + } + marshal: err = pdu_marshal(pdu, offset, "ds", s->msize, &version); if (err < 0) {
It is essential to choose a reasonable high value for 'msize' to avoid severe degraded file I/O performance. This parameter has to be chosen on client/guest side, and a Linux client defaults to an 'msize' of only 8192 if the user did not explicitly specify a value for 'msize'. Unfortunately many users are not aware that they should specify an appropriate value for 'msize' to avoid severe performance issues, so log a performance warning on host side in that case to make it more clear. Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> --- hw/9pfs/9p.c | 8 ++++++++ 1 file changed, 8 insertions(+)