Message ID | 20190327173206.9065-1-paul.durrant@citrix.com (mailing list archive) |
---|---|
Headers | show |
Series | xen-block: fix sector size confusion | expand |
On 27/03/2019 17:32, Paul Durrant wrote: > The Xen blkif protocol is confusing but discussion with the maintainer > has clarified that sector based quantities in requests and the 'sectors' > value advertized in xenstore should always be in terms of 512-byte > units and not the advertised logical 'sector-size' value. > > This series fixes xen-block to adhere to the spec. I thought we agreed that hardcoding things to 512 bytes was the wrong thing to do. I was expecting something like: 1) Clarify the spec with the intended meaning, (which is what some implementations actually use already) and wont cripple 4k datapaths. 2) Introduce a compatibility key for "I don't rely on sector-size being 512", which fixed implementations should advertise. 3) Specify that because of bugs in the spec which got out into the wild, drivers which don't find the key being advertised by the other end should emulate sector-size=512 for compatibility with broken implementations. Whatever the eventual way out, the first thing which needs to happen is an update to the spec, before actions are taken to alter existing implementations. ~Andrew
> -----Original Message----- > From: Andrew Cooper > Sent: 27 March 2019 18:20 > To: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel@lists.xenproject.org; qemu-block@nongnu.org; > qemu-devel@nongnu.org > Cc: Kevin Wolf <kwolf@redhat.com>; Stefano Stabellini <sstabellini@kernel.org>; Max Reitz > <mreitz@redhat.com>; Stefan Hajnoczi <stefanha@redhat.com>; Anthony Perard <anthony.perard@citrix.com> > Subject: Re: [Xen-devel] [PATCH v2 0/2] xen-block: fix sector size confusion > > On 27/03/2019 17:32, Paul Durrant wrote: > > The Xen blkif protocol is confusing but discussion with the maintainer > > has clarified that sector based quantities in requests and the 'sectors' > > value advertized in xenstore should always be in terms of 512-byte > > units and not the advertised logical 'sector-size' value. > > > > This series fixes xen-block to adhere to the spec. > > I thought we agreed that hardcoding things to 512 bytes was the wrong > thing to do. To some extent we decided it was the *only* thing to do. > > I was expecting something like: > > 1) Clarify the spec with the intended meaning, (which is what some > implementations actually use already) and wont cripple 4k datapaths. > 2) Introduce a compatibility key for "I don't rely on sector-size being > 512", which fixed implementations should advertise. > 3) Specify that because of bugs in the spec which got out into the wild, > drivers which don't find the key being advertised by the other end > should emulate sector-size=512 for compatibility with broken > implementations. Yes, that's how we are going to fix things. > > Whatever the eventual way out, the first thing which needs to happen is > an update to the spec, before actions are taken to alter existing > implementations. Well the implementation is currently wrong w.r.t. the spec and these patches fix that. As long as sector-size remains at 512 then no existing frontend should break, so I guess you could argue that patch #2 should also make sure that sector-size is also 512... but that is not yet in the spec. I guess I'm ok to defer patch #2 until a revised spec. is agreed, but the ship has already sailed as far as patch #1 goes. Anthony, thoughts? Paul > > ~Andrew
On Wed, Mar 27, 2019 at 08:32:28PM +0000, Paul Durrant wrote: > > -----Original Message----- > > From: Andrew Cooper > > Sent: 27 March 2019 18:20 > > To: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel@lists.xenproject.org; qemu-block@nongnu.org; > > qemu-devel@nongnu.org > > Cc: Kevin Wolf <kwolf@redhat.com>; Stefano Stabellini <sstabellini@kernel.org>; Max Reitz > > <mreitz@redhat.com>; Stefan Hajnoczi <stefanha@redhat.com>; Anthony Perard <anthony.perard@citrix.com> > > Subject: Re: [Xen-devel] [PATCH v2 0/2] xen-block: fix sector size confusion > > > > On 27/03/2019 17:32, Paul Durrant wrote: > > > The Xen blkif protocol is confusing but discussion with the maintainer > > > has clarified that sector based quantities in requests and the 'sectors' > > > value advertized in xenstore should always be in terms of 512-byte > > > units and not the advertised logical 'sector-size' value. > > > > > > This series fixes xen-block to adhere to the spec. > > > > I thought we agreed that hardcoding things to 512 bytes was the wrong > > thing to do. > > To some extent we decided it was the *only* thing to do. > > > > > I was expecting something like: > > > > 1) Clarify the spec with the intended meaning, (which is what some > > implementations actually use already) and wont cripple 4k datapaths. > > 2) Introduce a compatibility key for "I don't rely on sector-size being > > 512", which fixed implementations should advertise. > > 3) Specify that because of bugs in the spec which got out into the wild, > > drivers which don't find the key being advertised by the other end > > should emulate sector-size=512 for compatibility with broken > > implementations. > > Yes, that's how we are going to fix things. > > > > > Whatever the eventual way out, the first thing which needs to happen is > > an update to the spec, before actions are taken to alter existing > > implementations. > > Well the implementation is currently wrong w.r.t. the spec and these patches fix that. As long as sector-size remains at 512 then no existing frontend should break, so I guess you could argue that patch #2 should also make sure that sector-size is also 512... but that is not yet in the spec. > I guess I'm ok to defer patch #2 until a revised spec. is agreed, but the ship has already sailed as far as patch #1 goes. > > Anthony, thoughts? So QEMU used to always set "sector-size" to 512, and used that for request. The new implementation (not released yet) doesn't do that anymore, and may set "sector-size" to a different value and used that for requests. patch #1 is one way to fix the requests (and avoid regression) and more clearly spell out the weird thing about the spec. I also think patch #2 is too soon and should point to a commit in xen.git instead of a thread on xen-devel. In the meantime, we should probably set "sector-size" to 512, like QEMU used to do anyway, with a comment about the fact that different implementations uses sector-size differently and a value of 512 would work fine.
On 28/03/2019 11:40, Anthony PERARD wrote: > On Wed, Mar 27, 2019 at 08:32:28PM +0000, Paul Durrant wrote: >>> -----Original Message----- >>> From: Andrew Cooper >>> Sent: 27 March 2019 18:20 >>> To: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel@lists.xenproject.org; qemu-block@nongnu.org; >>> qemu-devel@nongnu.org >>> Cc: Kevin Wolf <kwolf@redhat.com>; Stefano Stabellini <sstabellini@kernel.org>; Max Reitz >>> <mreitz@redhat.com>; Stefan Hajnoczi <stefanha@redhat.com>; Anthony Perard <anthony.perard@citrix.com> >>> Subject: Re: [Xen-devel] [PATCH v2 0/2] xen-block: fix sector size confusion >>> >>> On 27/03/2019 17:32, Paul Durrant wrote: >>>> The Xen blkif protocol is confusing but discussion with the maintainer >>>> has clarified that sector based quantities in requests and the 'sectors' >>>> value advertized in xenstore should always be in terms of 512-byte >>>> units and not the advertised logical 'sector-size' value. >>>> >>>> This series fixes xen-block to adhere to the spec. >>> I thought we agreed that hardcoding things to 512 bytes was the wrong >>> thing to do. >> To some extent we decided it was the *only* thing to do. >> >>> I was expecting something like: >>> >>> 1) Clarify the spec with the intended meaning, (which is what some >>> implementations actually use already) and wont cripple 4k datapaths. >>> 2) Introduce a compatibility key for "I don't rely on sector-size being >>> 512", which fixed implementations should advertise. >>> 3) Specify that because of bugs in the spec which got out into the wild, >>> drivers which don't find the key being advertised by the other end >>> should emulate sector-size=512 for compatibility with broken >>> implementations. >> Yes, that's how we are going to fix things. >> >>> Whatever the eventual way out, the first thing which needs to happen is >>> an update to the spec, before actions are taken to alter existing >>> implementations. >> Well the implementation is currently wrong w.r.t. the spec and these patches fix that. As long as sector-size remains at 512 then no existing frontend should break, so I guess you could argue that patch #2 should also make sure that sector-size is also 512... but that is not yet in the spec. >> I guess I'm ok to defer patch #2 until a revised spec. is agreed, but the ship has already sailed as far as patch #1 goes. >> >> Anthony, thoughts? > So QEMU used to always set "sector-size" to 512, and used that for > request. The new implementation (not released yet) doesn't do that > anymore, and may set "sector-size" to a different value and used that > for requests. > > patch #1 is one way to fix the requests (and avoid regression) and > more clearly spell out the weird thing about the spec. > > I also think patch #2 is too soon and should point to a commit in > xen.git instead of a thread on xen-devel. > > In the meantime, we should probably set "sector-size" to 512, like QEMU > used to do anyway, with a comment about the fact that different > implementations uses sector-size differently and a value of 512 would > work fine. Hmm - I hadn't realised this is an unreleased issue in qemu. So, Qemu used to unconditionally set sector-size=512, and your work to qdev-ify everything introduced a change which has identified a spec/protocol issue? If so, then I think it is fine for this series to state (much more clearly than it does) that it is returning qemu's behaviour to match the currently released version, because we've discovered an issue in the spec/protocol, and that we will subsequently work address the issue in the spec and provide a forwards path which doesn't involve nailing our feet to the floor. ~Andrew
Am 28.03.2019 um 12:46 hat Andrew Cooper geschrieben: > On 28/03/2019 11:40, Anthony PERARD wrote: > > On Wed, Mar 27, 2019 at 08:32:28PM +0000, Paul Durrant wrote: > >>> -----Original Message----- > >>> From: Andrew Cooper > >>> Sent: 27 March 2019 18:20 > >>> To: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel@lists.xenproject.org; qemu-block@nongnu.org; > >>> qemu-devel@nongnu.org > >>> Cc: Kevin Wolf <kwolf@redhat.com>; Stefano Stabellini <sstabellini@kernel.org>; Max Reitz > >>> <mreitz@redhat.com>; Stefan Hajnoczi <stefanha@redhat.com>; Anthony Perard <anthony.perard@citrix.com> > >>> Subject: Re: [Xen-devel] [PATCH v2 0/2] xen-block: fix sector size confusion > >>> > >>> On 27/03/2019 17:32, Paul Durrant wrote: > >>>> The Xen blkif protocol is confusing but discussion with the maintainer > >>>> has clarified that sector based quantities in requests and the 'sectors' > >>>> value advertized in xenstore should always be in terms of 512-byte > >>>> units and not the advertised logical 'sector-size' value. > >>>> > >>>> This series fixes xen-block to adhere to the spec. > >>> I thought we agreed that hardcoding things to 512 bytes was the wrong > >>> thing to do. > >> To some extent we decided it was the *only* thing to do. > >> > >>> I was expecting something like: > >>> > >>> 1) Clarify the spec with the intended meaning, (which is what some > >>> implementations actually use already) and wont cripple 4k datapaths. > >>> 2) Introduce a compatibility key for "I don't rely on sector-size being > >>> 512", which fixed implementations should advertise. > >>> 3) Specify that because of bugs in the spec which got out into the wild, > >>> drivers which don't find the key being advertised by the other end > >>> should emulate sector-size=512 for compatibility with broken > >>> implementations. > >> Yes, that's how we are going to fix things. > >> > >>> Whatever the eventual way out, the first thing which needs to happen is > >>> an update to the spec, before actions are taken to alter existing > >>> implementations. > >> Well the implementation is currently wrong w.r.t. the spec and these patches fix that. As long as sector-size remains at 512 then no existing frontend should break, so I guess you could argue that patch #2 should also make sure that sector-size is also 512... but that is not yet in the spec. > >> I guess I'm ok to defer patch #2 until a revised spec. is agreed, but the ship has already sailed as far as patch #1 goes. > >> > >> Anthony, thoughts? > > So QEMU used to always set "sector-size" to 512, and used that for > > request. The new implementation (not released yet) doesn't do that > > anymore, and may set "sector-size" to a different value and used that > > for requests. > > > > patch #1 is one way to fix the requests (and avoid regression) and > > more clearly spell out the weird thing about the spec. > > > > I also think patch #2 is too soon and should point to a commit in > > xen.git instead of a thread on xen-devel. > > > > In the meantime, we should probably set "sector-size" to 512, like QEMU > > used to do anyway, with a comment about the fact that different > > implementations uses sector-size differently and a value of 512 would > > work fine. > > Hmm - I hadn't realised this is an unreleased issue in qemu. > > So, Qemu used to unconditionally set sector-size=512, and your work to > qdev-ify everything introduced a change which has identified a > spec/protocol issue? The old implementation has the sector size hardcoded: #define BLOCK_SIZE 512 Whereas the qdevified version uses DEFINE_BLOCK_PROPERTIES(), which includes user-visible options for logical/physical_block_size. So before, you couldn't even define a different sector size and the question whether 512 or the sector size should be used didn't make a difference anyway. > If so, then I think it is fine for this series to state (much more > clearly than it does) that it is returning qemu's behaviour to match the > currently released version, because we've discovered an issue in the > spec/protocol, and that we will subsequently work address the issue in > the spec and provide a forwards path which doesn't involve nailing our > feet to the floor. The closest thing to returning to the old behaviour would be erroring out during device initialisation if logical_block_size != 512. Kevin
> -----Original Message----- > From: Andrew Cooper > Sent: 28 March 2019 11:46 > To: Anthony Perard <anthony.perard@citrix.com>; Paul Durrant <Paul.Durrant@citrix.com> > Cc: xen-devel@lists.xenproject.org; qemu-block@nongnu.org; qemu-devel@nongnu.org; Kevin Wolf > <kwolf@redhat.com>; Stefano Stabellini <sstabellini@kernel.org>; Max Reitz <mreitz@redhat.com>; Stefan > Hajnoczi <stefanha@redhat.com> > Subject: Re: [Xen-devel] [PATCH v2 0/2] xen-block: fix sector size confusion > > On 28/03/2019 11:40, Anthony PERARD wrote: > > On Wed, Mar 27, 2019 at 08:32:28PM +0000, Paul Durrant wrote: > >>> -----Original Message----- > >>> From: Andrew Cooper > >>> Sent: 27 March 2019 18:20 > >>> To: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel@lists.xenproject.org; qemu-block@nongnu.org; > >>> qemu-devel@nongnu.org > >>> Cc: Kevin Wolf <kwolf@redhat.com>; Stefano Stabellini <sstabellini@kernel.org>; Max Reitz > >>> <mreitz@redhat.com>; Stefan Hajnoczi <stefanha@redhat.com>; Anthony Perard > <anthony.perard@citrix.com> > >>> Subject: Re: [Xen-devel] [PATCH v2 0/2] xen-block: fix sector size confusion > >>> > >>> On 27/03/2019 17:32, Paul Durrant wrote: > >>>> The Xen blkif protocol is confusing but discussion with the maintainer > >>>> has clarified that sector based quantities in requests and the 'sectors' > >>>> value advertized in xenstore should always be in terms of 512-byte > >>>> units and not the advertised logical 'sector-size' value. > >>>> > >>>> This series fixes xen-block to adhere to the spec. > >>> I thought we agreed that hardcoding things to 512 bytes was the wrong > >>> thing to do. > >> To some extent we decided it was the *only* thing to do. > >> > >>> I was expecting something like: > >>> > >>> 1) Clarify the spec with the intended meaning, (which is what some > >>> implementations actually use already) and wont cripple 4k datapaths. > >>> 2) Introduce a compatibility key for "I don't rely on sector-size being > >>> 512", which fixed implementations should advertise. > >>> 3) Specify that because of bugs in the spec which got out into the wild, > >>> drivers which don't find the key being advertised by the other end > >>> should emulate sector-size=512 for compatibility with broken > >>> implementations. > >> Yes, that's how we are going to fix things. > >> > >>> Whatever the eventual way out, the first thing which needs to happen is > >>> an update to the spec, before actions are taken to alter existing > >>> implementations. > >> Well the implementation is currently wrong w.r.t. the spec and these patches fix that. As long as > sector-size remains at 512 then no existing frontend should break, so I guess you could argue that > patch #2 should also make sure that sector-size is also 512... but that is not yet in the spec. > >> I guess I'm ok to defer patch #2 until a revised spec. is agreed, but the ship has already sailed > as far as patch #1 goes. > >> > >> Anthony, thoughts? > > So QEMU used to always set "sector-size" to 512, and used that for > > request. The new implementation (not released yet) doesn't do that > > anymore, and may set "sector-size" to a different value and used that > > for requests. > > > > patch #1 is one way to fix the requests (and avoid regression) and > > more clearly spell out the weird thing about the spec. > > > > I also think patch #2 is too soon and should point to a commit in > > xen.git instead of a thread on xen-devel. > > > > In the meantime, we should probably set "sector-size" to 512, like QEMU > > used to do anyway, with a comment about the fact that different > > implementations uses sector-size differently and a value of 512 would > > work fine. > > Hmm - I hadn't realised this is an unreleased issue in qemu. > > So, Qemu used to unconditionally set sector-size=512, and your work to > qdev-ify everything introduced a change which has identified a > spec/protocol issue? Basically, yes. I had not realized at the time how bad the spec. is. I was referring to prevailing implementations, which seemed to use sector-size as the multiple and, in the Windows case, appear to cope with a logical sector size other than 512. Given what I know now, fixing sector-size to be 512 seems like the only way to avoid regressions. > > If so, then I think it is fine for this series to state (much more > clearly than it does) that it is returning qemu's behaviour to match the > currently released version, because we've discovered an issue in the > spec/protocol, and that we will subsequently work address the issue in > the spec and provide a forwards path which doesn't involve nailing our > feet to the floor. Ok, I'll expand the commit comment and state that the spec. will be revised to allow greater logical block sizes. Paul > > ~Andrew
> -----Original Message----- > From: Kevin Wolf [mailto:kwolf@redhat.com] > Sent: 28 March 2019 11:56 > To: Andrew Cooper <Andrew.Cooper3@citrix.com> > Cc: Anthony Perard <anthony.perard@citrix.com>; Paul Durrant <Paul.Durrant@citrix.com>; xen- > devel@lists.xenproject.org; qemu-block@nongnu.org; qemu-devel@nongnu.org; Stefano Stabellini > <sstabellini@kernel.org>; Max Reitz <mreitz@redhat.com>; Stefan Hajnoczi <stefanha@redhat.com> > Subject: Re: [Xen-devel] [PATCH v2 0/2] xen-block: fix sector size confusion > > Am 28.03.2019 um 12:46 hat Andrew Cooper geschrieben: > > On 28/03/2019 11:40, Anthony PERARD wrote: > > > On Wed, Mar 27, 2019 at 08:32:28PM +0000, Paul Durrant wrote: > > >>> -----Original Message----- > > >>> From: Andrew Cooper > > >>> Sent: 27 March 2019 18:20 > > >>> To: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel@lists.xenproject.org; qemu- > block@nongnu.org; > > >>> qemu-devel@nongnu.org > > >>> Cc: Kevin Wolf <kwolf@redhat.com>; Stefano Stabellini <sstabellini@kernel.org>; Max Reitz > > >>> <mreitz@redhat.com>; Stefan Hajnoczi <stefanha@redhat.com>; Anthony Perard > <anthony.perard@citrix.com> > > >>> Subject: Re: [Xen-devel] [PATCH v2 0/2] xen-block: fix sector size confusion > > >>> > > >>> On 27/03/2019 17:32, Paul Durrant wrote: > > >>>> The Xen blkif protocol is confusing but discussion with the maintainer > > >>>> has clarified that sector based quantities in requests and the 'sectors' > > >>>> value advertized in xenstore should always be in terms of 512-byte > > >>>> units and not the advertised logical 'sector-size' value. > > >>>> > > >>>> This series fixes xen-block to adhere to the spec. > > >>> I thought we agreed that hardcoding things to 512 bytes was the wrong > > >>> thing to do. > > >> To some extent we decided it was the *only* thing to do. > > >> > > >>> I was expecting something like: > > >>> > > >>> 1) Clarify the spec with the intended meaning, (which is what some > > >>> implementations actually use already) and wont cripple 4k datapaths. > > >>> 2) Introduce a compatibility key for "I don't rely on sector-size being > > >>> 512", which fixed implementations should advertise. > > >>> 3) Specify that because of bugs in the spec which got out into the wild, > > >>> drivers which don't find the key being advertised by the other end > > >>> should emulate sector-size=512 for compatibility with broken > > >>> implementations. > > >> Yes, that's how we are going to fix things. > > >> > > >>> Whatever the eventual way out, the first thing which needs to happen is > > >>> an update to the spec, before actions are taken to alter existing > > >>> implementations. > > >> Well the implementation is currently wrong w.r.t. the spec and these patches fix that. As long as > sector-size remains at 512 then no existing frontend should break, so I guess you could argue that > patch #2 should also make sure that sector-size is also 512... but that is not yet in the spec. > > >> I guess I'm ok to defer patch #2 until a revised spec. is agreed, but the ship has already sailed > as far as patch #1 goes. > > >> > > >> Anthony, thoughts? > > > So QEMU used to always set "sector-size" to 512, and used that for > > > request. The new implementation (not released yet) doesn't do that > > > anymore, and may set "sector-size" to a different value and used that > > > for requests. > > > > > > patch #1 is one way to fix the requests (and avoid regression) and > > > more clearly spell out the weird thing about the spec. > > > > > > I also think patch #2 is too soon and should point to a commit in > > > xen.git instead of a thread on xen-devel. > > > > > > In the meantime, we should probably set "sector-size" to 512, like QEMU > > > used to do anyway, with a comment about the fact that different > > > implementations uses sector-size differently and a value of 512 would > > > work fine. > > > > Hmm - I hadn't realised this is an unreleased issue in qemu. > > > > So, Qemu used to unconditionally set sector-size=512, and your work to > > qdev-ify everything introduced a change which has identified a > > spec/protocol issue? > > The old implementation has the sector size hardcoded: > > #define BLOCK_SIZE 512 > > Whereas the qdevified version uses DEFINE_BLOCK_PROPERTIES(), which > includes user-visible options for logical/physical_block_size. > > So before, you couldn't even define a different sector size and the > question whether 512 or the sector size should be used didn't make a > difference anyway. > > > If so, then I think it is fine for this series to state (much more > > clearly than it does) that it is returning qemu's behaviour to match the > > currently released version, because we've discovered an issue in the > > spec/protocol, and that we will subsequently work address the issue in > > the spec and provide a forwards path which doesn't involve nailing our > > feet to the floor. > > The closest thing to returning to the old behaviour would be erroring > out during device initialisation if logical_block_size != 512. One thing I've not figured out... If I create a blockdev in QEMU that is pointing at a real device with a logical_block_size of 4k, will the QEMU block layer perform the necessary read-modify-write cycles for accesses < 4k? IOW would it be safe to always advertise a size of 512 to a frontend? The problem with erroring out during device init is that it does not give us a way of fixing things in future, as the frontend has not started at that time and thus we'd have no idea whether it could use whatever protocol fix we come up with. I think the only thing the backend could do is refuse to connect to an old frontend if logical_block_size != 512. Paul > > Kevin
Am 01.04.2019 um 11:01 hat Paul Durrant geschrieben: > > -----Original Message----- > > From: Kevin Wolf [mailto:kwolf@redhat.com] > > Sent: 28 March 2019 11:56 > > To: Andrew Cooper <Andrew.Cooper3@citrix.com> > > Cc: Anthony Perard <anthony.perard@citrix.com>; Paul Durrant <Paul.Durrant@citrix.com>; xen- > > devel@lists.xenproject.org; qemu-block@nongnu.org; qemu-devel@nongnu.org; Stefano Stabellini > > <sstabellini@kernel.org>; Max Reitz <mreitz@redhat.com>; Stefan Hajnoczi <stefanha@redhat.com> > > Subject: Re: [Xen-devel] [PATCH v2 0/2] xen-block: fix sector size confusion > > > > Am 28.03.2019 um 12:46 hat Andrew Cooper geschrieben: > > > On 28/03/2019 11:40, Anthony PERARD wrote: > > > > On Wed, Mar 27, 2019 at 08:32:28PM +0000, Paul Durrant wrote: > > > >>> -----Original Message----- > > > >>> From: Andrew Cooper > > > >>> Sent: 27 March 2019 18:20 > > > >>> To: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel@lists.xenproject.org; qemu- > > block@nongnu.org; > > > >>> qemu-devel@nongnu.org > > > >>> Cc: Kevin Wolf <kwolf@redhat.com>; Stefano Stabellini <sstabellini@kernel.org>; Max Reitz > > > >>> <mreitz@redhat.com>; Stefan Hajnoczi <stefanha@redhat.com>; Anthony Perard > > <anthony.perard@citrix.com> > > > >>> Subject: Re: [Xen-devel] [PATCH v2 0/2] xen-block: fix sector size confusion > > > >>> > > > >>> On 27/03/2019 17:32, Paul Durrant wrote: > > > >>>> The Xen blkif protocol is confusing but discussion with the maintainer > > > >>>> has clarified that sector based quantities in requests and the 'sectors' > > > >>>> value advertized in xenstore should always be in terms of 512-byte > > > >>>> units and not the advertised logical 'sector-size' value. > > > >>>> > > > >>>> This series fixes xen-block to adhere to the spec. > > > >>> I thought we agreed that hardcoding things to 512 bytes was the wrong > > > >>> thing to do. > > > >> To some extent we decided it was the *only* thing to do. > > > >> > > > >>> I was expecting something like: > > > >>> > > > >>> 1) Clarify the spec with the intended meaning, (which is what some > > > >>> implementations actually use already) and wont cripple 4k datapaths. > > > >>> 2) Introduce a compatibility key for "I don't rely on sector-size being > > > >>> 512", which fixed implementations should advertise. > > > >>> 3) Specify that because of bugs in the spec which got out into the wild, > > > >>> drivers which don't find the key being advertised by the other end > > > >>> should emulate sector-size=512 for compatibility with broken > > > >>> implementations. > > > >> Yes, that's how we are going to fix things. > > > >> > > > >>> Whatever the eventual way out, the first thing which needs to happen is > > > >>> an update to the spec, before actions are taken to alter existing > > > >>> implementations. > > > >> Well the implementation is currently wrong w.r.t. the spec and these patches fix that. As long as > > sector-size remains at 512 then no existing frontend should break, so I guess you could argue that > > patch #2 should also make sure that sector-size is also 512... but that is not yet in the spec. > > > >> I guess I'm ok to defer patch #2 until a revised spec. is agreed, but the ship has already sailed > > as far as patch #1 goes. > > > >> > > > >> Anthony, thoughts? > > > > So QEMU used to always set "sector-size" to 512, and used that for > > > > request. The new implementation (not released yet) doesn't do that > > > > anymore, and may set "sector-size" to a different value and used that > > > > for requests. > > > > > > > > patch #1 is one way to fix the requests (and avoid regression) and > > > > more clearly spell out the weird thing about the spec. > > > > > > > > I also think patch #2 is too soon and should point to a commit in > > > > xen.git instead of a thread on xen-devel. > > > > > > > > In the meantime, we should probably set "sector-size" to 512, like QEMU > > > > used to do anyway, with a comment about the fact that different > > > > implementations uses sector-size differently and a value of 512 would > > > > work fine. > > > > > > Hmm - I hadn't realised this is an unreleased issue in qemu. > > > > > > So, Qemu used to unconditionally set sector-size=512, and your work to > > > qdev-ify everything introduced a change which has identified a > > > spec/protocol issue? > > > > The old implementation has the sector size hardcoded: > > > > #define BLOCK_SIZE 512 > > > > Whereas the qdevified version uses DEFINE_BLOCK_PROPERTIES(), which > > includes user-visible options for logical/physical_block_size. > > > > So before, you couldn't even define a different sector size and the > > question whether 512 or the sector size should be used didn't make a > > difference anyway. > > > > > If so, then I think it is fine for this series to state (much more > > > clearly than it does) that it is returning qemu's behaviour to match the > > > currently released version, because we've discovered an issue in the > > > spec/protocol, and that we will subsequently work address the issue in > > > the spec and provide a forwards path which doesn't involve nailing our > > > feet to the floor. > > > > The closest thing to returning to the old behaviour would be erroring > > out during device initialisation if logical_block_size != 512. > > One thing I've not figured out... If I create a blockdev in QEMU that > is pointing at a real device with a logical_block_size of 4k, will the > QEMU block layer perform the necessary read-modify-write cycles for > accesses < 4k? IOW would it be safe to always advertise a size of 512 > to a frontend? Yes, for 512 accesses on native 4k disks with O_DIRECT, the QEMU block layer performs the necessary RMW. Of course, it still comes with a performance penalty, so you want to avoid such setups, but they do work. > The problem with erroring out during device init is that it does not > give us a way of fixing things in future, as the frontend has not > started at that time and thus we'd have no idea whether it could use > whatever protocol fix we come up with. I think the only thing the > backend could do is refuse to connect to an old frontend if > logical_block_size != 512. I was just thinking of getting back to the old state, with a quick fix (by making the problematic new setting inaccessible) for the bug in 4.0 that could possible be merged today or tomorrow for rc2. What you need to do for actually supporting 4k disks in the long term (QEMU 4.1 or later) depends on what the drivers look like currently and is a separate discussion. Kevin
> -----Original Message----- [snip] > > > > > > The old implementation has the sector size hardcoded: > > > > > > #define BLOCK_SIZE 512 > > > > > > Whereas the qdevified version uses DEFINE_BLOCK_PROPERTIES(), which > > > includes user-visible options for logical/physical_block_size. > > > > > > So before, you couldn't even define a different sector size and the > > > question whether 512 or the sector size should be used didn't make a > > > difference anyway. > > > > > > > If so, then I think it is fine for this series to state (much more > > > > clearly than it does) that it is returning qemu's behaviour to match the > > > > currently released version, because we've discovered an issue in the > > > > spec/protocol, and that we will subsequently work address the issue in > > > > the spec and provide a forwards path which doesn't involve nailing our > > > > feet to the floor. > > > > > > The closest thing to returning to the old behaviour would be erroring > > > out during device initialisation if logical_block_size != 512. > > > > One thing I've not figured out... If I create a blockdev in QEMU that > > is pointing at a real device with a logical_block_size of 4k, will the > > QEMU block layer perform the necessary read-modify-write cycles for > > accesses < 4k? IOW would it be safe to always advertise a size of 512 > > to a frontend? > > Yes, for 512 accesses on native 4k disks with O_DIRECT, the QEMU block > layer performs the necessary RMW. Of course, it still comes with a > performance penalty, so you want to avoid such setups, but they do work. > Ok, that's good. Thanks. > > The problem with erroring out during device init is that it does not > > give us a way of fixing things in future, as the frontend has not > > started at that time and thus we'd have no idea whether it could use > > whatever protocol fix we come up with. I think the only thing the > > backend could do is refuse to connect to an old frontend if > > logical_block_size != 512. > > I was just thinking of getting back to the old state, with a quick fix > (by making the problematic new setting inaccessible) for the bug in 4.0 > that could possible be merged today or tomorrow for rc2. > Ok, I see what you mean. I'll modify and resubmit patch #1 today. > What you need to do for actually supporting 4k disks in the long term > (QEMU 4.1 or later) depends on what the drivers look like currently and > is a separate discussion. > Yes, agreed. Paul > Kevin
Den 01.04.2019 11:34, skrev Kevin Wolf: > > Yes, for 512 accesses on native 4k disks with O_DIRECT, the QEMU block > layer performs the necessary RMW. Of course, it still comes with a > performance penalty, so you want to avoid such setups, but they do work. I suspect that the approximately 1/10 -th disk-speed I see (e.g 28389 vs 258719 K/sec sequential disk Output) in domU compared to dom0 must be due to this. I have dom0 and domU drives as partitions on an md raid6. The performance is sub-optimal to put it mildly. Here is one example bonnie++ run on a guest: Version 1.97 ------Sequential Output------ --Sequential Input- --Random- Concurrency 3 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP gt-credit2-on-p 16G 1062 90 28389 2 20233 2 3950 82 199633 10 1195 17 Latency 19003us 140ms 652ms 20811us 602ms 33189us Version 1.97 ------Sequential Create------ --------Random Create-------- gt-credit2-on-pt -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 127 34445 26 +++++ +++ 42424 27 67418 51 +++++ +++ 52590 35 Latency 12979us 329us 2903us 304us 38us 668us 1.97,1.97,gt-credit2-on-pt,3,1550928114,16G,,1062,90,28389,2,20233,2,3950,82,199633,10,1195,17,127,,,,,34445,26,+++++,+++,42424,27,67418,51,+++++,+++,52590,35,19003us,140ms,652ms,20811us,602ms,33189us,12979us,329us,2903us,304us,38us,668us Same on dom0: Version 1.97 ------Sequential Output------ --Sequential Input- --Random- Concurrency 3 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP XEN-4.12-DOM0 16G 475 91 258719 20 151508 19 585 99 324725 22 392.3 14 Latency 16093us 1097ms 406ms 18136us 160ms 155ms Version 1.97 ------Sequential Create------ --------Random Create-------- XEN-4.12-DOM0 -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 8 6563 11 +++++ +++ 13522 20 10969 18 +++++ +++ 9560 13 Latency 104us 68us 18179us 120us 20us 63us 1.97,1.97,XEN-4.12-DOM0,3,1547961396,16G,,475,91,258719,20,151508,19,585,99,324725,22,392.3,14,8,,,,,6563,11,+++++,+++,13522,20,10969,18,+++++,+++,9560,13,16093us,1097ms,406ms,18136us,160ms,155ms,104us,68us,18179us,120us,20us,63us