Message ID | 20230709202859.138387-1-michael.christie@oracle.com (mailing list archive) |
---|---|
Headers | show |
Series | vhost-scsi: Fix IO hangs when using windows | expand |
On Sun, Jul 09, 2023 at 03:28:57PM -0500, Mike Christie wrote: > The following patches were made over Linus's tree and fix an issue > where windows guests will send iovecs with offset/lengths that result > in IOs that are not aligned to 512. The LIO layer will then send them > to Linux's FS/block layer but it requires 512 byte alignment, so > depending on the FS/block driver being used we will get IO errors or > hung IO. > > The following patches have vhost-scsi detect when windows sends these > IOs and copy them to a bounce buffer. It then does some cleanup in > the related code. Thanks, tagged! Mike, you are the main developer on vhost/scsi recently. Do you want to be listed in MAINTAINERS too? This implies you will be expected to review patches/bug reports though. Thanks,
On 7/10/23 12:52 AM, Michael S. Tsirkin wrote: > On Sun, Jul 09, 2023 at 03:28:57PM -0500, Mike Christie wrote: >> The following patches were made over Linus's tree and fix an issue >> where windows guests will send iovecs with offset/lengths that result >> in IOs that are not aligned to 512. The LIO layer will then send them >> to Linux's FS/block layer but it requires 512 byte alignment, so >> depending on the FS/block driver being used we will get IO errors or >> hung IO. >> >> The following patches have vhost-scsi detect when windows sends these >> IOs and copy them to a bounce buffer. It then does some cleanup in >> the related code. > > > Thanks, tagged! > Mike, you are the main developer on vhost/scsi recently. > Do you want to be listed in MAINTAINERS too? > This implies you will be expected to review patches/bug reports > though. That sounds good. Thanks.
On Sun, Jul 09, 2023 at 03:28:57PM -0500, Mike Christie wrote: > The following patches were made over Linus's tree and fix an issue > where windows guests will send iovecs with offset/lengths that result > in IOs that are not aligned to 512. The LIO layer will then send them > to Linux's FS/block layer but it requires 512 byte alignment, so > depending on the FS/block driver being used we will get IO errors or > hung IO. > > The following patches have vhost-scsi detect when windows sends these > IOs and copy them to a bounce buffer. It then does some cleanup in > the related code. Hang on, virtio-scsi is a SCSI HBA and READs/WRITEs submitted must follow the usual constraints on SCSI block limits. Would Windows send mis-aligned I/O to a non-virtio-scsi SCSI HBA? Are you sure this is not a bug in the Windows guest driver where block limits are being misconfigured? Stefan
On 7/11/23 1:34 PM, Stefan Hajnoczi wrote: > On Sun, Jul 09, 2023 at 03:28:57PM -0500, Mike Christie wrote: >> The following patches were made over Linus's tree and fix an issue >> where windows guests will send iovecs with offset/lengths that result >> in IOs that are not aligned to 512. The LIO layer will then send them >> to Linux's FS/block layer but it requires 512 byte alignment, so >> depending on the FS/block driver being used we will get IO errors or >> hung IO. >> >> The following patches have vhost-scsi detect when windows sends these >> IOs and copy them to a bounce buffer. It then does some cleanup in >> the related code. > > Hang on, virtio-scsi is a SCSI HBA and READs/WRITEs submitted must > follow the usual constraints on SCSI block limits. Would Windows send > mis-aligned I/O to a non-virtio-scsi SCSI HBA? It's like linux where you can config settings like that. > > Are you sure this is not a bug in the Windows guest driver where block > limits are being misconfigured? From what our windows dev told us the guest drivers like here: https://github.com/virtio-win don't set the windows AlignmentMask to 512. They tried that and it resulted in windows crash dump crashing because it doesn't like the hard alignment requirement. We thought other apps would have trouble as well, so we tried to add bounce buffer support to the windows driver, but I think people thought it was going to be uglier than this patch and in the normal alignment case might also affect performance. There was some windows driver/layering and buffer/cmd details that I don't fully understand and took their word for because I don't know a lot about windows. In the end we still have to add checks to vhost-scsi to protect against bad drivers, so we thought we might as well just add bounce buffer support to vhost-scsi.
On Tue, Jul 11, 2023 at 04:01:22PM -0500, Mike Christie wrote: > On 7/11/23 1:34 PM, Stefan Hajnoczi wrote: > > On Sun, Jul 09, 2023 at 03:28:57PM -0500, Mike Christie wrote: > >> The following patches were made over Linus's tree and fix an issue > >> where windows guests will send iovecs with offset/lengths that result > >> in IOs that are not aligned to 512. The LIO layer will then send them > >> to Linux's FS/block layer but it requires 512 byte alignment, so > >> depending on the FS/block driver being used we will get IO errors or > >> hung IO. > >> > >> The following patches have vhost-scsi detect when windows sends these > >> IOs and copy them to a bounce buffer. It then does some cleanup in > >> the related code. > > > > Hang on, virtio-scsi is a SCSI HBA and READs/WRITEs submitted must > > follow the usual constraints on SCSI block limits. Would Windows send > > mis-aligned I/O to a non-virtio-scsi SCSI HBA? > > It's like linux where you can config settings like that. > > > > Are you sure this is not a bug in the Windows guest driver where block > > limits are being misconfigured? > > From what our windows dev told us the guest drivers like here: > > https://github.com/virtio-win > > don't set the windows AlignmentMask to 512. They tried that and it > resulted in windows crash dump crashing because it doesn't like the > hard alignment requirement. > > We thought other apps would have trouble as well, so we tried to add > bounce buffer support to the windows driver, but I think people thought > it was going to be uglier than this patch and in the normal alignment > case might also affect performance. There was some windows driver/layering > and buffer/cmd details that I don't fully understand and took their word > for because I don't know a lot about windows. > > In the end we still have to add checks to vhost-scsi to protect against > bad drivers, so we thought we might as well just add bounce buffer support > to vhost-scsi. CCing virtio-win developers so they can confirm how the vioscsi driver is supposed to handle request alignment. My expectation is that the virtio-scsi device will fail mis-aligned I/O requests. Stefan
On 7/12/23 9:26 AM, Stefan Hajnoczi wrote: > On Tue, Jul 11, 2023 at 04:01:22PM -0500, Mike Christie wrote: >> On 7/11/23 1:34 PM, Stefan Hajnoczi wrote: >>> On Sun, Jul 09, 2023 at 03:28:57PM -0500, Mike Christie wrote: >>>> The following patches were made over Linus's tree and fix an issue >>>> where windows guests will send iovecs with offset/lengths that result >>>> in IOs that are not aligned to 512. The LIO layer will then send them >>>> to Linux's FS/block layer but it requires 512 byte alignment, so >>>> depending on the FS/block driver being used we will get IO errors or >>>> hung IO. >>>> >>>> The following patches have vhost-scsi detect when windows sends these >>>> IOs and copy them to a bounce buffer. It then does some cleanup in >>>> the related code. >>> >>> Hang on, virtio-scsi is a SCSI HBA and READs/WRITEs submitted must >>> follow the usual constraints on SCSI block limits. Would Windows send >>> mis-aligned I/O to a non-virtio-scsi SCSI HBA? >> >> It's like linux where you can config settings like that. >> >>>> Are you sure this is not a bug in the Windows guest driver where block >>> limits are being misconfigured? >> >> From what our windows dev told us the guest drivers like here: >> >> https://github.com/virtio-win >> >> don't set the windows AlignmentMask to 512. They tried that and it >> resulted in windows crash dump crashing because it doesn't like the >> hard alignment requirement. >> >> We thought other apps would have trouble as well, so we tried to add >> bounce buffer support to the windows driver, but I think people thought >> it was going to be uglier than this patch and in the normal alignment >> case might also affect performance. There was some windows driver/layering >> and buffer/cmd details that I don't fully understand and took their word >> for because I don't know a lot about windows. >> >> In the end we still have to add checks to vhost-scsi to protect against >> bad drivers, so we thought we might as well just add bounce buffer support >> to vhost-scsi. > > CCing virtio-win developers so they can confirm how the vioscsi driver > is supposed to handle request alignment. > > My expectation is that the virtio-scsi device will fail mis-aligned I/O > requests. I don't think you can just change the driver's behavior to fail now, because apps send mis-aligned IO and its working as long as they have less than 256 bio vecs. We see mis-aligned IOs during boot and also from random non window's apps. If we just start to fail then it would be a regression when the app no longer works or the OS fails to start up.
On Wed, Jul 12, 2023 at 11:05:11AM -0500, Mike Christie wrote: > On 7/12/23 9:26 AM, Stefan Hajnoczi wrote: > > On Tue, Jul 11, 2023 at 04:01:22PM -0500, Mike Christie wrote: > >> On 7/11/23 1:34 PM, Stefan Hajnoczi wrote: > >>> On Sun, Jul 09, 2023 at 03:28:57PM -0500, Mike Christie wrote: > >>>> The following patches were made over Linus's tree and fix an issue > >>>> where windows guests will send iovecs with offset/lengths that result > >>>> in IOs that are not aligned to 512. The LIO layer will then send them > >>>> to Linux's FS/block layer but it requires 512 byte alignment, so > >>>> depending on the FS/block driver being used we will get IO errors or > >>>> hung IO. > >>>> > >>>> The following patches have vhost-scsi detect when windows sends these > >>>> IOs and copy them to a bounce buffer. It then does some cleanup in > >>>> the related code. > >>> > >>> Hang on, virtio-scsi is a SCSI HBA and READs/WRITEs submitted must > >>> follow the usual constraints on SCSI block limits. Would Windows send > >>> mis-aligned I/O to a non-virtio-scsi SCSI HBA? > >> > >> It's like linux where you can config settings like that. > >> > >>>> Are you sure this is not a bug in the Windows guest driver where block > >>> limits are being misconfigured? > >> > >> From what our windows dev told us the guest drivers like here: > >> > >> https://github.com/virtio-win > >> > >> don't set the windows AlignmentMask to 512. They tried that and it > >> resulted in windows crash dump crashing because it doesn't like the > >> hard alignment requirement. > >> > >> We thought other apps would have trouble as well, so we tried to add > >> bounce buffer support to the windows driver, but I think people thought > >> it was going to be uglier than this patch and in the normal alignment > >> case might also affect performance. There was some windows driver/layering > >> and buffer/cmd details that I don't fully understand and took their word > >> for because I don't know a lot about windows. > >> > >> In the end we still have to add checks to vhost-scsi to protect against > >> bad drivers, so we thought we might as well just add bounce buffer support > >> to vhost-scsi. > > > > CCing virtio-win developers so they can confirm how the vioscsi driver > > is supposed to handle request alignment. > > > > My expectation is that the virtio-scsi device will fail mis-aligned I/O > > requests. > > I don't think you can just change the driver's behavior to fail now, > because apps send mis-aligned IO and its working as long as they have less > than 256 bio vecs. > > We see mis-aligned IOs during boot and also from random non window's apps. > If we just start to fail then it would be a regression when the app no > longer works or the OS fails to start up. I was wrong: The virtio-scsi specification contains no alignment requirements for I/O buffers. It is fine for the driver to submit iovecs with any memory alignment. The QEMU code allocates a bounce buffer if the iovecs submitted by the driver do not match the minimum alignment requirements on the host (e.g. O_DIRECT requirements). It makes sense that vhost_scsi needs to use a bounce buffer in cases where the underlying storage has stricter memory alignment requirements. Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
On Thu, Jul 13, 2023 at 03:55:45PM +1000, Vadim Rozenfeld wrote: > Currently we use 4-byte alignmed (FILE_LONG_ALIGNMENT) in both Windows > virtio blk and scsi miniport drivers. > It shouldn't be a problem to change it to 512 by setting AlignmentMask > field of PORT_CONFIGURATION_INFORMATION structure > ( > https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/storport/ns-storport-_port_configuration_information > ) > to FILE_512_BYTE_ALIGNMENT. > I don't see any problem with changing the alignment parameter in our > drivers. But it will take us some time to test it properly. Hi Vadim, After looking at this again, I confused memory address/size alignment with request offset/length alignment. The virtio-scsi device does not have memory alignment constraints, so FILE_LONG_ALIGNMENT is okay and there's no need to change it. Thanks, Stefan