Message ID | 20210408081109.56537-2-mgurtovoy@nvidia.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2,1/3] virtio: update reset callback to return status | expand |
在 2021/4/8 下午4:11, Max Gurtovoy 写道: > According to the spec after writing 0 to device_status, the driver MUST > wait for a read of device_status to return 0 before reinitializing the > device. In case we have a device that won't return 0, the reset > operation will loop forever and cause the host/vm to stuck. Set timeout > for 3 minutes before giving up on the device. > > Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> > --- > drivers/virtio/virtio_pci_modern.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c > index cc3412a96a17..dcee616e8d21 100644 > --- a/drivers/virtio/virtio_pci_modern.c > +++ b/drivers/virtio/virtio_pci_modern.c > @@ -162,6 +162,7 @@ static int vp_reset(struct virtio_device *vdev) > { > struct virtio_pci_device *vp_dev = to_vp_device(vdev); > struct virtio_pci_modern_device *mdev = &vp_dev->mdev; > + unsigned long timeout = jiffies + msecs_to_jiffies(180000); > > /* 0 status means a reset. */ > vp_modern_set_status(mdev, 0); > @@ -169,9 +170,16 @@ static int vp_reset(struct virtio_device *vdev) > * device_status to return 0 before reinitializing the device. > * This will flush out the status write, and flush in device writes, > * including MSI-X interrupts, if any. > + * Set a timeout before giving up on the device. > */ > - while (vp_modern_get_status(mdev)) > + while (vp_modern_get_status(mdev)) { > + if (time_after(jiffies, timeout)) { What happens if the device finish the rest after the timeout? Thanks > + dev_err(&vdev->dev, "virtio: device not ready. " > + "Aborting. Try again later\n"); > + return -EAGAIN; > + } > msleep(1); > + } > /* Flush pending VQ/configuration callbacks. */ > vp_synchronize_vectors(vdev); > return 0;
On 4/8/2021 12:01 PM, Jason Wang wrote: > > 在 2021/4/8 下午4:11, Max Gurtovoy 写道: >> According to the spec after writing 0 to device_status, the driver MUST >> wait for a read of device_status to return 0 before reinitializing the >> device. In case we have a device that won't return 0, the reset >> operation will loop forever and cause the host/vm to stuck. Set timeout >> for 3 minutes before giving up on the device. >> >> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> >> --- >> drivers/virtio/virtio_pci_modern.c | 10 +++++++++- >> 1 file changed, 9 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/virtio/virtio_pci_modern.c >> b/drivers/virtio/virtio_pci_modern.c >> index cc3412a96a17..dcee616e8d21 100644 >> --- a/drivers/virtio/virtio_pci_modern.c >> +++ b/drivers/virtio/virtio_pci_modern.c >> @@ -162,6 +162,7 @@ static int vp_reset(struct virtio_device *vdev) >> { >> struct virtio_pci_device *vp_dev = to_vp_device(vdev); >> struct virtio_pci_modern_device *mdev = &vp_dev->mdev; >> + unsigned long timeout = jiffies + msecs_to_jiffies(180000); >> /* 0 status means a reset. */ >> vp_modern_set_status(mdev, 0); >> @@ -169,9 +170,16 @@ static int vp_reset(struct virtio_device *vdev) >> * device_status to return 0 before reinitializing the device. >> * This will flush out the status write, and flush in device >> writes, >> * including MSI-X interrupts, if any. >> + * Set a timeout before giving up on the device. >> */ >> - while (vp_modern_get_status(mdev)) >> + while (vp_modern_get_status(mdev)) { >> + if (time_after(jiffies, timeout)) { > > > What happens if the device finish the rest after the timeout? The driver will set VIRTIO_CONFIG_S_FAILED and one can re-probe it later on (e.g by re-scanning the pci bus). > > Thanks > > >> + dev_err(&vdev->dev, "virtio: device not ready. " >> + "Aborting. Try again later\n"); >> + return -EAGAIN; >> + } >> msleep(1); >> + } >> /* Flush pending VQ/configuration callbacks. */ >> vp_synchronize_vectors(vdev); >> return 0; >
在 2021/4/8 下午5:44, Max Gurtovoy 写道: > > On 4/8/2021 12:01 PM, Jason Wang wrote: >> >> 在 2021/4/8 下午4:11, Max Gurtovoy 写道: >>> According to the spec after writing 0 to device_status, the driver MUST >>> wait for a read of device_status to return 0 before reinitializing the >>> device. In case we have a device that won't return 0, the reset >>> operation will loop forever and cause the host/vm to stuck. Set timeout >>> for 3 minutes before giving up on the device. >>> >>> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> >>> --- >>> drivers/virtio/virtio_pci_modern.c | 10 +++++++++- >>> 1 file changed, 9 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/virtio/virtio_pci_modern.c >>> b/drivers/virtio/virtio_pci_modern.c >>> index cc3412a96a17..dcee616e8d21 100644 >>> --- a/drivers/virtio/virtio_pci_modern.c >>> +++ b/drivers/virtio/virtio_pci_modern.c >>> @@ -162,6 +162,7 @@ static int vp_reset(struct virtio_device *vdev) >>> { >>> struct virtio_pci_device *vp_dev = to_vp_device(vdev); >>> struct virtio_pci_modern_device *mdev = &vp_dev->mdev; >>> + unsigned long timeout = jiffies + msecs_to_jiffies(180000); >>> /* 0 status means a reset. */ >>> vp_modern_set_status(mdev, 0); >>> @@ -169,9 +170,16 @@ static int vp_reset(struct virtio_device *vdev) >>> * device_status to return 0 before reinitializing the device. >>> * This will flush out the status write, and flush in device >>> writes, >>> * including MSI-X interrupts, if any. >>> + * Set a timeout before giving up on the device. >>> */ >>> - while (vp_modern_get_status(mdev)) >>> + while (vp_modern_get_status(mdev)) { >>> + if (time_after(jiffies, timeout)) { >> >> >> What happens if the device finish the rest after the timeout? > > > The driver will set VIRTIO_CONFIG_S_FAILED and one can re-probe it > later on (e.g by re-scanning the pci bus). Ok, so do we need the flush through vp_synchronize_vectors() here? Thanks > > >> >> Thanks >> >> >>> + dev_err(&vdev->dev, "virtio: device not ready. " >>> + "Aborting. Try again later\n"); >>> + return -EAGAIN; >>> + } >>> msleep(1); >>> + } >>> /* Flush pending VQ/configuration callbacks. */ >>> vp_synchronize_vectors(vdev); >>> return 0; >> >
On 4/8/2021 3:45 PM, Jason Wang wrote: > > 在 2021/4/8 下午5:44, Max Gurtovoy 写道: >> >> On 4/8/2021 12:01 PM, Jason Wang wrote: >>> >>> 在 2021/4/8 下午4:11, Max Gurtovoy 写道: >>>> According to the spec after writing 0 to device_status, the driver >>>> MUST >>>> wait for a read of device_status to return 0 before reinitializing the >>>> device. In case we have a device that won't return 0, the reset >>>> operation will loop forever and cause the host/vm to stuck. Set >>>> timeout >>>> for 3 minutes before giving up on the device. >>>> >>>> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> >>>> --- >>>> drivers/virtio/virtio_pci_modern.c | 10 +++++++++- >>>> 1 file changed, 9 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/drivers/virtio/virtio_pci_modern.c >>>> b/drivers/virtio/virtio_pci_modern.c >>>> index cc3412a96a17..dcee616e8d21 100644 >>>> --- a/drivers/virtio/virtio_pci_modern.c >>>> +++ b/drivers/virtio/virtio_pci_modern.c >>>> @@ -162,6 +162,7 @@ static int vp_reset(struct virtio_device *vdev) >>>> { >>>> struct virtio_pci_device *vp_dev = to_vp_device(vdev); >>>> struct virtio_pci_modern_device *mdev = &vp_dev->mdev; >>>> + unsigned long timeout = jiffies + msecs_to_jiffies(180000); >>>> /* 0 status means a reset. */ >>>> vp_modern_set_status(mdev, 0); >>>> @@ -169,9 +170,16 @@ static int vp_reset(struct virtio_device *vdev) >>>> * device_status to return 0 before reinitializing the device. >>>> * This will flush out the status write, and flush in device >>>> writes, >>>> * including MSI-X interrupts, if any. >>>> + * Set a timeout before giving up on the device. >>>> */ >>>> - while (vp_modern_get_status(mdev)) >>>> + while (vp_modern_get_status(mdev)) { >>>> + if (time_after(jiffies, timeout)) { >>> >>> >>> What happens if the device finish the rest after the timeout? >> >> >> The driver will set VIRTIO_CONFIG_S_FAILED and one can re-probe it >> later on (e.g by re-scanning the pci bus). > > > Ok, so do we need the flush through vp_synchronize_vectors() here? If the device didn't write 0 to status I guess we don't need that. The device shouldn't raise any interrupt before negotiation finish successfully. MST, is that correct ? > > Thanks > > >> >> >>> >>> Thanks >>> >>> >>>> + dev_err(&vdev->dev, "virtio: device not ready. " >>>> + "Aborting. Try again later\n"); >>>> + return -EAGAIN; >>>> + } >>>> msleep(1); >>>> + } >>>> /* Flush pending VQ/configuration callbacks. */ >>>> vp_synchronize_vectors(vdev); >>>> return 0; >>> >> >
在 2021/4/8 下午8:57, Max Gurtovoy 写道: > > On 4/8/2021 3:45 PM, Jason Wang wrote: >> >> 在 2021/4/8 下午5:44, Max Gurtovoy 写道: >>> >>> On 4/8/2021 12:01 PM, Jason Wang wrote: >>>> >>>> 在 2021/4/8 下午4:11, Max Gurtovoy 写道: >>>>> According to the spec after writing 0 to device_status, the driver >>>>> MUST >>>>> wait for a read of device_status to return 0 before reinitializing >>>>> the >>>>> device. In case we have a device that won't return 0, the reset >>>>> operation will loop forever and cause the host/vm to stuck. Set >>>>> timeout >>>>> for 3 minutes before giving up on the device. >>>>> >>>>> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> >>>>> --- >>>>> drivers/virtio/virtio_pci_modern.c | 10 +++++++++- >>>>> 1 file changed, 9 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/drivers/virtio/virtio_pci_modern.c >>>>> b/drivers/virtio/virtio_pci_modern.c >>>>> index cc3412a96a17..dcee616e8d21 100644 >>>>> --- a/drivers/virtio/virtio_pci_modern.c >>>>> +++ b/drivers/virtio/virtio_pci_modern.c >>>>> @@ -162,6 +162,7 @@ static int vp_reset(struct virtio_device *vdev) >>>>> { >>>>> struct virtio_pci_device *vp_dev = to_vp_device(vdev); >>>>> struct virtio_pci_modern_device *mdev = &vp_dev->mdev; >>>>> + unsigned long timeout = jiffies + msecs_to_jiffies(180000); >>>>> /* 0 status means a reset. */ >>>>> vp_modern_set_status(mdev, 0); >>>>> @@ -169,9 +170,16 @@ static int vp_reset(struct virtio_device *vdev) >>>>> * device_status to return 0 before reinitializing the device. >>>>> * This will flush out the status write, and flush in device >>>>> writes, >>>>> * including MSI-X interrupts, if any. >>>>> + * Set a timeout before giving up on the device. >>>>> */ >>>>> - while (vp_modern_get_status(mdev)) >>>>> + while (vp_modern_get_status(mdev)) { >>>>> + if (time_after(jiffies, timeout)) { >>>> >>>> >>>> What happens if the device finish the rest after the timeout? >>> >>> >>> The driver will set VIRTIO_CONFIG_S_FAILED and one can re-probe it >>> later on (e.g by re-scanning the pci bus). >> >> >> Ok, so do we need the flush through vp_synchronize_vectors() here? > > If the device didn't write 0 to status I guess we don't need that. > > The device shouldn't raise any interrupt before negotiation finish > successfully. The reset could be triggered in other places like driver removing. Thanks > > MST, is that correct ? > >> >> Thanks >> >> >>> >>> >>>> >>>> Thanks >>>> >>>> >>>>> + dev_err(&vdev->dev, "virtio: device not ready. " >>>>> + "Aborting. Try again later\n"); >>>>> + return -EAGAIN; >>>>> + } >>>>> msleep(1); >>>>> + } >>>>> /* Flush pending VQ/configuration callbacks. */ >>>>> vp_synchronize_vectors(vdev); >>>>> return 0; >>>> >>> >> >
diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c index cc3412a96a17..dcee616e8d21 100644 --- a/drivers/virtio/virtio_pci_modern.c +++ b/drivers/virtio/virtio_pci_modern.c @@ -162,6 +162,7 @@ static int vp_reset(struct virtio_device *vdev) { struct virtio_pci_device *vp_dev = to_vp_device(vdev); struct virtio_pci_modern_device *mdev = &vp_dev->mdev; + unsigned long timeout = jiffies + msecs_to_jiffies(180000); /* 0 status means a reset. */ vp_modern_set_status(mdev, 0); @@ -169,9 +170,16 @@ static int vp_reset(struct virtio_device *vdev) * device_status to return 0 before reinitializing the device. * This will flush out the status write, and flush in device writes, * including MSI-X interrupts, if any. + * Set a timeout before giving up on the device. */ - while (vp_modern_get_status(mdev)) + while (vp_modern_get_status(mdev)) { + if (time_after(jiffies, timeout)) { + dev_err(&vdev->dev, "virtio: device not ready. " + "Aborting. Try again later\n"); + return -EAGAIN; + } msleep(1); + } /* Flush pending VQ/configuration callbacks. */ vp_synchronize_vectors(vdev); return 0;
According to the spec after writing 0 to device_status, the driver MUST wait for a read of device_status to return 0 before reinitializing the device. In case we have a device that won't return 0, the reset operation will loop forever and cause the host/vm to stuck. Set timeout for 3 minutes before giving up on the device. Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> --- drivers/virtio/virtio_pci_modern.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)