diff mbox series

[2/2] virtio_pmem: set device ready in probe()

Message ID 20220620081519.1494-2-jasowang@redhat.com (mailing list archive)
State Superseded
Headers show
Series [1/2] virtio_pmem: initialize provider_data through nd_region_desc | expand

Commit Message

Jason Wang June 20, 2022, 8:15 a.m. UTC
The NVDIMM region could be available before the virtio_device_ready()
that is called by virtio_dev_probe(). This means the driver tries to
use device before DRIVER_OK which violates the spec, fixing this by
set device ready before the nvdimm_pmem_region_create().

Note that this means the virtio_pmem_host_ack() could be triggered
before the creation of the nd region, this is safe since the
virtio_pmem_host_ack() since pmem_lock has been initialized and we
check if we've added any buffer before trying to proceed.

Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

Comments

Michael S. Tsirkin June 20, 2022, 8:32 a.m. UTC | #1
I think you should CC the maintainer, Pankaj Gupta.

On Mon, Jun 20, 2022 at 04:15:19PM +0800, Jason Wang wrote:
> The NVDIMM region could be available before the virtio_device_ready()
> that is called by virtio_dev_probe(). This means the driver tries to
> use device before DRIVER_OK which violates the spec, fixing this by
> set device ready before the nvdimm_pmem_region_create().
> 
> Note that this means the virtio_pmem_host_ack() could be triggered
> before the creation of the nd region, this is safe since the
> virtio_pmem_host_ack() since pmem_lock has been initialized and we
> check if we've added any buffer before trying to proceed.
> 
> Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> index 48f8327d0431..173f2f5adaea 100644
> --- a/drivers/nvdimm/virtio_pmem.c
> +++ b/drivers/nvdimm/virtio_pmem.c
> @@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
>  	ndr_desc.provider_data = vdev;
>  	set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
>  	set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> +	/*
> +	 * The NVDIMM region could be available before the
> +	 * virtio_device_ready() that is called by
> +	 * virtio_dev_probe(), so we set device ready here.
> +	 *

virtio_dev_probe is not to blame here, right?
I don't like copying its logic here as we won't remember to fix
it if we change virtio_dev_probe to e.g. not call virtio_device_ready.

is it nvdimm_pmem_region_create what makes it possible for
the region to become available?
Then "The NVDIMM region could become available immediately
after the call to nvdimm_pmem_region_create.
Tell device we are ready to handle this case."

> +	 * The callback - virtio_pmem_host_ack() is safe to be called
> +	 * before the nvdimm_pmem_region_create() since the pmem_lock
> +	 * has been initialized and legality of a used buffer is
> +	 * validated before moving forward.
> +	 */
> +	virtio_device_ready(vdev);
>  	nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
>  	if (!nd_region) {
>  		dev_err(&vdev->dev, "failed to create nvdimm region\n");
> @@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
>  	}
>  	return 0;
>  out_nd:
> +	virtio_reset_device(vdev);


Does this fix cleanup too?

>  	nvdimm_bus_unregister(vpmem->nvdimm_bus);
>  out_vq:
>  	vdev->config->del_vqs(vdev);
> -- 
> 2.25.1
Jason Wang June 20, 2022, 8:39 a.m. UTC | #2
On Mon, Jun 20, 2022 at 4:32 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> I think you should CC the maintainer, Pankaj Gupta.

Yes, I miss him accidentally.

>
> On Mon, Jun 20, 2022 at 04:15:19PM +0800, Jason Wang wrote:
> > The NVDIMM region could be available before the virtio_device_ready()
> > that is called by virtio_dev_probe(). This means the driver tries to
> > use device before DRIVER_OK which violates the spec, fixing this by
> > set device ready before the nvdimm_pmem_region_create().
> >
> > Note that this means the virtio_pmem_host_ack() could be triggered
> > before the creation of the nd region, this is safe since the
> > virtio_pmem_host_ack() since pmem_lock has been initialized and we
> > check if we've added any buffer before trying to proceed.
> >
> > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > ---
> >  drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
> >  1 file changed, 12 insertions(+)
> >
> > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > index 48f8327d0431..173f2f5adaea 100644
> > --- a/drivers/nvdimm/virtio_pmem.c
> > +++ b/drivers/nvdimm/virtio_pmem.c
> > @@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> >       ndr_desc.provider_data = vdev;
> >       set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> >       set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> > +     /*
> > +      * The NVDIMM region could be available before the
> > +      * virtio_device_ready() that is called by
> > +      * virtio_dev_probe(), so we set device ready here.
> > +      *
>
> virtio_dev_probe is not to blame here, right?

Yes and actually it's not to blame, it just describes what can happen now.

> I don't like copying its logic here as we won't remember to fix
> it if we change virtio_dev_probe to e.g. not call virtio_device_ready.
>
> is it nvdimm_pmem_region_create what makes it possible for
> the region to become available?

I think so.

> Then "The NVDIMM region could become available immediately
> after the call to nvdimm_pmem_region_create.
> Tell device we are ready to handle this case."

That's fine.

>
> > +      * The callback - virtio_pmem_host_ack() is safe to be called
> > +      * before the nvdimm_pmem_region_create() since the pmem_lock
> > +      * has been initialized and legality of a used buffer is
> > +      * validated before moving forward.
> > +      */
> > +     virtio_device_ready(vdev);
> >       nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> >       if (!nd_region) {
> >               dev_err(&vdev->dev, "failed to create nvdimm region\n");
> > @@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> >       }
> >       return 0;
> >  out_nd:
> > +     virtio_reset_device(vdev);
>
>
> Does this fix cleanup too?

Not sure I get this, we make the device ready before
nvdimm_pmem_region_create(), so we need to reset if
nvdimm_pmem_region_create() fails?

Thanks

>
> >       nvdimm_bus_unregister(vpmem->nvdimm_bus);
> >  out_vq:
> >       vdev->config->del_vqs(vdev);
> > --
> > 2.25.1
>
Michael S. Tsirkin June 20, 2022, 8:53 a.m. UTC | #3
On Mon, Jun 20, 2022 at 04:39:27PM +0800, Jason Wang wrote:
> On Mon, Jun 20, 2022 at 4:32 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > I think you should CC the maintainer, Pankaj Gupta.
> 
> Yes, I miss him accidentally.
> 
> >
> > On Mon, Jun 20, 2022 at 04:15:19PM +0800, Jason Wang wrote:
> > > The NVDIMM region could be available before the virtio_device_ready()
> > > that is called by virtio_dev_probe(). This means the driver tries to
> > > use device before DRIVER_OK which violates the spec, fixing this by
> > > set device ready before the nvdimm_pmem_region_create().
> > >
> > > Note that this means the virtio_pmem_host_ack() could be triggered
> > > before the creation of the nd region, this is safe since the
> > > virtio_pmem_host_ack() since pmem_lock has been initialized and we
> > > check if we've added any buffer before trying to proceed.
> > >
> > > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > ---
> > >  drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
> > >  1 file changed, 12 insertions(+)
> > >
> > > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > > index 48f8327d0431..173f2f5adaea 100644
> > > --- a/drivers/nvdimm/virtio_pmem.c
> > > +++ b/drivers/nvdimm/virtio_pmem.c
> > > @@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > >       ndr_desc.provider_data = vdev;
> > >       set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> > >       set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> > > +     /*
> > > +      * The NVDIMM region could be available before the
> > > +      * virtio_device_ready() that is called by
> > > +      * virtio_dev_probe(), so we set device ready here.
> > > +      *
> >
> > virtio_dev_probe is not to blame here, right?
> 
> Yes and actually it's not to blame, it just describes what can happen now.
> 
> > I don't like copying its logic here as we won't remember to fix
> > it if we change virtio_dev_probe to e.g. not call virtio_device_ready.
> >
> > is it nvdimm_pmem_region_create what makes it possible for
> > the region to become available?
> 
> I think so.
> 
> > Then "The NVDIMM region could become available immediately
> > after the call to nvdimm_pmem_region_create.
> > Tell device we are ready to handle this case."
> 
> That's fine.
> 
> >
> > > +      * The callback - virtio_pmem_host_ack() is safe to be called
> > > +      * before the nvdimm_pmem_region_create() since the pmem_lock
> > > +      * has been initialized and legality of a used buffer is
> > > +      * validated before moving forward.
> > > +      */
> > > +     virtio_device_ready(vdev);
> > >       nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> > >       if (!nd_region) {
> > >               dev_err(&vdev->dev, "failed to create nvdimm region\n");
> > > @@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > >       }
> > >       return 0;
> > >  out_nd:
> > > +     virtio_reset_device(vdev);
> >
> >
> > Does this fix cleanup too?
> 
> Not sure I get this, we make the device ready before
> nvdimm_pmem_region_create(), so we need to reset if
> nvdimm_pmem_region_create() fails?
> 
> Thanks

Oh, right.

> >
> > >       nvdimm_bus_unregister(vpmem->nvdimm_bus);
> > >  out_vq:
> > >       vdev->config->del_vqs(vdev);
> > > --
> > > 2.25.1
> >
Pankaj Gupta June 21, 2022, 12:34 p.m. UTC | #4
> The NVDIMM region could be available before the virtio_device_ready()
> that is called by virtio_dev_probe(). This means the driver tries to
> use device before DRIVER_OK which violates the spec, fixing this by
> set device ready before the nvdimm_pmem_region_create().
>
> Note that this means the virtio_pmem_host_ack() could be triggered
> before the creation of the nd region, this is safe since the
> virtio_pmem_host_ack() since pmem_lock has been initialized and we
> check if we've added any buffer before trying to proceed.
>
> Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
>
> diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> index 48f8327d0431..173f2f5adaea 100644
> --- a/drivers/nvdimm/virtio_pmem.c
> +++ b/drivers/nvdimm/virtio_pmem.c
> @@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
>         ndr_desc.provider_data = vdev;
>         set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
>         set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> +       /*
> +        * The NVDIMM region could be available before the
> +        * virtio_device_ready() that is called by
> +        * virtio_dev_probe(), so we set device ready here.
> +        *
> +        * The callback - virtio_pmem_host_ack() is safe to be called
> +        * before the nvdimm_pmem_region_create() since the pmem_lock
> +        * has been initialized and legality of a used buffer is
> +        * validated before moving forward.
> +        */
> +       virtio_device_ready(vdev);
>         nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
>         if (!nd_region) {
>                 dev_err(&vdev->dev, "failed to create nvdimm region\n");
> @@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
>         }
>         return 0;
>  out_nd:
> +       virtio_reset_device(vdev);
>         nvdimm_bus_unregister(vpmem->nvdimm_bus);
>  out_vq:
>         vdev->config->del_vqs(vdev);

IIRC Similar fix was submitted by msft in the past while proposing support for
PCI BAR with virtio pmem and I tested it. Feel free to add.

Acked-by: Pankaj Gupta <pankaj.gupta@amd.com>
Dan Williams June 21, 2022, 10:38 p.m. UTC | #5
Jason Wang wrote:
> The NVDIMM region could be available before the virtio_device_ready()
> that is called by virtio_dev_probe(). This means the driver tries to
> use device before DRIVER_OK which violates the spec, fixing this by
> set device ready before the nvdimm_pmem_region_create().

Can you clarify the failure path. What race is virtio_device_ready()
losing?

> 
> Note that this means the virtio_pmem_host_ack() could be triggered
> before the creation of the nd region, this is safe since the
> virtio_pmem_host_ack() since pmem_lock has been initialized and we
> check if we've added any buffer before trying to proceed.

I got a little bit lost with the usage of "we" here. Can you clarify
which function / context is making which guarantee?

> 
> Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> index 48f8327d0431..173f2f5adaea 100644
> --- a/drivers/nvdimm/virtio_pmem.c
> +++ b/drivers/nvdimm/virtio_pmem.c
> @@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
>  	ndr_desc.provider_data = vdev;
>  	set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
>  	set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> +	/*
> +	 * The NVDIMM region could be available before the
> +	 * virtio_device_ready() that is called by
> +	 * virtio_dev_probe(), so we set device ready here.
> +	 *
> +	 * The callback - virtio_pmem_host_ack() is safe to be called
> +	 * before the nvdimm_pmem_region_create() since the pmem_lock
> +	 * has been initialized and legality of a used buffer is
> +	 * validated before moving forward.

This comment feels like changelog material. Just document why
virtio_device_ready() must be called before device_add() of the
nd_region.

> +	 */
> +	virtio_device_ready(vdev);
>  	nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
>  	if (!nd_region) {
>  		dev_err(&vdev->dev, "failed to create nvdimm region\n");
> @@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
>  	}
>  	return 0;
>  out_nd:
> +	virtio_reset_device(vdev);
>  	nvdimm_bus_unregister(vpmem->nvdimm_bus);
>  out_vq:
>  	vdev->config->del_vqs(vdev);
> -- 
> 2.25.1
>
Jason Wang June 22, 2022, 3:34 a.m. UTC | #6
On Wed, Jun 22, 2022 at 6:38 AM Dan Williams <dan.j.williams@intel.com> wrote:
>
> Jason Wang wrote:
> > The NVDIMM region could be available before the virtio_device_ready()
> > that is called by virtio_dev_probe(). This means the driver tries to
> > use device before DRIVER_OK which violates the spec, fixing this by
> > set device ready before the nvdimm_pmem_region_create().
>
> Can you clarify the failure path. What race is virtio_device_ready()
> losing?

So it's something like this:

1) virtio_device_ready() will set DRIVER_OK to the device.
2) virtio spec disallow device to process a virtqueue without DRIVER_OK to set

But the nd_region is available to user after nd_region_create(), and a
flush could be issued before virtio_device_ready().

This means the hypervisor gets a kick on the virtqueue before
DRIVER_OK. The hypervisor should choose not to respond to that kick
according to the spec. This will result in infinite wait in
virtio_pmem_flush().

Fortunately, qemu doesn't check DRIVER_OK and can process the
virtqueue without DRIVER_OK (which is kind of a spec violation), so we
survive for the past few years. But there's no guarantee it can work
for other hypervisor.

So we need to set DRIVER_OK before nd_region_create() to make sure flush works.

>
> >
> > Note that this means the virtio_pmem_host_ack() could be triggered
> > before the creation of the nd region, this is safe since the
> > virtio_pmem_host_ack() since pmem_lock has been initialized and we
> > check if we've added any buffer before trying to proceed.
>
> I got a little bit lost with the usage of "we" here. Can you clarify
> which function / context is making which guarantee?

By "we" I meant the callback for the req_vq that is virtio_pmem_host_ack().

If we do virtio_device_ready() before nd_region_create(). A buggy or
malicious hypervisor can raise the notification before
nd_region_create(). We need to make sure virtio_pmem_host_ack() can
survive from this. And since we've checked whether we've submitted any
request before, so in the case where guest memory is protected from
the host, we're safe here.

>
> >
> > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > ---
> >  drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
> >  1 file changed, 12 insertions(+)
> >
> > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > index 48f8327d0431..173f2f5adaea 100644
> > --- a/drivers/nvdimm/virtio_pmem.c
> > +++ b/drivers/nvdimm/virtio_pmem.c
> > @@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> >       ndr_desc.provider_data = vdev;
> >       set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> >       set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> > +     /*
> > +      * The NVDIMM region could be available before the
> > +      * virtio_device_ready() that is called by
> > +      * virtio_dev_probe(), so we set device ready here.
> > +      *
> > +      * The callback - virtio_pmem_host_ack() is safe to be called
> > +      * before the nvdimm_pmem_region_create() since the pmem_lock
> > +      * has been initialized and legality of a used buffer is
> > +      * validated before moving forward.
>
> This comment feels like changelog material.

I had this in the changelog:

> > Note that this means the virtio_pmem_host_ack() could be triggered
> > before the creation of the nd region, this is safe since the
> > virtio_pmem_host_ack() since pmem_lock has been initialized and we
> > check if we've added any buffer before trying to proceed.

> Just document why
> virtio_device_ready() must be called before device_add() of the
> nd_region.

This comment wants to explain the side effect of having
virtio_device_ready() before nvdimm_pmem_region_create() and why we
can survive from that.

Thanks

>
> > +      */
> > +     virtio_device_ready(vdev);
> >       nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> >       if (!nd_region) {
> >               dev_err(&vdev->dev, "failed to create nvdimm region\n");
> > @@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> >       }
> >       return 0;
> >  out_nd:
> > +     virtio_reset_device(vdev);
> >       nvdimm_bus_unregister(vpmem->nvdimm_bus);
> >  out_vq:
> >       vdev->config->del_vqs(vdev);
> > --
> > 2.25.1
> >
>
Michael S. Tsirkin June 22, 2022, 6:29 a.m. UTC | #7
On Tue, Jun 21, 2022 at 03:38:35PM -0700, Dan Williams wrote:
> Jason Wang wrote:
> > The NVDIMM region could be available before the virtio_device_ready()
> > that is called by virtio_dev_probe(). This means the driver tries to
> > use device before DRIVER_OK which violates the spec, fixing this by
> > set device ready before the nvdimm_pmem_region_create().
> 
> Can you clarify the failure path. What race is virtio_device_ready()
> losing?
> 
> > 
> > Note that this means the virtio_pmem_host_ack() could be triggered
> > before the creation of the nd region, this is safe since the
> > virtio_pmem_host_ack() since pmem_lock has been initialized and we
> > check if we've added any buffer before trying to proceed.
> 
> I got a little bit lost with the usage of "we" here. Can you clarify
> which function / context is making which guarantee?
> 
> > 
> > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > ---
> >  drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
> >  1 file changed, 12 insertions(+)
> > 
> > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > index 48f8327d0431..173f2f5adaea 100644
> > --- a/drivers/nvdimm/virtio_pmem.c
> > +++ b/drivers/nvdimm/virtio_pmem.c
> > @@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> >  	ndr_desc.provider_data = vdev;
> >  	set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> >  	set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> > +	/*
> > +	 * The NVDIMM region could be available before the
> > +	 * virtio_device_ready() that is called by
> > +	 * virtio_dev_probe(), so we set device ready here.
> > +	 *
> > +	 * The callback - virtio_pmem_host_ack() is safe to be called
> > +	 * before the nvdimm_pmem_region_create() since the pmem_lock
> > +	 * has been initialized and legality of a used buffer is
> > +	 * validated before moving forward.
> 
> This comment feels like changelog material. Just document why
> virtio_device_ready() must be called before device_add() of the
> nd_region.

Agree here. More specifically if you are documenting why is it
safe to invoke each callback then that belongs to the callback itself.

> > +	 */
> > +	virtio_device_ready(vdev);
> >  	nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> >  	if (!nd_region) {
> >  		dev_err(&vdev->dev, "failed to create nvdimm region\n");
> > @@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> >  	}
> >  	return 0;
> >  out_nd:
> > +	virtio_reset_device(vdev);
> >  	nvdimm_bus_unregister(vpmem->nvdimm_bus);
> >  out_vq:
> >  	vdev->config->del_vqs(vdev);
> > -- 
> > 2.25.1
> >
Jason Wang June 22, 2022, 7:24 a.m. UTC | #8
On Wed, Jun 22, 2022 at 2:29 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Tue, Jun 21, 2022 at 03:38:35PM -0700, Dan Williams wrote:
> > Jason Wang wrote:
> > > The NVDIMM region could be available before the virtio_device_ready()
> > > that is called by virtio_dev_probe(). This means the driver tries to
> > > use device before DRIVER_OK which violates the spec, fixing this by
> > > set device ready before the nvdimm_pmem_region_create().
> >
> > Can you clarify the failure path. What race is virtio_device_ready()
> > losing?
> >
> > >
> > > Note that this means the virtio_pmem_host_ack() could be triggered
> > > before the creation of the nd region, this is safe since the
> > > virtio_pmem_host_ack() since pmem_lock has been initialized and we
> > > check if we've added any buffer before trying to proceed.
> >
> > I got a little bit lost with the usage of "we" here. Can you clarify
> > which function / context is making which guarantee?
> >
> > >
> > > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > ---
> > >  drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
> > >  1 file changed, 12 insertions(+)
> > >
> > > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > > index 48f8327d0431..173f2f5adaea 100644
> > > --- a/drivers/nvdimm/virtio_pmem.c
> > > +++ b/drivers/nvdimm/virtio_pmem.c
> > > @@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > >     ndr_desc.provider_data = vdev;
> > >     set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> > >     set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> > > +   /*
> > > +    * The NVDIMM region could be available before the
> > > +    * virtio_device_ready() that is called by
> > > +    * virtio_dev_probe(), so we set device ready here.
> > > +    *
> > > +    * The callback - virtio_pmem_host_ack() is safe to be called
> > > +    * before the nvdimm_pmem_region_create() since the pmem_lock
> > > +    * has been initialized and legality of a used buffer is
> > > +    * validated before moving forward.
> >
> > This comment feels like changelog material. Just document why
> > virtio_device_ready() must be called before device_add() of the
> > nd_region.
>
> Agree here. More specifically if you are documenting why is it
> safe to invoke each callback then that belongs to the callback itself.

Ok, so I will move it to the callback and leave a simple comment like

" See comment in virtio_pmem_host_ack(), it is safe to be called
before nvdimm_pmem_region_create()"

Thanks

>
> > > +    */
> > > +   virtio_device_ready(vdev);
> > >     nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> > >     if (!nd_region) {
> > >             dev_err(&vdev->dev, "failed to create nvdimm region\n");
> > > @@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > >     }
> > >     return 0;
> > >  out_nd:
> > > +   virtio_reset_device(vdev);
> > >     nvdimm_bus_unregister(vpmem->nvdimm_bus);
> > >  out_vq:
> > >     vdev->config->del_vqs(vdev);
> > > --
> > > 2.25.1
> > >
>
Michael S. Tsirkin June 22, 2022, 12:31 p.m. UTC | #9
On Wed, Jun 22, 2022 at 03:24:15PM +0800, Jason Wang wrote:
> On Wed, Jun 22, 2022 at 2:29 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Tue, Jun 21, 2022 at 03:38:35PM -0700, Dan Williams wrote:
> > > Jason Wang wrote:
> > > > The NVDIMM region could be available before the virtio_device_ready()
> > > > that is called by virtio_dev_probe(). This means the driver tries to
> > > > use device before DRIVER_OK which violates the spec, fixing this by
> > > > set device ready before the nvdimm_pmem_region_create().
> > >
> > > Can you clarify the failure path. What race is virtio_device_ready()
> > > losing?
> > >
> > > >
> > > > Note that this means the virtio_pmem_host_ack() could be triggered
> > > > before the creation of the nd region, this is safe since the
> > > > virtio_pmem_host_ack() since pmem_lock has been initialized and we
> > > > check if we've added any buffer before trying to proceed.
> > >
> > > I got a little bit lost with the usage of "we" here. Can you clarify
> > > which function / context is making which guarantee?
> > >
> > > >
> > > > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > ---
> > > >  drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
> > > >  1 file changed, 12 insertions(+)
> > > >
> > > > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > > > index 48f8327d0431..173f2f5adaea 100644
> > > > --- a/drivers/nvdimm/virtio_pmem.c
> > > > +++ b/drivers/nvdimm/virtio_pmem.c
> > > > @@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > > >     ndr_desc.provider_data = vdev;
> > > >     set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> > > >     set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> > > > +   /*
> > > > +    * The NVDIMM region could be available before the
> > > > +    * virtio_device_ready() that is called by
> > > > +    * virtio_dev_probe(), so we set device ready here.
> > > > +    *
> > > > +    * The callback - virtio_pmem_host_ack() is safe to be called
> > > > +    * before the nvdimm_pmem_region_create() since the pmem_lock
> > > > +    * has been initialized and legality of a used buffer is
> > > > +    * validated before moving forward.
> > >
> > > This comment feels like changelog material. Just document why
> > > virtio_device_ready() must be called before device_add() of the
> > > nd_region.
> >
> > Agree here. More specifically if you are documenting why is it
> > safe to invoke each callback then that belongs to the callback itself.
> 
> Ok, so I will move it to the callback and leave a simple comment like
> 
> " See comment in virtio_pmem_host_ack(), it is safe to be called
> before nvdimm_pmem_region_create()"
> 
> Thanks

No, just document why virtio_device_ready() must be called before device_add()

I don't think the idea of working around these issues by adding code
to  virtio_device_ready worked so far, not at all sure this approach
is here to stay.


> >
> > > > +    */
> > > > +   virtio_device_ready(vdev);
> > > >     nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> > > >     if (!nd_region) {
> > > >             dev_err(&vdev->dev, "failed to create nvdimm region\n");
> > > > @@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > > >     }
> > > >     return 0;
> > > >  out_nd:
> > > > +   virtio_reset_device(vdev);
> > > >     nvdimm_bus_unregister(vpmem->nvdimm_bus);
> > > >  out_vq:
> > > >     vdev->config->del_vqs(vdev);
> > > > --
> > > > 2.25.1
> > > >
> >
Jason Wang June 23, 2022, 1:29 a.m. UTC | #10
On Wed, Jun 22, 2022 at 8:31 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Jun 22, 2022 at 03:24:15PM +0800, Jason Wang wrote:
> > On Wed, Jun 22, 2022 at 2:29 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Tue, Jun 21, 2022 at 03:38:35PM -0700, Dan Williams wrote:
> > > > Jason Wang wrote:
> > > > > The NVDIMM region could be available before the virtio_device_ready()
> > > > > that is called by virtio_dev_probe(). This means the driver tries to
> > > > > use device before DRIVER_OK which violates the spec, fixing this by
> > > > > set device ready before the nvdimm_pmem_region_create().
> > > >
> > > > Can you clarify the failure path. What race is virtio_device_ready()
> > > > losing?
> > > >
> > > > >
> > > > > Note that this means the virtio_pmem_host_ack() could be triggered
> > > > > before the creation of the nd region, this is safe since the
> > > > > virtio_pmem_host_ack() since pmem_lock has been initialized and we
> > > > > check if we've added any buffer before trying to proceed.
> > > >
> > > > I got a little bit lost with the usage of "we" here. Can you clarify
> > > > which function / context is making which guarantee?
> > > >
> > > > >
> > > > > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > ---
> > > > >  drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
> > > > >  1 file changed, 12 insertions(+)
> > > > >
> > > > > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > > > > index 48f8327d0431..173f2f5adaea 100644
> > > > > --- a/drivers/nvdimm/virtio_pmem.c
> > > > > +++ b/drivers/nvdimm/virtio_pmem.c
> > > > > @@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > > > >     ndr_desc.provider_data = vdev;
> > > > >     set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> > > > >     set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> > > > > +   /*
> > > > > +    * The NVDIMM region could be available before the
> > > > > +    * virtio_device_ready() that is called by
> > > > > +    * virtio_dev_probe(), so we set device ready here.
> > > > > +    *
> > > > > +    * The callback - virtio_pmem_host_ack() is safe to be called
> > > > > +    * before the nvdimm_pmem_region_create() since the pmem_lock
> > > > > +    * has been initialized and legality of a used buffer is
> > > > > +    * validated before moving forward.
> > > >
> > > > This comment feels like changelog material. Just document why
> > > > virtio_device_ready() must be called before device_add() of the
> > > > nd_region.
> > >
> > > Agree here. More specifically if you are documenting why is it
> > > safe to invoke each callback then that belongs to the callback itself.
> >
> > Ok, so I will move it to the callback and leave a simple comment like
> >
> > " See comment in virtio_pmem_host_ack(), it is safe to be called
> > before nvdimm_pmem_region_create()"
> >
> > Thanks
>
> No, just document why virtio_device_ready() must be called before device_add()
>
> I don't think the idea of working around these issues by adding code
> to  virtio_device_ready worked so far,

Any issue you found in this approach?

> not at all sure this approach
> is here to stay.

Or do you have other ideas to fix this issue?

Thanks

>
>
> > >
> > > > > +    */
> > > > > +   virtio_device_ready(vdev);
> > > > >     nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> > > > >     if (!nd_region) {
> > > > >             dev_err(&vdev->dev, "failed to create nvdimm region\n");
> > > > > @@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > > > >     }
> > > > >     return 0;
> > > > >  out_nd:
> > > > > +   virtio_reset_device(vdev);
> > > > >     nvdimm_bus_unregister(vpmem->nvdimm_bus);
> > > > >  out_vq:
> > > > >     vdev->config->del_vqs(vdev);
> > > > > --
> > > > > 2.25.1
> > > > >
> > >
>
Jason Wang June 23, 2022, 3:57 a.m. UTC | #11
On Thu, Jun 23, 2022 at 9:29 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Wed, Jun 22, 2022 at 8:31 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Wed, Jun 22, 2022 at 03:24:15PM +0800, Jason Wang wrote:
> > > On Wed, Jun 22, 2022 at 2:29 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Tue, Jun 21, 2022 at 03:38:35PM -0700, Dan Williams wrote:
> > > > > Jason Wang wrote:
> > > > > > The NVDIMM region could be available before the virtio_device_ready()
> > > > > > that is called by virtio_dev_probe(). This means the driver tries to
> > > > > > use device before DRIVER_OK which violates the spec, fixing this by
> > > > > > set device ready before the nvdimm_pmem_region_create().
> > > > >
> > > > > Can you clarify the failure path. What race is virtio_device_ready()
> > > > > losing?
> > > > >
> > > > > >
> > > > > > Note that this means the virtio_pmem_host_ack() could be triggered
> > > > > > before the creation of the nd region, this is safe since the
> > > > > > virtio_pmem_host_ack() since pmem_lock has been initialized and we
> > > > > > check if we've added any buffer before trying to proceed.
> > > > >
> > > > > I got a little bit lost with the usage of "we" here. Can you clarify
> > > > > which function / context is making which guarantee?
> > > > >
> > > > > >
> > > > > > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > ---
> > > > > >  drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
> > > > > >  1 file changed, 12 insertions(+)
> > > > > >
> > > > > > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > > > > > index 48f8327d0431..173f2f5adaea 100644
> > > > > > --- a/drivers/nvdimm/virtio_pmem.c
> > > > > > +++ b/drivers/nvdimm/virtio_pmem.c
> > > > > > @@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > > > > >     ndr_desc.provider_data = vdev;
> > > > > >     set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> > > > > >     set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> > > > > > +   /*
> > > > > > +    * The NVDIMM region could be available before the
> > > > > > +    * virtio_device_ready() that is called by
> > > > > > +    * virtio_dev_probe(), so we set device ready here.
> > > > > > +    *
> > > > > > +    * The callback - virtio_pmem_host_ack() is safe to be called
> > > > > > +    * before the nvdimm_pmem_region_create() since the pmem_lock
> > > > > > +    * has been initialized and legality of a used buffer is
> > > > > > +    * validated before moving forward.
> > > > >
> > > > > This comment feels like changelog material. Just document why
> > > > > virtio_device_ready() must be called before device_add() of the
> > > > > nd_region.
> > > >
> > > > Agree here. More specifically if you are documenting why is it
> > > > safe to invoke each callback then that belongs to the callback itself.
> > >
> > > Ok, so I will move it to the callback and leave a simple comment like
> > >
> > > " See comment in virtio_pmem_host_ack(), it is safe to be called
> > > before nvdimm_pmem_region_create()"
> > >
> > > Thanks
> >
> > No, just document why virtio_device_ready() must be called before device_add()
> >
> > I don't think the idea of working around these issues by adding code
> > to  virtio_device_ready worked so far,
>
> Any issue you found in this approach?
>
> > not at all sure this approach
> > is here to stay.
>
> Or do you have other ideas to fix this issue?

Or do you think we can do something similar to harden the config
interrupt (down the road with the kconfig option)?

virtio_device_ready(); // set driver ok but delay the vring interrupt
subsystem_register();
virtio_enable_vq_callback(); // enable vring interrupt and raised
delayed interrupt

Thanks

>
> Thanks
>
> >
> >
> > > >
> > > > > > +    */
> > > > > > +   virtio_device_ready(vdev);
> > > > > >     nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> > > > > >     if (!nd_region) {
> > > > > >             dev_err(&vdev->dev, "failed to create nvdimm region\n");
> > > > > > @@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > > > > >     }
> > > > > >     return 0;
> > > > > >  out_nd:
> > > > > > +   virtio_reset_device(vdev);
> > > > > >     nvdimm_bus_unregister(vpmem->nvdimm_bus);
> > > > > >  out_vq:
> > > > > >     vdev->config->del_vqs(vdev);
> > > > > > --
> > > > > > 2.25.1
> > > > > >
> > > >
> >
Michael S. Tsirkin June 24, 2022, 6:44 a.m. UTC | #12
On Thu, Jun 23, 2022 at 11:57:26AM +0800, Jason Wang wrote:
> On Thu, Jun 23, 2022 at 9:29 AM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Wed, Jun 22, 2022 at 8:31 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Wed, Jun 22, 2022 at 03:24:15PM +0800, Jason Wang wrote:
> > > > On Wed, Jun 22, 2022 at 2:29 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Tue, Jun 21, 2022 at 03:38:35PM -0700, Dan Williams wrote:
> > > > > > Jason Wang wrote:
> > > > > > > The NVDIMM region could be available before the virtio_device_ready()
> > > > > > > that is called by virtio_dev_probe(). This means the driver tries to
> > > > > > > use device before DRIVER_OK which violates the spec, fixing this by
> > > > > > > set device ready before the nvdimm_pmem_region_create().
> > > > > >
> > > > > > Can you clarify the failure path. What race is virtio_device_ready()
> > > > > > losing?
> > > > > >
> > > > > > >
> > > > > > > Note that this means the virtio_pmem_host_ack() could be triggered
> > > > > > > before the creation of the nd region, this is safe since the
> > > > > > > virtio_pmem_host_ack() since pmem_lock has been initialized and we
> > > > > > > check if we've added any buffer before trying to proceed.
> > > > > >
> > > > > > I got a little bit lost with the usage of "we" here. Can you clarify
> > > > > > which function / context is making which guarantee?
> > > > > >
> > > > > > >
> > > > > > > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > > ---
> > > > > > >  drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
> > > > > > >  1 file changed, 12 insertions(+)
> > > > > > >
> > > > > > > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > > > > > > index 48f8327d0431..173f2f5adaea 100644
> > > > > > > --- a/drivers/nvdimm/virtio_pmem.c
> > > > > > > +++ b/drivers/nvdimm/virtio_pmem.c
> > > > > > > @@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > > > > > >     ndr_desc.provider_data = vdev;
> > > > > > >     set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> > > > > > >     set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> > > > > > > +   /*
> > > > > > > +    * The NVDIMM region could be available before the
> > > > > > > +    * virtio_device_ready() that is called by
> > > > > > > +    * virtio_dev_probe(), so we set device ready here.
> > > > > > > +    *
> > > > > > > +    * The callback - virtio_pmem_host_ack() is safe to be called
> > > > > > > +    * before the nvdimm_pmem_region_create() since the pmem_lock
> > > > > > > +    * has been initialized and legality of a used buffer is
> > > > > > > +    * validated before moving forward.
> > > > > >
> > > > > > This comment feels like changelog material. Just document why
> > > > > > virtio_device_ready() must be called before device_add() of the
> > > > > > nd_region.
> > > > >
> > > > > Agree here. More specifically if you are documenting why is it
> > > > > safe to invoke each callback then that belongs to the callback itself.
> > > >
> > > > Ok, so I will move it to the callback and leave a simple comment like
> > > >
> > > > " See comment in virtio_pmem_host_ack(), it is safe to be called
> > > > before nvdimm_pmem_region_create()"
> > > >
> > > > Thanks
> > >
> > > No, just document why virtio_device_ready() must be called before device_add()
> > >
> > > I don't think the idea of working around these issues by adding code
> > > to  virtio_device_ready worked so far,
> >
> > Any issue you found in this approach?
> >
> > > not at all sure this approach
> > > is here to stay.
> >
> > Or do you have other ideas to fix this issue?
> 
> Or do you think we can do something similar to harden the config
> interrupt (down the road with the kconfig option)?
> 
> virtio_device_ready(); // set driver ok but delay the vring interrupt
> subsystem_register();
> virtio_enable_vq_callback(); // enable vring interrupt and raised
> delayed interrupt
> 
> Thanks

Yes and from API POV I think we should do

virtio_disable_vq_callback();
virtio_device_ready();
subsystem_register();
virtio_enable_vq_callback();

this way we won't break all drivers that aren't careful like
previous hardening patches did.


> >
> > Thanks
> >
> > >
> > >
> > > > >
> > > > > > > +    */
> > > > > > > +   virtio_device_ready(vdev);
> > > > > > >     nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> > > > > > >     if (!nd_region) {
> > > > > > >             dev_err(&vdev->dev, "failed to create nvdimm region\n");
> > > > > > > @@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > > > > > >     }
> > > > > > >     return 0;
> > > > > > >  out_nd:
> > > > > > > +   virtio_reset_device(vdev);
> > > > > > >     nvdimm_bus_unregister(vpmem->nvdimm_bus);
> > > > > > >  out_vq:
> > > > > > >     vdev->config->del_vqs(vdev);
> > > > > > > --
> > > > > > > 2.25.1
> > > > > > >
> > > > >
> > >
diff mbox series

Patch

diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
index 48f8327d0431..173f2f5adaea 100644
--- a/drivers/nvdimm/virtio_pmem.c
+++ b/drivers/nvdimm/virtio_pmem.c
@@ -84,6 +84,17 @@  static int virtio_pmem_probe(struct virtio_device *vdev)
 	ndr_desc.provider_data = vdev;
 	set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
 	set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
+	/*
+	 * The NVDIMM region could be available before the
+	 * virtio_device_ready() that is called by
+	 * virtio_dev_probe(), so we set device ready here.
+	 *
+	 * The callback - virtio_pmem_host_ack() is safe to be called
+	 * before the nvdimm_pmem_region_create() since the pmem_lock
+	 * has been initialized and legality of a used buffer is
+	 * validated before moving forward.
+	 */
+	virtio_device_ready(vdev);
 	nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
 	if (!nd_region) {
 		dev_err(&vdev->dev, "failed to create nvdimm region\n");
@@ -92,6 +103,7 @@  static int virtio_pmem_probe(struct virtio_device *vdev)
 	}
 	return 0;
 out_nd:
+	virtio_reset_device(vdev);
 	nvdimm_bus_unregister(vpmem->nvdimm_bus);
 out_vq:
 	vdev->config->del_vqs(vdev);