diff mbox series

[20/34] iio: inkern: only relase the device node when done with it

Message ID 20220610084545.547700-21-nuno.sa@analog.com (mailing list archive)
State Superseded
Headers show
Series make iio inkern interface firmware agnostic | expand

Commit Message

Nuno Sa June 10, 2022, 8:45 a.m. UTC
'of_node_put()' can potentially release the memory pointed to by
'iiospec.np' which would leave us with an invalid pointer (and we would
still pass it in 'of_xlate()'). As such, we can only release the node
after we are done with it.

Fixes: 17d82b47a215d ("iio: Add OF support")
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
---
 drivers/iio/inkern.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Andy Shevchenko June 10, 2022, 2:56 p.m. UTC | #1
On Fri, Jun 10, 2022 at 10:48 AM Nuno Sá <nuno.sa@analog.com> wrote:
>
> 'of_node_put()' can potentially release the memory pointed to by
> 'iiospec.np' which would leave us with an invalid pointer (and we would
> still pass it in 'of_xlate()'). As such, we can only release the node
> after we are done with it.

The question you should answer in the commit message is the following:
"Can an OF node, attached to a struct device, be gone before the
device itself?" If it so, then patch is good, otherwise there is no
point in this patch in the first place.
Nuno Sá June 10, 2022, 8:08 p.m. UTC | #2
On Fri, 2022-06-10 at 16:56 +0200, Andy Shevchenko wrote:
> On Fri, Jun 10, 2022 at 10:48 AM Nuno Sá <nuno.sa@analog.com> wrote:
> > 
> > 'of_node_put()' can potentially release the memory pointed to by
> > 'iiospec.np' which would leave us with an invalid pointer (and we
> > would
> > still pass it in 'of_xlate()'). As such, we can only release the
> > node
> > after we are done with it.
> 
> The question you should answer in the commit message is the
> following:
> "Can an OF node, attached to a struct device, be gone before the
> device itself?" If it so, then patch is good, otherwise there is no
> point in this patch in the first place.
> 

Yeah, I might be wrong but from a quick look... yes, I think the node
can be gone before the device. Take a look on the spi or i2c of_notify
handling and you can see that the nodes are get/put on the add/remove
notifcation. Meaning that the node lifespan is not really attached to
the device lifespan. If it was, I would expect to see of_node_put() on
the device release() function...

Again, I might be wrong and I admit I was not sure about including this
patch because it's a very unlikely scenario even though I think, in
theory, a possible one.

- Nuno Sá
Jonathan Cameron June 11, 2022, 2:59 p.m. UTC | #3
+Cc Mark Brown for a query on ordering in device tree based SPI setup.

On Fri, 10 Jun 2022 22:08:41 +0200
Nuno Sá <noname.nuno@gmail.com> wrote:

> On Fri, 2022-06-10 at 16:56 +0200, Andy Shevchenko wrote:
> > On Fri, Jun 10, 2022 at 10:48 AM Nuno Sá <nuno.sa@analog.com> wrote:  
> > > 
> > > 'of_node_put()' can potentially release the memory pointed to by
> > > 'iiospec.np' which would leave us with an invalid pointer (and we
> > > would
> > > still pass it in 'of_xlate()'). As such, we can only release the
> > > node
> > > after we are done with it.  
> > 
> > The question you should answer in the commit message is the
> > following:
> > "Can an OF node, attached to a struct device, be gone before the
> > device itself?" If it so, then patch is good, otherwise there is no
> > point in this patch in the first place.
> >   
> 
> Yeah, I might be wrong but from a quick look... yes, I think the node
> can be gone before the device. Take a look on the spi or i2c of_notify
> handling and you can see that the nodes are get/put on the add/remove
> notifcation. Meaning that the node lifespan is not really attached to
> the device lifespan. If it was, I would expect to see of_node_put() on
> the device release() function...

I had a look at spi_of_notify() and indeed via spi_unregister_device()
the node is put just before device_del() so I agree that at first glance
it seems like there may be a race there against the useage here.
Mark (+CC) out of interest why are the node gets before the device_add()
in spi_add_device() called from of_register_spi_device() but the matching
node puts before the device_del() in spi_unregister_device()?
Seems like inconsistent ordering...

Which is not to say we shouldn't fix the IIO usage as this patch does!

> 
> Again, I might be wrong and I admit I was not sure about including this
> patch because it's a very unlikely scenario even though I think, in
> theory, a possible one.

The patch is currently valid even if it's not a 'real' bug.
Given we are doing a put on that device_node, it makes sense for that
to occur after the local use has finished - we shouldn't be relying on
what happens to be the case for lifetimes today.

Now, I did wonder if any drivers actually use it in their xlate callbacks.
One does for an error print, so this is potentially real (if very unlikely!)

This isn't a 'fix' I'd expect to rush in, or necessarily backport to stable
but I think it's a valid fix.

Perhaps add a little more detail to the patch description for v2.

Thanks,

Jonathan


> 
> - Nuno Sá
>
Nuno Sá June 13, 2022, 7:20 a.m. UTC | #4
On Sat, 2022-06-11 at 15:59 +0100, Jonathan Cameron wrote:
> 
> +Cc Mark Brown for a query on ordering in device tree based SPI
> setup.
> 
> On Fri, 10 Jun 2022 22:08:41 +0200
> Nuno Sá <noname.nuno@gmail.com> wrote:
> 
> > On Fri, 2022-06-10 at 16:56 +0200, Andy Shevchenko wrote:
> > > On Fri, Jun 10, 2022 at 10:48 AM Nuno Sá <nuno.sa@analog.com>
> > > wrote:  
> > > > 
> > > > 'of_node_put()' can potentially release the memory pointed to
> > > > by
> > > > 'iiospec.np' which would leave us with an invalid pointer (and
> > > > we
> > > > would
> > > > still pass it in 'of_xlate()'). As such, we can only release
> > > > the
> > > > node
> > > > after we are done with it.  
> > > 
> > > The question you should answer in the commit message is the
> > > following:
> > > "Can an OF node, attached to a struct device, be gone before the
> > > device itself?" If it so, then patch is good, otherwise there is
> > > no
> > > point in this patch in the first place.
> > >   
> > 
> > Yeah, I might be wrong but from a quick look... yes, I think the
> > node
> > can be gone before the device. Take a look on the spi or i2c
> > of_notify
> > handling and you can see that the nodes are get/put on the
> > add/remove
> > notifcation. Meaning that the node lifespan is not really attached
> > to
> > the device lifespan. If it was, I would expect to see of_node_put()
> > on
> > the device release() function...
> 
> I had a look at spi_of_notify() and indeed via
> spi_unregister_device()
> the node is put just before device_del() so I agree that at first
> glance
> it seems like there may be a race there against the useage here.
> Mark (+CC) out of interest why are the node gets before the
> device_add()
> in spi_add_device() called from of_register_spi_device() but the
> matching
> node puts before the device_del() in spi_unregister_device()?
> Seems like inconsistent ordering...
> 
> Which is not to say we shouldn't fix the IIO usage as this patch
> does!
> 

Just to add something that came to my attention. In the IIO case, it
does not even matter if the parent device has the OF node lifetime
"linked" to it (as it actually happens for platform devices). The
reason is that iio_dev only has a weak reference to it's parent and (I
think) the parent can actually go away while the iio_dev is still
around (eg: someone has an open fd to the iio_dev cdev).

> > 
> > Again, I might be wrong and I admit I was not sure about including
> > this
> > patch because it's a very unlikely scenario even though I think, in
> > theory, a possible one.
> 
> The patch is currently valid even if it's not a 'real' bug.
> Given we are doing a put on that device_node, it makes sense for that
> to occur after the local use has finished - we shouldn't be relying
> on
> what happens to be the case for lifetimes today.
> 
> Now, I did wonder if any drivers actually use it in their xlate
> callbacks.
> One does for an error print, so this is potentially real (if very
> unlikely!)
> 
> This isn't a 'fix' I'd expect to rush in, or necessarily backport to
> stable
> but I think it's a valid fix.
> 

Should I drop the fixes tag?

- Nuno Sá
Jonathan Cameron June 18, 2022, 2:03 p.m. UTC | #5
On Mon, 13 Jun 2022 09:20:26 +0200
Nuno Sá <noname.nuno@gmail.com> wrote:

> On Sat, 2022-06-11 at 15:59 +0100, Jonathan Cameron wrote:
> > 
> > +Cc Mark Brown for a query on ordering in device tree based SPI
> > setup.
> > 
> > On Fri, 10 Jun 2022 22:08:41 +0200
> > Nuno Sá <noname.nuno@gmail.com> wrote:
> >   
> > > On Fri, 2022-06-10 at 16:56 +0200, Andy Shevchenko wrote:  
> > > > On Fri, Jun 10, 2022 at 10:48 AM Nuno Sá <nuno.sa@analog.com>
> > > > wrote:    
> > > > > 
> > > > > 'of_node_put()' can potentially release the memory pointed to
> > > > > by
> > > > > 'iiospec.np' which would leave us with an invalid pointer (and
> > > > > we
> > > > > would
> > > > > still pass it in 'of_xlate()'). As such, we can only release
> > > > > the
> > > > > node
> > > > > after we are done with it.    
> > > > 
> > > > The question you should answer in the commit message is the
> > > > following:
> > > > "Can an OF node, attached to a struct device, be gone before the
> > > > device itself?" If it so, then patch is good, otherwise there is
> > > > no
> > > > point in this patch in the first place.
> > > >     
> > > 
> > > Yeah, I might be wrong but from a quick look... yes, I think the
> > > node
> > > can be gone before the device. Take a look on the spi or i2c
> > > of_notify
> > > handling and you can see that the nodes are get/put on the
> > > add/remove
> > > notifcation. Meaning that the node lifespan is not really attached
> > > to
> > > the device lifespan. If it was, I would expect to see of_node_put()
> > > on
> > > the device release() function...  
> > 
> > I had a look at spi_of_notify() and indeed via
> > spi_unregister_device()
> > the node is put just before device_del() so I agree that at first
> > glance
> > it seems like there may be a race there against the useage here.
> > Mark (+CC) out of interest why are the node gets before the
> > device_add()
> > in spi_add_device() called from of_register_spi_device() but the
> > matching
> > node puts before the device_del() in spi_unregister_device()?
> > Seems like inconsistent ordering...
> > 
> > Which is not to say we shouldn't fix the IIO usage as this patch
> > does!
> >   
> 
> Just to add something that came to my attention. In the IIO case, it
> does not even matter if the parent device has the OF node lifetime
> "linked" to it (as it actually happens for platform devices). The
> reason is that iio_dev only has a weak reference to it's parent and (I
> think) the parent can actually go away while the iio_dev is still
> around (eg: someone has an open fd to the iio_dev cdev).

I chased through this as well and agree. The device_del()
hiding in cdev_device_del() will drop the parent reference on the parent.

I'm not sure it actually maters much though given almost all of_node
useage is not tied up with the anything that might be using the iio_dev
after the iio_device_unregister() call.

Maybe there is a race condition somewhere...

> 
> > > 
> > > Again, I might be wrong and I admit I was not sure about including
> > > this
> > > patch because it's a very unlikely scenario even though I think, in
> > > theory, a possible one.  
> > 
> > The patch is currently valid even if it's not a 'real' bug.
> > Given we are doing a put on that device_node, it makes sense for that
> > to occur after the local use has finished - we shouldn't be relying
> > on
> > what happens to be the case for lifetimes today.
> > 
> > Now, I did wonder if any drivers actually use it in their xlate
> > callbacks.
> > One does for an error print, so this is potentially real (if very
> > unlikely!)
> > 
> > This isn't a 'fix' I'd expect to rush in, or necessarily backport to
> > stable
> > but I think it's a valid fix.
> >   
> 
> Should I drop the fixes tag?
> 
Keep the tag. It's a fix, just a low priority one.

Jonathan

> - Nuno Sá
> 
> 
> 
>
Jonathan Cameron June 18, 2022, 2:13 p.m. UTC | #6
On Mon, 13 Jun 2022 09:20:26 +0200
Nuno Sá <noname.nuno@gmail.com> wrote:

> On Sat, 2022-06-11 at 15:59 +0100, Jonathan Cameron wrote:
> > 
> > +Cc Mark Brown for a query on ordering in device tree based SPI
> > setup.
> > 
> > On Fri, 10 Jun 2022 22:08:41 +0200
> > Nuno Sá <noname.nuno@gmail.com> wrote:
> >   
> > > On Fri, 2022-06-10 at 16:56 +0200, Andy Shevchenko wrote:  
> > > > On Fri, Jun 10, 2022 at 10:48 AM Nuno Sá <nuno.sa@analog.com>
> > > > wrote:    
> > > > > 
> > > > > 'of_node_put()' can potentially release the memory pointed to
> > > > > by
> > > > > 'iiospec.np' which would leave us with an invalid pointer (and
> > > > > we
> > > > > would
> > > > > still pass it in 'of_xlate()'). As such, we can only release
> > > > > the
> > > > > node
> > > > > after we are done with it.    
> > > > 
> > > > The question you should answer in the commit message is the
> > > > following:
> > > > "Can an OF node, attached to a struct device, be gone before the
> > > > device itself?" If it so, then patch is good, otherwise there is
> > > > no
> > > > point in this patch in the first place.
> > > >     
> > > 
> > > Yeah, I might be wrong but from a quick look... yes, I think the
> > > node
> > > can be gone before the device. Take a look on the spi or i2c
> > > of_notify
> > > handling and you can see that the nodes are get/put on the
> > > add/remove
> > > notifcation. Meaning that the node lifespan is not really attached
> > > to
> > > the device lifespan. If it was, I would expect to see of_node_put()
> > > on
> > > the device release() function...  
> > 
> > I had a look at spi_of_notify() and indeed via
> > spi_unregister_device()
> > the node is put just before device_del() so I agree that at first
> > glance
> > it seems like there may be a race there against the useage here.
> > Mark (+CC) out of interest why are the node gets before the
> > device_add()
> > in spi_add_device() called from of_register_spi_device() but the
> > matching
> > node puts before the device_del() in spi_unregister_device()?
> > Seems like inconsistent ordering...
> > 
> > Which is not to say we shouldn't fix the IIO usage as this patch
> > does!
> >   
> 
> Just to add something that came to my attention. In the IIO case, it
> does not even matter if the parent device has the OF node lifetime
> "linked" to it (as it actually happens for platform devices). The
> reason is that iio_dev only has a weak reference to it's parent and (I
> think) the parent can actually go away while the iio_dev is still
> around (eg: someone has an open fd to the iio_dev cdev).
> 
> > > 
> > > Again, I might be wrong and I admit I was not sure about including
> > > this
> > > patch because it's a very unlikely scenario even though I think, in
> > > theory, a possible one.  
> > 
> > The patch is currently valid even if it's not a 'real' bug.
> > Given we are doing a put on that device_node, it makes sense for that
> > to occur after the local use has finished - we shouldn't be relying
> > on
> > what happens to be the case for lifetimes today.
> > 
> > Now, I did wonder if any drivers actually use it in their xlate
> > callbacks.
> > One does for an error print, so this is potentially real (if very
> > unlikely!)
> > 
> > This isn't a 'fix' I'd expect to rush in, or necessarily backport to
> > stable
> > but I think it's a valid fix.
> >   
> 
> Should I drop the fixes tag?
> 
Nope,  Tag still relevant and may be useful to someone down the
line!

Jonathan

> - Nuno Sá
> 
> 
> 
>
Jonathan Cameron June 18, 2022, 5:30 p.m. UTC | #7
On Fri, 10 Jun 2022 10:45:31 +0200
Nuno Sá <nuno.sa@analog.com> wrote:

Typo in patch title (just noticed whilst scrolling past)

release


> 'of_node_put()' can potentially release the memory pointed to by
> 'iiospec.np' which would leave us with an invalid pointer (and we would
> still pass it in 'of_xlate()'). As such, we can only release the node
> after we are done with it.
> 
> Fixes: 17d82b47a215d ("iio: Add OF support")
> Signed-off-by: Nuno Sá <nuno.sa@analog.com>
> ---
>  drivers/iio/inkern.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/iio/inkern.c b/drivers/iio/inkern.c
> index df74765d33dc..9d87057794fc 100644
> --- a/drivers/iio/inkern.c
> +++ b/drivers/iio/inkern.c
> @@ -165,9 +165,10 @@ static int __of_iio_channel_get(struct iio_channel *channel,
>  
>  	idev = bus_find_device(&iio_bus_type, NULL, iiospec.np,
>  			       iio_dev_node_match);
> -	of_node_put(iiospec.np);
> -	if (idev == NULL)
> +	if (idev == NULL) {
> +		of_node_put(iiospec.np);
>  		return -EPROBE_DEFER;
> +	}
>  
>  	indio_dev = dev_to_iio_dev(idev);
>  	channel->indio_dev = indio_dev;
> @@ -175,6 +176,7 @@ static int __of_iio_channel_get(struct iio_channel *channel,
>  		index = indio_dev->info->of_xlate(indio_dev, &iiospec);
>  	else
>  		index = __of_iio_simple_xlate(indio_dev, &iiospec);
> +	of_node_put(iiospec.np);
>  	if (index < 0)
>  		goto err_put;
>  	channel->channel = &indio_dev->channels[index];
diff mbox series

Patch

diff --git a/drivers/iio/inkern.c b/drivers/iio/inkern.c
index df74765d33dc..9d87057794fc 100644
--- a/drivers/iio/inkern.c
+++ b/drivers/iio/inkern.c
@@ -165,9 +165,10 @@  static int __of_iio_channel_get(struct iio_channel *channel,
 
 	idev = bus_find_device(&iio_bus_type, NULL, iiospec.np,
 			       iio_dev_node_match);
-	of_node_put(iiospec.np);
-	if (idev == NULL)
+	if (idev == NULL) {
+		of_node_put(iiospec.np);
 		return -EPROBE_DEFER;
+	}
 
 	indio_dev = dev_to_iio_dev(idev);
 	channel->indio_dev = indio_dev;
@@ -175,6 +176,7 @@  static int __of_iio_channel_get(struct iio_channel *channel,
 		index = indio_dev->info->of_xlate(indio_dev, &iiospec);
 	else
 		index = __of_iio_simple_xlate(indio_dev, &iiospec);
+	of_node_put(iiospec.np);
 	if (index < 0)
 		goto err_put;
 	channel->channel = &indio_dev->channels[index];