diff mbox

[v2] MAX1111: Fix Race condition causing NULL pointer exception

Message ID 1310410051-13127-1-git-send-email-morpheus.ibis@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Pavel Herrmann July 11, 2011, 6:47 p.m. UTC
spi_sync call uses its spi_message parameter to keep completion information,
having this structure static is not thread-safe, potentially causing one
thread having pointers to memory on or above other threads stack. use mutex
to prevent multiple access

Signed-off-by: Pavel Herrmann <morpheus.ibis@gmail.com>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Marek Vasut <marek.vasut@gmail.com>
Acked-by: Cyril Hrubis <metan@ucw.cz>
---
 drivers/hwmon/max1111.c |   12 ++++++++++++
 1 files changed, 12 insertions(+), 0 deletions(-)

Comments

Jean Delvare July 11, 2011, 8:11 p.m. UTC | #1
Hi Pavel,

On Mon, 11 Jul 2011 20:47:31 +0200, Pavel Herrmann wrote:
> spi_sync call uses its spi_message parameter to keep completion information,
> having this structure static is not thread-safe, potentially causing one
> thread having pointers to memory on or above other threads stack. use mutex
> to prevent multiple access

This has nothing to do with static, as a matter of fact the structure
is dynamically allocated. The bottom line is that the driver structure
is such that calls to max1111_read() must be serialized.
> 
> Signed-off-by: Pavel Herrmann <morpheus.ibis@gmail.com>
> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
> Acked-by: Pavel Machek <pavel@ucw.cz>
> Acked-by: Marek Vasut <marek.vasut@gmail.com>
> Acked-by: Cyril Hrubis <metan@ucw.cz>
> ---
>  drivers/hwmon/max1111.c |   12 ++++++++++++
>  1 files changed, 12 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/hwmon/max1111.c b/drivers/hwmon/max1111.c
> index 12a54aa..d872f57 100644
> --- a/drivers/hwmon/max1111.c
> +++ b/drivers/hwmon/max1111.c
> @@ -40,6 +40,7 @@ struct max1111_data {
>  	struct spi_transfer	xfer[2];
>  	uint8_t *tx_buf;
>  	uint8_t *rx_buf;
> +	struct mutex		msg_lock_mutex;

"lock" and "mutex" being kind of synonyms, this is a rather bad name.
And you should add a comment explaining what is protected. As I
understand it, it's more than just msg, it's also protecting xfer,
tx_buf and rx_buf.

>  };
>  
>  static int max1111_read(struct device *dev, int channel)
> @@ -48,6 +49,11 @@ static int max1111_read(struct device *dev, int channel)
>  	uint8_t v1, v2;
>  	int err;
>  
> +	/* spi_sync requires data not to be freed before function returns
> +	 * for static data, any access is dangerous, use locks
> +	 */

This has nothing to do with "freeing data". max1111_read() doesn't free
anything. It is making use of a data structure, the access to which
must be serialized. Easy as that. And no, access isn't dangerous ;)

> +	mutex_lock(&data->msg_lock_mutex);
> +
>  	data->tx_buf[0] = (channel << MAX1111_CTRL_SEL_SH) |
>  		MAX1111_CTRL_PD0 | MAX1111_CTRL_PD1 |
>  		MAX1111_CTRL_SGL | MAX1111_CTRL_UNI | MAX1111_CTRL_STR;
> @@ -55,12 +61,15 @@ static int max1111_read(struct device *dev, int channel)
>  	err = spi_sync(data->spi, &data->msg);
>  	if (err < 0) {
>  		dev_err(dev, "spi_sync failed with %d\n", err);
> +		mutex_unlock(&data->msg_lock_mutex);
>  		return err;
>  	}
>  
>  	v1 = data->rx_buf[0];
>  	v2 = data->rx_buf[1];
>  
> +	mutex_unlock(&data->msg_lock_mutex);
> +
>  	if ((v1 & 0xc0) || (v2 & 0x3f))
>  		return -EINVAL;
>  
> @@ -176,6 +185,8 @@ static int __devinit max1111_probe(struct spi_device *spi)
>  	if (err)
>  		goto err_free_data;
>  
> +	mutex_init(&data->msg_lock_mutex);
> +
>  	data->spi = spi;
>  	spi_set_drvdata(spi, data);
>  
> @@ -213,6 +224,7 @@ static int __devexit max1111_remove(struct spi_device *spi)
>  
>  	hwmon_device_unregister(data->hwmon_dev);
>  	sysfs_remove_group(&spi->dev.kobj, &max1111_attr_group);
> +	mutex_destroy(data->msg_lock_mutex);

I didn't know about this function, interesting. No other hwmon driver
calls it, not sure if it makes sense when the underlying memory is
going to be freed immediately afterwards anyway. But of course it can't
hurt.

>  	kfree(data->rx_buf);
>  	kfree(data->tx_buf);
>  	kfree(data);

Unrelated to your patch, but these 3 separate allocs are really insane,
memory usage in this driver could be much improved IMHO.

Please respin your patch with a better struct member name and improved
description and comments, and I'll be happy to apply it.

Thanks,
Pavel Herrmann July 11, 2011, 8:36 p.m. UTC | #2
On Monday 11 of July 2011 22:11:48 Jean Delvare wrote:
> > spi_sync call uses its spi_message parameter to keep completion
> > information, having this structure static is not thread-safe,
> > potentially causing one thread having pointers to memory on or above
> > other threads stack. use mutex to prevent multiple access
> 
> This has nothing to do with static, as a matter of fact the structure
> is dynamically allocated. The bottom line is that the driver structure
> is such that calls to max1111_read() must be serialized.

the structure is dynamically allocated, but the pointer used to hold it is a 
static global var.
"static" in this context meant "shared by all threads"

> > +	/* spi_sync requires data not to be freed before function returns
> > +	 * for static data, any access is dangerous, use locks
> > +	 */
> 
> This has nothing to do with "freeing data". max1111_read() doesn't free
> anything. It is making use of a data structure, the access to which
> must be serialized. Easy as that. And no, access isn't dangerous ;)

as spi_message contains a pointer to completion (created and waited on by 
spi_sync()), witch gets rewritten and causes the NULL exception, writing to it 
while the call is in progress is bad idea. also changing the message sent 
half-way would not be very nice.
reading would be fine, though

> Please respin your patch with a better struct member name and improved
> description and comments, and I'll be happy to apply it.

on it
Stanislav Brabec July 11, 2011, 8:56 p.m. UTC | #3
Pavel Herrmann wrote:
> spi_sync call uses its spi_message parameter to keep completion information,
> having this structure static is not thread-safe, potentially causing one
> thread having pointers to memory on or above other threads stack. use mutex
> to prevent multiple access
> 
> Signed-off-by: Pavel Herrmann <morpheus.ibis@gmail.com>
> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
> Acked-by: Pavel Machek <pavel@ucw.cz>
> Acked-by: Marek Vasut <marek.vasut@gmail.com>
> Acked-by: Cyril Hrubis <metan@ucw.cz>
Tested-by: Stanislav Brabec <utx@penguin.cz>

Tested on spitz, fixed previously reproducible OOPSes and kernel panic.

(See thread "kernel panic in spi_complete() on spitz (PXA270)" in
linux-arm-kernel@lists.infradead.org.)
Guenter Roeck July 11, 2011, 9:03 p.m. UTC | #4
On Mon, 2011-07-11 at 16:36 -0400, Pavel Herrmann wrote:
> On Monday 11 of July 2011 22:11:48 Jean Delvare wrote:
> > > spi_sync call uses its spi_message parameter to keep completion
> > > information, having this structure static is not thread-safe,
> > > potentially causing one thread having pointers to memory on or above
> > > other threads stack. use mutex to prevent multiple access
> > 
> > This has nothing to do with static, as a matter of fact the structure
> > is dynamically allocated. The bottom line is that the driver structure
> > is such that calls to max1111_read() must be serialized.
> 
> the structure is dynamically allocated, but the pointer used to hold it is a 
> static global var.

This is true only if CONFIG_SHARPSL_PM is defined, and it assumes that
the driver is instantiated exactly once. That is pretty badly broken
(the commit introducing it even admits that), and should be fixed. This
does not happen CONFIG_SHARPSL_PM is not defined. If CONFIG_SHARPSL_PM
_is_ defined in your environment, and you do have multiple instances of
the driver (ie if you have multiple MAX1111 chips in your system), a
severe problem is that max1111_read_channel() does not identify the
driver instance. That can not be fixed with a mutex.

> "static" in this context meant "shared by all threads"
> 
I think it would make sense to stick with common terminology. In your
definition, almost all global variables of all programs out there would
be defined as "static".

Guenter
Pavel Herrmann July 11, 2011, 9:49 p.m. UTC | #5
On Monday 11 of July 2011 14:03:13 Guenter Roeck wrote:
> On Mon, 2011-07-11 at 16:36 -0400, Pavel Herrmann wrote:
> > the structure is dynamically allocated, but the pointer used to hold it
> > is a static global var.
> 
> This is true only if CONFIG_SHARPSL_PM is defined, and it assumes that
> the driver is instantiated exactly once. That is pretty badly broken
> (the commit introducing it even admits that), and should be fixed. This
> does not happen CONFIG_SHARPSL_PM is not defined. If CONFIG_SHARPSL_PM
> _is_ defined in your environment, and you do have multiple instances of
> the driver (ie if you have multiple MAX1111 chips in your system), a
> severe problem is that max1111_read_channel() does not identify the
> driver instance. That can not be fixed with a mutex.

if you don't have CONFIG_SHARPSL_PM then there is nothing calling 
max1111_read, and thus any of the discussed doesn't matter
AFAIK max1111 is only used in sharpsl devices (according to kernel drivers 
anyways), and only one a piece.
this patch is meant to fix a crash, not make the driver code pretty just in 
case someone else decides to use it. this patch also doesn't present any more 
challenges for solving the multiple devices issue and would be necessary 
either way, as drvdata is not thread-safe anyways (or I am badly mistaken)

> I think it would make sense to stick with common terminology. In your
> definition, almost all global variables of all programs out there would
> be defined as "static".

yeah, sorry for that
Jean Delvare July 12, 2011, 6:48 a.m. UTC | #6
On Mon, 11 Jul 2011 23:49:10 +0200, Pavel Herrmann wrote:
> On Monday 11 of July 2011 14:03:13 Guenter Roeck wrote:
> > On Mon, 2011-07-11 at 16:36 -0400, Pavel Herrmann wrote:
> > > the structure is dynamically allocated, but the pointer used to hold it
> > > is a static global var.
> > 
> > This is true only if CONFIG_SHARPSL_PM is defined, and it assumes that
> > the driver is instantiated exactly once. That is pretty badly broken
> > (the commit introducing it even admits that), and should be fixed. This
> > does not happen CONFIG_SHARPSL_PM is not defined. If CONFIG_SHARPSL_PM
> > _is_ defined in your environment, and you do have multiple instances of
> > the driver (ie if you have multiple MAX1111 chips in your system), a
> > severe problem is that max1111_read_channel() does not identify the
> > driver instance. That can not be fixed with a mutex.
> 
> if you don't have CONFIG_SHARPSL_PM then there is nothing calling 
> max1111_read, and thus any of the discussed doesn't matter

This assumption of yours is incorrect. Even with CONFIG_SHARPSL_PM
disabled, the max1111 driver creates sysfs attributes which, when read,
call max1111_read(). What CONFIG_SHARPSL_PM adds is the in-kernel
access.

> AFAIK max1111 is only used in sharpsl devices (according to kernel drivers 
> anyways), and only one a piece.
> this patch is meant to fix a crash, not make the driver code pretty just in 
> case someone else decides to use it. this patch also doesn't present any more 
> challenges for solving the multiple devices issue and would be necessary 
> either way, as drvdata is not thread-safe anyways (or I am badly mistaken)

You are right, drvdata is not thread-safe, and this is the most obvious
reason why your patch is needed.
diff mbox

Patch

diff --git a/drivers/hwmon/max1111.c b/drivers/hwmon/max1111.c
index 12a54aa..d872f57 100644
--- a/drivers/hwmon/max1111.c
+++ b/drivers/hwmon/max1111.c
@@ -40,6 +40,7 @@  struct max1111_data {
 	struct spi_transfer	xfer[2];
 	uint8_t *tx_buf;
 	uint8_t *rx_buf;
+	struct mutex		msg_lock_mutex;
 };
 
 static int max1111_read(struct device *dev, int channel)
@@ -48,6 +49,11 @@  static int max1111_read(struct device *dev, int channel)
 	uint8_t v1, v2;
 	int err;
 
+	/* spi_sync requires data not to be freed before function returns
+	 * for static data, any access is dangerous, use locks
+	 */
+	mutex_lock(&data->msg_lock_mutex);
+
 	data->tx_buf[0] = (channel << MAX1111_CTRL_SEL_SH) |
 		MAX1111_CTRL_PD0 | MAX1111_CTRL_PD1 |
 		MAX1111_CTRL_SGL | MAX1111_CTRL_UNI | MAX1111_CTRL_STR;
@@ -55,12 +61,15 @@  static int max1111_read(struct device *dev, int channel)
 	err = spi_sync(data->spi, &data->msg);
 	if (err < 0) {
 		dev_err(dev, "spi_sync failed with %d\n", err);
+		mutex_unlock(&data->msg_lock_mutex);
 		return err;
 	}
 
 	v1 = data->rx_buf[0];
 	v2 = data->rx_buf[1];
 
+	mutex_unlock(&data->msg_lock_mutex);
+
 	if ((v1 & 0xc0) || (v2 & 0x3f))
 		return -EINVAL;
 
@@ -176,6 +185,8 @@  static int __devinit max1111_probe(struct spi_device *spi)
 	if (err)
 		goto err_free_data;
 
+	mutex_init(&data->msg_lock_mutex);
+
 	data->spi = spi;
 	spi_set_drvdata(spi, data);
 
@@ -213,6 +224,7 @@  static int __devexit max1111_remove(struct spi_device *spi)
 
 	hwmon_device_unregister(data->hwmon_dev);
 	sysfs_remove_group(&spi->dev.kobj, &max1111_attr_group);
+	mutex_destroy(data->msg_lock_mutex);
 	kfree(data->rx_buf);
 	kfree(data->tx_buf);
 	kfree(data);