Message ID | CACVXFVOZ=4HFAy38q_8ZUjOPffcv_mMTi7KCdh3HcX=MCLw5SQ@mail.gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Sun, Sep 16, 2012 at 09:24:43PM +0800, Ming Lei wrote: > On Sun, Sep 16, 2012 at 4:25 PM, Russell King - ARM Linux > <linux@arm.linux.org.uk> wrote: > > > > It isn't. As I said, it's a race condition due to lack of locking - the > > driver hasn't been added to the list of drivers at this point: > > > > int bus_add_driver(struct device_driver *drv) > > { > > ... > > if (drv->bus->p->drivers_autoprobe) { > > error = driver_attach(drv); > > if (error) > > goto out_unregister; > > } > > klist_add_tail(&priv->knode_bus, &bus->p->klist_drivers); > > ... > > } > > > > Notice that the attaching is done _before_ the driver is added to the > > bus driver list. > > Yes, it is a problem since a new device may be added to bus > and bus_probe_device() may not see the new added driver. > > So looks klist_add_tail() should complete before driver_attach() > inside bus_add_driver(). > > The attached one line change should fix the problem because the > below way can guarantee that no probe(dev) may be lost. > > > CPU0 CPU1 > driver_register > ... > write(bus->driver_list) > smp_mb() > read(bus->device_list) > ... > device_add > /* bus_add_device */ > write(bus->device_list) > smp_mb() > /* bus_probe_device*/ > read(bus->driver_list) > > And the smp_mb() has been implicit by UNLOCK+LOCK > of 'klist' according to 'VARIETIES OF MEMORY BARRIER' part > of Documentation/memory-barriers.txt. > > Could you test the below patch to see if it can fix your problem? > > diff --git a/drivers/base/bus.c b/drivers/base/bus.c > index 181ed26..17d7437 100644 > --- a/drivers/base/bus.c > +++ b/drivers/base/bus.c > @@ -714,12 +714,12 @@ int bus_add_driver(struct device_driver *drv) > if (error) > goto out_unregister; > > + klist_add_tail(&priv->knode_bus, &bus->p->klist_drivers); > if (drv->bus->p->drivers_autoprobe) { > error = driver_attach(drv); > if (error) > goto out_unregister; > } > - klist_add_tail(&priv->knode_bus, &bus->p->klist_drivers); > module_add_driver(drv->owner, drv); > > error = driver_create_file(drv, &driver_attr_uevent); > > > Did the above patch ever prove to solve the issue or not? thanks, greg k-h
On Wed, Sep 26, 2012 at 01:08:33PM -0700, Greg Kroah-Hartman wrote: > On Sun, Sep 16, 2012 at 09:24:43PM +0800, Ming Lei wrote: > > diff --git a/drivers/base/bus.c b/drivers/base/bus.c > > index 181ed26..17d7437 100644 > > --- a/drivers/base/bus.c > > +++ b/drivers/base/bus.c > > @@ -714,12 +714,12 @@ int bus_add_driver(struct device_driver *drv) > > if (error) > > goto out_unregister; > > > > + klist_add_tail(&priv->knode_bus, &bus->p->klist_drivers); > > if (drv->bus->p->drivers_autoprobe) { > > error = driver_attach(drv); > > if (error) > > goto out_unregister; > > } > > - klist_add_tail(&priv->knode_bus, &bus->p->klist_drivers); > > module_add_driver(drv->owner, drv); > > > > error = driver_create_file(drv, &driver_attr_uevent); > > > > > > > > Did the above patch ever prove to solve the issue or not? To be honest, I've not bothered to test the above patch, and now when I look at it, I notice it's broken - in that on error it will corrupt the driver list. Take a look at the error path. priv is drv->p. We add priv->knode_bus to the driver list. If driver_attach() returns an error, then we go to out_unregister, which does: out_unregister: kobject_put(&priv->kobj); kfree(drv->p); drv->p = NULL; thereby freeing the node we just added to the driver list without first removing it. I suspect it will fix the problem, but let's get the patch to be correct before it gets tested...
On Wed, Sep 26, 2012 at 09:23:21PM +0100, Russell King - ARM Linux wrote: > On Wed, Sep 26, 2012 at 01:08:33PM -0700, Greg Kroah-Hartman wrote: > > On Sun, Sep 16, 2012 at 09:24:43PM +0800, Ming Lei wrote: > > > diff --git a/drivers/base/bus.c b/drivers/base/bus.c > > > index 181ed26..17d7437 100644 > > > --- a/drivers/base/bus.c > > > +++ b/drivers/base/bus.c > > > @@ -714,12 +714,12 @@ int bus_add_driver(struct device_driver *drv) > > > if (error) > > > goto out_unregister; > > > > > > + klist_add_tail(&priv->knode_bus, &bus->p->klist_drivers); > > > if (drv->bus->p->drivers_autoprobe) { > > > error = driver_attach(drv); > > > if (error) > > > goto out_unregister; > > > } > > > - klist_add_tail(&priv->knode_bus, &bus->p->klist_drivers); > > > module_add_driver(drv->owner, drv); > > > > > > error = driver_create_file(drv, &driver_attr_uevent); > > > > > > > > > > > > > Did the above patch ever prove to solve the issue or not? > > To be honest, I've not bothered to test the above patch, and now when I > look at it, I notice it's broken - in that on error it will corrupt the > driver list. Take a look at the error path. > > priv is drv->p. We add priv->knode_bus to the driver list. If > driver_attach() returns an error, then we go to out_unregister, which > does: > > out_unregister: > kobject_put(&priv->kobj); > kfree(drv->p); > drv->p = NULL; > > thereby freeing the node we just added to the driver list without first > removing it. > > I suspect it will fix the problem, but let's get the patch to be correct > before it gets tested... Good catch. Ming, care to redo this? greg k-h
On Thu, Sep 27, 2012 at 4:23 AM, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote: > On Wed, Sep 26, 2012 at 01:08:33PM -0700, Greg Kroah-Hartman wrote: >> On Sun, Sep 16, 2012 at 09:24:43PM +0800, Ming Lei wrote: >> > diff --git a/drivers/base/bus.c b/drivers/base/bus.c >> > index 181ed26..17d7437 100644 >> > --- a/drivers/base/bus.c >> > +++ b/drivers/base/bus.c >> > @@ -714,12 +714,12 @@ int bus_add_driver(struct device_driver *drv) >> > if (error) >> > goto out_unregister; >> > >> > + klist_add_tail(&priv->knode_bus, &bus->p->klist_drivers); >> > if (drv->bus->p->drivers_autoprobe) { >> > error = driver_attach(drv); >> > if (error) >> > goto out_unregister; >> > } >> > - klist_add_tail(&priv->knode_bus, &bus->p->klist_drivers); >> > module_add_driver(drv->owner, drv); >> > >> > error = driver_create_file(drv, &driver_attr_uevent); >> > >> > >> > >> >> Did the above patch ever prove to solve the issue or not? > > To be honest, I've not bothered to test the above patch, and now when I > look at it, I notice it's broken - in that on error it will corrupt the > driver list. Take a look at the error path. > > priv is drv->p. We add priv->knode_bus to the driver list. If > driver_attach() returns an error, then we go to out_unregister, which In fact, driver_attach() always returns zero, so it does __not__ affect your test. I knew the failure shouldn't be handled in theory because the probe failure on one device should not cause failure of driver_register, and it should be fixed, IMO. > does: > > out_unregister: > kobject_put(&priv->kobj); > kfree(drv->p); > drv->p = NULL; > > thereby freeing the node we just added to the driver list without first > removing it. > > I suspect it will fix the problem, but let's get the patch to be correct > before it gets tested... Trust me, please go ahead to test and it doesn't affect it... Thanks,
diff --git a/drivers/base/bus.c b/drivers/base/bus.c index 181ed26..17d7437 100644 --- a/drivers/base/bus.c +++ b/drivers/base/bus.c @@ -714,12 +714,12 @@ int bus_add_driver(struct device_driver *drv) if (error) goto out_unregister; + klist_add_tail(&priv->knode_bus, &bus->p->klist_drivers); if (drv->bus->p->drivers_autoprobe) { error = driver_attach(drv); if (error) goto out_unregister; } - klist_add_tail(&priv->knode_bus, &bus->p->klist_drivers); module_add_driver(drv->owner, drv); error = driver_create_file(drv, &driver_attr_uevent);