diff mbox

-next regression: "driver cohandle -EPROBE_DEFER from bus_type.match()"

Message ID CAPcyv4hOJuT=i2A_0vDQHsL7MXNgVN_p17q3UW6EcTkxsCMmeg@mail.gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Dan Williams Dec. 17, 2015, 3:51 p.m. UTC
The commit below causes the libnvdimm sub-system to stop loading.
This is due to the fact that nvdimm_bus_match() returns the result of
test_bit() which may be negative.  If there are any other bus match
functions using test_bit they may be similarly impacted.

Can we queue a fixup like the following to libnvdimm, and maybe
others, ahead of this driver core change?

static struct module *to_bus_provider(struct device *dev)




Other ideas?

commit 09a14906a26e454cad7ff0ad96af40fc4cd90eb0
Author: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Date:   Tue Dec 8 10:00:45 2015 +0100

   ARM: 8472/1: driver cohandle -EPROBE_DEFER from bus_type.match()

   Allow implementations of the match() callback in struct bus_type to
   return errors and if it's -EPROBE_DEFER then queue the device for
   deferred probing.

   This is useful to buses such as AMBA in which devices are registered
   before their matching information can be retrieved from the HW
   (typically because a clock driver hasn't probed yet).

   [changed if-else code structure, adjusted documentation to match the code,
   extended comments]

   Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
   Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
   Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
   Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

Comments

Dan Williams Dec. 17, 2015, 4:48 p.m. UTC | #1
On Thu, Dec 17, 2015 at 7:51 AM, Dan Williams <dan.j.williams@intel.com> wrote:
> The commit below causes the libnvdimm sub-system to stop loading.
> This is due to the fact that nvdimm_bus_match() returns the result of
> test_bit() which may be negative.  If there are any other bus match
> functions using test_bit they may be similarly impacted.
>
> Can we queue a fixup like the following to libnvdimm, and maybe
> others, ahead of this driver core change?
>
> diff --git a/drivers/nvdimm/bus.c b/drivers/nvdimm/bus.c
> index 7e2c43f701bc..2b2181cdeb63 100644
> --- a/drivers/nvdimm/bus.c
> +++ b/drivers/nvdimm/bus.c
> @@ -62,7 +62,7 @@ static int nvdimm_bus_match(struct device *dev,
> struct device_driver *drv)
> {
>        struct nd_device_driver *nd_drv = to_nd_device_driver(drv);
>
> -       return test_bit(to_nd_device_type(dev), &nd_drv->type);
> +       return !!test_bit(to_nd_device_type(dev), &nd_drv->type);
> }
>
> static struct module *to_bus_provider(struct device *dev)
>
>
>
>
> Other ideas?

How about just checking for EPROBE_DEFER, every other non-zero value is success.
Matthew Wilcox Dec. 17, 2015, 5:07 p.m. UTC | #2
On Thu, Dec 17, 2015 at 08:48:09AM -0800, Dan Williams wrote:
> On Thu, Dec 17, 2015 at 7:51 AM, Dan Williams <dan.j.williams@intel.com> wrote:
> How about just checking for EPROBE_DEFER, every other non-zero value is success.

How about using IS_ERR_VALUE()?  If it's good enough for IS_ERR(),
it's good enough for this case.
Ross Zwisler Dec. 17, 2015, 5:50 p.m. UTC | #3
On Thu, Dec 17, 2015 at 8:51 AM, Dan Williams <dan.j.williams@intel.com> wrote:
> The commit below causes the libnvdimm sub-system to stop loading.
> This is due to the fact that nvdimm_bus_match() returns the result of
> test_bit() which may be negative.  If there are any other bus match
> functions using test_bit they may be similarly impacted.
>
> Can we queue a fixup like the following to libnvdimm, and maybe
> others, ahead of this driver core change?
>
> diff --git a/drivers/nvdimm/bus.c b/drivers/nvdimm/bus.c
> index 7e2c43f701bc..2b2181cdeb63 100644
> --- a/drivers/nvdimm/bus.c
> +++ b/drivers/nvdimm/bus.c
> @@ -62,7 +62,7 @@ static int nvdimm_bus_match(struct device *dev,
> struct device_driver *drv)
> {
>        struct nd_device_driver *nd_drv = to_nd_device_driver(drv);
>
> -       return test_bit(to_nd_device_type(dev), &nd_drv->type);
> +       return !!test_bit(to_nd_device_type(dev), &nd_drv->type);

How is this call to test_bit() returning a negative value?  That can
only happen if the bit number we supply is 63, correct?  I only see
to_nd_device_type() returning 1-6 for the defines ND_DEVICE_DIMM thru
ND_DEVICE_NAMESPACE_BLK?
Russell King - ARM Linux Dec. 17, 2015, 6:46 p.m. UTC | #4
On Thu, Dec 17, 2015 at 07:51:14AM -0800, Dan Williams wrote:
> The commit below causes the libnvdimm sub-system to stop loading.
> This is due to the fact that nvdimm_bus_match() returns the result of
> test_bit() which may be negative.  If there are any other bus match
> functions using test_bit they may be similarly impacted.
> 
> Can we queue a fixup like the following to libnvdimm, and maybe
> others, ahead of this driver core change?

This is rather annoying.  Have we uncovered a latent bug in other
architectures?  Well, looking through the test_bit() implementations,
it looks like it.

I'll drop the patch set for the time being, we can't go around breaking
stuff like this.  However, I think the test_bit() result should be
regularised across different architectures - it _looks_ to me like most
implementations return 0/1 values, but there may be some that don't
(maybe the assembly versions?)

Here's the list I've pulled out so far from the "easy" cases, which all
look like they're returning 0/1 values.

asm-generic: 0/1

/**
 * test_bit - Determine whether a bit is set
 * @nr: bit number to test
 * @addr: Address to start counting from
 */
static inline int test_bit(int nr, const volatile unsigned long *addr)
{
        return 1UL & (addr[BIT_WORD(nr)] >> (nr & (BITS_PER_LONG-1)));
}

alpha: 0/1

static inline int
test_bit(int nr, const volatile void * addr)
{
        return (1UL & (((const int *) addr)[nr >> 5] >> (nr & 31))) != 0UL;
}

arm: 0/1

test_bit(unsigned int nr, const volatile unsigned long *addr)
{
        unsigned long mask;

        addr += nr >> 5;

        mask = 1UL << (nr & 0x1f);

        return ((mask & *addr) != 0);
}

blackfin: 0/1

static inline int test_bit(int nr, const volatile unsigned long *addr)
{
        volatile const unsigned long *a = addr + (nr >> 5);
        return __raw_bit_test_asm(a, nr & 0x1f) != 0;
}

frv: 0/1

static inline int
__constant_test_bit(unsigned long nr, const volatile void *addr)
{
        return ((1UL << (nr & 31)) & (((const volatile unsigned int *) addr)[nr >> 5])) != 0;
}
(and similar for __test_bit)

h8300 uses assembly... no idea

hexagon uses assembly as well... no idea

ia64: 0/1

static __inline__ int
test_bit (int nr, const volatile void *addr)
{
        return 1 & (((const volatile __u32 *) addr)[nr >> 5] >> (nr & 31));
}

m68k: 0/1

static inline int test_bit(int nr, const unsigned long *vaddr)
{
        return (vaddr[nr >> 5] & (1UL << (nr & 31))) != 0;
}

mn10300: 0/1

static inline int test_bit(unsigned long nr, const volatile void *addr)
{
        return 1UL & (((const volatile unsigned int *) addr)[nr >> 5] >> (nr & 31));
}

s390: 0/1

static inline int test_bit(unsigned long nr, const volatile unsigned long *ptr)
{
        const volatile unsigned char *addr;

        addr = ((const volatile unsigned char *)ptr);
        addr += (nr ^ (BITS_PER_LONG - 8)) >> 3;
        return (*addr >> (nr & 7)) & 1;
}

x86: 0/1 for constant, ? for variable

static __always_inline int constant_test_bit(long nr, const volatile unsigned long *addr)
{
        return ((1UL << (nr & (BITS_PER_LONG-1))) &
                (addr[nr >> _BITOPS_LONG_SHIFT])) != 0;
}
(presumably variable_test_bit is the same, but I don't know)
Dan Williams Dec. 17, 2015, 8:35 p.m. UTC | #5
[Adding Dave Howells who tried to correct this situation earlier this year]

On Thu, Dec 17, 2015 at 10:46 AM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Thu, Dec 17, 2015 at 07:51:14AM -0800, Dan Williams wrote:
>> The commit below causes the libnvdimm sub-system to stop loading.
>> This is due to the fact that nvdimm_bus_match() returns the result of
>> test_bit() which may be negative.  If there are any other bus match
>> functions using test_bit they may be similarly impacted.
>>
>> Can we queue a fixup like the following to libnvdimm, and maybe
>> others, ahead of this driver core change?
>
> This is rather annoying.  Have we uncovered a latent bug in other
> architectures?  Well, looking through the test_bit() implementations,
> it looks like it.
>
> I'll drop the patch set for the time being, we can't go around breaking
> stuff like this.

...or make the interpretation from the return value of ->match() be 0,
-EPROBE_DEFER, or other non-zero value for success?  Although that's
fairly subtle.

> However, I think the test_bit() result should be
> regularised across different architectures - it _looks_ to me like most
> implementations return 0/1 values, but there may be some that don't
> (maybe the assembly versions?)

Correct.  Al the constant versions return 0 or 1, but the assembly
return 0 or non-zero.

Here's a link to Dave's rework.

https://lwn.net/Articles/642437/
Yoshinori Sato Dec. 18, 2015, 4:18 a.m. UTC | #6
On Fri, 18 Dec 2015 03:46:41 +0900,
Russell King - ARM Linux wrote:
> 
> On Thu, Dec 17, 2015 at 07:51:14AM -0800, Dan Williams wrote:
> > The commit below causes the libnvdimm sub-system to stop loading.
> > This is due to the fact that nvdimm_bus_match() returns the result of
> > test_bit() which may be negative.  If there are any other bus match
> > functions using test_bit they may be similarly impacted.
> > 
> > Can we queue a fixup like the following to libnvdimm, and maybe
> > others, ahead of this driver core change?
> 
> This is rather annoying.  Have we uncovered a latent bug in other
> architectures?  Well, looking through the test_bit() implementations,
> it looks like it.
> 
> I'll drop the patch set for the time being, we can't go around breaking
> stuff like this.  However, I think the test_bit() result should be
> regularised across different architectures - it _looks_ to me like most
> implementations return 0/1 values, but there may be some that don't
> (maybe the assembly versions?)
> 
> Here's the list I've pulled out so far from the "easy" cases, which all
> look like they're returning 0/1 values.
> 
> asm-generic: 0/1
> 
> /**
>  * test_bit - Determine whether a bit is set
>  * @nr: bit number to test
>  * @addr: Address to start counting from
>  */
> static inline int test_bit(int nr, const volatile unsigned long *addr)
> {
>         return 1UL & (addr[BIT_WORD(nr)] >> (nr & (BITS_PER_LONG-1)));
> }
> 
> alpha: 0/1
> 
> static inline int
> test_bit(int nr, const volatile void * addr)
> {
>         return (1UL & (((const int *) addr)[nr >> 5] >> (nr & 31))) != 0UL;
> }
> 
> arm: 0/1
> 
> test_bit(unsigned int nr, const volatile unsigned long *addr)
> {
>         unsigned long mask;
> 
>         addr += nr >> 5;
> 
>         mask = 1UL << (nr & 0x1f);
> 
>         return ((mask & *addr) != 0);
> }
> 
> blackfin: 0/1
> 
> static inline int test_bit(int nr, const volatile unsigned long *addr)
> {
>         volatile const unsigned long *a = addr + (nr >> 5);
>         return __raw_bit_test_asm(a, nr & 0x1f) != 0;
> }
> 
> frv: 0/1
> 
> static inline int
> __constant_test_bit(unsigned long nr, const volatile void *addr)
> {
>         return ((1UL << (nr & 31)) & (((const volatile unsigned int *) addr)[nr >> 5])) != 0;
> }
> (and similar for __test_bit)
> 
> h8300 uses assembly... no idea
0/1
I think same return of other architecture.

> hexagon uses assembly as well... no idea
> 
> ia64: 0/1
> 
> static __inline__ int
> test_bit (int nr, const volatile void *addr)
> {
>         return 1 & (((const volatile __u32 *) addr)[nr >> 5] >> (nr & 31));
> }
> 
> m68k: 0/1
> 
> static inline int test_bit(int nr, const unsigned long *vaddr)
> {
>         return (vaddr[nr >> 5] & (1UL << (nr & 31))) != 0;
> }
> 
> mn10300: 0/1
> 
> static inline int test_bit(unsigned long nr, const volatile void *addr)
> {
>         return 1UL & (((const volatile unsigned int *) addr)[nr >> 5] >> (nr & 31));
> }
> 
> s390: 0/1
> 
> static inline int test_bit(unsigned long nr, const volatile unsigned long *ptr)
> {
>         const volatile unsigned char *addr;
> 
>         addr = ((const volatile unsigned char *)ptr);
>         addr += (nr ^ (BITS_PER_LONG - 8)) >> 3;
>         return (*addr >> (nr & 7)) & 1;
> }
> 
> x86: 0/1 for constant, ? for variable
> 
> static __always_inline int constant_test_bit(long nr, const volatile unsigned long *addr)
> {
>         return ((1UL << (nr & (BITS_PER_LONG-1))) &
>                 (addr[nr >> _BITOPS_LONG_SHIFT])) != 0;
> }
> (presumably variable_test_bit is the same, but I don't know)
> 
> -- 
> FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
> according to speedtest.net.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
diff mbox

Patch

diff --git a/drivers/nvdimm/bus.c b/drivers/nvdimm/bus.c
index 7e2c43f701bc..2b2181cdeb63 100644
--- a/drivers/nvdimm/bus.c
+++ b/drivers/nvdimm/bus.c
@@ -62,7 +62,7 @@  static int nvdimm_bus_match(struct device *dev,
struct device_driver *drv)
{
       struct nd_device_driver *nd_drv = to_nd_device_driver(drv);

-       return test_bit(to_nd_device_type(dev), &nd_drv->type);
+       return !!test_bit(to_nd_device_type(dev), &nd_drv->type);
}