diff mbox

acpi: pci_root: fix NULL pointer deref after resume from suspend

Message ID 1254119480-9730-1-git-send-email-dfeng@redhat.com (mailing list archive)
State RFC, archived
Headers show

Commit Message

Xiaotian Feng Sept. 28, 2009, 6:31 a.m. UTC
commit 275582 introduces acpi_get_pci_dev(), but pdev->subordinate
can be NULL, then a NULL was passed to pci_get_slot, this results
the kernel oops when resume from suspend.

This patch resolves following kernel oops:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
IP: [<ffffffff812217e7>] pci_get_slot+0x4c/0x8c

Signed-off-by: Xiaotian Feng <dfeng@redhat.com>
---
 drivers/acpi/pci_root.c |    6 +++++-
 1 files changed, 5 insertions(+), 1 deletions(-)

Comments

Alexander Chiang Sept. 28, 2009, 5:38 p.m. UTC | #1
Hi Xiaotian,

Thanks for the bug report.

* Xiaotian Feng <dfeng@redhat.com>:
> commit 275582 introduces acpi_get_pci_dev(), but pdev->subordinate
> can be NULL, then a NULL was passed to pci_get_slot, this results
> the kernel oops when resume from suspend.
> 
> This patch resolves following kernel oops:
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
> IP: [<ffffffff812217e7>] pci_get_slot+0x4c/0x8c
> 
> Signed-off-by: Xiaotian Feng <dfeng@redhat.com>
> ---
>  drivers/acpi/pci_root.c |    6 +++++-
>  1 files changed, 5 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
> index 3112221..3c35144 100644
> --- a/drivers/acpi/pci_root.c
> +++ b/drivers/acpi/pci_root.c
> @@ -387,7 +387,11 @@ struct pci_dev *acpi_get_pci_dev(acpi_handle handle)
>  		if (!pdev || hnd == handle)
>  			break;
>  
> -		pbus = pdev->subordinate;
> +		if (pdev->subordinate)
> +			pbus = pdev->subordinate;
> +		else
> +			pbus = pdev->bus;
> +

I'm a little confused by this. If we start from the PCI root
bridge and walk back down the hierarchy, shouldn't everything
between the root and the device be a P2P bridge?

What is special about suspend/resume that causes the subordinate
bus to become NULL?

Can you send the full stacktrace?

Thanks.

/ac

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael Wysocki Sept. 28, 2009, 8:43 p.m. UTC | #2
On Monday 28 September 2009, Alex Chiang wrote:
> Hi Xiaotian,
> 
> Thanks for the bug report.
> 
> * Xiaotian Feng <dfeng@redhat.com>:
> > commit 275582 introduces acpi_get_pci_dev(), but pdev->subordinate
> > can be NULL, then a NULL was passed to pci_get_slot, this results
> > the kernel oops when resume from suspend.
> > 
> > This patch resolves following kernel oops:
> > BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
> > IP: [<ffffffff812217e7>] pci_get_slot+0x4c/0x8c
> > 
> > Signed-off-by: Xiaotian Feng <dfeng@redhat.com>
> > ---
> >  drivers/acpi/pci_root.c |    6 +++++-
> >  1 files changed, 5 insertions(+), 1 deletions(-)
> > 
> > diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
> > index 3112221..3c35144 100644
> > --- a/drivers/acpi/pci_root.c
> > +++ b/drivers/acpi/pci_root.c
> > @@ -387,7 +387,11 @@ struct pci_dev *acpi_get_pci_dev(acpi_handle handle)
> >  		if (!pdev || hnd == handle)
> >  			break;
> >  
> > -		pbus = pdev->subordinate;
> > +		if (pdev->subordinate)
> > +			pbus = pdev->subordinate;
> > +		else
> > +			pbus = pdev->bus;
> > +
> 
> I'm a little confused by this. If we start from the PCI root
> bridge and walk back down the hierarchy, shouldn't everything
> between the root and the device be a P2P bridge?

Well, if my reading of the code is correct, there's no guarantee that
pci_get_slot() will always return either the right device or a bridge.  If it
returns anything that's not a bridge or the device we're looking for, we'll hit
the NULL pointer deref.

In fact it looks like we should also clear pdev if no device was found.

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael Wysocki Sept. 28, 2009, 9:05 p.m. UTC | #3
On Monday 28 September 2009, Rafael J. Wysocki wrote:
> On Monday 28 September 2009, Alex Chiang wrote:
> > Hi Xiaotian,
> > 
> > Thanks for the bug report.
> > 
> > * Xiaotian Feng <dfeng@redhat.com>:
> > > commit 275582 introduces acpi_get_pci_dev(), but pdev->subordinate
> > > can be NULL, then a NULL was passed to pci_get_slot, this results
> > > the kernel oops when resume from suspend.
> > > 
> > > This patch resolves following kernel oops:
> > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
> > > IP: [<ffffffff812217e7>] pci_get_slot+0x4c/0x8c
> > > 
> > > Signed-off-by: Xiaotian Feng <dfeng@redhat.com>
> > > ---
> > >  drivers/acpi/pci_root.c |    6 +++++-
> > >  1 files changed, 5 insertions(+), 1 deletions(-)
> > > 
> > > diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
> > > index 3112221..3c35144 100644
> > > --- a/drivers/acpi/pci_root.c
> > > +++ b/drivers/acpi/pci_root.c
> > > @@ -387,7 +387,11 @@ struct pci_dev *acpi_get_pci_dev(acpi_handle handle)
> > >  		if (!pdev || hnd == handle)
> > >  			break;
> > >  
> > > -		pbus = pdev->subordinate;
> > > +		if (pdev->subordinate)
> > > +			pbus = pdev->subordinate;
> > > +		else
> > > +			pbus = pdev->bus;
> > > +
> > 
> > I'm a little confused by this. If we start from the PCI root
> > bridge and walk back down the hierarchy, shouldn't everything
> > between the root and the device be a P2P bridge?
> 
> Well, if my reading of the code is correct, there's no guarantee that
> pci_get_slot() will always return either the right device or a bridge.

I should have been more precise.

If devfn of node happens to be the same as devfn of a non-bridge device on
pbus, the pci_get_slot() will return a valid pointer to it, but
pdev->subordinate will be NULL.  Is it impossible for some reason?

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Chiang Sept. 28, 2009, 10:20 p.m. UTC | #4
* Rafael J. Wysocki <rjw@sisk.pl>:
> On Monday 28 September 2009, Rafael J. Wysocki wrote:
> > On Monday 28 September 2009, Alex Chiang wrote:
> > > * Xiaotian Feng <dfeng@redhat.com>:
> > > > --- a/drivers/acpi/pci_root.c
> > > > +++ b/drivers/acpi/pci_root.c
> > > > @@ -387,7 +387,11 @@ struct pci_dev *acpi_get_pci_dev(acpi_handle handle)
> > > >  		if (!pdev || hnd == handle)
> > > >  			break;
> > > >  
> > > > -		pbus = pdev->subordinate;
> > > > +		if (pdev->subordinate)
> > > > +			pbus = pdev->subordinate;
> > > > +		else
> > > > +			pbus = pdev->bus;
> > > > +
> > > 
> > > I'm a little confused by this. If we start from the PCI root
> > > bridge and walk back down the hierarchy, shouldn't everything
> > > between the root and the device be a P2P bridge?
> > 
> > Well, if my reading of the code is correct, there's no guarantee that
> > pci_get_slot() will always return either the right device or a bridge.
> 
> I should have been more precise.
> 
> If devfn of node happens to be the same as devfn of a non-bridge device on
> pbus, the pci_get_slot() will return a valid pointer to it, but
> pdev->subordinate will be NULL.  Is it impossible for some reason?

Hm, that's a good thought, but I'm still confused. Here's the
first part of the full function (acpi_get_pci_dev):

        phandle = handle;
        while (!acpi_is_root_bridge(phandle)) {
                node = kzalloc(sizeof(struct acpi_handle_node), GFP_KERNEL);
                if (!node)
                        goto out;

                INIT_LIST_HEAD(&node->node);
                node->handle = phandle;
                list_add(&node->node, &device_list);

                status = acpi_get_parent(phandle, &phandle);
                if (ACPI_FAILURE(status))
                        goto out;
        }

phandle starts off as the input parameter, and we make successive
calls to acpi_get_parent() to walk up the ACPI device tree until
we get to a root bridge.

My assumption here is that all those ACPI devices must be P2P
bridges.

        root = acpi_pci_find_root(phandle);
        if (!root)
                goto out;

        pbus = root->bus;

Now we've got an acpi_pci_root() which has a struct pci_bus, and
we can start walking back down the PCI tree. Except what we're
really doing is iterating across the device_list which we created
above.

device_list should only contain P2P bridges, based on my
assumption above.

        list_for_each_entry(node, &device_list, node) {
                acpi_handle hnd = node->handle;
                status = acpi_evaluate_integer(hnd, "_ADR", NULL, &adr);
                if (ACPI_FAILURE(status))
                        goto out;
                dev = (adr >> 16) & 0xffff;
                fn  = adr & 0xffff;

                pdev = pci_get_slot(pbus, PCI_DEVFN(dev, fn));
                if (!pdev || hnd == handle)
                        break;

                pbus = pdev->subordinate;
                pci_dev_put(pdev);
        }

The point you raise about collision between the devfn of 'node'
and some non-bridge device returned by pci_get_slot() seems like
it really shouldn't happen, because we evaluate _ADR for each
node on device_list, in the reverse order that we found them, and
based on my assumption, all those nodes should be bridges.

I'm not saying that Xiaotian's patch is wrong. I'm saying I'd
like to be educated as to why my basic assumption was wrong,
because now you're making me think that this code is pretty
fragile. :-/

Thanks.

/ac


--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael Wysocki Sept. 28, 2009, 10:50 p.m. UTC | #5
On Tuesday 29 September 2009, Alex Chiang wrote:
> * Rafael J. Wysocki <rjw@sisk.pl>:
> > On Monday 28 September 2009, Rafael J. Wysocki wrote:
> > > On Monday 28 September 2009, Alex Chiang wrote:
> > > > * Xiaotian Feng <dfeng@redhat.com>:
> > > > > --- a/drivers/acpi/pci_root.c
> > > > > +++ b/drivers/acpi/pci_root.c
> > > > > @@ -387,7 +387,11 @@ struct pci_dev *acpi_get_pci_dev(acpi_handle handle)
> > > > >  		if (!pdev || hnd == handle)
> > > > >  			break;
> > > > >  
> > > > > -		pbus = pdev->subordinate;
> > > > > +		if (pdev->subordinate)
> > > > > +			pbus = pdev->subordinate;
> > > > > +		else
> > > > > +			pbus = pdev->bus;
> > > > > +
> > > > 
> > > > I'm a little confused by this. If we start from the PCI root
> > > > bridge and walk back down the hierarchy, shouldn't everything
> > > > between the root and the device be a P2P bridge?
> > > 
> > > Well, if my reading of the code is correct, there's no guarantee that
> > > pci_get_slot() will always return either the right device or a bridge.
> > 
> > I should have been more precise.
> > 
> > If devfn of node happens to be the same as devfn of a non-bridge device on
> > pbus, the pci_get_slot() will return a valid pointer to it, but
> > pdev->subordinate will be NULL.  Is it impossible for some reason?
> 
> Hm, that's a good thought, but I'm still confused. Here's the
> first part of the full function (acpi_get_pci_dev):
> 
>         phandle = handle;
>         while (!acpi_is_root_bridge(phandle)) {
>                 node = kzalloc(sizeof(struct acpi_handle_node), GFP_KERNEL);
>                 if (!node)
>                         goto out;
> 
>                 INIT_LIST_HEAD(&node->node);
>                 node->handle = phandle;
>                 list_add(&node->node, &device_list);
> 
>                 status = acpi_get_parent(phandle, &phandle);
>                 if (ACPI_FAILURE(status))
>                         goto out;
>         }
> 
> phandle starts off as the input parameter, and we make successive
> calls to acpi_get_parent() to walk up the ACPI device tree until
> we get to a root bridge.
> 
> My assumption here is that all those ACPI devices must be P2P
> bridges.
> 
>         root = acpi_pci_find_root(phandle);
>         if (!root)
>                 goto out;
> 
>         pbus = root->bus;
> 
> Now we've got an acpi_pci_root() which has a struct pci_bus, and
> we can start walking back down the PCI tree. Except what we're
> really doing is iterating across the device_list which we created
> above.
> 
> device_list should only contain P2P bridges, based on my
> assumption above.
> 
>         list_for_each_entry(node, &device_list, node) {
>                 acpi_handle hnd = node->handle;
>                 status = acpi_evaluate_integer(hnd, "_ADR", NULL, &adr);
>                 if (ACPI_FAILURE(status))
>                         goto out;
>                 dev = (adr >> 16) & 0xffff;
>                 fn  = adr & 0xffff;
> 
>                 pdev = pci_get_slot(pbus, PCI_DEVFN(dev, fn));
>                 if (!pdev || hnd == handle)
>                         break;
> 
>                 pbus = pdev->subordinate;
>                 pci_dev_put(pdev);
>         }
> 
> The point you raise about collision between the devfn of 'node'
> and some non-bridge device returned by pci_get_slot() seems like
> it really shouldn't happen, because we evaluate _ADR for each
> node on device_list, in the reverse order that we found them, and
> based on my assumption, all those nodes should be bridges.

You seem to be right, but if the Xiaotian's patch actually fixes the NULL
pointer deref, one of the assumptions is clearly wrong.

> I'm not saying that Xiaotian's patch is wrong. I'm saying I'd
> like to be educated as to why my basic assumption was wrong,
> because now you're making me think that this code is pretty
> fragile. :-/

Perhaps Xiaotian can add some printk()s on top of his patch that will show us
exactly in what conditions pbus becomes NULL.

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Xiaotian Feng Sept. 29, 2009, 1:44 a.m. UTC | #6
On 09/29/2009 01:38 AM, Alex Chiang wrote:
> Hi Xiaotian,
>
> Thanks for the bug report.
>
> * Xiaotian Feng<dfeng@redhat.com>:
>    
>> commit 275582 introduces acpi_get_pci_dev(), but pdev->subordinate
>> can be NULL, then a NULL was passed to pci_get_slot, this results
>> the kernel oops when resume from suspend.
>>
>> This patch resolves following kernel oops:
>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
>> IP: [<ffffffff812217e7>] pci_get_slot+0x4c/0x8c
>>
>> Signed-off-by: Xiaotian Feng<dfeng@redhat.com>
>> ---
>>   drivers/acpi/pci_root.c |    6 +++++-
>>   1 files changed, 5 insertions(+), 1 deletions(-)
>>
>> diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
>> index 3112221..3c35144 100644
>> --- a/drivers/acpi/pci_root.c
>> +++ b/drivers/acpi/pci_root.c
>> @@ -387,7 +387,11 @@ struct pci_dev *acpi_get_pci_dev(acpi_handle handle)
>>   		if (!pdev || hnd == handle)
>>   			break;
>>
>> -		pbus = pdev->subordinate;
>> +		if (pdev->subordinate)
>> +			pbus = pdev->subordinate;
>> +		else
>> +			pbus = pdev->bus;
>> +
>>      
> I'm a little confused by this. If we start from the PCI root
> bridge and walk back down the hierarchy, shouldn't everything
> between the root and the device be a P2P bridge?
>
> What is special about suspend/resume that causes the subordinate
> bus to become NULL?
>
> Can you send the full stacktrace?
>
> Thanks.
>
> /ac
>
>
>    
the full call trace is here:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
IP: [<ffffffff812217e7>] pci_get_slot+0x4c/0x8c
PGD 208b9d067 PUD 208a89067 PMD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/power/state
CPU 0
Modules linked in: fuse radeon ttm drm_kms_helper drm i2c_algo_bit sco 
bridge stp llc bnep l2cap bluetooth sunrpc ip6t_REJECT nf_conntrack_ipv6 
ip6table_filter ip6_tables ipv6 dm_multipath uinput snd_hda_codec_analog 
snd_hda_intel snd_hda_codec snd_hwdep e1000e snd_pcm snd_timer i2c_i801 
i2c_core snd soundcore snd_page_alloc iTCO_wdt iTCO_vendor_support 
serio_raw ppdev parport_pc parport pcspkr dcdbas ata_generic pata_acpi 
[last unloaded: speedstep_lib]
Pid: 35, comm: kacpi_hotplug Not tainted 2.6.32-rc2 #3 OptiPlex 760
RIP: 0010:[<ffffffff812217e7>]  [<ffffffff812217e7>] pci_get_slot+0x4c/0x8c
RSP: 0018:ffff88022ee69aa0  EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffff88022e9b1090 RCX: 00000000000000a0
RDX: 000000000000002f RSI: ffffffff8168ab38 RDI: ffffffff8168ab38
RBP: ffff88022ee69ac0 R08: ffffffff8168ab30 R09: ffff880100000000
R10: ffffffff8168ab50 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000001 R14: ffff88022f712000 R15: ffff88022f710dd0
FS:  0000000000000000(0000) GS:ffff880028200000(0000) 
knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000028 CR3: 00000001fc298000 CR4: 00000000000406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kacpi_hotplug (pid: 35, threadinfo ffff88022ee68000, task 
ffff88022eefc120)
Stack:
  0000000000000018 ffff88022e9b1090 ffff88020880e9c0 0000000000000000
<0> ffff88022ee69b30 ffffffff81254193 0000000000000000 ffff88022ee69ae0
<0> ffff88020880e340 ffff88020880ee38 ffff88022f710208 0000000000000001
Call Trace:
  [<ffffffff81254193>] acpi_get_pci_dev+0x106/0x167
  [<ffffffff8125545a>] acpi_pci_bind+0x1c/0x86
  [<ffffffff8116230a>] ? sysfs_create_file+0x2a/0x2c
  [<ffffffff8125141f>] acpi_add_single_object+0x964/0xa0c
  [<ffffffff812515a7>] acpi_bus_check_add+0xe0/0x138
  [<ffffffff81251667>] acpi_bus_scan+0x68/0xa0
  [<ffffffff812516f4>] acpi_bus_add+0x2a/0x2e
  [<ffffffff81252c59>] hotplug_dock_devices+0x114/0x13e
  [<ffffffff8125301a>] acpi_dock_deferred_cb+0xbf/0x192
  [<ffffffff8124d6ca>] acpi_os_execute_deferred+0x29/0x36
  [<ffffffff8106a244>] worker_thread+0x251/0x347
  [<ffffffff8106a1ef>] ? worker_thread+0x1fc/0x347
  [<ffffffff8124d6a1>] ? acpi_os_execute_deferred+0x0/0x36
  [<ffffffff8106e426>] ? autoremove_wake_function+0x0/0x39
  [<ffffffff81069ff3>] ? worker_thread+0x0/0x347
  [<ffffffff8106e0e0>] kthread+0x7f/0x87
  [<ffffffff81012cea>] child_rip+0xa/0x20
  [<ffffffff81012650>] ? restore_args+0x0/0x30
  [<ffffffff8106e061>] ? kthread+0x0/0x87
  [<ffffffff81012ce0>] ? child_rip+0x0/0x20
Code: ff 49 89 fc 41 89 f5 a9 00 ff ff 07 74 11 be 87 00 00 00 48 c7 c7 
45 6d 5a 81 e8 f6 2b e3 ff 48 c7 c7 30 ab 68 81 e8 29 77 20 00 <49> 8b 
5c 24 28 49 83 c4 28 eb 09 44 39 6b 38 74 10 48 89 c3 48
RIP  [<ffffffff812217e7>] pci_get_slot+0x4c/0x8c
  RSP <ffff88022ee69aa0>
CR2: 0000000000000028
---[ end trace b5a7793bd9db2a4d ]---

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Xiaotian Feng Sept. 29, 2009, 10:11 a.m. UTC | #7
On 09/29/2009 06:50 AM, Rafael J. Wysocki wrote:
> On Tuesday 29 September 2009, Alex Chiang wrote:
>> * Rafael J. Wysocki<rjw@sisk.pl>:
>>> On Monday 28 September 2009, Rafael J. Wysocki wrote:
>>>> On Monday 28 September 2009, Alex Chiang wrote:
>>>>> * Xiaotian Feng<dfeng@redhat.com>:
>>>>>> --- a/drivers/acpi/pci_root.c
>>>>>> +++ b/drivers/acpi/pci_root.c
>>>>>> @@ -387,7 +387,11 @@ struct pci_dev *acpi_get_pci_dev(acpi_handle handle)
>>>>>>   		if (!pdev || hnd == handle)
>>>>>>   			break;
>>>>>>
>>>>>> -		pbus = pdev->subordinate;
>>>>>> +		if (pdev->subordinate)
>>>>>> +			pbus = pdev->subordinate;
>>>>>> +		else
>>>>>> +			pbus = pdev->bus;
>>>>>> +
>>>>>
>>>>> I'm a little confused by this. If we start from the PCI root
>>>>> bridge and walk back down the hierarchy, shouldn't everything
>>>>> between the root and the device be a P2P bridge?
>>>>
>>>> Well, if my reading of the code is correct, there's no guarantee that
>>>> pci_get_slot() will always return either the right device or a bridge.
>>>
>>> I should have been more precise.
>>>
>>> If devfn of node happens to be the same as devfn of a non-bridge device on
>>> pbus, the pci_get_slot() will return a valid pointer to it, but
>>> pdev->subordinate will be NULL.  Is it impossible for some reason?
>>
>> Hm, that's a good thought, but I'm still confused. Here's the
>> first part of the full function (acpi_get_pci_dev):
>>
>>          phandle = handle;
>>          while (!acpi_is_root_bridge(phandle)) {
>>                  node = kzalloc(sizeof(struct acpi_handle_node), GFP_KERNEL);
>>                  if (!node)
>>                          goto out;
>>
>>                  INIT_LIST_HEAD(&node->node);
>>                  node->handle = phandle;
>>                  list_add(&node->node,&device_list);
>>
>>                  status = acpi_get_parent(phandle,&phandle);
>>                  if (ACPI_FAILURE(status))
>>                          goto out;
>>          }
>>
>> phandle starts off as the input parameter, and we make successive
>> calls to acpi_get_parent() to walk up the ACPI device tree until
>> we get to a root bridge.
>>
>> My assumption here is that all those ACPI devices must be P2P
>> bridges.
>>
>>          root = acpi_pci_find_root(phandle);
>>          if (!root)
>>                  goto out;
>>
>>          pbus = root->bus;
>>
>> Now we've got an acpi_pci_root() which has a struct pci_bus, and
>> we can start walking back down the PCI tree. Except what we're
>> really doing is iterating across the device_list which we created
>> above.
>>
>> device_list should only contain P2P bridges, based on my
>> assumption above.
>>
>>          list_for_each_entry(node,&device_list, node) {
>>                  acpi_handle hnd = node->handle;
>>                  status = acpi_evaluate_integer(hnd, "_ADR", NULL,&adr);
>>                  if (ACPI_FAILURE(status))
>>                          goto out;
>>                  dev = (adr>>  16)&  0xffff;
>>                  fn  = adr&  0xffff;
>>
>>                  pdev = pci_get_slot(pbus, PCI_DEVFN(dev, fn));
>>                  if (!pdev || hnd == handle)
>>                          break;
>>
>>                  pbus = pdev->subordinate;
>>                  pci_dev_put(pdev);
>>          }
>>
>> The point you raise about collision between the devfn of 'node'
>> and some non-bridge device returned by pci_get_slot() seems like
>> it really shouldn't happen, because we evaluate _ADR for each
>> node on device_list, in the reverse order that we found them, and
>> based on my assumption, all those nodes should be bridges.
>
> You seem to be right, but if the Xiaotian's patch actually fixes the NULL
> pointer deref, one of the assumptions is clearly wrong.
>
>> I'm not saying that Xiaotian's patch is wrong. I'm saying I'd
>> like to be educated as to why my basic assumption was wrong,
>> because now you're making me think that this code is pretty
>> fragile. :-/
>
> Perhaps Xiaotian can add some printk()s on top of his patch that will show us
> exactly in what conditions pbus becomes NULL.
>
> Thanks,
> Rafael
>
Is there any cases that pdev->subordinate is NULL while pdev is bridge 
device?
 From pci_slot.c::walk_p2p_bridge, there's code like following:

         dev = pci_get_slot(pci_bus, PCI_DEVFN(device, function));
         if (!dev || !dev->subordinate)
                 goto out;

It looks like dev->subordinate can be NULL even if in p2p bridge, right?

Thanks
Xiaotian

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
index 3112221..3c35144 100644
--- a/drivers/acpi/pci_root.c
+++ b/drivers/acpi/pci_root.c
@@ -387,7 +387,11 @@  struct pci_dev *acpi_get_pci_dev(acpi_handle handle)
 		if (!pdev || hnd == handle)
 			break;
 
-		pbus = pdev->subordinate;
+		if (pdev->subordinate)
+			pbus = pdev->subordinate;
+		else
+			pbus = pdev->bus;
+
 		pci_dev_put(pdev);
 	}
 out: