diff mbox series

[v4,05/17] PCI/DOE: Silence WARN splat with CONFIG_DEBUG_OBJECTS=y

Message ID 67a9117f463ecdb38a2dbca6a20391ce2f1e7a06.1678543498.git.lukas@wunner.de (mailing list archive)
State Handled Elsewhere
Headers show
Series Collection of DOE material | expand

Commit Message

Lukas Wunner March 11, 2023, 2:40 p.m. UTC
Gregory Price reports a WARN splat with CONFIG_DEBUG_OBJECTS=y upon CXL
probing because pci_doe_submit_task() invokes INIT_WORK() instead of
INIT_WORK_ONSTACK() for a work_struct that was allocated on the stack.

All callers of pci_doe_submit_task() allocate the work_struct on the
stack, so replace INIT_WORK() with INIT_WORK_ONSTACK() as a backportable
short-term fix.

The long-term fix implemented by a subsequent commit is to move to a
synchronous API which allocates the work_struct internally in the DOE
library.

Stacktrace for posterity:

WARNING: CPU: 0 PID: 23 at lib/debugobjects.c:545 __debug_object_init.cold+0x18/0x183
CPU: 0 PID: 23 Comm: kworker/u2:1 Not tainted 6.1.0-0.rc1.20221019gitaae703b02f92.17.fc38.x86_64 #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
Call Trace:
 pci_doe_submit_task+0x5d/0xd0
 pci_doe_discovery+0xb4/0x100
 pcim_doe_create_mb+0x219/0x290
 cxl_pci_probe+0x192/0x430
 local_pci_probe+0x41/0x80
 pci_device_probe+0xb3/0x220
 really_probe+0xde/0x380
 __driver_probe_device+0x78/0x170
 driver_probe_device+0x1f/0x90
 __driver_attach_async_helper+0x5c/0xe0
 async_run_entry_fn+0x30/0x130
 process_one_work+0x294/0x5b0

Fixes: 9d24322e887b ("PCI/DOE: Add DOE mailbox support functions")
Link: https://lore.kernel.org/linux-cxl/Y1bOniJliOFszvIK@memverge.com/
Reported-by: Gregory Price <gregory.price@memverge.com>
Tested-by: Ira Weiny <ira.weiny@intel.com>
Tested-by: Gregory Price <gregory.price@memverge.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huwei.com>
Cc: stable@vger.kernel.org # v6.0+
---
 drivers/pci/doe.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Alexey Kardashevskiy March 21, 2023, 3:42 a.m. UTC | #1
On 12/3/23 01:40, Lukas Wunner wrote:
> Gregory Price reports a WARN splat with CONFIG_DEBUG_OBJECTS=y upon CXL
> probing because pci_doe_submit_task() invokes INIT_WORK() instead of
> INIT_WORK_ONSTACK() for a work_struct that was allocated on the stack.
> 
> All callers of pci_doe_submit_task() allocate the work_struct on the
> stack, so replace INIT_WORK() with INIT_WORK_ONSTACK() as a backportable
> short-term fix.
> 
> The long-term fix implemented by a subsequent commit is to move to a
> synchronous API which allocates the work_struct internally in the DOE
> library.
> 
> Stacktrace for posterity:
> 
> WARNING: CPU: 0 PID: 23 at lib/debugobjects.c:545 __debug_object_init.cold+0x18/0x183
> CPU: 0 PID: 23 Comm: kworker/u2:1 Not tainted 6.1.0-0.rc1.20221019gitaae703b02f92.17.fc38.x86_64 #1
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> Call Trace:
>   pci_doe_submit_task+0x5d/0xd0
>   pci_doe_discovery+0xb4/0x100
>   pcim_doe_create_mb+0x219/0x290
>   cxl_pci_probe+0x192/0x430
>   local_pci_probe+0x41/0x80
>   pci_device_probe+0xb3/0x220
>   really_probe+0xde/0x380
>   __driver_probe_device+0x78/0x170
>   driver_probe_device+0x1f/0x90
>   __driver_attach_async_helper+0x5c/0xe0
>   async_run_entry_fn+0x30/0x130
>   process_one_work+0x294/0x5b0
> 
> Fixes: 9d24322e887b ("PCI/DOE: Add DOE mailbox support functions")
> Link: https://lore.kernel.org/linux-cxl/Y1bOniJliOFszvIK@memverge.com/
> Reported-by: Gregory Price <gregory.price@memverge.com>
> Tested-by: Ira Weiny <ira.weiny@intel.com>
> Tested-by: Gregory Price <gregory.price@memverge.com>
> Signed-off-by: Lukas Wunner <lukas@wunner.de>
> Reviewed-by: Ira Weiny <ira.weiny@intel.com>
> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> Reviewed-by: Gregory Price <gregory.price@memverge.com>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huwei.com>
                                                   ^^^^^

huwei? :)
Jonathan Cameron March 21, 2023, 9:05 a.m. UTC | #2
On Tue, 21 Mar 2023 14:42:01 +1100
Alexey Kardashevskiy <aik@amd.com> wrote:

> On 12/3/23 01:40, Lukas Wunner wrote:
> > Gregory Price reports a WARN splat with CONFIG_DEBUG_OBJECTS=y upon CXL
> > probing because pci_doe_submit_task() invokes INIT_WORK() instead of
> > INIT_WORK_ONSTACK() for a work_struct that was allocated on the stack.
> > 
> > All callers of pci_doe_submit_task() allocate the work_struct on the
> > stack, so replace INIT_WORK() with INIT_WORK_ONSTACK() as a backportable
> > short-term fix.
> > 
> > The long-term fix implemented by a subsequent commit is to move to a
> > synchronous API which allocates the work_struct internally in the DOE
> > library.
> > 
> > Stacktrace for posterity:
> > 
> > WARNING: CPU: 0 PID: 23 at lib/debugobjects.c:545 __debug_object_init.cold+0x18/0x183
> > CPU: 0 PID: 23 Comm: kworker/u2:1 Not tainted 6.1.0-0.rc1.20221019gitaae703b02f92.17.fc38.x86_64 #1
> > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> > Call Trace:
> >   pci_doe_submit_task+0x5d/0xd0
> >   pci_doe_discovery+0xb4/0x100
> >   pcim_doe_create_mb+0x219/0x290
> >   cxl_pci_probe+0x192/0x430
> >   local_pci_probe+0x41/0x80
> >   pci_device_probe+0xb3/0x220
> >   really_probe+0xde/0x380
> >   __driver_probe_device+0x78/0x170
> >   driver_probe_device+0x1f/0x90
> >   __driver_attach_async_helper+0x5c/0xe0
> >   async_run_entry_fn+0x30/0x130
> >   process_one_work+0x294/0x5b0
> > 
> > Fixes: 9d24322e887b ("PCI/DOE: Add DOE mailbox support functions")
> > Link: https://lore.kernel.org/linux-cxl/Y1bOniJliOFszvIK@memverge.com/
> > Reported-by: Gregory Price <gregory.price@memverge.com>
> > Tested-by: Ira Weiny <ira.weiny@intel.com>
> > Tested-by: Gregory Price <gregory.price@memverge.com>
> > Signed-off-by: Lukas Wunner <lukas@wunner.de>
> > Reviewed-by: Ira Weiny <ira.weiny@intel.com>
> > Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> > Reviewed-by: Gregory Price <gregory.price@memverge.com>
> > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huwei.com>  
>                                                    ^^^^^
> 
> huwei? :)
Doh.  I normally type my own name wrong ;)

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Thanks,

Jonathan

> 
>
Lukas Wunner April 4, 2023, 9:01 a.m. UTC | #3
On Tue, Mar 21, 2023 at 02:42:01PM +1100, Alexey Kardashevskiy wrote:
> On 12/3/23 01:40, Lukas Wunner wrote:
> > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huwei.com>
>                                                   ^^^^^
> 
> huwei? :)

Thanks for spotting this Alexey.

Dan fixed it up when he applied the patch to cxl/fixes yesterday:
https://git.kernel.org/cxl/cxl/c/92dc899c3b49
diff mbox series

Patch

diff --git a/drivers/pci/doe.c b/drivers/pci/doe.c
index 6f097932ccbf..c14ffdf23f87 100644
--- a/drivers/pci/doe.c
+++ b/drivers/pci/doe.c
@@ -523,6 +523,8 @@  EXPORT_SYMBOL_GPL(pci_doe_supports_prot);
  * task->complete will be called when the state machine is done processing this
  * task.
  *
+ * @task must be allocated on the stack.
+ *
  * Excess data will be discarded.
  *
  * RETURNS: 0 when task has been successfully queued, -ERRNO on error
@@ -544,7 +546,7 @@  int pci_doe_submit_task(struct pci_doe_mb *doe_mb, struct pci_doe_task *task)
 		return -EIO;
 
 	task->doe_mb = doe_mb;
-	INIT_WORK(&task->work, doe_statemachine_work);
+	INIT_WORK_ONSTACK(&task->work, doe_statemachine_work);
 	queue_work(doe_mb->work_queue, &task->work);
 	return 0;
 }