diff mbox series

[v2,01/10] PCI/DOE: Silence WARN splat with CONFIG_DEBUG_OBJECTS=y

Message ID cc4b61809e2520d835cf3d4f62e7d5ed00a9d031.1674468099.git.lukas@wunner.de (mailing list archive)
State Superseded
Delegated to: Bjorn Helgaas
Headers show
Series Collection of DOE material | expand

Commit Message

Lukas Wunner Jan. 23, 2023, 10:11 a.m. UTC
Gregory Price reports a WARN splat with CONFIG_DEBUG_OBJECTS=y upon CXL
probing because pci_doe_submit_task() invokes INIT_WORK() instead of
INIT_WORK_ONSTACK() for a work_struct that was allocated on the stack.

All callers of pci_doe_submit_task() allocate the work_struct on the
stack, so replace INIT_WORK() with INIT_WORK_ONSTACK() as a backportable
short-term fix.

Stacktrace for posterity:

WARNING: CPU: 0 PID: 23 at lib/debugobjects.c:545 __debug_object_init.cold+0x18/0x183
CPU: 0 PID: 23 Comm: kworker/u2:1 Not tainted 6.1.0-0.rc1.20221019gitaae703b02f92.17.fc38.x86_64 #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
Call Trace:
 pci_doe_submit_task+0x5d/0xd0
 pci_doe_discovery+0xb4/0x100
 pcim_doe_create_mb+0x219/0x290
 cxl_pci_probe+0x192/0x430
 local_pci_probe+0x41/0x80
 pci_device_probe+0xb3/0x220
 really_probe+0xde/0x380
 __driver_probe_device+0x78/0x170
 driver_probe_device+0x1f/0x90
 __driver_attach_async_helper+0x5c/0xe0
 async_run_entry_fn+0x30/0x130
 process_one_work+0x294/0x5b0

Fixes: 9d24322e887b ("PCI/DOE: Add DOE mailbox support functions")
Link: https://lore.kernel.org/linux-cxl/Y1bOniJliOFszvIK@memverge.com/
Reported-by: Gregory Price <gregory.price@memverge.com>
Tested-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org # v6.0+
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 Changes v1 -> v2:
  * Add note in kernel-doc of pci_doe_submit_task() that pci_doe_task must
    be allocated on the stack (Jonathan)

 drivers/pci/doe.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Ira Weiny Jan. 24, 2023, 12:33 a.m. UTC | #1
Lukas Wunner wrote:
> Gregory Price reports a WARN splat with CONFIG_DEBUG_OBJECTS=y upon CXL
> probing because pci_doe_submit_task() invokes INIT_WORK() instead of
> INIT_WORK_ONSTACK() for a work_struct that was allocated on the stack.
> 
> All callers of pci_doe_submit_task() allocate the work_struct on the
> stack, so replace INIT_WORK() with INIT_WORK_ONSTACK() as a backportable
> short-term fix.
> 
> Stacktrace for posterity:
> 
> WARNING: CPU: 0 PID: 23 at lib/debugobjects.c:545 __debug_object_init.cold+0x18/0x183
> CPU: 0 PID: 23 Comm: kworker/u2:1 Not tainted 6.1.0-0.rc1.20221019gitaae703b02f92.17.fc38.x86_64 #1
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> Call Trace:
>  pci_doe_submit_task+0x5d/0xd0
>  pci_doe_discovery+0xb4/0x100
>  pcim_doe_create_mb+0x219/0x290
>  cxl_pci_probe+0x192/0x430
>  local_pci_probe+0x41/0x80
>  pci_device_probe+0xb3/0x220
>  really_probe+0xde/0x380
>  __driver_probe_device+0x78/0x170
>  driver_probe_device+0x1f/0x90
>  __driver_attach_async_helper+0x5c/0xe0
>  async_run_entry_fn+0x30/0x130
>  process_one_work+0x294/0x5b0
> 
> Fixes: 9d24322e887b ("PCI/DOE: Add DOE mailbox support functions")
> Link: https://lore.kernel.org/linux-cxl/Y1bOniJliOFszvIK@memverge.com/
> Reported-by: Gregory Price <gregory.price@memverge.com>
> Tested-by: Ira Weiny <ira.weiny@intel.com>

Reviewed-by: Ira Weiny <ira.weiny@intel.com>

> Signed-off-by: Lukas Wunner <lukas@wunner.de>
> Cc: stable@vger.kernel.org # v6.0+
> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
>  Changes v1 -> v2:
>   * Add note in kernel-doc of pci_doe_submit_task() that pci_doe_task must
>     be allocated on the stack (Jonathan)
> 
>  drivers/pci/doe.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/doe.c b/drivers/pci/doe.c
> index 66d9ab288646..12a6752351bf 100644
> --- a/drivers/pci/doe.c
> +++ b/drivers/pci/doe.c
> @@ -520,6 +520,8 @@ EXPORT_SYMBOL_GPL(pci_doe_supports_prot);
>   * task->complete will be called when the state machine is done processing this
>   * task.
>   *
> + * @task must be allocated on the stack.
> + *
>   * Excess data will be discarded.
>   *
>   * RETURNS: 0 when task has been successfully queued, -ERRNO on error
> @@ -541,7 +543,7 @@ int pci_doe_submit_task(struct pci_doe_mb *doe_mb, struct pci_doe_task *task)
>  		return -EIO;
>  
>  	task->doe_mb = doe_mb;
> -	INIT_WORK(&task->work, doe_statemachine_work);
> +	INIT_WORK_ONSTACK(&task->work, doe_statemachine_work);
>  	queue_work(doe_mb->work_queue, &task->work);
>  	return 0;
>  }
> -- 
> 2.39.1
>
Jonathan Cameron Jan. 24, 2023, 10:32 a.m. UTC | #2
On Mon, 23 Jan 2023 16:33:36 -0800
Ira Weiny <ira.weiny@intel.com> wrote:

> Lukas Wunner wrote:
> > Gregory Price reports a WARN splat with CONFIG_DEBUG_OBJECTS=y upon CXL
> > probing because pci_doe_submit_task() invokes INIT_WORK() instead of
> > INIT_WORK_ONSTACK() for a work_struct that was allocated on the stack.
> > 
> > All callers of pci_doe_submit_task() allocate the work_struct on the
> > stack, so replace INIT_WORK() with INIT_WORK_ONSTACK() as a backportable
> > short-term fix.
> > 
> > Stacktrace for posterity:
> > 
> > WARNING: CPU: 0 PID: 23 at lib/debugobjects.c:545 __debug_object_init.cold+0x18/0x183
> > CPU: 0 PID: 23 Comm: kworker/u2:1 Not tainted 6.1.0-0.rc1.20221019gitaae703b02f92.17.fc38.x86_64 #1
> > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> > Call Trace:
> >  pci_doe_submit_task+0x5d/0xd0
> >  pci_doe_discovery+0xb4/0x100
> >  pcim_doe_create_mb+0x219/0x290
> >  cxl_pci_probe+0x192/0x430
> >  local_pci_probe+0x41/0x80
> >  pci_device_probe+0xb3/0x220
> >  really_probe+0xde/0x380
> >  __driver_probe_device+0x78/0x170
> >  driver_probe_device+0x1f/0x90
> >  __driver_attach_async_helper+0x5c/0xe0
> >  async_run_entry_fn+0x30/0x130
> >  process_one_work+0x294/0x5b0
> > 
> > Fixes: 9d24322e887b ("PCI/DOE: Add DOE mailbox support functions")
> > Link: https://lore.kernel.org/linux-cxl/Y1bOniJliOFszvIK@memverge.com/
> > Reported-by: Gregory Price <gregory.price@memverge.com>
> > Tested-by: Ira Weiny <ira.weiny@intel.com>  
> 
> Reviewed-by: Ira Weiny <ira.weiny@intel.com>

It's an unusual requirement, but this is indeed the minimal fix
given current users.  Obviously becomes more sensible later in the
series once you make the API synchronous only.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huwei.com>

> 
> > Signed-off-by: Lukas Wunner <lukas@wunner.de>
> > Cc: stable@vger.kernel.org # v6.0+
> > Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > ---
> >  Changes v1 -> v2:
> >   * Add note in kernel-doc of pci_doe_submit_task() that pci_doe_task must
> >     be allocated on the stack (Jonathan)
> > 
> >  drivers/pci/doe.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/pci/doe.c b/drivers/pci/doe.c
> > index 66d9ab288646..12a6752351bf 100644
> > --- a/drivers/pci/doe.c
> > +++ b/drivers/pci/doe.c
> > @@ -520,6 +520,8 @@ EXPORT_SYMBOL_GPL(pci_doe_supports_prot);
> >   * task->complete will be called when the state machine is done processing this
> >   * task.
> >   *
> > + * @task must be allocated on the stack.
> > + *
> >   * Excess data will be discarded.
> >   *
> >   * RETURNS: 0 when task has been successfully queued, -ERRNO on error
> > @@ -541,7 +543,7 @@ int pci_doe_submit_task(struct pci_doe_mb *doe_mb, struct pci_doe_task *task)
> >  		return -EIO;
> >  
> >  	task->doe_mb = doe_mb;
> > -	INIT_WORK(&task->work, doe_statemachine_work);
> > +	INIT_WORK_ONSTACK(&task->work, doe_statemachine_work);
> >  	queue_work(doe_mb->work_queue, &task->work);
> >  	return 0;
> >  }
> > -- 
> > 2.39.1
> >   
> 
>
Gregory Price Jan. 24, 2023, 4:18 p.m. UTC | #3
On Mon, Jan 23, 2023 at 11:11:00AM +0100, Lukas Wunner wrote:
> Gregory Price reports a WARN splat with CONFIG_DEBUG_OBJECTS=y upon CXL
> probing because pci_doe_submit_task() invokes INIT_WORK() instead of
> INIT_WORK_ONSTACK() for a work_struct that was allocated on the stack.
> 
> All callers of pci_doe_submit_task() allocate the work_struct on the
> stack, so replace INIT_WORK() with INIT_WORK_ONSTACK() as a backportable
> short-term fix.
>
> ... snip ...
> Reported-by: Gregory Price <gregory.price@memverge.com>

Tested-by: Gregory Price <gregory.price@memverge.com>
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Lukas Wunner Jan. 25, 2023, 9:05 p.m. UTC | #4
On Tue, Jan 24, 2023 at 10:32:08AM +0000, Jonathan Cameron wrote:
> On Mon, 23 Jan 2023 16:33:36 -0800 Ira Weiny <ira.weiny@intel.com> wrote:
> > Lukas Wunner wrote:
> > > Gregory Price reports a WARN splat with CONFIG_DEBUG_OBJECTS=y upon CXL
> > > probing because pci_doe_submit_task() invokes INIT_WORK() instead of
> > > INIT_WORK_ONSTACK() for a work_struct that was allocated on the stack.
> > > 
> > > All callers of pci_doe_submit_task() allocate the work_struct on the
> > > stack, so replace INIT_WORK() with INIT_WORK_ONSTACK() as a backportable
> > > short-term fix.
[...]
> It's an unusual requirement, but this is indeed the minimal fix
> given current users.  Obviously becomes more sensible later in the
> series once you make the API synchronous only.

Okay, I'll amend the commit message as follows when respinning
to make more obvious what's being done here:

    The long-term fix implemented by a subsequent commit is to move to a
    synchronous API which allocates the work_struct internally in the DOE
    library.

Thanks,

Lukas
Dan Williams Feb. 10, 2023, 11:50 p.m. UTC | #5
Lukas Wunner wrote:
> Gregory Price reports a WARN splat with CONFIG_DEBUG_OBJECTS=y upon CXL
> probing because pci_doe_submit_task() invokes INIT_WORK() instead of
> INIT_WORK_ONSTACK() for a work_struct that was allocated on the stack.
> 
> All callers of pci_doe_submit_task() allocate the work_struct on the
> stack, so replace INIT_WORK() with INIT_WORK_ONSTACK() as a backportable
> short-term fix.
> 
> Stacktrace for posterity:
> 
> WARNING: CPU: 0 PID: 23 at lib/debugobjects.c:545 __debug_object_init.cold+0x18/0x183
> CPU: 0 PID: 23 Comm: kworker/u2:1 Not tainted 6.1.0-0.rc1.20221019gitaae703b02f92.17.fc38.x86_64 #1
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> Call Trace:
>  pci_doe_submit_task+0x5d/0xd0
>  pci_doe_discovery+0xb4/0x100
>  pcim_doe_create_mb+0x219/0x290
>  cxl_pci_probe+0x192/0x430
>  local_pci_probe+0x41/0x80
>  pci_device_probe+0xb3/0x220
>  really_probe+0xde/0x380
>  __driver_probe_device+0x78/0x170
>  driver_probe_device+0x1f/0x90
>  __driver_attach_async_helper+0x5c/0xe0
>  async_run_entry_fn+0x30/0x130
>  process_one_work+0x294/0x5b0
> 
> Fixes: 9d24322e887b ("PCI/DOE: Add DOE mailbox support functions")
> Link: https://lore.kernel.org/linux-cxl/Y1bOniJliOFszvIK@memverge.com/
> Reported-by: Gregory Price <gregory.price@memverge.com>
> Tested-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Lukas Wunner <lukas@wunner.de>
> Cc: stable@vger.kernel.org # v6.0+
> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
diff mbox series

Patch

diff --git a/drivers/pci/doe.c b/drivers/pci/doe.c
index 66d9ab288646..12a6752351bf 100644
--- a/drivers/pci/doe.c
+++ b/drivers/pci/doe.c
@@ -520,6 +520,8 @@  EXPORT_SYMBOL_GPL(pci_doe_supports_prot);
  * task->complete will be called when the state machine is done processing this
  * task.
  *
+ * @task must be allocated on the stack.
+ *
  * Excess data will be discarded.
  *
  * RETURNS: 0 when task has been successfully queued, -ERRNO on error
@@ -541,7 +543,7 @@  int pci_doe_submit_task(struct pci_doe_mb *doe_mb, struct pci_doe_task *task)
 		return -EIO;
 
 	task->doe_mb = doe_mb;
-	INIT_WORK(&task->work, doe_statemachine_work);
+	INIT_WORK_ONSTACK(&task->work, doe_statemachine_work);
 	queue_work(doe_mb->work_queue, &task->work);
 	return 0;
 }