cxl/mbox: Fix debug message print

Message ID	20230731-cxl-fix-clear-event-debug-print-v1-1-42c068f500d1@intel.com
State	New, archived
Headers	show Return-Path: <linux-cxl-owner@vger.kernel.org> From: Ira Weiny <ira.weiny@intel.com> Date: Mon, 31 Jul 2023 13:52:27 -0700 Subject: [PATCH] cxl/mbox: Fix debug message print MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Message-Id: <20230731-cxl-fix-clear-event-debug-print-v1-1-42c068f500d1@intel.com> To: Davidlohr Bueso <dave@stgolabs.net>, Jonathan Cameron <jonathan.cameron@huawei.com>, Dave Jiang <dave.jiang@intel.com>, Alison Schofield <alison.schofield@intel.com>, Vishal Verma <vishal.l.verma@intel.com>, Dan Williams <dan.j.williams@intel.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>, linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org, Ira Weiny <ira.weiny@intel.com> Precedence: bulk
Series	cxl/mbox: Fix debug message print \| expand cxl/mbox: Fix debug message print

Ira Weiny July 31, 2023, 8:52 p.m. UTC

The handle value used to report an event being cleared by dev_dbg() is
incorrect due to a post increment of the payload handle index.

Delay the increment of the index until after the print.  Also add the
debugging for event processing which was useful in finding this bug.

To: Davidlohr Bueso <dave@stgolabs.net>
To: Jonathan Cameron <jonathan.cameron@huawei.com>
To: Dave Jiang <dave.jiang@intel.com>
To: Alison Schofield <alison.schofield@intel.com>
To: Vishal Verma <vishal.l.verma@intel.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: linux-cxl@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
---
NOTE: This does fix a bug in the patch referenced below.  However, I
don't think that warrants back porting because this is only a debug
print.

Fixes: 6ebe28f9ec72 ("cxl/mem: Read, trace, and clear events on driver load")
---
 drivers/cxl/core/mbox.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)


---
base-commit: 5d0c230f1de8c7515b6567d9afba1f196fb4e2f4
change-id: 20230731-cxl-fix-clear-event-debug-print-3b57da0e956c

Best regards,

Dave Jiang July 31, 2023, 9:53 p.m. UTC | #1

On 7/31/23 13:52, Ira Weiny wrote:
> The handle value used to report an event being cleared by dev_dbg() is
> incorrect due to a post increment of the payload handle index.
> 
> Delay the increment of the index until after the print.  Also add the
> debugging for event processing which was useful in finding this bug.
> 
> To: Davidlohr Bueso <dave@stgolabs.net>
> To: Jonathan Cameron <jonathan.cameron@huawei.com>
> To: Dave Jiang <dave.jiang@intel.com>
> To: Alison Schofield <alison.schofield@intel.com>
> To: Vishal Verma <vishal.l.verma@intel.com>
> To: Dan Williams <dan.j.williams@intel.com>
> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Cc: linux-cxl@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> ---
> NOTE: This does fix a bug in the patch referenced below.  However, I
> don't think that warrants back porting because this is only a debug
> print.
> 
> Fixes: 6ebe28f9ec72 ("cxl/mem: Read, trace, and clear events on driver load")
> ---
>   drivers/cxl/core/mbox.c | 9 +++++++--
>   1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index d6d067fbee97..f052d5f174ee 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -882,9 +882,10 @@ static int cxl_clear_event_record(struct cxl_memdev_state *mds,
>   	 */
>   	i = 0;
>   	for (cnt = 0; cnt < total; cnt++) {
> -		payload->handles[i++] = get_pl->records[cnt].hdr.handle;
> +		payload->handles[i] = get_pl->records[cnt].hdr.handle;
>   		dev_dbg(mds->cxlds.dev, "Event log '%d': Clearing %u\n", log,
>   			le16_to_cpu(payload->handles[i]));
> +		i++;
>   
>   		if (i == max_handles) {
>   			payload->nr_recs = i;
> @@ -946,9 +947,13 @@ static void cxl_mem_get_records_log(struct cxl_memdev_state *mds,
>   		if (!nr_rec)
>   			break;
>   
> -		for (i = 0; i < nr_rec; i++)
> +		for (i = 0; i < nr_rec; i++) {
> +			dev_dbg(dev, "Event log %d: processing handle %u\n",
> +				type,
> +				le16_to_cpu(payload->records[i].hdr.handle));
>   			cxl_event_trace_record(cxlmd, type,
>   					       &payload->records[i]);
> +		}
>   
>   		if (payload->flags & CXL_GET_EVENT_FLAG_OVERFLOW)
>   			trace_cxl_overflow(cxlmd, type, payload);
> 
> ---
> base-commit: 5d0c230f1de8c7515b6567d9afba1f196fb4e2f4
> change-id: 20230731-cxl-fix-clear-event-debug-print-3b57da0e956c
> 
> Best regards,

Dan Williams July 31, 2023, 10:12 p.m. UTC | #2

Ira Weiny wrote:
> The handle value used to report an event being cleared by dev_dbg() is
> incorrect due to a post increment of the payload handle index.
> 
> Delay the increment of the index until after the print.  Also add the
> debugging for event processing which was useful in finding this bug.
> 
> To: Davidlohr Bueso <dave@stgolabs.net>
> To: Jonathan Cameron <jonathan.cameron@huawei.com>
> To: Dave Jiang <dave.jiang@intel.com>
> To: Alison Schofield <alison.schofield@intel.com>
> To: Vishal Verma <vishal.l.verma@intel.com>
> To: Dan Williams <dan.j.williams@intel.com>

Is this some new process recommendation to use "To:", I would only
expect Cc: for maintainers.

> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Jonathan gets listed twice because?

> Cc: linux-cxl@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org

I assume this is because b4 takes its address from the patch itself?

It just feels a bit too noisy.

> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> ---
> NOTE: This does fix a bug in the patch referenced below.  However, I
> don't think that warrants back porting because this is only a debug
> print.

I don't think that's a reason not to flag for backport. Smaller things
than this have been backported, and if it saves someone getting confused
in the field its worth it.

> 
> Fixes: 6ebe28f9ec72 ("cxl/mem: Read, trace, and clear events on driver load")
> ---
>  drivers/cxl/core/mbox.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index d6d067fbee97..f052d5f174ee 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -882,9 +882,10 @@ static int cxl_clear_event_record(struct cxl_memdev_state *mds,
>  	 */
>  	i = 0;
>  	for (cnt = 0; cnt < total; cnt++) {
> -		payload->handles[i++] = get_pl->records[cnt].hdr.handle;
> +		payload->handles[i] = get_pl->records[cnt].hdr.handle;
>  		dev_dbg(mds->cxlds.dev, "Event log '%d': Clearing %u\n", log,
>  			le16_to_cpu(payload->handles[i]));
> +		i++;
>  
>  		if (i == max_handles) {
>  			payload->nr_recs = i;
> @@ -946,9 +947,13 @@ static void cxl_mem_get_records_log(struct cxl_memdev_state *mds,
>  		if (!nr_rec)
>  			break;
>  
> -		for (i = 0; i < nr_rec; i++)
> +		for (i = 0; i < nr_rec; i++) {
> +			dev_dbg(dev, "Event log %d: processing handle %u\n",
> +				type,
> +				le16_to_cpu(payload->records[i].hdr.handle));
>  			cxl_event_trace_record(cxlmd, type,
>  					       &payload->records[i]);

This looks unrelated to the fix.

> +		}
>  
>  		if (payload->flags & CXL_GET_EVENT_FLAG_OVERFLOW)
>  			trace_cxl_overflow(cxlmd, type, payload);
> 
> ---
> base-commit: 5d0c230f1de8c7515b6567d9afba1f196fb4e2f4
> change-id: 20230731-cxl-fix-clear-event-debug-print-3b57da0e956c
> 
> Best regards,
> -- 
> Ira Weiny <ira.weiny@intel.com>
>

Verma, Vishal L July 31, 2023, 10:25 p.m. UTC | #3

On Mon, 2023-07-31 at 15:12 -0700, Dan Williams wrote:
> Ira Weiny wrote:
> > The handle value used to report an event being cleared by dev_dbg() is
> > incorrect due to a post increment of the payload handle index.
> > 
> > Delay the increment of the index until after the print.  Also add the
> > debugging for event processing which was useful in finding this bug.
> > 
> > To: Davidlohr Bueso <dave@stgolabs.net>
> > To: Jonathan Cameron <jonathan.cameron@huawei.com>
> > To: Dave Jiang <dave.jiang@intel.com>
> > To: Alison Schofield <alison.schofield@intel.com>
> > To: Vishal Verma <vishal.l.verma@intel.com>
> > To: Dan Williams <dan.j.williams@intel.com>
> 
> Is this some new process recommendation to use "To:", I would only
> expect Cc: for maintainers.
> 
> > Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> Jonathan gets listed twice because?
> 
> > Cc: linux-cxl@vger.kernel.org
> > Cc: linux-kernel@vger.kernel.org
> 
> I assume this is because b4 takes its address from the patch itself?
> 
> It just feels a bit too noisy.

Yes - but the better way to do this is to add the To: and Cc: lines in
b4 prep --edit-cover, even for a single patch branch. That way b4 uses
them to figure out where to send/cc, but other doesn't add them to
commit messages (or even the cover letter in case there is one).

Alison Schofield Aug. 1, 2023, 1:24 a.m. UTC | #4

On Mon, Jul 31, 2023 at 01:52:27PM -0700, Ira Weiny wrote:

Looks like this is rolling for the distro list, so..

Please be more specific than this:
[PATCH] cxl/mbox: Fix debug message print

Something like this is more useful to folks scanning the one-liners:
[PATCH] cxl/mbox: Use correct handle in events debug message

> The handle value used to report an event being cleared by dev_dbg() is
> incorrect due to a post increment of the payload handle index.
> 
> Delay the increment of the index until after the print.  Also add the
> debugging for event processing which was useful in finding this bug.
> 

"Also" always smells like something that should be a separate patch.

I guess you could rewrite the commit message and keep it in one patch:
[PATCH] cxl/mbox: Improve event dev_dbg() messages

Alison


> To: Davidlohr Bueso <dave@stgolabs.net>
> To: Jonathan Cameron <jonathan.cameron@huawei.com>
> To: Dave Jiang <dave.jiang@intel.com>
> To: Alison Schofield <alison.schofield@intel.com>
> To: Vishal Verma <vishal.l.verma@intel.com>
> To: Dan Williams <dan.j.williams@intel.com>
> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Cc: linux-cxl@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> ---
> NOTE: This does fix a bug in the patch referenced below.  However, I
> don't think that warrants back porting because this is only a debug
> print.
> 
> Fixes: 6ebe28f9ec72 ("cxl/mem: Read, trace, and clear events on driver load")
> ---
>  drivers/cxl/core/mbox.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index d6d067fbee97..f052d5f174ee 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -882,9 +882,10 @@ static int cxl_clear_event_record(struct cxl_memdev_state *mds,
>  	 */
>  	i = 0;
>  	for (cnt = 0; cnt < total; cnt++) {
> -		payload->handles[i++] = get_pl->records[cnt].hdr.handle;
> +		payload->handles[i] = get_pl->records[cnt].hdr.handle;
>  		dev_dbg(mds->cxlds.dev, "Event log '%d': Clearing %u\n", log,
>  			le16_to_cpu(payload->handles[i]));
> +		i++;
>  
>  		if (i == max_handles) {
>  			payload->nr_recs = i;
> @@ -946,9 +947,13 @@ static void cxl_mem_get_records_log(struct cxl_memdev_state *mds,
>  		if (!nr_rec)
>  			break;
>  
> -		for (i = 0; i < nr_rec; i++)
> +		for (i = 0; i < nr_rec; i++) {
> +			dev_dbg(dev, "Event log %d: processing handle %u\n",
> +				type,
> +				le16_to_cpu(payload->records[i].hdr.handle));
>  			cxl_event_trace_record(cxlmd, type,
>  					       &payload->records[i]);
> +		}
>  
>  		if (payload->flags & CXL_GET_EVENT_FLAG_OVERFLOW)
>  			trace_cxl_overflow(cxlmd, type, payload);
> 
> ---
> base-commit: 5d0c230f1de8c7515b6567d9afba1f196fb4e2f4
> change-id: 20230731-cxl-fix-clear-event-debug-print-3b57da0e956c
> 
> Best regards,
> -- 
> Ira Weiny <ira.weiny@intel.com>
>

Ira Weiny Aug. 1, 2023, 6:45 p.m. UTC | #5

Dan Williams wrote:
> Ira Weiny wrote:
> > The handle value used to report an event being cleared by dev_dbg() is
> > incorrect due to a post increment of the payload handle index.
> > 
> > Delay the increment of the index until after the print.  Also add the
> > debugging for event processing which was useful in finding this bug.
> > 
> > To: Davidlohr Bueso <dave@stgolabs.net>
> > To: Jonathan Cameron <jonathan.cameron@huawei.com>
> > To: Dave Jiang <dave.jiang@intel.com>
> > To: Alison Schofield <alison.schofield@intel.com>
> > To: Vishal Verma <vishal.l.verma@intel.com>
> > To: Dan Williams <dan.j.williams@intel.com>
> 
> Is this some new process recommendation to use "To:", I would only
> expect Cc: for maintainers.

It is just what b4 does because all those folks are listed as maintainers.

> 
> > Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> Jonathan gets listed twice because?

That is odd.  I did not notice that.

> 
> > Cc: linux-cxl@vger.kernel.org
> > Cc: linux-kernel@vger.kernel.org
> 
> I assume this is because b4 takes its address from the patch itself?
> 
> It just feels a bit too noisy.

Yea b4 generated the list and I just copied it here.  Vishal is correct
that this is automatically put in the tracking cover letter commit b4 uses
but I copied it here because it was not a series.  I'll skip this in the
future.

> 
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > ---
> > NOTE: This does fix a bug in the patch referenced below.  However, I
> > don't think that warrants back porting because this is only a debug
> > print.
> 
> I don't think that's a reason not to flag for backport. Smaller things
> than this have been backported, and if it saves someone getting confused
> in the field its worth it.

Ok.

> 
> > 
> > Fixes: 6ebe28f9ec72 ("cxl/mem: Read, trace, and clear events on driver load")
> > ---
> >  drivers/cxl/core/mbox.c | 9 +++++++--
> >  1 file changed, 7 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> > index d6d067fbee97..f052d5f174ee 100644
> > --- a/drivers/cxl/core/mbox.c
> > +++ b/drivers/cxl/core/mbox.c
> > @@ -882,9 +882,10 @@ static int cxl_clear_event_record(struct cxl_memdev_state *mds,
> >  	 */
> >  	i = 0;
> >  	for (cnt = 0; cnt < total; cnt++) {
> > -		payload->handles[i++] = get_pl->records[cnt].hdr.handle;
> > +		payload->handles[i] = get_pl->records[cnt].hdr.handle;
> >  		dev_dbg(mds->cxlds.dev, "Event log '%d': Clearing %u\n", log,
> >  			le16_to_cpu(payload->handles[i]));
> > +		i++;
> >  
> >  		if (i == max_handles) {
> >  			payload->nr_recs = i;
> > @@ -946,9 +947,13 @@ static void cxl_mem_get_records_log(struct cxl_memdev_state *mds,
> >  		if (!nr_rec)
> >  			break;
> >  
> > -		for (i = 0; i < nr_rec; i++)
> > +		for (i = 0; i < nr_rec; i++) {
> > +			dev_dbg(dev, "Event log %d: processing handle %u\n",
> > +				type,
> > +				le16_to_cpu(payload->records[i].hdr.handle));
> >  			cxl_event_trace_record(cxlmd, type,
> >  					       &payload->records[i]);
> 
> This looks unrelated to the fix.

It helped me to debug the issue.  Since you want this marked as a fix for
backport I will split this out.  Potentially just squash it into some
other patch or make it stand alone.

Ira

Ira Weiny Aug. 1, 2023, 6:48 p.m. UTC | #6

Alison Schofield wrote:
> On Mon, Jul 31, 2023 at 01:52:27PM -0700, Ira Weiny wrote:
> 
> Looks like this is rolling for the distro list, so..
> 
> Please be more specific than this:
> [PATCH] cxl/mbox: Fix debug message print
> 
> Something like this is more useful to folks scanning the one-liners:
> [PATCH] cxl/mbox: Use correct handle in events debug message

That is better thanks!

> 
> > The handle value used to report an event being cleared by dev_dbg() is
> > incorrect due to a post increment of the payload handle index.
> > 
> > Delay the increment of the index until after the print.  Also add the
> > debugging for event processing which was useful in finding this bug.
> > 
> 
> "Also" always smells like something that should be a separate patch.

It could be but I did not see this as something which warranted its own
patch as it lead me to fix the bug.  Since Dan feels the fix should be
backported that changes my attitude.

> 
> I guess you could rewrite the commit message and keep it in one patch:
> [PATCH] cxl/mbox: Improve event dev_dbg() messages

Except that misses that this is a bug fix.

Ira

cxl/mbox: Fix debug message print

Commit Message

Comments

Patch