[v4,27/32] cxlflash: Fix to prevent stale AFU RRQ

Message ID	1443223144-9962-1-git-send-email-mrochs@linux.vnet.ibm.com (mailing list archive)
State	Superseded, archived
Headers	show Return-Path: <linux-scsi-owner@kernel.org> Gateway: Authorized Use Only! Violators will be prosecuted for <linux-scsi@vger.kernel.org> from <mrochs@linux.vnet.ibm.com>; Fri, 25 Sep 2015 19:19:47 -0400 Gateway: Authorized Use Only! Violators will be prosecuted; Fri, 25 Sep 2015 19:19:46 -0400 From: "Matthew R. Ochs" <mrochs@linux.vnet.ibm.com> To: linux-scsi@vger.kernel.org, James Bottomley <James.Bottomley@HansenPartnership.com>, "Nicholas A. Bellinger" <nab@linux-iscsi.org>, Brian King <brking@linux.vnet.ibm.com>, Ian Munsie <imunsie@au1.ibm.com>, Daniel Axtens <dja@ozlabs.au.ibm.com>, Andrew Donnellan <andrew.donnellan@au1.ibm.com>, Tomas Henzl <thenzl@redhat.com>, David Laight <David.Laight@ACULAB.COM> Cc: Michael Neuling <mikey@neuling.org>, linuxppc-dev@lists.ozlabs.org, "Manoj N. Kumar" <manoj@linux.vnet.ibm.com> Subject: [PATCH v4 27/32] cxlflash: Fix to prevent stale AFU RRQ Date: Fri, 25 Sep 2015 18:19:04 -0500 Message-Id: <1443223144-9962-1-git-send-email-mrochs@linux.vnet.ibm.com> In-Reply-To: <1443222593-8828-1-git-send-email-mrochs@linux.vnet.ibm.com> References: <1443222593-8828-1-git-send-email-mrochs@linux.vnet.ibm.com> Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk

Message ID

1443223144-9962-1-git-send-email-mrochs@linux.vnet.ibm.com (mailing list archive)

State

Superseded, archived

Headers

From: "Matthew R. Ochs" <mrochs@linux.vnet.ibm.com>
To: linux-scsi@vger.kernel.org,
	James Bottomley <James.Bottomley@HansenPartnership.com>,
	"Nicholas A. Bellinger" <nab@linux-iscsi.org>,
	Brian King <brking@linux.vnet.ibm.com>, Ian Munsie <imunsie@au1.ibm.com>,
	Daniel Axtens <dja@ozlabs.au.ibm.com>,
	Andrew Donnellan <andrew.donnellan@au1.ibm.com>,
	Tomas Henzl <thenzl@redhat.com>, David Laight <David.Laight@ACULAB.COM>
Cc: Michael Neuling <mikey@neuling.org>, linuxppc-dev@lists.ozlabs.org,
	"Manoj N. Kumar" <manoj@linux.vnet.ibm.com>
Subject: [PATCH v4 27/32] cxlflash: Fix to prevent stale AFU RRQ
Date: Fri, 25 Sep 2015 18:19:04 -0500
Message-Id: <1443223144-9962-1-git-send-email-mrochs@linux.vnet.ibm.com>
In-Reply-To: <1443222593-8828-1-git-send-email-mrochs@linux.vnet.ibm.com>
References: <1443222593-8828-1-git-send-email-mrochs@linux.vnet.ibm.com>
Sender: linux-scsi-owner@vger.kernel.org
Precedence: bulk

Commit Message

Matthew R. Ochs Sept. 25, 2015, 11:19 p.m. UTC

Following an adapter reset, the AFU RRQ that resides in host memory
holds stale data. This can lead to a condition where the RRQ interrupt
handler tries to process stale entries and/or endlessly loops due to an
out of sync generation bit.

To fix, the AFU RRQ in host memory needs to be cleared after each reset.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/main.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Daniel Axtens Sept. 29, 2015, 1:36 a.m. UTC | #1

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

"Matthew R. Ochs" <mrochs@linux.vnet.ibm.com> writes:

> Following an adapter reset, the AFU RRQ that resides in host memory
> holds stale data. This can lead to a condition where the RRQ interrupt
> handler tries to process stale entries and/or endlessly loops due to an
> out of sync generation bit.
>
> To fix, the AFU RRQ in host memory needs to be cleared after each reset.

This looks good. Do you need anything to bail out of cxlflash_rrq_irq if
the data goes stale or to all Fs while that function is running?

Daniel

>
> Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
> Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
> Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
> ---
>  drivers/scsi/cxlflash/main.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
> index 24aedfb..ab11ee6 100644
> --- a/drivers/scsi/cxlflash/main.c
> +++ b/drivers/scsi/cxlflash/main.c
> @@ -1598,6 +1598,9 @@ static int start_afu(struct cxlflash_cfg *cfg)
>  
>  	init_pcr(cfg);
>  
> +	/* After an AFU reset, RRQ entries are stale, clear them */
> +	memset(&afu->rrq_entry, 0, sizeof(afu->rrq_entry));
> +
>  	/* Initialize RRQ pointers */
>  	afu->hrrq_start = &afu->rrq_entry[0];
>  	afu->hrrq_end = &afu->rrq_entry[NUM_RRQ_ENTRY - 1];
> -- 
> 2.1.0
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: GPGTools - https://gpgtools.org

iQIcBAEBCgAGBQJWCesgAAoJEPC3R3P2I92FE4kP+wYTEdCDP50RfKZPNCpoY//T
5tDNV58W1RasTrJLVGrpeGBboI4WTzRlKYAYX1iLC6ZSkX1rQS5rBqgImQQAg3i5
N9qoAdNCch06UVO7psceptT9653xt0OzitHW/tmB8pOMII65UIm8AtydSs2+J0e8
htw9Jnw098jls8NJtzz32T48QoM+kV1w7vfJJN36Z6FjL36KJLQCAL2/gLvv+3Zt
2ueeP6MM0q0iFxnQJwPFdf3DebggbMqeVbNorNAyCSkNNoUh2QR8NMHeFS8OoXCl
nrzjFoeSFDXh4c4bRzVoGRalIFzP+VY9guk9d/VtAt8ZkkrIyfI3K9lJTgDq2gqj
ShnT/WB4lskUVc/FjdkyIc+Nyg6pPkm4OftCnN7sf7CSd2+myO/d6S1f1qJlCn7s
3KBLXEFrKXKHl9TVDqkZSZVvyJOKmxoMqT98I00CWhXFkRgO126tnqsMsmrqgzxu
QxZQSo+/Oi7av6YuVtTP2c9SeYbGBfhOMOH5Dx7z9VY57rWvE5CVsduYJoG2D59P
Td5ToL1H3UOt8G1vzeqmkb7AWK882py2OLS7WG0PUDRDa2dfE6+8NOCJmuX9tW2x
/NnLlInDTiVTpz3iiJpe9Efu3SkL2KeMEBxi4jdt31hFO0BPeFWnZHKjwOa+uQ/w
OSaHCUZEZ6LRdbeJ1GLk
=O6Zk
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Matthew R. Ochs Sept. 29, 2015, 8:22 p.m. UTC | #2

> On Sep 28, 2015, at 8:36 PM, Daniel Axtens <dja@axtens.net> wrote:
> 
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA512
> 
> "Matthew R. Ochs" <mrochs@linux.vnet.ibm.com> writes:
> 
>> Following an adapter reset, the AFU RRQ that resides in host memory
>> holds stale data. This can lead to a condition where the RRQ interrupt
>> handler tries to process stale entries and/or endlessly loops due to an
>> out of sync generation bit.
>> 
>> To fix, the AFU RRQ in host memory needs to be cleared after each reset.
> 
> This looks good. Do you need anything to bail out of cxlflash_rrq_irq if
> the data goes stale or to all Fs while that function is running?

We're not performing an MMIO here, so I'm not sure how the all Fs check
would apply. We're also protected fairly well by the generation bit. I suppose
we could look at adding some type of 'max iterations' count to protect against
a runaway handler but that would be in a future patch.


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Daniel Axtens Sept. 30, 2015, 11:51 p.m. UTC | #3

>>> Following an adapter reset, the AFU RRQ that resides in host memory
>>> holds stale data. This can lead to a condition where the RRQ interrupt
>>> handler tries to process stale entries and/or endlessly loops due to an
>>> out of sync generation bit.
>>> 
>>> To fix, the AFU RRQ in host memory needs to be cleared after each reset.
>> 
>> This looks good. Do you need anything to bail out of cxlflash_rrq_irq if
>> the data goes stale or to all Fs while that function is running?
>
> We're not performing an MMIO here, so I'm not sure how the all Fs check
> would apply. We're also protected fairly well by the generation bit. I suppose
> we could look at adding some type of 'max iterations' count to protect against
> a runaway handler but that would be in a future patch.

Ah, right you are. I had confused all Fs with UEs.

Reviewed-by: Daniel Axtens <dja@axtens.net>

Regards,
Daniel

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index 24aedfb..ab11ee6 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -1598,6 +1598,9 @@  static int start_afu(struct cxlflash_cfg *cfg)
 
 	init_pcr(cfg);
 
+	/* After an AFU reset, RRQ entries are stale, clear them */
+	memset(&afu->rrq_entry, 0, sizeof(afu->rrq_entry));
+
 	/* Initialize RRQ pointers */
 	afu->hrrq_start = &afu->rrq_entry[0];
 	afu->hrrq_end = &afu->rrq_entry[NUM_RRQ_ENTRY - 1];

[v4,27/32] cxlflash: Fix to prevent stale AFU RRQ

Commit Message

Comments

Patch