Message ID | 1443223144-9962-1-git-send-email-mrochs@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 "Matthew R. Ochs" <mrochs@linux.vnet.ibm.com> writes: > Following an adapter reset, the AFU RRQ that resides in host memory > holds stale data. This can lead to a condition where the RRQ interrupt > handler tries to process stale entries and/or endlessly loops due to an > out of sync generation bit. > > To fix, the AFU RRQ in host memory needs to be cleared after each reset. This looks good. Do you need anything to bail out of cxlflash_rrq_irq if the data goes stale or to all Fs while that function is running? Daniel > > Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> > Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com> > Reviewed-by: Brian King <brking@linux.vnet.ibm.com> > --- > drivers/scsi/cxlflash/main.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c > index 24aedfb..ab11ee6 100644 > --- a/drivers/scsi/cxlflash/main.c > +++ b/drivers/scsi/cxlflash/main.c > @@ -1598,6 +1598,9 @@ static int start_afu(struct cxlflash_cfg *cfg) > > init_pcr(cfg); > > + /* After an AFU reset, RRQ entries are stale, clear them */ > + memset(&afu->rrq_entry, 0, sizeof(afu->rrq_entry)); > + > /* Initialize RRQ pointers */ > afu->hrrq_start = &afu->rrq_entry[0]; > afu->hrrq_end = &afu->rrq_entry[NUM_RRQ_ENTRY - 1]; > -- > 2.1.0 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 Comment: GPGTools - https://gpgtools.org iQIcBAEBCgAGBQJWCesgAAoJEPC3R3P2I92FE4kP+wYTEdCDP50RfKZPNCpoY//T 5tDNV58W1RasTrJLVGrpeGBboI4WTzRlKYAYX1iLC6ZSkX1rQS5rBqgImQQAg3i5 N9qoAdNCch06UVO7psceptT9653xt0OzitHW/tmB8pOMII65UIm8AtydSs2+J0e8 htw9Jnw098jls8NJtzz32T48QoM+kV1w7vfJJN36Z6FjL36KJLQCAL2/gLvv+3Zt 2ueeP6MM0q0iFxnQJwPFdf3DebggbMqeVbNorNAyCSkNNoUh2QR8NMHeFS8OoXCl nrzjFoeSFDXh4c4bRzVoGRalIFzP+VY9guk9d/VtAt8ZkkrIyfI3K9lJTgDq2gqj ShnT/WB4lskUVc/FjdkyIc+Nyg6pPkm4OftCnN7sf7CSd2+myO/d6S1f1qJlCn7s 3KBLXEFrKXKHl9TVDqkZSZVvyJOKmxoMqT98I00CWhXFkRgO126tnqsMsmrqgzxu QxZQSo+/Oi7av6YuVtTP2c9SeYbGBfhOMOH5Dx7z9VY57rWvE5CVsduYJoG2D59P Td5ToL1H3UOt8G1vzeqmkb7AWK882py2OLS7WG0PUDRDa2dfE6+8NOCJmuX9tW2x /NnLlInDTiVTpz3iiJpe9Efu3SkL2KeMEBxi4jdt31hFO0BPeFWnZHKjwOa+uQ/w OSaHCUZEZ6LRdbeJ1GLk =O6Zk -----END PGP SIGNATURE----- -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> On Sep 28, 2015, at 8:36 PM, Daniel Axtens <dja@axtens.net> wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA512 > > "Matthew R. Ochs" <mrochs@linux.vnet.ibm.com> writes: > >> Following an adapter reset, the AFU RRQ that resides in host memory >> holds stale data. This can lead to a condition where the RRQ interrupt >> handler tries to process stale entries and/or endlessly loops due to an >> out of sync generation bit. >> >> To fix, the AFU RRQ in host memory needs to be cleared after each reset. > > This looks good. Do you need anything to bail out of cxlflash_rrq_irq if > the data goes stale or to all Fs while that function is running? We're not performing an MMIO here, so I'm not sure how the all Fs check would apply. We're also protected fairly well by the generation bit. I suppose we could look at adding some type of 'max iterations' count to protect against a runaway handler but that would be in a future patch. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
>>> Following an adapter reset, the AFU RRQ that resides in host memory >>> holds stale data. This can lead to a condition where the RRQ interrupt >>> handler tries to process stale entries and/or endlessly loops due to an >>> out of sync generation bit. >>> >>> To fix, the AFU RRQ in host memory needs to be cleared after each reset. >> >> This looks good. Do you need anything to bail out of cxlflash_rrq_irq if >> the data goes stale or to all Fs while that function is running? > > We're not performing an MMIO here, so I'm not sure how the all Fs check > would apply. We're also protected fairly well by the generation bit. I suppose > we could look at adding some type of 'max iterations' count to protect against > a runaway handler but that would be in a future patch. Ah, right you are. I had confused all Fs with UEs. Reviewed-by: Daniel Axtens <dja@axtens.net> Regards, Daniel -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c index 24aedfb..ab11ee6 100644 --- a/drivers/scsi/cxlflash/main.c +++ b/drivers/scsi/cxlflash/main.c @@ -1598,6 +1598,9 @@ static int start_afu(struct cxlflash_cfg *cfg) init_pcr(cfg); + /* After an AFU reset, RRQ entries are stale, clear them */ + memset(&afu->rrq_entry, 0, sizeof(afu->rrq_entry)); + /* Initialize RRQ pointers */ afu->hrrq_start = &afu->rrq_entry[0]; afu->hrrq_end = &afu->rrq_entry[NUM_RRQ_ENTRY - 1];