Message ID | 20200124151607.31375-1-johannes.thumshirn@wdc.com (mailing list archive) |
---|---|
State | Rejected |
Headers | show |
Series | scsi: don't panic host on invalid sgtable count | expand |
On Sat, 2020-01-25 at 00:16 +0900, Johannes Thumshirn wrote: > If we have an invalid number of entries mapped an sg table, there's > no need to panic the host, instead we can spit out a warning in dmesg > and gracefully return an I/O error. Can we? This is an assertion failure which should never happen. If it does, it's likely an indicator that a system has gone seriously out of spec for some reason, like internal compromise, CPU/Memory failure or something else. The HA view is that panic is appropriate for conditions that should never happen because it helps the machine fail fast. James > While we're at it fix a trailing whitespace in the comment above. > > Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> > --- > drivers/scsi/scsi_lib.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > index 3e7a45d0daca..9bddf54e3def 100644 > --- a/drivers/scsi/scsi_lib.c > +++ b/drivers/scsi/scsi_lib.c > @@ -992,12 +992,15 @@ static blk_status_t scsi_init_sgtable(struct > request *req, > SCSI_INLINE_SG_CNT))) > return BLK_STS_RESOURCE; > > - /* > + /* > * Next, walk the list, and fill in the addresses and sizes > of > * each segment. > */ > count = blk_rq_map_sg(req->q, req, sdb->table.sgl); > - BUG_ON(count > sdb->table.nents); > + if (WARN_ON_ONCE(count > sdb->table.nents)) { > + sg_free_table_chained(&sdb->table, > SCSI_INLINE_SG_CNT); > + return BLK_STS_IOERR; > + } > sdb->table.nents = count; > sdb->length = blk_rq_payload_bytes(req); > return BLK_STS_OK;
On 24/01/2020 16:23, James Bottomley wrote: > On Sat, 2020-01-25 at 00:16 +0900, Johannes Thumshirn wrote: >> If we have an invalid number of entries mapped an sg table, there's >> no need to panic the host, instead we can spit out a warning in dmesg >> and gracefully return an I/O error. > > Can we? This is an assertion failure which should never happen. If it > does, it's likely an indicator that a system has gone seriously out of > spec for some reason, like internal compromise, CPU/Memory failure or > something else. > > The HA view is that panic is appropriate for conditions that should > never happen because it helps the machine fail fast. Yes but an HA setup could still set panic_on_oops and retain the fail fast portion. Anyway it's just something that popped up when I was looking up something unrelated in scsi_lib.c. It's not that I'm married to this cleanup.
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 3e7a45d0daca..9bddf54e3def 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -992,12 +992,15 @@ static blk_status_t scsi_init_sgtable(struct request *req, SCSI_INLINE_SG_CNT))) return BLK_STS_RESOURCE; - /* + /* * Next, walk the list, and fill in the addresses and sizes of * each segment. */ count = blk_rq_map_sg(req->q, req, sdb->table.sgl); - BUG_ON(count > sdb->table.nents); + if (WARN_ON_ONCE(count > sdb->table.nents)) { + sg_free_table_chained(&sdb->table, SCSI_INLINE_SG_CNT); + return BLK_STS_IOERR; + } sdb->table.nents = count; sdb->length = blk_rq_payload_bytes(req); return BLK_STS_OK;
If we have an invalid number of entries mapped an sg table, there's no need to panic the host, instead we can spit out a warning in dmesg and gracefully return an I/O error. While we're at it fix a trailing whitespace in the comment above. Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> --- drivers/scsi/scsi_lib.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)