Message ID | 20200309181416.10665-1-emilne@redhat.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | scsi: avoid repetitive logging of device offline messages | expand |
On Mon, 2020-03-09 at 14:14 -0400, Ewan D. Milne wrote: > Large queues of I/O to offline devices that are eventually > submitted when devices are unblocked result in a many repeated > "rejecting I/O to offline device" messages. These messages > can fill up the dmesg buffer in crash dumps so no useful > prior messages remain. In addition, if a serial console > is used, the flood of messages can cause a hard lockup in > the console code. > > Introduce a flag indicating the message has already been logged > for the device, and reset the flag when scsi_device_set_state() > changes the device state. > > Signed-off-by: Ewan D. Milne <emilne@redhat.com> > --- > drivers/scsi/scsi_lib.c | 8 ++++++-- > include/scsi/scsi_device.h | 2 ++ > 2 files changed, 8 insertions(+), 2 deletions(-) > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > index 610ee41..d3a6d97 100644 > --- a/drivers/scsi/scsi_lib.c > +++ b/drivers/scsi/scsi_lib.c > @@ -1240,8 +1240,11 @@ scsi_prep_state_check(struct scsi_device > *sdev, struct request *req) > * commands. The device must be brought online > * before trying any recovery commands. > */ > - sdev_printk(KERN_ERR, sdev, > - "rejecting I/O to offline device\n"); > + if (!sdev->offline_already) { > + sdev->offline_already = 1; > + sdev_printk(KERN_ERR, sdev, > + "rejecting I/O to offline > device\n"); > + } Offline->online is a legal transition, so you'll need to clear this flag when that happens so we get another offline message if it goes offline again. James
On 3/9/20 11:14 AM, Ewan D. Milne wrote: > Large queues of I/O to offline devices that are eventually > submitted when devices are unblocked result in a many repeated > "rejecting I/O to offline device" messages. These messages > can fill up the dmesg buffer in crash dumps so no useful > prior messages remain. In addition, if a serial console > is used, the flood of messages can cause a hard lockup in > the console code. > > Introduce a flag indicating the message has already been logged > for the device, and reset the flag when scsi_device_set_state() > changes the device state. > > Signed-off-by: Ewan D. Milne <emilne@redhat.com> > --- > drivers/scsi/scsi_lib.c | 8 ++++++-- > include/scsi/scsi_device.h | 2 ++ > 2 files changed, 8 insertions(+), 2 deletions(-) > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > index 610ee41..d3a6d97 100644 > --- a/drivers/scsi/scsi_lib.c > +++ b/drivers/scsi/scsi_lib.c > @@ -1240,8 +1240,11 @@ scsi_prep_state_check(struct scsi_device *sdev, struct request *req) > * commands. The device must be brought online > * before trying any recovery commands. > */ > - sdev_printk(KERN_ERR, sdev, > - "rejecting I/O to offline device\n"); > + if (!sdev->offline_already) { > + sdev->offline_already = 1; > + sdev_printk(KERN_ERR, sdev, > + "rejecting I/O to offline device\n"); > + } > return BLK_STS_IOERR; > case SDEV_DEL: > /* > @@ -2340,6 +2343,7 @@ scsi_device_set_state(struct scsi_device *sdev, enum scsi_device_state state) > break; > > } > + sdev->offline_already = 0; > sdev->sdev_state = state; > return 0; > > diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h > index f8312a3..72987a0 100644 > --- a/include/scsi/scsi_device.h > +++ b/include/scsi/scsi_device.h > @@ -204,6 +204,8 @@ struct scsi_device { > unsigned unmap_limit_for_ws:1; /* Use the UNMAP limit for WRITE SAME */ > unsigned rpm_autosuspend:1; /* Enable runtime autosuspend at device > * creation time */ > + unsigned offline_already:1; /* Device offline message logged */ > + > atomic_t disk_events_disable_depth; /* disable depth for disk events */ > > DECLARE_BITMAP(supported_events, SDEV_EVT_MAXBITS); /* supported events */ Bitfields are troublesome in multithreaded software. Has it been considered to use rate-limiting instead of introducing a new bitfield member? Thanks, Bart.
On Mon, 2020-03-09 at 12:05 -0700, James Bottomley wrote: > On Mon, 2020-03-09 at 14:14 -0400, Ewan D. Milne wrote: > > Large queues of I/O to offline devices that are eventually > > submitted when devices are unblocked result in a many repeated > > "rejecting I/O to offline device" messages. These messages > > can fill up the dmesg buffer in crash dumps so no useful > > prior messages remain. In addition, if a serial console > > is used, the flood of messages can cause a hard lockup in > > the console code. > > > > Introduce a flag indicating the message has already been logged > > for the device, and reset the flag when scsi_device_set_state() > > changes the device state. > > > > Signed-off-by: Ewan D. Milne <emilne@redhat.com> > > --- > > drivers/scsi/scsi_lib.c | 8 ++++++-- > > include/scsi/scsi_device.h | 2 ++ > > 2 files changed, 8 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > > index 610ee41..d3a6d97 100644 > > --- a/drivers/scsi/scsi_lib.c > > +++ b/drivers/scsi/scsi_lib.c > > @@ -1240,8 +1240,11 @@ scsi_prep_state_check(struct scsi_device > > *sdev, struct request *req) > > * commands. The device must be brought online > > * before trying any recovery commands. > > */ > > - sdev_printk(KERN_ERR, sdev, > > - "rejecting I/O to offline device\n"); > > + if (!sdev->offline_already) { > > + sdev->offline_already = 1; > > + sdev_printk(KERN_ERR, sdev, > > + "rejecting I/O to offline > > device\n"); > > + } > > Offline->online is a legal transition, so you'll need to clear this > flag when that happens so we get another offline message if it goes > offline again. > > James > That's what resetting the flag in scsi_device_set_state() does. The only thing that worries me is if some driver manipulates the sdev->sdev_state value directly. -Ewan
On Mon, 2020-03-09 at 12:36 -0700, Bart Van Assche wrote: > On 3/9/20 11:14 AM, Ewan D. Milne wrote: > > Large queues of I/O to offline devices that are eventually > > submitted when devices are unblocked result in a many repeated > > "rejecting I/O to offline device" messages. These messages > > can fill up the dmesg buffer in crash dumps so no useful > > prior messages remain. In addition, if a serial console > > is used, the flood of messages can cause a hard lockup in > > the console code. > > > > Introduce a flag indicating the message has already been logged > > for the device, and reset the flag when scsi_device_set_state() > > changes the device state. > > > > Signed-off-by: Ewan D. Milne <emilne@redhat.com> > > --- > > drivers/scsi/scsi_lib.c | 8 ++++++-- > > include/scsi/scsi_device.h | 2 ++ > > 2 files changed, 8 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > > index 610ee41..d3a6d97 100644 > > --- a/drivers/scsi/scsi_lib.c > > +++ b/drivers/scsi/scsi_lib.c > > @@ -1240,8 +1240,11 @@ scsi_prep_state_check(struct scsi_device *sdev, struct request *req) > > * commands. The device must be brought online > > * before trying any recovery commands. > > */ > > - sdev_printk(KERN_ERR, sdev, > > - "rejecting I/O to offline device\n"); > > + if (!sdev->offline_already) { > > + sdev->offline_already = 1; > > + sdev_printk(KERN_ERR, sdev, > > + "rejecting I/O to offline device\n"); > > + } > > return BLK_STS_IOERR; > > case SDEV_DEL: > > /* > > @@ -2340,6 +2343,7 @@ scsi_device_set_state(struct scsi_device *sdev, enum scsi_device_state state) > > break; > > > > } > > + sdev->offline_already = 0; > > sdev->sdev_state = state; > > return 0; > > > > diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h > > index f8312a3..72987a0 100644 > > --- a/include/scsi/scsi_device.h > > +++ b/include/scsi/scsi_device.h > > @@ -204,6 +204,8 @@ struct scsi_device { > > unsigned unmap_limit_for_ws:1; /* Use the UNMAP limit for WRITE SAME */ > > unsigned rpm_autosuspend:1; /* Enable runtime autosuspend at device > > * creation time */ > > + unsigned offline_already:1; /* Device offline message logged */ > > + > > atomic_t disk_events_disable_depth; /* disable depth for disk events */ > > > > DECLARE_BITMAP(supported_events, SDEV_EVT_MAXBITS); /* supported events */ > > Bitfields are troublesome in multithreaded software. Has it been > considered to use rate-limiting instead of introducing a new bitfield > member? > > Thanks, > > Bart. > I did but printk_ratelimited() does not do what is desired here. What we want is only a single message per-device. If we ratelimit the message instance itself we lose information in the log about which devices were affected (which makes debugging issues with multipath I/O much harder). The only purpose of the flag is to try to suppress duplicate messages, in the common case it is a single thread submitting the queued I/O which is going to get rejected. If multiple threads submit I/O there might be duplicated messages but that is not all that critical. Hence the lack of locking on the flag. -Ewan
On Mon, 2020-03-09 at 14:14 -0400, Ewan D. Milne wrote: > Large queues of I/O to offline devices that are eventually > submitted when devices are unblocked result in a many repeated > "rejecting I/O to offline device" messages. These messages > can fill up the dmesg buffer in crash dumps so no useful > prior messages remain. In addition, if a serial console > is used, the flood of messages can cause a hard lockup in > the console code. > > Introduce a flag indicating the message has already been logged > for the device, and reset the flag when scsi_device_set_state() > changes the device state. > > Signed-off-by: Ewan D. Milne <emilne@redhat.com> > --- > drivers/scsi/scsi_lib.c | 8 ++++++-- > include/scsi/scsi_device.h | 2 ++ > 2 files changed, 8 insertions(+), 2 deletions(-) > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > index 610ee41..d3a6d97 100644 > --- a/drivers/scsi/scsi_lib.c > +++ b/drivers/scsi/scsi_lib.c > @@ -1240,8 +1240,11 @@ scsi_prep_state_check(struct scsi_device > *sdev, struct request *req) > * commands. The device must be brought online > * before trying any recovery commands. > */ > - sdev_printk(KERN_ERR, sdev, > - "rejecting I/O to offline device\n"); > + if (!sdev->offline_already) { > + sdev->offline_already = 1; > + sdev_printk(KERN_ERR, sdev, > + "rejecting I/O to offline > device\n"); > + } > return BLK_STS_IOERR; > case SDEV_DEL: > /* > @@ -2340,6 +2343,7 @@ scsi_device_set_state(struct scsi_device *sdev, > enum scsi_device_state state) > break; > > } > + sdev->offline_already = 0; > sdev->sdev_state = state; > return 0; > > diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h > index f8312a3..72987a0 100644 > --- a/include/scsi/scsi_device.h > +++ b/include/scsi/scsi_device.h > @@ -204,6 +204,8 @@ struct scsi_device { > unsigned unmap_limit_for_ws:1; /* Use the UNMAP limit for > WRITE SAME */ > unsigned rpm_autosuspend:1; /* Enable runtime autosuspend > at device > * creation time */ > + unsigned offline_already:1; /* Device offline message > logged */ > + > atomic_t disk_events_disable_depth; /* disable depth for disk > events */ > > DECLARE_BITMAP(supported_events, SDEV_EVT_MAXBITS); /* > supported events */ I tested this on a customer issue. If its accepted please use Tested-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Laurence Oberman <loberman@redhat.com>
On 2020-03-09 13:54, Ewan D. Milne wrote: > The only purpose of the flag is to try to suppress duplicate messages, > in the common case it is a single thread submitting the queued I/O which > is going to get rejected. If multiple threads submit I/O there might > be duplicated messages but that is not all that critical. Hence the > lack of locking on the flag. Hi Ewan, My concern is that scsi_prep_state_check() may be called from more than one thread at the same time and that bitfield changes are not thread-safe. Thanks, Bart.
On 2020-03-09 4:49 p.m., Ewan D. Milne wrote: > On Mon, 2020-03-09 at 12:05 -0700, James Bottomley wrote: >> On Mon, 2020-03-09 at 14:14 -0400, Ewan D. Milne wrote: >>> Large queues of I/O to offline devices that are eventually >>> submitted when devices are unblocked result in a many repeated >>> "rejecting I/O to offline device" messages. These messages >>> can fill up the dmesg buffer in crash dumps so no useful >>> prior messages remain. In addition, if a serial console >>> is used, the flood of messages can cause a hard lockup in >>> the console code. >>> >>> Introduce a flag indicating the message has already been logged >>> for the device, and reset the flag when scsi_device_set_state() >>> changes the device state. >>> >>> Signed-off-by: Ewan D. Milne <emilne@redhat.com> >>> --- >>> drivers/scsi/scsi_lib.c | 8 ++++++-- >>> include/scsi/scsi_device.h | 2 ++ >>> 2 files changed, 8 insertions(+), 2 deletions(-) >>> >>> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c >>> index 610ee41..d3a6d97 100644 >>> --- a/drivers/scsi/scsi_lib.c >>> +++ b/drivers/scsi/scsi_lib.c >>> @@ -1240,8 +1240,11 @@ scsi_prep_state_check(struct scsi_device >>> *sdev, struct request *req) >>> * commands. The device must be brought online >>> * before trying any recovery commands. >>> */ >>> - sdev_printk(KERN_ERR, sdev, >>> - "rejecting I/O to offline device\n"); >>> + if (!sdev->offline_already) { >>> + sdev->offline_already = 1; >>> + sdev_printk(KERN_ERR, sdev, >>> + "rejecting I/O to offline >>> device\n"); >>> + } >> >> Offline->online is a legal transition, so you'll need to clear this >> flag when that happens so we get another offline message if it goes >> offline again. >> >> James >> > > That's what resetting the flag in scsi_device_set_state() does. > The only thing that worries me is if some driver manipulates the > sdev->sdev_state value directly. Perhaps they could sneak in some c++isms and allow the definition of struct fields to be preceded by [[protected]]. The idea would be to complain loudly if that field is directly accessed. The accessor functions might need something similar (or the same decoration) to defeat that noise. In the meantime: Reviewed-by: Douglas Gilbert <dgilbert@interlog.com>
On Mon, 2020-03-09 at 19:02 -0700, Bart Van Assche wrote: > On 2020-03-09 13:54, Ewan D. Milne wrote: > > The only purpose of the flag is to try to suppress duplicate messages, > > in the common case it is a single thread submitting the queued I/O which > > is going to get rejected. If multiple threads submit I/O there might > > be duplicated messages but that is not all that critical. Hence the > > lack of locking on the flag. > > Hi Ewan, > > My concern is that scsi_prep_state_check() may be called from more than > one thread at the same time and that bitfield changes are not thread-safe. > > Thanks, > > Bart. Yes, I agree that the flag is not thread-safe, but I don't think that is a concern. Because if we get multiple rejecting I/O messages until the other threads see the flag change we are no worse off than before, and once the sdev state changes back to SDEV_RUNNING we won't call scsi_prep_state_check() and examine the flag. -Ewan
On 2020-03-10 09:44, Ewan D. Milne wrote: > On Mon, 2020-03-09 at 19:02 -0700, Bart Van Assche wrote: >> On 2020-03-09 13:54, Ewan D. Milne wrote: >>> The only purpose of the flag is to try to suppress duplicate messages, >>> in the common case it is a single thread submitting the queued I/O which >>> is going to get rejected. If multiple threads submit I/O there might >>> be duplicated messages but that is not all that critical. Hence the >>> lack of locking on the flag. >> >> My concern is that scsi_prep_state_check() may be called from more than >> one thread at the same time and that bitfield changes are not thread-safe. > > Yes, I agree that the flag is not thread-safe, but I don't think that > is a concern. Because if we get multiple rejecting I/O messages until > the other threads see the flag change we are no worse off than before, > and once the sdev state changes back to SDEV_RUNNING we won't call > scsi_prep_state_check() and examine the flag. Hi Ewan, This is the entire list of bitfields in struct scsi_device: unsigned removable:1; unsigned changed:1; unsigned busy:1; unsigned lockable:1; unsigned locked:1; unsigned borken:1; unsigned disconnect:1; unsigned soft_reset:1; unsigned sdtr:1; unsigned wdtr:1; unsigned ppr:1; unsigned tagged_supported:1; unsigned simple_tags:1; unsigned was_reset:1; unsigned expecting_cc_ua:1; unsigned use_10_for_rw:1; unsigned use_10_for_ms:1; unsigned set_dbd_for_ms:1; unsigned no_report_opcodes:1; unsigned no_write_same:1; unsigned use_16_for_rw:1; unsigned skip_ms_page_8:1; unsigned skip_ms_page_3f:1; unsigned skip_vpd_pages:1; unsigned try_vpd_pages:1; unsigned use_192_bytes_for_3f:1; unsigned no_start_on_add:1; unsigned allow_restart:1; unsigned manage_start_stop:1; unsigned start_stop_pwr_cond:1; unsigned no_uld_attach:1; unsigned select_no_atn:1; unsigned fix_capacity:1; unsigned guess_capacity:1; unsigned retry_hwerror:1; unsigned last_sector_bug:1; unsigned no_read_disc_info:1; unsigned no_read_capacity_16:1; unsigned try_rc_10_first:1; unsigned security_supported:1; unsigned is_visible:1; unsigned wce_default_on:1; unsigned no_dif:1; unsigned broken_fua:1; unsigned lun_in_cdb:1; unsigned unmap_limit_for_ws:1; unsigned rpm_autosuspend:1; If any thread modifies any of these existing bitfields concurrently with scsi_prep_state_check(), one of the two modifications will be lost. That is because compilers implement bitfield changes as follows: new_value = (old_value & ~(1 << ...)) | (1 << ...); If two such assignment statements are executed concurrently, both start from the same 'old_value' and only one of the two changes will happen. I'm concerned that this patch introduces a maintenance problem in the long term. Has it been considered to make 'offline_already' a 32-bits integer variable or a boolean instead of a bitfield? I think such variables can be modified without discarding values written from another thread. Thanks, Bart.
On 2020-03-11 12:01 a.m., Bart Van Assche wrote: > On 2020-03-10 09:44, Ewan D. Milne wrote: >> On Mon, 2020-03-09 at 19:02 -0700, Bart Van Assche wrote: >>> On 2020-03-09 13:54, Ewan D. Milne wrote: >>>> The only purpose of the flag is to try to suppress duplicate messages, >>>> in the common case it is a single thread submitting the queued I/O which >>>> is going to get rejected. If multiple threads submit I/O there might >>>> be duplicated messages but that is not all that critical. Hence the >>>> lack of locking on the flag. >>> >>> My concern is that scsi_prep_state_check() may be called from more than >>> one thread at the same time and that bitfield changes are not thread-safe. >> >> Yes, I agree that the flag is not thread-safe, but I don't think that >> is a concern. Because if we get multiple rejecting I/O messages until >> the other threads see the flag change we are no worse off than before, >> and once the sdev state changes back to SDEV_RUNNING we won't call >> scsi_prep_state_check() and examine the flag. > > Hi Ewan, > > This is the entire list of bitfields in struct scsi_device: > > unsigned removable:1; > unsigned changed:1; > unsigned busy:1; > unsigned lockable:1; > unsigned locked:1; > unsigned borken:1; > unsigned disconnect:1; > unsigned soft_reset:1; > unsigned sdtr:1; > unsigned wdtr:1; > unsigned ppr:1; > unsigned tagged_supported:1; > unsigned simple_tags:1; > unsigned was_reset:1; > unsigned expecting_cc_ua:1; > unsigned use_10_for_rw:1; > unsigned use_10_for_ms:1; > unsigned set_dbd_for_ms:1; > unsigned no_report_opcodes:1; > unsigned no_write_same:1; > unsigned use_16_for_rw:1; > unsigned skip_ms_page_8:1; > unsigned skip_ms_page_3f:1; > unsigned skip_vpd_pages:1; > unsigned try_vpd_pages:1; > unsigned use_192_bytes_for_3f:1; > unsigned no_start_on_add:1; > unsigned allow_restart:1; > unsigned manage_start_stop:1; > unsigned start_stop_pwr_cond:1; > unsigned no_uld_attach:1; > unsigned select_no_atn:1; > unsigned fix_capacity:1; > unsigned guess_capacity:1; > unsigned retry_hwerror:1; > unsigned last_sector_bug:1; > unsigned no_read_disc_info:1; > unsigned no_read_capacity_16:1; > unsigned try_rc_10_first:1; > unsigned security_supported:1; > unsigned is_visible:1; > unsigned wce_default_on:1; > unsigned no_dif:1; > unsigned broken_fua:1; > unsigned lun_in_cdb:1; > unsigned unmap_limit_for_ws:1; > unsigned rpm_autosuspend:1; > > If any thread modifies any of these existing bitfields concurrently with > scsi_prep_state_check(), one of the two modifications will be lost. That > is because compilers implement bitfield changes as follows: > > new_value = (old_value & ~(1 << ...)) | (1 << ...); > > If two such assignment statements are executed concurrently, both start > from the same 'old_value' and only one of the two changes will happen. > I'm concerned that this patch introduces a maintenance problem in the > long term. Has it been considered to make 'offline_already' a 32-bits > integer variable or a boolean instead of a bitfield? I think such > variables can be modified without discarding values written from another > thread. Most of the stuff there is slow moving data, typically set once at device creation time. The design predates atomic bitops. I can't see why the addition of an extra bitfield, whose overwriting is not going to cause a calamity ***, warrants the proposer having the rewrite this patch. Doug Gilbert *** setting offline_already effectively invalidates a lot of other bitfields. Still, if the offline_already fails to be set, due to a clash (with what?), then at worst the warning message is repeated and the patch code tries again to set that bitfield.
On Tue, 2020-03-10 at 21:01 -0700, Bart Van Assche wrote: > ... > If any thread modifies any of these existing bitfields concurrently with > scsi_prep_state_check(), one of the two modifications will be lost. That > is because compilers implement bitfield changes as follows: > > new_value = (old_value & ~(1 << ...)) | (1 << ...); > > If two such assignment statements are executed concurrently, both start > from the same 'old_value' and only one of the two changes will happen. > I'm concerned that this patch introduces a maintenance problem in the > long term. Has it been considered to make 'offline_already' a 32-bits > integer variable or a boolean instead of a bitfield? I think such > variables can be modified without discarding values written from another > thread. > > Thanks, > > Bart. > I understand your concern about long-term maintenance, and introducing a pattern that could be used by other code that needs more accurate state. (Although, it is likely that a lock or an atomic operation might be required in other cases). I'll post a v2 changing this to a boolean. -Ewan
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 610ee41..d3a6d97 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1240,8 +1240,11 @@ scsi_prep_state_check(struct scsi_device *sdev, struct request *req) * commands. The device must be brought online * before trying any recovery commands. */ - sdev_printk(KERN_ERR, sdev, - "rejecting I/O to offline device\n"); + if (!sdev->offline_already) { + sdev->offline_already = 1; + sdev_printk(KERN_ERR, sdev, + "rejecting I/O to offline device\n"); + } return BLK_STS_IOERR; case SDEV_DEL: /* @@ -2340,6 +2343,7 @@ scsi_device_set_state(struct scsi_device *sdev, enum scsi_device_state state) break; } + sdev->offline_already = 0; sdev->sdev_state = state; return 0; diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h index f8312a3..72987a0 100644 --- a/include/scsi/scsi_device.h +++ b/include/scsi/scsi_device.h @@ -204,6 +204,8 @@ struct scsi_device { unsigned unmap_limit_for_ws:1; /* Use the UNMAP limit for WRITE SAME */ unsigned rpm_autosuspend:1; /* Enable runtime autosuspend at device * creation time */ + unsigned offline_already:1; /* Device offline message logged */ + atomic_t disk_events_disable_depth; /* disable depth for disk events */ DECLARE_BITMAP(supported_events, SDEV_EVT_MAXBITS); /* supported events */
Large queues of I/O to offline devices that are eventually submitted when devices are unblocked result in a many repeated "rejecting I/O to offline device" messages. These messages can fill up the dmesg buffer in crash dumps so no useful prior messages remain. In addition, if a serial console is used, the flood of messages can cause a hard lockup in the console code. Introduce a flag indicating the message has already been logged for the device, and reset the flag when scsi_device_set_state() changes the device state. Signed-off-by: Ewan D. Milne <emilne@redhat.com> --- drivers/scsi/scsi_lib.c | 8 ++++++-- include/scsi/scsi_device.h | 2 ++ 2 files changed, 8 insertions(+), 2 deletions(-)