diff mbox series

Revert "scsi: sd: Do not repeat the starting disk message"

Message ID 20240716161101.30692-1-johan+linaro@kernel.org (mailing list archive)
State Accepted
Commit da3e19ef0b3de0aa4b25595bdc214c02a04f19b8
Headers show
Series Revert "scsi: sd: Do not repeat the starting disk message" | expand

Commit Message

Johan Hovold July 16, 2024, 4:11 p.m. UTC
This reverts commit 7a6bbc2829d4ab592c7e440a6f6f5deb3cd95db4.

The offending commit tried to suppress a double "Starting disk" message
for some drivers, but instead started spamming the log with bogus
messages every five seconds:

	[  311.798956] sd 0:0:0:0: [sda] Starting disk
	[  316.919103] sd 0:0:0:0: [sda] Starting disk
	[  322.040775] sd 0:0:0:0: [sda] Starting disk
	[  327.161140] sd 0:0:0:0: [sda] Starting disk
	[  332.281352] sd 0:0:0:0: [sda] Starting disk
	[  337.401878] sd 0:0:0:0: [sda] Starting disk
	[  342.521527] sd 0:0:0:0: [sda] Starting disk
	[  345.850401] sd 0:0:0:0: [sda] Starting disk
	[  350.967132] sd 0:0:0:0: [sda] Starting disk
	[  356.090454] sd 0:0:0:0: [sda] Starting disk
	...

on machines that do not actually stop the disk on runtime suspend (e.g.
the Qualcomm sc8280xp CRD with UFS).

Let's just revert for now to address the regression.

Fixes: 7a6bbc2829d4 ("scsi: sd: Do not repeat the starting disk message")
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
---
 drivers/scsi/sd.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)


Hi,

I just noticed this regression that snuck into 6.10-final and tracked it
down to 7a6bbc2829d4 ("scsi: sd: Do not repeat the starting disk
message").

I wanted to get this out ASAP to address the immediate regression while
someone who cares enough can work out a proper fix for the double start
message (which seems less annoying).

Note that the offending commit is marked for stable.

Johan

Comments

Bart Van Assche July 16, 2024, 9:44 p.m. UTC | #1
On 7/16/24 9:11 AM, Johan Hovold wrote:
> This reverts commit 7a6bbc2829d4ab592c7e440a6f6f5deb3cd95db4.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Damien Le Moal July 16, 2024, 10:48 p.m. UTC | #2
On 7/17/24 01:11, Johan Hovold wrote:
> This reverts commit 7a6bbc2829d4ab592c7e440a6f6f5deb3cd95db4.
> 
> The offending commit tried to suppress a double "Starting disk" message
> for some drivers, but instead started spamming the log with bogus
> messages every five seconds:
> 
> 	[  311.798956] sd 0:0:0:0: [sda] Starting disk
> 	[  316.919103] sd 0:0:0:0: [sda] Starting disk
> 	[  322.040775] sd 0:0:0:0: [sda] Starting disk
> 	[  327.161140] sd 0:0:0:0: [sda] Starting disk
> 	[  332.281352] sd 0:0:0:0: [sda] Starting disk
> 	[  337.401878] sd 0:0:0:0: [sda] Starting disk
> 	[  342.521527] sd 0:0:0:0: [sda] Starting disk
> 	[  345.850401] sd 0:0:0:0: [sda] Starting disk
> 	[  350.967132] sd 0:0:0:0: [sda] Starting disk
> 	[  356.090454] sd 0:0:0:0: [sda] Starting disk
> 	...
> 
> on machines that do not actually stop the disk on runtime suspend (e.g.
> the Qualcomm sc8280xp CRD with UFS).

This is odd. If the disk is not being being suspended, why does the platform
even enable runtime PM for it ? Are you sure about this ? Or is it simply that
the runtime pm timer is set to a very low interval ?

It almost sound like what we need to do here is suppress this message for the
runtime resume case, so something like:

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 2e933fd1de70..4261128bf1f3 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -4220,7 +4220,8 @@ static int sd_resume_common(struct device *dev, bool runtime)
        if (!sdkp)      /* E.g.: runtime resume at the start of sd_probe() */
                return 0;

-       sd_printk(KERN_NOTICE, sdkp, "Starting disk\n");
+       if (!runtime)
+               sd_printk(KERN_NOTICE, sdkp, "Starting disk\n");

        if (!sd_do_start_stop(sdkp->device, runtime)) {
                sdkp->suspended = false;

However, I would like to make sure that this platform is not calling
sd_resume_runtime() for nothing every 5s. If that is the case, then there is a
more fundamental problem here and reverting this patch is only hiding that.

> 
> Let's just revert for now to address the regression.
> 
> Fixes: 7a6bbc2829d4 ("scsi: sd: Do not repeat the starting disk message")
> Cc: stable@vger.kernel.org
> Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
> ---
>  drivers/scsi/sd.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> 
> Hi,
> 
> I just noticed this regression that snuck into 6.10-final and tracked it
> down to 7a6bbc2829d4 ("scsi: sd: Do not repeat the starting disk
> message").
> 
> I wanted to get this out ASAP to address the immediate regression while
> someone who cares enough can work out a proper fix for the double start
> message (which seems less annoying).
> 
> Note that the offending commit is marked for stable.
> 
> Johan
> 
> 
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index 1b7561abe05d..6b64af7d4927 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -4119,6 +4119,8 @@ static int sd_resume(struct device *dev)
>  {
>  	struct scsi_disk *sdkp = dev_get_drvdata(dev);
>  
> +	sd_printk(KERN_NOTICE, sdkp, "Starting disk\n");
> +
>  	if (opal_unlock_from_suspend(sdkp->opal_dev)) {
>  		sd_printk(KERN_NOTICE, sdkp, "OPAL unlock failed\n");
>  		return -EIO;
> @@ -4135,13 +4137,12 @@ static int sd_resume_common(struct device *dev, bool runtime)
>  	if (!sdkp)	/* E.g.: runtime resume at the start of sd_probe() */
>  		return 0;
>  
> -	sd_printk(KERN_NOTICE, sdkp, "Starting disk\n");
> -
>  	if (!sd_do_start_stop(sdkp->device, runtime)) {
>  		sdkp->suspended = false;
>  		return 0;
>  	}
>  
> +	sd_printk(KERN_NOTICE, sdkp, "Starting disk\n");
>  	ret = sd_start_stop_device(sdkp, 1);
>  	if (!ret) {
>  		sd_resume(dev);
Johan Hovold July 17, 2024, 9 a.m. UTC | #3
On Wed, Jul 17, 2024 at 07:48:26AM +0900, Damien Le Moal wrote:
> On 7/17/24 01:11, Johan Hovold wrote:
> > This reverts commit 7a6bbc2829d4ab592c7e440a6f6f5deb3cd95db4.
> > 
> > The offending commit tried to suppress a double "Starting disk" message
> > for some drivers, but instead started spamming the log with bogus
> > messages every five seconds:
> > 
> > 	[  311.798956] sd 0:0:0:0: [sda] Starting disk
> > 	[  316.919103] sd 0:0:0:0: [sda] Starting disk
> > 	[  322.040775] sd 0:0:0:0: [sda] Starting disk
> > 	[  327.161140] sd 0:0:0:0: [sda] Starting disk
> > 	[  332.281352] sd 0:0:0:0: [sda] Starting disk
> > 	[  337.401878] sd 0:0:0:0: [sda] Starting disk
> > 	[  342.521527] sd 0:0:0:0: [sda] Starting disk
> > 	[  345.850401] sd 0:0:0:0: [sda] Starting disk
> > 	[  350.967132] sd 0:0:0:0: [sda] Starting disk
> > 	[  356.090454] sd 0:0:0:0: [sda] Starting disk
> > 	...
> > 
> > on machines that do not actually stop the disk on runtime suspend (e.g.
> > the Qualcomm sc8280xp CRD with UFS).
> 
> This is odd. If the disk is not being being suspended, why does the platform
> even enable runtime PM for it ? 

This is clearly intended to be supported as sd_do_start_stop() returns
false and that prevents sd_start_stop_device() from being called on
resume (and similarly on suspend which is why there are no matching
stopping disk messages above):

	[   32.822189] sd 0:0:0:0: sd_resume_common - runtime = 1, sd_do_start_stop = 0, manage_runtime_start_stop = 0

> Are you sure about this ? Or is it simply that
> the runtime pm timer is set to a very low interval ?

I haven't tried to determine why runtime pm is used this way, but your
patch is clearly broken as it prints a message about starting the disk
even when sd_do_start_stop() returns false.

> It almost sound like what we need to do here is suppress this message for the
> runtime resume case, so something like:

No, that would only make things worse as I assume you'd have a stopped
disk message without a matching start message for driver that do end up
stopping the disk here.

> However, I would like to make sure that this platform is not calling
> sd_resume_runtime() for nothing every 5s. If that is the case, then there is a
> more fundamental problem here and reverting this patch is only hiding that.

This is with the Qualcomm UFS driver, but it seems it just relies on the
generic ufshcd_pltfrm_init() implementation.

Also not sure why anyone would want to see these messages on every
runtime suspend (for drivers that end up taking this path), but that's a
separate discussion.

Johan
Damien Le Moal July 17, 2024, 10:46 a.m. UTC | #4
On 7/17/24 18:00, Johan Hovold wrote:
> On Wed, Jul 17, 2024 at 07:48:26AM +0900, Damien Le Moal wrote:
>> On 7/17/24 01:11, Johan Hovold wrote:
>>> This reverts commit 7a6bbc2829d4ab592c7e440a6f6f5deb3cd95db4.
>>>
>>> The offending commit tried to suppress a double "Starting disk" message
>>> for some drivers, but instead started spamming the log with bogus
>>> messages every five seconds:
>>>
>>> 	[  311.798956] sd 0:0:0:0: [sda] Starting disk
>>> 	[  316.919103] sd 0:0:0:0: [sda] Starting disk
>>> 	[  322.040775] sd 0:0:0:0: [sda] Starting disk
>>> 	[  327.161140] sd 0:0:0:0: [sda] Starting disk
>>> 	[  332.281352] sd 0:0:0:0: [sda] Starting disk
>>> 	[  337.401878] sd 0:0:0:0: [sda] Starting disk
>>> 	[  342.521527] sd 0:0:0:0: [sda] Starting disk
>>> 	[  345.850401] sd 0:0:0:0: [sda] Starting disk
>>> 	[  350.967132] sd 0:0:0:0: [sda] Starting disk
>>> 	[  356.090454] sd 0:0:0:0: [sda] Starting disk
>>> 	...
>>>
>>> on machines that do not actually stop the disk on runtime suspend (e.g.
>>> the Qualcomm sc8280xp CRD with UFS).
>>
>> This is odd. If the disk is not being being suspended, why does the platform
>> even enable runtime PM for it ? 
> 
> This is clearly intended to be supported as sd_do_start_stop() returns
> false and that prevents sd_start_stop_device() from being called on
> resume (and similarly on suspend which is why there are no matching
> stopping disk messages above):
> 
> 	[   32.822189] sd 0:0:0:0: sd_resume_common - runtime = 1, sd_do_start_stop = 0, manage_runtime_start_stop = 0

Yes, so we can suppress the "Starting disk" message for runtime resume, to match
the runtime suspend not having the message.

> 
>> Are you sure about this ? Or is it simply that
>> the runtime pm timer is set to a very low interval ?
> 
> I haven't tried to determine why runtime pm is used this way, but your
> patch is clearly broken as it prints a message about starting the disk
> even when sd_do_start_stop() returns false.

The patch is not *that* broken, because sd_do_start_stop() returning false mean
only that the disk will *not* be started using a START STOP UNIT command. But
the underlying LLD must start the drive. So the message is not wrong, even
though it is probably best to suppress it for the runtime case.

The point here is that sd_runtime_resume() should NOT be called every 5s unless
there is also a runtime suspend in between the calls. As mentioned, this can
happen if the autosuspend timer is set to a very low timeout to aggressively
suspend the disk after a short idle time. That of course makes absolutely no
sense for HDDs given the spinup time needed, but I guess that is a possiblity
for UFS drives.

> 
>> It almost sound like what we need to do here is suppress this message for the
>> runtime resume case, so something like:
> 
> No, that would only make things worse as I assume you'd have a stopped
> disk message without a matching start message for driver that do end up
> stopping the disk here.

OK. so let's revert this patch and I will rework that message to be displayed
only on device removal, system suspend and system shutdown.

>> However, I would like to make sure that this platform is not calling
>> sd_resume_runtime() for nothing every 5s. If that is the case, then there is a
>> more fundamental problem here and reverting this patch is only hiding that.
> 
> This is with the Qualcomm UFS driver, but it seems it just relies on the
> generic ufshcd_pltfrm_init() implementation.
> 
> Also not sure why anyone would want to see these messages on every
> runtime suspend (for drivers that end up taking this path), but that's a
> separate discussion.

Not really. As mentioned above, it is probably best to suppress the start/stop
messages for runtime suspend. The separate discussion is why sd_runtime_resume
is called that often for this UFS drive: bug or aggressive autosuspend ? Given
that I do not have this hardware, I will let someone else look into that.
Johan Hovold July 17, 2024, 11:25 a.m. UTC | #5
On Wed, Jul 17, 2024 at 07:46:14PM +0900, Damien Le Moal wrote:
> On 7/17/24 18:00, Johan Hovold wrote:
> > On Wed, Jul 17, 2024 at 07:48:26AM +0900, Damien Le Moal wrote:
> >> On 7/17/24 01:11, Johan Hovold wrote:
> >>> This reverts commit 7a6bbc2829d4ab592c7e440a6f6f5deb3cd95db4.
> >>>
> >>> The offending commit tried to suppress a double "Starting disk" message
> >>> for some drivers, but instead started spamming the log with bogus
> >>> messages every five seconds:
> >>>
> >>> 	[  311.798956] sd 0:0:0:0: [sda] Starting disk
> >>> 	[  316.919103] sd 0:0:0:0: [sda] Starting disk
> >>> 	[  322.040775] sd 0:0:0:0: [sda] Starting disk
> >>> 	[  327.161140] sd 0:0:0:0: [sda] Starting disk
> >>> 	[  332.281352] sd 0:0:0:0: [sda] Starting disk
> >>> 	[  337.401878] sd 0:0:0:0: [sda] Starting disk
> >>> 	[  342.521527] sd 0:0:0:0: [sda] Starting disk
> >>> 	[  345.850401] sd 0:0:0:0: [sda] Starting disk
> >>> 	[  350.967132] sd 0:0:0:0: [sda] Starting disk
> >>> 	[  356.090454] sd 0:0:0:0: [sda] Starting disk
> >>> 	...
> >>>
> >>> on machines that do not actually stop the disk on runtime suspend (e.g.
> >>> the Qualcomm sc8280xp CRD with UFS).
> >>
> >> This is odd. If the disk is not being being suspended, why does the platform
> >> even enable runtime PM for it ? 
> > 
> > This is clearly intended to be supported as sd_do_start_stop() returns
> > false and that prevents sd_start_stop_device() from being called on
> > resume (and similarly on suspend which is why there are no matching
> > stopping disk messages above):
> > 
> > 	[   32.822189] sd 0:0:0:0: sd_resume_common - runtime = 1, sd_do_start_stop = 0, manage_runtime_start_stop = 0
> 
> Yes, so we can suppress the "Starting disk" message for runtime resume, to match
> the runtime suspend not having the message.

No, the point is that the stopping disk message is also suppressed when
sd_do_start_stop() returns false (i.e. when sd_start_stop_device() is
never called). See sd_suspend_common().

> >> Are you sure about this ? Or is it simply that
> >> the runtime pm timer is set to a very low interval ?
> > 
> > I haven't tried to determine why runtime pm is used this way, but your
> > patch is clearly broken as it prints a message about starting the disk
> > even when sd_do_start_stop() returns false.
> 
> The patch is not *that* broken, because sd_do_start_stop() returning false mean
> only that the disk will *not* be started using a START STOP UNIT command. But
> the underlying LLD must start the drive. So the message is not wrong, even
> though it is probably best to suppress it for the runtime case.

From a quick look at the code I interpret the (original) intention to be
to only print these messages in cases were sd_start_stop_device() is
actually called.
 
> The point here is that sd_runtime_resume() should NOT be called every 5s unless
> there is also a runtime suspend in between the calls. As mentioned, this can
> happen if the autosuspend timer is set to a very low timeout to aggressively
> suspend the disk after a short idle time. That of course makes absolutely no
> sense for HDDs given the spinup time needed, but I guess that is a possiblity
> for UFS drives.

I don't see anything obviously wrong with this for things like UFS.

Here's what some printk reveal for the Qualcomm platform in question:

[   50.659451] ufs_device_wlun 0:0:0:49488: ufshcd_wl_runtime_resume
[   50.669756] sd 0:0:0:0: sd_resume_runtime
[   52.911603] sd 0:0:0:0: sd_suspend_runtime
[   52.921707] ufs_device_wlun 0:0:0:49488: ufshcd_wl_runtime_suspend
[   53.472894] ufs_device_wlun 0:0:0:49488: ufshcd_wl_runtime_resume
[   53.481464] sd 0:0:0:0: sd_resume_runtime
[   55.550493] sd 0:0:0:0: sd_suspend_runtime
[   55.559697] ufs_device_wlun 0:0:0:49488: ufshcd_wl_runtime_suspend
[   58.595554] ufs_device_wlun 0:0:0:49488: ufshcd_wl_runtime_resume
[   58.607868] sd 0:0:0:0: sd_resume_runtime
[   60.667330] sd 0:0:0:0: sd_suspend_runtime
[   60.677623] ufs_device_wlun 0:0:0:49488: ufshcd_wl_runtime_suspend
[   63.714149] ufs_device_wlun 0:0:0:49488: ufshcd_wl_runtime_resume
[   63.724498] sd 0:0:0:0: sd_resume_runtime
[   65.772893] sd 0:0:0:0: sd_suspend_runtime
[   65.784696] ufs_device_wlun 0:0:0:49488: ufshcd_wl_runtime_suspend
[   68.836015] ufs_device_wlun 0:0:0:49488: ufshcd_wl_runtime_resume
[   68.849576] sd 0:0:0:0: sd_resume_runtime
[   71.359102] sd 0:0:0:0: sd_suspend_runtime
[   71.368928] ufs_device_wlun 0:0:0:49488: ufshcd_wl_runtime_suspend
[   73.955031] ufs_device_wlun 0:0:0:49488: ufshcd_wl_runtime_resume
[   73.963040] sd 0:0:0:0: sd_resume_runtime
[   76.032153] sd 0:0:0:0: sd_suspend_runtime
[   76.042100] ufs_device_wlun 0:0:0:49488: ufshcd_wl_runtime_suspend

Looks like a 2-second autosuspend timeout somewhere, and the controller
stays suspended for 1-3 seconds in between.

> >> It almost sound like what we need to do here is suppress this message for the
> >> runtime resume case, so something like:
> > 
> > No, that would only make things worse as I assume you'd have a stopped
> > disk message without a matching start message for driver that do end up
> > stopping the disk here.
> 
> OK. so let's revert this patch and I will rework that message to be displayed
> only on device removal, system suspend and system shutdown.

Sounds good.

Johan
Damien Le Moal July 17, 2024, 12:04 p.m. UTC | #6
On 7/17/24 20:25, Johan Hovold wrote:
> On Wed, Jul 17, 2024 at 07:46:14PM +0900, Damien Le Moal wrote:
>> On 7/17/24 18:00, Johan Hovold wrote:
>>> On Wed, Jul 17, 2024 at 07:48:26AM +0900, Damien Le Moal wrote:
>>>> On 7/17/24 01:11, Johan Hovold wrote:
>>>>> This reverts commit 7a6bbc2829d4ab592c7e440a6f6f5deb3cd95db4.
>>>>>
>>>>> The offending commit tried to suppress a double "Starting disk" message
>>>>> for some drivers, but instead started spamming the log with bogus
>>>>> messages every five seconds:
>>>>>
>>>>> 	[  311.798956] sd 0:0:0:0: [sda] Starting disk
>>>>> 	[  316.919103] sd 0:0:0:0: [sda] Starting disk
>>>>> 	[  322.040775] sd 0:0:0:0: [sda] Starting disk
>>>>> 	[  327.161140] sd 0:0:0:0: [sda] Starting disk
>>>>> 	[  332.281352] sd 0:0:0:0: [sda] Starting disk
>>>>> 	[  337.401878] sd 0:0:0:0: [sda] Starting disk
>>>>> 	[  342.521527] sd 0:0:0:0: [sda] Starting disk
>>>>> 	[  345.850401] sd 0:0:0:0: [sda] Starting disk
>>>>> 	[  350.967132] sd 0:0:0:0: [sda] Starting disk
>>>>> 	[  356.090454] sd 0:0:0:0: [sda] Starting disk
>>>>> 	...
>>>>>
>>>>> on machines that do not actually stop the disk on runtime suspend (e.g.
>>>>> the Qualcomm sc8280xp CRD with UFS).
>>>>
>>>> This is odd. If the disk is not being being suspended, why does the platform
>>>> even enable runtime PM for it ? 
>>>
>>> This is clearly intended to be supported as sd_do_start_stop() returns
>>> false and that prevents sd_start_stop_device() from being called on
>>> resume (and similarly on suspend which is why there are no matching
>>> stopping disk messages above):
>>>
>>> 	[   32.822189] sd 0:0:0:0: sd_resume_common - runtime = 1, sd_do_start_stop = 0, manage_runtime_start_stop = 0
>>
>> Yes, so we can suppress the "Starting disk" message for runtime resume, to match
>> the runtime suspend not having the message.
> 
> No, the point is that the stopping disk message is also suppressed when
> sd_do_start_stop() returns false (i.e. when sd_start_stop_device() is
> never called). See sd_suspend_common().
> 
>>>> Are you sure about this ? Or is it simply that
>>>> the runtime pm timer is set to a very low interval ?
>>>
>>> I haven't tried to determine why runtime pm is used this way, but your
>>> patch is clearly broken as it prints a message about starting the disk
>>> even when sd_do_start_stop() returns false.
>>
>> The patch is not *that* broken, because sd_do_start_stop() returning false mean
>> only that the disk will *not* be started using a START STOP UNIT command. But
>> the underlying LLD must start the drive. So the message is not wrong, even
>> though it is probably best to suppress it for the runtime case.
> 
> From a quick look at the code I interpret the (original) intention to be
> to only print these messages in cases were sd_start_stop_device() is
> actually called.
>  
>> The point here is that sd_runtime_resume() should NOT be called every 5s unless
>> there is also a runtime suspend in between the calls. As mentioned, this can
>> happen if the autosuspend timer is set to a very low timeout to aggressively
>> suspend the disk after a short idle time. That of course makes absolutely no
>> sense for HDDs given the spinup time needed, but I guess that is a possiblity
>> for UFS drives.
> 
> I don't see anything obviously wrong with this for things like UFS.
> 
> Here's what some printk reveal for the Qualcomm platform in question:
> 
> [   50.659451] ufs_device_wlun 0:0:0:49488: ufshcd_wl_runtime_resume
> [   50.669756] sd 0:0:0:0: sd_resume_runtime
> [   52.911603] sd 0:0:0:0: sd_suspend_runtime
> [   52.921707] ufs_device_wlun 0:0:0:49488: ufshcd_wl_runtime_suspend
> [   53.472894] ufs_device_wlun 0:0:0:49488: ufshcd_wl_runtime_resume
> [   53.481464] sd 0:0:0:0: sd_resume_runtime
> [   55.550493] sd 0:0:0:0: sd_suspend_runtime
> [   55.559697] ufs_device_wlun 0:0:0:49488: ufshcd_wl_runtime_suspend
> [   58.595554] ufs_device_wlun 0:0:0:49488: ufshcd_wl_runtime_resume
> [   58.607868] sd 0:0:0:0: sd_resume_runtime
> [   60.667330] sd 0:0:0:0: sd_suspend_runtime
> [   60.677623] ufs_device_wlun 0:0:0:49488: ufshcd_wl_runtime_suspend
> [   63.714149] ufs_device_wlun 0:0:0:49488: ufshcd_wl_runtime_resume
> [   63.724498] sd 0:0:0:0: sd_resume_runtime
> [   65.772893] sd 0:0:0:0: sd_suspend_runtime
> [   65.784696] ufs_device_wlun 0:0:0:49488: ufshcd_wl_runtime_suspend
> [   68.836015] ufs_device_wlun 0:0:0:49488: ufshcd_wl_runtime_resume
> [   68.849576] sd 0:0:0:0: sd_resume_runtime
> [   71.359102] sd 0:0:0:0: sd_suspend_runtime
> [   71.368928] ufs_device_wlun 0:0:0:49488: ufshcd_wl_runtime_suspend
> [   73.955031] ufs_device_wlun 0:0:0:49488: ufshcd_wl_runtime_resume
> [   73.963040] sd 0:0:0:0: sd_resume_runtime
> [   76.032153] sd 0:0:0:0: sd_suspend_runtime
> [   76.042100] ufs_device_wlun 0:0:0:49488: ufshcd_wl_runtime_suspend
> 
> Looks like a 2-second autosuspend timeout somewhere, and the controller
> stays suspended for 1-3 seconds in between.

OK. So all good and nothing suspicious with this. That is only very aggressive
autosuspend. As I said, let's revert and I will rework the start/stop messages.

> 
>>>> It almost sound like what we need to do here is suppress this message for the
>>>> runtime resume case, so something like:
>>>
>>> No, that would only make things worse as I assume you'd have a stopped
>>> disk message without a matching start message for driver that do end up
>>> stopping the disk here.
>>
>> OK. so let's revert this patch and I will rework that message to be displayed
>> only on device removal, system suspend and system shutdown.
> 
> Sounds good.
> 
> Johan
>
Damien Le Moal July 17, 2024, 12:05 p.m. UTC | #7
On 7/17/24 01:11, Johan Hovold wrote:
> This reverts commit 7a6bbc2829d4ab592c7e440a6f6f5deb3cd95db4.
> 
> The offending commit tried to suppress a double "Starting disk" message
> for some drivers, but instead started spamming the log with bogus
> messages every five seconds:
> 
> 	[  311.798956] sd 0:0:0:0: [sda] Starting disk
> 	[  316.919103] sd 0:0:0:0: [sda] Starting disk
> 	[  322.040775] sd 0:0:0:0: [sda] Starting disk
> 	[  327.161140] sd 0:0:0:0: [sda] Starting disk
> 	[  332.281352] sd 0:0:0:0: [sda] Starting disk
> 	[  337.401878] sd 0:0:0:0: [sda] Starting disk
> 	[  342.521527] sd 0:0:0:0: [sda] Starting disk
> 	[  345.850401] sd 0:0:0:0: [sda] Starting disk
> 	[  350.967132] sd 0:0:0:0: [sda] Starting disk
> 	[  356.090454] sd 0:0:0:0: [sda] Starting disk
> 	...
> 
> on machines that do not actually stop the disk on runtime suspend (e.g.
> the Qualcomm sc8280xp CRD with UFS).
> 
> Let's just revert for now to address the regression.
> 
> Fixes: 7a6bbc2829d4 ("scsi: sd: Do not repeat the starting disk message")
> Cc: stable@vger.kernel.org
> Signed-off-by: Johan Hovold <johan+linaro@kernel.org>

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Martin K. Petersen July 23, 2024, 12:30 a.m. UTC | #8
Johan,

> This reverts commit 7a6bbc2829d4ab592c7e440a6f6f5deb3cd95db4.

Applied to 6.11/scsi-staging, thanks!
diff mbox series

Patch

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 1b7561abe05d..6b64af7d4927 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -4119,6 +4119,8 @@  static int sd_resume(struct device *dev)
 {
 	struct scsi_disk *sdkp = dev_get_drvdata(dev);
 
+	sd_printk(KERN_NOTICE, sdkp, "Starting disk\n");
+
 	if (opal_unlock_from_suspend(sdkp->opal_dev)) {
 		sd_printk(KERN_NOTICE, sdkp, "OPAL unlock failed\n");
 		return -EIO;
@@ -4135,13 +4137,12 @@  static int sd_resume_common(struct device *dev, bool runtime)
 	if (!sdkp)	/* E.g.: runtime resume at the start of sd_probe() */
 		return 0;
 
-	sd_printk(KERN_NOTICE, sdkp, "Starting disk\n");
-
 	if (!sd_do_start_stop(sdkp->device, runtime)) {
 		sdkp->suspended = false;
 		return 0;
 	}
 
+	sd_printk(KERN_NOTICE, sdkp, "Starting disk\n");
 	ret = sd_start_stop_device(sdkp, 1);
 	if (!ret) {
 		sd_resume(dev);