[1/2] libsas: remove irq save in sas_ata_qc_issue()

Message ID	20180504145011.5752-1-bigeasy@linutronix.de (mailing list archive)
State	Changes Requested
Headers	show Return-Path: <linux-scsi-owner@kernel.org> From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> To: linux-scsi@vger.kernel.org Cc: tglx@linutronix.de, Hannes Reinecke <hare@suse.com>, "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>, "Martin K. Petersen" <martin.petersen@oracle.com>, Sebastian Andrzej Siewior <bigeasy@linutronix.de> Subject: [PATCH 1/2] libsas: remove irq save in sas_ata_qc_issue() Date: Fri, 4 May 2018 16:50:10 +0200 Message-Id: <20180504145011.5752-1-bigeasy@linutronix.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk

Sebastian Andrzej Siewior May 4, 2018, 2:50 p.m. UTC

Since commit 312d3e56119a ("[SCSI] libsas: remove ata_port.lock
management duties from lldds") the sas_ata_qc_issue() function unlocks
the ata_port.lock and disables interrupts before doing so.
That lock is always taken with disabled interrupts so at this point, the
interrupts are already disabled. There is no need to disable the
interrupts before the unlock operation because they are already
disabled.
Restoring the interrupt state later does not change anything because
they were disabled and remain disabled. Therefore remove the operations
which do not change the behaviour.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 drivers/scsi/libsas/sas_ata.c | 3 ---
 1 file changed, 3 deletions(-)

John Garry May 18, 2018, 1:31 p.m. UTC | #1

On 04/05/2018 15:50, Sebastian Andrzej Siewior wrote:
> Since commit 312d3e56119a ("[SCSI] libsas: remove ata_port.lock
> management duties from lldds") the sas_ata_qc_issue() function unlocks
> the ata_port.lock and disables interrupts before doing so.
> That lock is always taken with disabled interrupts so at this point, the
> interrupts are already disabled. There is no need to disable the
> interrupts before the unlock operation because they are already
> disabled.

Is the only caller ata_qc_issue(), which has spin_lock_irqsave(host 
lock) enabled?

> Restoring the interrupt state later does not change anything because
> they were disabled and remain disabled. Therefore remove the operations
> which do not change the behaviour.
>
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
>  drivers/scsi/libsas/sas_ata.c | 3 ---
>  1 file changed, 3 deletions(-)
>
> diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
> index 0cc1567eacc1..bc5d917ff643 100644
> --- a/drivers/scsi/libsas/sas_ata.c
> +++ b/drivers/scsi/libsas/sas_ata.c
> @@ -176,7 +176,6 @@ static void sas_ata_task_done(struct sas_task *task)
>
>  static unsigned int sas_ata_qc_issue(struct ata_queued_cmd *qc)
>  {
> -	unsigned long flags;
>  	struct sas_task *task;
>  	struct scatterlist *sg;
>  	int ret = AC_ERR_SYSTEM;
> @@ -190,7 +189,6 @@ static unsigned int sas_ata_qc_issue(struct ata_queued_cmd *qc)
>  	/* TODO: audit callers to ensure they are ready for qc_issue to
>  	 * unconditionally re-enable interrupts
>  	 */

Is the "TODO" comment still relevant?

cheers,

> -	local_irq_save(flags);
>  	spin_unlock(ap->lock);
>
>  	/* If the device fell off, no sense in issuing commands */
> @@ -252,7 +250,6 @@ static unsigned int sas_ata_qc_issue(struct ata_queued_cmd *qc)
>
>   out:
>  	spin_lock(ap->lock);
> -	local_irq_restore(flags);
>  	return ret;
>  }
>
>

Sebastian Andrzej Siewior May 22, 2018, 5:31 p.m. UTC | #2

On 2018-05-18 14:31:27 [+0100], John Garry wrote:
> On 04/05/2018 15:50, Sebastian Andrzej Siewior wrote:
> > Since commit 312d3e56119a ("[SCSI] libsas: remove ata_port.lock
> > management duties from lldds") the sas_ata_qc_issue() function unlocks
> > the ata_port.lock and disables interrupts before doing so.
> > That lock is always taken with disabled interrupts so at this point, the
> > interrupts are already disabled. There is no need to disable the
> > interrupts before the unlock operation because they are already
> > disabled.
> 
> Is the only caller ata_qc_issue(), which has spin_lock_irqsave(host lock)
> enabled?

I don't understand the question. 

> > Restoring the interrupt state later does not change anything because
> > they were disabled and remain disabled. Therefore remove the operations
> > which do not change the behaviour.
> > 
> > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> > ---
> >  drivers/scsi/libsas/sas_ata.c | 3 ---
> >  1 file changed, 3 deletions(-)
> > 
> > diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
> > index 0cc1567eacc1..bc5d917ff643 100644
> > --- a/drivers/scsi/libsas/sas_ata.c
> > +++ b/drivers/scsi/libsas/sas_ata.c
> > @@ -176,7 +176,6 @@ static void sas_ata_task_done(struct sas_task *task)
> > 
> >  static unsigned int sas_ata_qc_issue(struct ata_queued_cmd *qc)
> >  {
> > -	unsigned long flags;
> >  	struct sas_task *task;
> >  	struct scatterlist *sg;
> >  	int ret = AC_ERR_SYSTEM;
> > @@ -190,7 +189,6 @@ static unsigned int sas_ata_qc_issue(struct ata_queued_cmd *qc)
> >  	/* TODO: audit callers to ensure they are ready for qc_issue to
> >  	 * unconditionally re-enable interrupts
> >  	 */
> 
> Is the "TODO" comment still relevant?

can't tell. The interrupts are never enabled during the unlock.

> cheers,
> 
> > -	local_irq_save(flags);
> >  	spin_unlock(ap->lock);
> > 
> >  	/* If the device fell off, no sense in issuing commands */
> > @@ -252,7 +250,6 @@ static unsigned int sas_ata_qc_issue(struct ata_queued_cmd *qc)
> > 
> >   out:
> >  	spin_lock(ap->lock);
> > -	local_irq_restore(flags);
> >  	return ret;
> >  }

Sebastian

John Garry May 23, 2018, 8:46 a.m. UTC | #3

On 22/05/2018 18:31, Sebastian Andrzej Siewior wrote:
> On 2018-05-18 14:31:27 [+0100], John Garry wrote:
>> On 04/05/2018 15:50, Sebastian Andrzej Siewior wrote:
>>> Since commit 312d3e56119a ("[SCSI] libsas: remove ata_port.lock
>>> management duties from lldds") the sas_ata_qc_issue() function unlocks
>>> the ata_port.lock and disables interrupts before doing so.
>>> That lock is always taken with disabled interrupts so at this point, the
>>> interrupts are already disabled. There is no need to disable the
>>> interrupts before the unlock operation because they are already
>>> disabled.
>>
>> Is the only caller ata_qc_issue(), which has spin_lock_irqsave(host lock)
>> enabled?
>
> I don't understand the question.

It seems to me that ap->lock can be taken in interrupt context, so then 
we know it should be locked with interrupts disabled always.

I was just really asking how you know interrupts are disabled already? 
Maybe it's obvious.

cheers,

>
>>> Restoring the interrupt state later does not change anything because
>>> they were disabled and remain disabled. Therefore remove the operations
>>> which do not change the behaviour.
>>>
>>> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
>>> ---
>>>  drivers/scsi/libsas/sas_ata.c | 3 ---
>>>  1 file changed, 3 deletions(-)
>>>
>>> diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
>>> index 0cc1567eacc1..bc5d917ff643 100644
>>> --- a/drivers/scsi/libsas/sas_ata.c
>>> +++ b/drivers/scsi/libsas/sas_ata.c
>>> @@ -176,7 +176,6 @@ static void sas_ata_task_done(struct sas_task *task)
>>>
>>>  static unsigned int sas_ata_qc_issue(struct ata_queued_cmd *qc)
>>>  {
>>> -	unsigned long flags;
>>>  	struct sas_task *task;
>>>  	struct scatterlist *sg;
>>>  	int ret = AC_ERR_SYSTEM;
>>> @@ -190,7 +189,6 @@ static unsigned int sas_ata_qc_issue(struct ata_queued_cmd *qc)
>>>  	/* TODO: audit callers to ensure they are ready for qc_issue to
>>>  	 * unconditionally re-enable interrupts
>>>  	 */
>>
>> Is the "TODO" comment still relevant?
>
> can't tell. The interrupts are never enabled during the unlock.
>
>> cheers,
>>
>>> -	local_irq_save(flags);
>>>  	spin_unlock(ap->lock);
>>>
>>>  	/* If the device fell off, no sense in issuing commands */
>>> @@ -252,7 +250,6 @@ static unsigned int sas_ata_qc_issue(struct ata_queued_cmd *qc)
>>>
>>>   out:
>>>  	spin_lock(ap->lock);
>>> -	local_irq_restore(flags);
>>>  	return ret;
>>>  }
>
> Sebastian
>
> .
>

Sebastian Andrzej Siewior May 23, 2018, 9:24 a.m. UTC | #4

On 2018-05-23 09:46:30 [+0100], John Garry wrote:
> On 22/05/2018 18:31, Sebastian Andrzej Siewior wrote:
> > On 2018-05-18 14:31:27 [+0100], John Garry wrote:
> > > On 04/05/2018 15:50, Sebastian Andrzej Siewior wrote:
> > > > Since commit 312d3e56119a ("[SCSI] libsas: remove ata_port.lock
> > > > management duties from lldds") the sas_ata_qc_issue() function unlocks
> > > > the ata_port.lock and disables interrupts before doing so.
> > > > That lock is always taken with disabled interrupts so at this point, the
> > > > interrupts are already disabled. There is no need to disable the
> > > > interrupts before the unlock operation because they are already
> > > > disabled.
> > > 
> > > Is the only caller ata_qc_issue(), which has spin_lock_irqsave(host lock)
> > > enabled?
> > 
> > I don't understand the question.
> 
> It seems to me that ap->lock can be taken in interrupt context, so then we
> know it should be locked with interrupts disabled always.

yes, the comment above the function says so, too.

> I was just really asking how you know interrupts are disabled already? Maybe
> it's obvious.

The lock has to be taken with interrupts disabled. If the interrupts
were enabled at the time sas_ata_qc_issue() was invoked then the lock
was already taken in a bad way and disabling interrupts before the
unlock does not make it any better.
To verify that the locking is okay you can build the kernel with lockdep
enabled and CONFIG_PROVE_LOCKING=y set. That way lockdep will inform you
as soon as it notices the lock taken in interrupt context and in
process-context with enabled interrupts.

> cheers,

Sebastian

Jason Yan May 24, 2018, 2:58 a.m. UTC | #5

On 2018/5/23 17:24, Sebastian Andrzej Siewior wrote:
> On 2018-05-23 09:46:30 [+0100], John Garry wrote:
>> On 22/05/2018 18:31, Sebastian Andrzej Siewior wrote:
>>> On 2018-05-18 14:31:27 [+0100], John Garry wrote:
>>>> On 04/05/2018 15:50, Sebastian Andrzej Siewior wrote:
>>>>> Since commit 312d3e56119a ("[SCSI] libsas: remove ata_port.lock
>>>>> management duties from lldds") the sas_ata_qc_issue() function unlocks
>>>>> the ata_port.lock and disables interrupts before doing so.
>>>>> That lock is always taken with disabled interrupts so at this point, the
>>>>> interrupts are already disabled. There is no need to disable the
>>>>> interrupts before the unlock operation because they are already
>>>>> disabled.
>>>>
>>>> Is the only caller ata_qc_issue(), which has spin_lock_irqsave(host lock)
>>>> enabled?
>>>
>>> I don't understand the question.
>>
>> It seems to me that ap->lock can be taken in interrupt context, so then we
>> know it should be locked with interrupts disabled always.
>
> yes, the comment above the function says so, too.
>
>> I was just really asking how you know interrupts are disabled already? Maybe
>> it's obvious.
>
> The lock has to be taken with interrupts disabled. If the interrupts
> were enabled at the time sas_ata_qc_issue() was invoked then the lock
> was already taken in a bad way and disabling interrupts before the
> unlock does not make it any better.
> To verify that the locking is okay you can build the kernel with lockdep
> enabled and CONFIG_PROVE_LOCKING=y set. That way lockdep will inform you
> as soon as it notices the lock taken in interrupt context and in
> process-context with enabled interrupts.
>

For libsas, ata_qc_issue() now have three callers:
1. sas_queuecommand(), the normal io patch, have taken ap->lock with
irq disabled
2. ata_exec_internal_sg(), from EH context, have taken ap->lock with
irq disabled too
3. __ata_qc_complete(), from io completion path, have taken ap->lock
with irq disabled too(this may be in interrupts context)

I think it's fine to delete this irq save code. As for the "TODO"
comment, I think we can add:

BUG_ON(!irqs_disabled());

or

WARN_ON_ONCE(!irqs_disabled());

and then delete the "TODO" comment.

>> cheers,
>
> Sebastian
>
> .
>

Sebastian Andrzej Siewior May 24, 2018, 7:15 a.m. UTC | #6

On 2018-05-24 10:58:44 [+0800], Jason Yan wrote:
> I think it's fine to delete this irq save code. As for the "TODO"
> comment, I think we can add:
> 
> BUG_ON(!irqs_disabled());
> 
> or
> 
> WARN_ON_ONCE(!irqs_disabled());

no, please don't do this. Please add instead
	lockdep_assert_held()

on the lock in question and let lockdep to its work. Lockdep has way
better coverage than your irqs_disabled() check which also breaks RT.

> and then delete the "TODO" comment.

Sebastian

Jason Yan May 24, 2018, 7:38 a.m. UTC | #7

On 2018/5/24 15:15, Sebastian Andrzej Siewior wrote:
> On 2018-05-24 10:58:44 [+0800], Jason Yan wrote:
>> I think it's fine to delete this irq save code. As for the "TODO"
>> comment, I think we can add:
>>
>> BUG_ON(!irqs_disabled());
>>
>> or
>>
>> WARN_ON_ONCE(!irqs_disabled());
>
> no, please don't do this. Please add instead
> 	lockdep_assert_held()
>
> on the lock in question and let lockdep to its work. Lockdep has way
> better coverage than your irqs_disabled() check which also breaks RT.
>

lockdep_assert_held() cannot detect the irq state, it can only detect
whether we have held the lock.

Because you are trying to delete the irq-save code which will make the
comment hard to understand, I'm just giving an advise. It's welcome
if you have a better way.

Thanks.

>> and then delete the "TODO" comment.
>
> Sebastian
>
> .
>

Johannes Thumshirn May 24, 2018, 7:50 a.m. UTC | #8

On Thu, May 24, 2018 at 03:38:51PM +0800, Jason Yan wrote:
> 
> 
> On 2018/5/24 15:15, Sebastian Andrzej Siewior wrote:
> > On 2018-05-24 10:58:44 [+0800], Jason Yan wrote:
> > > I think it's fine to delete this irq save code. As for the "TODO"
> > > comment, I think we can add:
> > > 
> > > BUG_ON(!irqs_disabled());
> > > 
> > > or
> > > 
> > > WARN_ON_ONCE(!irqs_disabled());
> > 
> > no, please don't do this. Please add instead
> > 	lockdep_assert_held()
> > 
> > on the lock in question and let lockdep to its work. Lockdep has way
> > better coverage than your irqs_disabled() check which also breaks RT.
> > 
> 
> lockdep_assert_held() cannot detect the irq state, it can only detect
> whether we have held the lock.

I think Sebastian wanted to say lockdep_assert_irqs_disabled().
 
Either way, please never ever use BUG_ON() (even WARN_ON() is
questionable as some people actually use panic_on_oops). It causes
nasty bugs at customer sites.

Byte,
	Johannes

Sebastian Andrzej Siewior May 24, 2018, 7:52 a.m. UTC | #9

On 2018-05-24 15:38:51 [+0800], Jason Yan wrote:
> > no, please don't do this. Please add instead
> > 	lockdep_assert_held()
> > 
> > on the lock in question and let lockdep to its work. Lockdep has way
> > better coverage than your irqs_disabled() check which also breaks RT.
> > 
> 
> lockdep_assert_held() cannot detect the irq state, it can only detect
> whether we have held the lock.

yes, correct. But if the lock is acquired with interrupts enabled or the
interrupts are enabled while the lock is held then lockdep will notify
you. So you ensure that the lock is held at this point then and lockdep
will do the validation (however, no need to do so in sas_ata_qc_issue()
because lockdep do the evaluation during unlock/lock).

> Because you are trying to delete the irq-save code which will make the
> comment hard to understand, I'm just giving an advise. It's welcome
> if you have a better way.

I am removing local_irq_save() prior an unlock of a lock which has to be
taken with disabled interrupts. Therefore the interrupts are disabling
as part of the locking procedure and the additional disabling is a nop.

What comment are you talking about? sas_ata_qc_issue() has no comments
except for "If the device fell …".

> Thanks.
> 
Sebastian

Sebastian Andrzej Siewior May 24, 2018, 7:54 a.m. UTC | #10

On 2018-05-24 09:50:07 [+0200], Johannes Thumshirn wrote:
> > lockdep_assert_held() cannot detect the irq state, it can only detect
> > whether we have held the lock.
> 
> I think Sebastian wanted to say lockdep_assert_irqs_disabled().

nope, meant the right thing.

> Byte,
> 	Johannes

Sebastian

Jason Yan May 24, 2018, 7:58 a.m. UTC | #11

On 2018/5/24 15:50, Johannes Thumshirn wrote:
> On Thu, May 24, 2018 at 03:38:51PM +0800, Jason Yan wrote:
>>
>>
>> On 2018/5/24 15:15, Sebastian Andrzej Siewior wrote:
>>> On 2018-05-24 10:58:44 [+0800], Jason Yan wrote:
>>>> I think it's fine to delete this irq save code. As for the "TODO"
>>>> comment, I think we can add:
>>>>
>>>> BUG_ON(!irqs_disabled());
>>>>
>>>> or
>>>>
>>>> WARN_ON_ONCE(!irqs_disabled());
>>>
>>> no, please don't do this. Please add instead
>>> 	lockdep_assert_held()
>>>
>>> on the lock in question and let lockdep to its work. Lockdep has way
>>> better coverage than your irqs_disabled() check which also breaks RT.
>>>
>>
>> lockdep_assert_held() cannot detect the irq state, it can only detect
>> whether we have held the lock.
>
> I think Sebastian wanted to say lockdep_assert_irqs_disabled().
>

OK, good idea.

> Either way, please never ever use BUG_ON() (even WARN_ON() is
> questionable as some people actually use panic_on_oops). It causes
> nasty bugs at customer sites.
>

It's true that BUG_ON() is not recommended. But WARN_ON_ONCE() is
a possible way to replace it, panic_on_oops have no effect on it.
Actually lockdep_assert_irqs_disabled() uses WARN_ONCE() too.

> Byte,
> 	Johannes
>

Jason Yan May 24, 2018, 8 a.m. UTC | #12

On 2018/5/24 15:52, Sebastian Andrzej Siewior wrote:
> On 2018-05-24 15:38:51 [+0800], Jason Yan wrote:
>>> no, please don't do this. Please add instead
>>> 	lockdep_assert_held()
>>>
>>> on the lock in question and let lockdep to its work. Lockdep has way
>>> better coverage than your irqs_disabled() check which also breaks RT.
>>>
>>
>> lockdep_assert_held() cannot detect the irq state, it can only detect
>> whether we have held the lock.
>
> yes, correct. But if the lock is acquired with interrupts enabled or the
> interrupts are enabled while the lock is held then lockdep will notify
> you. So you ensure that the lock is held at this point then and lockdep
> will do the validation (however, no need to do so in sas_ata_qc_issue()
> because lockdep do the evaluation during unlock/lock).
>
>> Because you are trying to delete the irq-save code which will make the
>> comment hard to understand, I'm just giving an advise. It's welcome
>> if you have a better way.
>
> I am removing local_irq_save() prior an unlock of a lock which has to be
> taken with disabled interrupts. Therefore the interrupts are disabling
> as part of the locking procedure and the additional disabling is a nop.
>
> What comment are you talking about? sas_ata_qc_issue() has no comments
> except for "If the device fell …".
>

I mean the "TODO" comment above the local_irq_save().

>> Thanks.
>>
> Sebastian
>
> .
>

Sebastian Andrzej Siewior May 24, 2018, 8:18 a.m. UTC | #13

On 2018-05-24 16:00:47 [+0800], Jason Yan wrote:
> > > Because you are trying to delete the irq-save code which will make the
> > > comment hard to understand, I'm just giving an advise. It's welcome
> > > if you have a better way.
> > 
> > I am removing local_irq_save() prior an unlock of a lock which has to be
> > taken with disabled interrupts. Therefore the interrupts are disabling
> > as part of the locking procedure and the additional disabling is a nop.
> > 
> > What comment are you talking about? sas_ata_qc_issue() has no comments
> > except for "If the device fell …".
> > 
> 
> I mean the "TODO" comment above the local_irq_save().
this?
/*
 * The local_irq_*() APIs are equal to the raw_local_irq*()
 * if !TRACE_IRQFLAGS.
 */

Sebastian

Jason Yan May 24, 2018, 8:20 a.m. UTC | #14

On 2018/5/24 16:18, Sebastian Andrzej Siewior wrote:
> On 2018-05-24 16:00:47 [+0800], Jason Yan wrote:
>>>> Because you are trying to delete the irq-save code which will make the
>>>> comment hard to understand, I'm just giving an advise. It's welcome
>>>> if you have a better way.
>>>
>>> I am removing local_irq_save() prior an unlock of a lock which has to be
>>> taken with disabled interrupts. Therefore the interrupts are disabling
>>> as part of the locking procedure and the additional disabling is a nop.
>>>
>>> What comment are you talking about? sas_ata_qc_issue() has no comments
>>> except for "If the device fell …".
>>>
>>
>> I mean the "TODO" comment above the local_irq_save().
> this?
> /*
>   * The local_irq_*() APIs are equal to the raw_local_irq*()
>   * if !TRACE_IRQFLAGS.
>   */
>

NO, this:

	/* TODO: audit callers to ensure they are ready for qc_issue to
	 * unconditionally re-enable interrupts
	 */
	local_irq_save(flags);
	spin_unlock(ap->lock);

> Sebastian
>
> .
>

Sebastian Andrzej Siewior May 24, 2018, 8:27 a.m. UTC | #15

On 2018-05-24 16:20:33 [+0800], Jason Yan wrote:
> NO, this:
> 
> 	/* TODO: audit callers to ensure they are ready for qc_issue to
> 	 * unconditionally re-enable interrupts
> 	 */
> 	local_irq_save(flags);
> 	spin_unlock(ap->lock);

indeed, I have no idea how I could have overseen this. At least as of
today, the interrupts are never "re-enabled" in this function. I
*assumed* that the intention was to audit the code for this

 	spin_unlock_irq(ap->lock);

change instead. But if this is or was never intended than I could indeed
remove the TODO comment.

Sebastian

John Garry May 30, 2018, 9:34 a.m. UTC | #16

On 23/05/2018 10:24, Sebastian Andrzej Siewior wrote:
> On 2018-05-23 09:46:30 [+0100], John Garry wrote:
>> On 22/05/2018 18:31, Sebastian Andrzej Siewior wrote:
>>> On 2018-05-18 14:31:27 [+0100], John Garry wrote:
>>>> On 04/05/2018 15:50, Sebastian Andrzej Siewior wrote:
>>>>> Since commit 312d3e56119a ("[SCSI] libsas: remove ata_port.lock
>>>>> management duties from lldds") the sas_ata_qc_issue() function unlocks
>>>>> the ata_port.lock and disables interrupts before doing so.
>>>>> That lock is always taken with disabled interrupts so at this point, the
>>>>> interrupts are already disabled. There is no need to disable the
>>>>> interrupts before the unlock operation because they are already
>>>>> disabled.
>>>>
>>>> Is the only caller ata_qc_issue(), which has spin_lock_irqsave(host lock)
>>>> enabled?
>>>
>>> I don't understand the question.
>>
>> It seems to me that ap->lock can be taken in interrupt context, so then we
>> know it should be locked with interrupts disabled always.
>
> yes, the comment above the function says so, too.
>
>> I was just really asking how you know interrupts are disabled already? Maybe
>> it's obvious.
>
> The lock has to be taken with interrupts disabled. If the interrupts
> were enabled at the time sas_ata_qc_issue() was invoked then the lock
> was already taken in a bad way and disabling interrupts before the
> unlock does not make it any better.
> To verify that the locking is okay you can build the kernel with lockdep
> enabled and CONFIG_PROVE_LOCKING=y set. That way lockdep will inform you
> as soon as it notices the lock taken in interrupt context and in
> process-context with enabled interrupts.
>
>> cheers,
>

Sorry, but personally I don't see much value in this change. I think 
it's better for safety to be consistent in how we lock & unlock the 
spinlock, i.e. use irqsave variant (or similar).

And even if we have audited the callers to know the irqstate when this 
function is called, since sas_ata_qc_issue() is a callback for 
ata_port_operations.qc_issue, then it should be documented that 
interrupts should always be disabled for this callback.

All the best,
John

> Sebastian
>
> .
>

Sebastian Andrzej Siewior May 30, 2018, 10:16 a.m. UTC | #17

On 2018-05-30 10:34:12 [+0100], John Garry wrote:
> Sorry, but personally I don't see much value in this change. I think it's
> better for safety to be consistent in how we lock & unlock the spinlock,
> i.e. use irqsave variant (or similar).

The lock should do irqsave() and unlock irqrestore(). This
local_irqsave() + unlock() is not correct.

> And even if we have audited the callers to know the irqstate when this
> function is called, since sas_ata_qc_issue() is a callback for
> ata_port_operations.qc_issue, then it should be documented that interrupts
> should always be disabled for this callback.

If the lock has been acquired without disabling the interrupts then it
was already wrong. This is just duct tape and does not fix the
problematic caller. Lockdep will yell.

> All the best,
> John

Sebastian

John Garry May 30, 2018, 11:16 a.m. UTC | #18

On 30/05/2018 11:16, Sebastian Andrzej Siewior wrote:
> On 2018-05-30 10:34:12 [+0100], John Garry wrote:
>> Sorry, but personally I don't see much value in this change. I think it's
>> better for safety to be consistent in how we lock & unlock the spinlock,
>> i.e. use irqsave variant (or similar).
>
> The lock should do irqsave() and unlock irqrestore(). This
> local_irqsave() + unlock() is not correct.
>

Aren't they the same, i.e. local_irq_save()+spin_lock() = 
spin_lock_irqsave()? Both give state of lock held, interrupts and 
preemption disabled.

>> And even if we have audited the callers to know the irqstate when this
>> function is called, since sas_ata_qc_issue() is a callback for
>> ata_port_operations.qc_issue, then it should be documented that interrupts
>> should always be disabled for this callback.
>
> If the lock has been acquired without disabling the interrupts then it
> was already wrong. This is just duct tape and does not fix the
> problematic caller. Lockdep will yell.
>

John

>
> Sebastian
>
> .
>

Sebastian Andrzej Siewior May 30, 2018, 11:22 a.m. UTC | #19

On 2018-05-30 12:16:23 [+0100], John Garry wrote:
> On 30/05/2018 11:16, Sebastian Andrzej Siewior wrote:
> > On 2018-05-30 10:34:12 [+0100], John Garry wrote:
> > > Sorry, but personally I don't see much value in this change. I think it's
> > > better for safety to be consistent in how we lock & unlock the spinlock,
> > > i.e. use irqsave variant (or similar).
> > 
> > The lock should do irqsave() and unlock irqrestore(). This
> > local_irqsave() + unlock() is not correct.
> > 
> 
> Aren't they the same, i.e. local_irq_save()+spin_lock() =
> spin_lock_irqsave()? Both give state of lock held, interrupts and preemption
> disabled.

Yes, there are the same on vanilla and not on RT. However my point is
that the code does this instead:
	local_irq_save();
	spin_unlock();

and this is wrong. There is no spin_unlock_irqsave().

> 
> John
> 
Sebastian

John Garry May 30, 2018, 1:37 p.m. UTC | #20

On 30/05/2018 12:22, Sebastian Andrzej Siewior wrote:
> On 2018-05-30 12:16:23 [+0100], John Garry wrote:
>> On 30/05/2018 11:16, Sebastian Andrzej Siewior wrote:
>>> On 2018-05-30 10:34:12 [+0100], John Garry wrote:
>>>> Sorry, but personally I don't see much value in this change. I think it's
>>>> better for safety to be consistent in how we lock & unlock the spinlock,
>>>> i.e. use irqsave variant (or similar).
>>>
>>> The lock should do irqsave() and unlock irqrestore(). This
>>> local_irqsave() + unlock() is not correct.
>>>
>>
>> Aren't they the same, i.e. local_irq_save()+spin_lock() =
>> spin_lock_irqsave()? Both give state of lock held, interrupts and preemption
>> disabled.
>
> Yes, there are the same on vanilla and not on RT. However my point is
> that the code does this instead:
> 	local_irq_save();
> 	spin_unlock();

Ah, I just noticed that this is spin_unlock().

So about the "TODO", which you mention "I *assumed* that the intention 
was to audit the code for this

  	spin_unlock_irq(ap->lock);

change instead. But if this is or was never intended than I could indeed
remove the TODO comment."

As I see, we're dropping the lock but maintaining the irq posture for 
holding that lock (disabled), which seems inefficient.

John

Sebastian Andrzej Siewior May 30, 2018, 1:45 p.m. UTC | #21

On 2018-05-30 14:37:50 [+0100], John Garry wrote:
> On 30/05/2018 12:22, Sebastian Andrzej Siewior wrote:
> > Yes, there are the same on vanilla and not on RT. However my point is
> > that the code does this instead:
> > 	local_irq_save();
> > 	spin_unlock();
> 
> Ah, I just noticed that this is spin_unlock().
> 
> So about the "TODO", which you mention "I *assumed* that the intention was
> to audit the code for this
> 
>  	spin_unlock_irq(ap->lock);
> 
> change instead. But if this is or was never intended than I could indeed
> remove the TODO comment."
> 
> As I see, we're dropping the lock but maintaining the irq posture for
> holding that lock (disabled), which seems inefficient.

excellent. So no more objections from your side or is this a complaint I
didn't fully decode?

> John
> 
Sebastian

John Garry May 30, 2018, 2:08 p.m. UTC | #22

On 30/05/2018 14:45, Sebastian Andrzej Siewior wrote:
> On 2018-05-30 14:37:50 [+0100], John Garry wrote:
>> On 30/05/2018 12:22, Sebastian Andrzej Siewior wrote:
>>> Yes, there are the same on vanilla and not on RT. However my point is
>>> that the code does this instead:
>>> 	local_irq_save();
>>> 	spin_unlock();
>>
>> Ah, I just noticed that this is spin_unlock().
>>
>> So about the "TODO", which you mention "I *assumed* that the intention was
>> to audit the code for this
>>
>>  	spin_unlock_irq(ap->lock);
>>
>> change instead. But if this is or was never intended than I could indeed
>> remove the TODO comment."
>>
>> As I see, we're dropping the lock but maintaining the irq posture for
>> holding that lock (disabled), which seems inefficient.
>
> excellent. So no more objections from your side or is this a complaint I
> didn't fully decode?

I think the original code is not great since we're dropping the lock 
but maintaining the irq posture as disabled.

Do you plan to add a lockdep_assert_irqs_disabled() check in addition? 
It's not needed if we work on the basis that the lock is always taken 
with irqs disabled. And is this lockdep function even intended to be 
used outside core kernel code?

Personally I think that solving the issue in the original code would be 
better, i.e. keeping irqs disabled needlessly. Or else the "TODO" can 
stay (or be amended to improve clarity).

John

>
>> John
>>
> Sebastian
>
> .
>

Sebastian Andrzej Siewior May 30, 2018, 2:28 p.m. UTC | #23

On 2018-05-30 15:08:37 [+0100], John Garry wrote:
> On 30/05/2018 14:45, Sebastian Andrzej Siewior wrote:
> > excellent. So no more objections from your side or is this a complaint I
> > didn't fully decode?
> 
> I think the original code is not great since we're dropping the lock but
> maintaining the irq posture as disabled.

the patch does not change the locking as it exists today.

> Do you plan to add a lockdep_assert_irqs_disabled() check in addition? It's
> not needed if we work on the basis that the lock is always taken with irqs
> disabled. And is this lockdep function even intended to be used outside core
> kernel code?

I don't plan to add lockdep_assert_irqs_disabled(). You should only need
to use lockdep_assert_held() to verify that the lock is held as
expected. Whether or not the interrupts need to be disabled is best
verified with lockdep. Lockdep would even complain here if you intend to
invoke spin_unlock() on a lock which was not acquired earlier (I'm just
trying to say that lockdep_assert_held() is not required here either).

> Personally I think that solving the issue in the original code would be
> better, i.e. keeping irqs disabled needlessly. Or else the "TODO" can stay
> (or be amended to improve clarity).

I would like to get rid of that local_irq_save() here because it breaks
-RT and is not the correct thing to do for !RT (but it does not break
things).
I posted v1 with just the removal of the local_irq_sav() and v2 with
TODO removed as suggested here in the thread. The interrupts still
remain disabled.
So based on that it seems that you are in favour of the initial v1.

> John

Sebastian

John Garry May 30, 2018, 2:54 p.m. UTC | #24

On 30/05/2018 15:28, Sebastian Andrzej Siewior wrote:
> On 2018-05-30 15:08:37 [+0100], John Garry wrote:
>> On 30/05/2018 14:45, Sebastian Andrzej Siewior wrote:
>>> excellent. So no more objections from your side or is this a complaint I
>>> didn't fully decode?
>>
>> I think the original code is not great since we're dropping the lock but
>> maintaining the irq posture as disabled.
>
> the patch does not change the locking as it exists today.

Right

>
>> Do you plan to add a lockdep_assert_irqs_disabled() check in addition? It's
>> not needed if we work on the basis that the lock is always taken with irqs
>> disabled. And is this lockdep function even intended to be used outside core
>> kernel code?
>
> I don't plan to add lockdep_assert_irqs_disabled(). You should only need
> to use lockdep_assert_held() to verify that the lock is held as
> expected. Whether or not the interrupts need to be disabled is best
> verified with lockdep. Lockdep would even complain here if you intend to
> invoke spin_unlock() on a lock which was not acquired earlier (I'm just
> trying to say that lockdep_assert_held() is not required here either).
>
>> Personally I think that solving the issue in the original code would be
>> better, i.e. keeping irqs disabled needlessly. Or else the "TODO" can stay
>> (or be amended to improve clarity).
>
> I would like to get rid of that local_irq_save() here because it breaks
> -RT and is not the correct thing to do for !RT (but it does not break
> things).
> I posted v1 with just the removal of the local_irq_sav() and v2 with
> TODO removed as suggested here in the thread. The interrupts still
> remain disabled.
> So based on that it seems that you are in favour of the initial v1.
>

In fact, if this functional change was accepted then the "TODO" would 
need to be updated for clarity since it would be less clear in meaning 
when the local_irq_save() function is removed.

However, since !RT is not broken, then I am not sure if we accept 
changes to fix the RT kernel only. You will need to defer to the James 
and Martin for that.

Thanks,
John

>> John
>
> Sebastian
>
> .
>

[1/2] libsas: remove irq save in sas_ata_qc_issue()

Commit Message

Comments

Patch