diff mbox series

[12/28] lustre: ptlrpc: do not wakeup every second

Message ID 1539543498-29105-13-git-send-email-jsimmons@infradead.org (mailing list archive)
State New, archived
Headers show
Series lustre: more assorted fixes for lustre 2.10 | expand

Commit Message

James Simmons Oct. 14, 2018, 6:58 p.m. UTC
From: Alex Zhuravlev <bzzz@whamcloud.com>

Even if there are no RPC requests on the set, there is no need to
wake up every second. The thread is woken up when a request is added
to the set or when the STOP bit is set, so it is sufficient to only
wake up when there are requests on the set to worry about.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
WC-bug-id: https://jira.whamcloud.com/browse/LU-9660
Reviewed-on: https://review.whamcloud.com/28776
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

NeilBrown Oct. 29, 2018, 12:03 a.m. UTC | #1
On Sun, Oct 14 2018, James Simmons wrote:

> From: Alex Zhuravlev <bzzz@whamcloud.com>
>
> Even if there are no RPC requests on the set, there is no need to
> wake up every second. The thread is woken up when a request is added
> to the set or when the STOP bit is set, so it is sufficient to only
> wake up when there are requests on the set to worry about.
>
> Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
> WC-bug-id: https://jira.whamcloud.com/browse/LU-9660
> Reviewed-on: https://review.whamcloud.com/28776
> Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
> Reviewed-by: Patrick Farrell <paf@cray.com>
> Reviewed-by: Oleg Drokin <green@whamcloud.com>
> Signed-off-by: James Simmons <jsimmons@infradead.org>
> ---
>  drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c b/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
> index c201a88..5b4977b 100644
> --- a/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
> +++ b/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
> @@ -371,7 +371,7 @@ static int ptlrpcd_check(struct lu_env *env, struct ptlrpcd_ctl *pc)
>  		}
>  	}
>  
> -	return rc;
> +	return rc || test_bit(LIOD_STOP, &pc->pc_flags);
>  }
>  
>  /**
> @@ -441,7 +441,7 @@ static int ptlrpcd(void *arg)
>  		lu_context_enter(env.le_ses);
>  		if (wait_event_idle_timeout(set->set_waitq,
>  					    ptlrpcd_check(&env, pc),
> -					    (timeout ? timeout : 1) * HZ) == 0)
> +					    timeout * HZ) == 0)
>  			ptlrpc_expired_set(set);

This is incorrect.
A timeout of zero means the timeout happens after zero jiffies
(immediately), it doesn't mean there is no timeout.
If we want a "timeout" of zero to mean "Wait forever", we need something
like:

  wait_event_idle_timeout(.....,
                          timeout ? (timeout * HZ) : MAX_SCHEDULE_TIMEOUT) == 0

I've updated the patch accordingly.

Thanks,
NeilBrown

>  
>  		lu_context_exit(&env.le_ctx);
> -- 
> 1.8.3.1
Patrick Farrell Oct. 29, 2018, 1:35 a.m. UTC | #2
Neil,

Does your statement imply this would spin?  It definitely doesn’t just spin (that behavior in a main “wait for work” spot of a (depending on settings) ~per-CPU daemon would render systems unusable and this patch has been in testing for a while).  So what is the detailed behavior of a “timeout that expires immediately”?

- Patrick
NeilBrown Oct. 29, 2018, 2:41 a.m. UTC | #3
On Mon, Oct 29 2018, Patrick Farrell wrote:

> Neil,
>
> Does your statement imply this would spin?  It definitely doesn’t just
> spin (that behavior in a main “wait for work” spot of a (depending on
> settings) ~per-CPU daemon would render systems unusable and this patch
> has been in testing for a while).  So what is the detailed behavior of
> a “timeout that expires immediately”?

Hi Patrick,
 it definitely spins for me.

 I should have clarified that the SFS patch

   e81847bd0651 LU-9660 ptlrpc: do not wakeup every second

 is correct, as __l_wait_event() treats a timeout value of 0 as meaning an
 indefinite timeout.
 The error was in the conversion to wait_event_idle_timeout().  The
 various wait_event*timeout() functions treat 0 as 1 less than 1.
 If you want to not have a timeout, you need to not use the *_timeout()
 version.
 If a timeout is undesirable rather than fatal, then
 MAX_SCHEDULE_TIMEOUT can be used.  In this case, that seemed best.

Thanks,
NeilBrown


>
> - Patrick
>
>
> ________________________________
> From: lustre-devel <lustre-devel-bounces@lists.lustre.org> on behalf of NeilBrown <neilb@suse.com>
> Sent: Sunday, October 28, 2018 7:03:02 PM
> To: James Simmons; Andreas Dilger; Oleg Drokin
> Cc: Lustre Development List
> Subject: Re: [lustre-devel] [PATCH 12/28] lustre: ptlrpc: do not wakeup every second
>
> On Sun, Oct 14 2018, James Simmons wrote:
>
>> From: Alex Zhuravlev <bzzz@whamcloud.com>
>>
>> Even if there are no RPC requests on the set, there is no need to
>> wake up every second. The thread is woken up when a request is added
>> to the set or when the STOP bit is set, so it is sufficient to only
>> wake up when there are requests on the set to worry about.
>>
>> Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
>> WC-bug-id: https://jira.whamcloud.com/browse/LU-9660
>> Reviewed-on: https://review.whamcloud.com/28776
>> Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
>> Reviewed-by: Patrick Farrell <paf@cray.com>
>> Reviewed-by: Oleg Drokin <green@whamcloud.com>
>> Signed-off-by: James Simmons <jsimmons@infradead.org>
>> ---
>>  drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c b/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
>> index c201a88..5b4977b 100644
>> --- a/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
>> +++ b/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
>> @@ -371,7 +371,7 @@ static int ptlrpcd_check(struct lu_env *env, struct ptlrpcd_ctl *pc)
>>                }
>>        }
>>
>> -     return rc;
>> +     return rc || test_bit(LIOD_STOP, &pc->pc_flags);
>>  }
>>
>>  /**
>> @@ -441,7 +441,7 @@ static int ptlrpcd(void *arg)
>>                lu_context_enter(env.le_ses);
>>                if (wait_event_idle_timeout(set->set_waitq,
>>                                            ptlrpcd_check(&env, pc),
>> -                                         (timeout ? timeout : 1) * HZ) == 0)
>> +                                         timeout * HZ) == 0)
>>                        ptlrpc_expired_set(set);
>
> This is incorrect.
> A timeout of zero means the timeout happens after zero jiffies
> (immediately), it doesn't mean there is no timeout.
> If we want a "timeout" of zero to mean "Wait forever", we need something
> like:
>
>   wait_event_idle_timeout(.....,
>                           timeout ? (timeout * HZ) : MAX_SCHEDULE_TIMEOUT) == 0
>
> I've updated the patch accordingly.
>
> Thanks,
> NeilBrown
>
>>
>>                lu_context_exit(&env.le_ctx);
>> --
>> 1.8.3.1
James Simmons Oct. 29, 2018, 3:42 a.m. UTC | #4
> > Neil,
> >
> > Does your statement imply this would spin?  It definitely doesn’t just
> > spin (that behavior in a main “wait for work” spot of a (depending on
> > settings) ~per-CPU daemon would render systems unusable and this patch
> > has been in testing for a while).  So what is the detailed behavior of
> > a “timeout that expires immediately”?
> 
> Hi Patrick,
>  it definitely spins for me.

Ah that is where the high cpu load is coming from.
 
>  I should have clarified that the SFS patch
> 
>    e81847bd0651 LU-9660 ptlrpc: do not wakeup every second
> 
>  is correct, as __l_wait_event() treats a timeout value of 0 as meaning an
>  indefinite timeout.
>  The error was in the conversion to wait_event_idle_timeout().  The
>  various wait_event*timeout() functions treat 0 as 1 less than 1.
>  If you want to not have a timeout, you need to not use the *_timeout()
>  version.
>  If a timeout is undesirable rather than fatal, then
>  MAX_SCHEDULE_TIMEOUT can be used.  In this case, that seemed best.

I missed that in the conversion. Now that I know that the mapping I will
do the future server code port correctly ;-)

> > - Patrick
> >
> >
> > ________________________________
> > From: lustre-devel <lustre-devel-bounces@lists.lustre.org> on behalf of NeilBrown <neilb@suse.com>
> > Sent: Sunday, October 28, 2018 7:03:02 PM
> > To: James Simmons; Andreas Dilger; Oleg Drokin
> > Cc: Lustre Development List
> > Subject: Re: [lustre-devel] [PATCH 12/28] lustre: ptlrpc: do not wakeup every second
> >
> > On Sun, Oct 14 2018, James Simmons wrote:
> >
> >> From: Alex Zhuravlev <bzzz@whamcloud.com>
> >>
> >> Even if there are no RPC requests on the set, there is no need to
> >> wake up every second. The thread is woken up when a request is added
> >> to the set or when the STOP bit is set, so it is sufficient to only
> >> wake up when there are requests on the set to worry about.
> >>
> >> Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
> >> WC-bug-id: https://jira.whamcloud.com/browse/LU-9660
> >> Reviewed-on: https://review.whamcloud.com/28776
> >> Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
> >> Reviewed-by: Patrick Farrell <paf@cray.com>
> >> Reviewed-by: Oleg Drokin <green@whamcloud.com>
> >> Signed-off-by: James Simmons <jsimmons@infradead.org>
> >> ---
> >>  drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c | 4 ++--
> >>  1 file changed, 2 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c b/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
> >> index c201a88..5b4977b 100644
> >> --- a/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
> >> +++ b/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
> >> @@ -371,7 +371,7 @@ static int ptlrpcd_check(struct lu_env *env, struct ptlrpcd_ctl *pc)
> >>                }
> >>        }
> >>
> >> -     return rc;
> >> +     return rc || test_bit(LIOD_STOP, &pc->pc_flags);
> >>  }
> >>
> >>  /**
> >> @@ -441,7 +441,7 @@ static int ptlrpcd(void *arg)
> >>                lu_context_enter(env.le_ses);
> >>                if (wait_event_idle_timeout(set->set_waitq,
> >>                                            ptlrpcd_check(&env, pc),
> >> -                                         (timeout ? timeout : 1) * HZ) == 0)
> >> +                                         timeout * HZ) == 0)
> >>                        ptlrpc_expired_set(set);
> >
> > This is incorrect.
> > A timeout of zero means the timeout happens after zero jiffies
> > (immediately), it doesn't mean there is no timeout.
> > If we want a "timeout" of zero to mean "Wait forever", we need something
> > like:
> >
> >   wait_event_idle_timeout(.....,
> >                           timeout ? (timeout * HZ) : MAX_SCHEDULE_TIMEOUT) == 0
> >
> > I've updated the patch accordingly.
> >
> > Thanks,
> > NeilBrown
> >
> >>
> >>                lu_context_exit(&env.le_ctx);
> >> --
> >> 1.8.3.1
>
Patrick Farrell Oct. 29, 2018, 2:17 p.m. UTC | #5
Ah, OK!

Thanks, Neil.  That makes sense.  I just knew there was no way that had been missed in the WC branch - Just as I'm sure you two noticed it immediately in your testing.


Also useful to know about the timeout functions and 0.


Regards,

Patrick
James Simmons Nov. 4, 2018, 8:53 p.m. UTC | #6
> > Neil,
> >
> > Does your statement imply this would spin?  It definitely doesn’t just
> > spin (that behavior in a main “wait for work” spot of a (depending on
> > settings) ~per-CPU daemon would render systems unusable and this patch
> > has been in testing for a while).  So what is the detailed behavior of
> > a “timeout that expires immediately”?
> 
> Hi Patrick,
>  it definitely spins for me.
> 
>  I should have clarified that the SFS patch
> 
>    e81847bd0651 LU-9660 ptlrpc: do not wakeup every second
> 
>  is correct, as __l_wait_event() treats a timeout value of 0 as meaning an
>  indefinite timeout.
>  The error was in the conversion to wait_event_idle_timeout().  The
>  various wait_event*timeout() functions treat 0 as 1 less than 1.
>  If you want to not have a timeout, you need to not use the *_timeout()
>  version.
>  If a timeout is undesirable rather than fatal, then
>  MAX_SCHEDULE_TIMEOUT can be used.  In this case, that seemed best.
> 
> Thanks,
> NeilBrown
> 
> 
> >
> > - Patrick
> >
> >
> > ________________________________
> > From: lustre-devel <lustre-devel-bounces@lists.lustre.org> on behalf of NeilBrown <neilb@suse.com>
> > Sent: Sunday, October 28, 2018 7:03:02 PM
> > To: James Simmons; Andreas Dilger; Oleg Drokin
> > Cc: Lustre Development List
> > Subject: Re: [lustre-devel] [PATCH 12/28] lustre: ptlrpc: do not wakeup every second
> >
> > On Sun, Oct 14 2018, James Simmons wrote:
> >
> >> From: Alex Zhuravlev <bzzz@whamcloud.com>
> >>
> >> Even if there are no RPC requests on the set, there is no need to
> >> wake up every second. The thread is woken up when a request is added
> >> to the set or when the STOP bit is set, so it is sufficient to only
> >> wake up when there are requests on the set to worry about.
> >>
> >> Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
> >> WC-bug-id: https://jira.whamcloud.com/browse/LU-9660
> >> Reviewed-on: https://review.whamcloud.com/28776
> >> Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
> >> Reviewed-by: Patrick Farrell <paf@cray.com>
> >> Reviewed-by: Oleg Drokin <green@whamcloud.com>
> >> Signed-off-by: James Simmons <jsimmons@infradead.org>
> >> ---
> >>  drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c | 4 ++--
> >>  1 file changed, 2 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c b/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
> >> index c201a88..5b4977b 100644
> >> --- a/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
> >> +++ b/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
> >> @@ -371,7 +371,7 @@ static int ptlrpcd_check(struct lu_env *env, struct ptlrpcd_ctl *pc)
> >>                }
> >>        }
> >>
> >> -     return rc;
> >> +     return rc || test_bit(LIOD_STOP, &pc->pc_flags);
> >>  }
> >>
> >>  /**
> >> @@ -441,7 +441,7 @@ static int ptlrpcd(void *arg)
> >>                lu_context_enter(env.le_ses);
> >>                if (wait_event_idle_timeout(set->set_waitq,
> >>                                            ptlrpcd_check(&env, pc),
> >> -                                         (timeout ? timeout : 1) * HZ) == 0)
> >> +                                         timeout * HZ) == 0)
> >>                        ptlrpc_expired_set(set);
> >
> > This is incorrect.
> > A timeout of zero means the timeout happens after zero jiffies
> > (immediately), it doesn't mean there is no timeout.
> > If we want a "timeout" of zero to mean "Wait forever", we need something
> > like:
> >
> >   wait_event_idle_timeout(.....,
> >                           timeout ? (timeout * HZ) : MAX_SCHEDULE_TIMEOUT) == 0
> >
> > I've updated the patch accordingly.

I did that change locally as well and my CPU load problem went away. 
Thanks for figuring it out. Will be more careful in the future.
diff mbox series

Patch

diff --git a/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c b/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
index c201a88..5b4977b 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
@@ -371,7 +371,7 @@  static int ptlrpcd_check(struct lu_env *env, struct ptlrpcd_ctl *pc)
 		}
 	}
 
-	return rc;
+	return rc || test_bit(LIOD_STOP, &pc->pc_flags);
 }
 
 /**
@@ -441,7 +441,7 @@  static int ptlrpcd(void *arg)
 		lu_context_enter(env.le_ses);
 		if (wait_event_idle_timeout(set->set_waitq,
 					    ptlrpcd_check(&env, pc),
-					    (timeout ? timeout : 1) * HZ) == 0)
+					    timeout * HZ) == 0)
 			ptlrpc_expired_set(set);
 
 		lu_context_exit(&env.le_ctx);