mm/swap: piggyback lru_add_drain_all() calls

Message ID	157018386639.6110.3058050375244904201.stgit@buzz (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=POAe=X5=kvack.org=owner-linux-mm@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 696212070B Subject: [PATCH] mm/swap: piggyback lru_add_drain_all() calls From: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> To: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org> Cc: linux-kernel@vger.kernel.org Date: Fri, 04 Oct 2019 13:11:06 +0300 Message-ID: <157018386639.6110.3058050375244904201.stgit@buzz> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Precedence: bulk
Series	mm/swap: piggyback lru_add_drain_all() calls \| expand mm/swap: piggyback lru_add_drain_all() calls

Konstantin Khlebnikov Oct. 4, 2019, 10:11 a.m. UTC

This is very slow operation. There is no reason to do it again if somebody
else already drained all per-cpu vectors after we waited for lock.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
---
 mm/swap.c |   13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

Matthew Wilcox Oct. 4, 2019, 12:10 p.m. UTC | #1

On Fri, Oct 04, 2019 at 01:11:06PM +0300, Konstantin Khlebnikov wrote:
> This is very slow operation. There is no reason to do it again if somebody
> else already drained all per-cpu vectors after we waited for lock.
> +	seq = raw_read_seqcount_latch(&seqcount);
> +
>  	mutex_lock(&lock);
> +
> +	/* Piggyback on drain done by somebody else. */
> +	if (__read_seqcount_retry(&seqcount, seq))
> +		goto done;
> +
> +	raw_write_seqcount_latch(&seqcount);
> +

Do we really need the seqcount to do this?  Wouldn't a mutex_trylock()
have the same effect?

Konstantin Khlebnikov Oct. 4, 2019, 12:24 p.m. UTC | #2

On 04/10/2019 15.10, Matthew Wilcox wrote:
> On Fri, Oct 04, 2019 at 01:11:06PM +0300, Konstantin Khlebnikov wrote:
>> This is very slow operation. There is no reason to do it again if somebody
>> else already drained all per-cpu vectors after we waited for lock.
>> +	seq = raw_read_seqcount_latch(&seqcount);
>> +
>>   	mutex_lock(&lock);
>> +
>> +	/* Piggyback on drain done by somebody else. */
>> +	if (__read_seqcount_retry(&seqcount, seq))
>> +		goto done;
>> +
>> +	raw_write_seqcount_latch(&seqcount);
>> +
> 
> Do we really need the seqcount to do this?  Wouldn't a mutex_trylock()
> have the same effect?
> 

No, this is completely different semantics.

Operation could be safely skipped only if somebody else started and
finished drain after current task called this function.

Michal Hocko Oct. 4, 2019, 12:27 p.m. UTC | #3

On Fri 04-10-19 05:10:17, Matthew Wilcox wrote:
> On Fri, Oct 04, 2019 at 01:11:06PM +0300, Konstantin Khlebnikov wrote:
> > This is very slow operation. There is no reason to do it again if somebody
> > else already drained all per-cpu vectors after we waited for lock.
> > +	seq = raw_read_seqcount_latch(&seqcount);
> > +
> >  	mutex_lock(&lock);
> > +
> > +	/* Piggyback on drain done by somebody else. */
> > +	if (__read_seqcount_retry(&seqcount, seq))
> > +		goto done;
> > +
> > +	raw_write_seqcount_latch(&seqcount);
> > +
> 
> Do we really need the seqcount to do this?  Wouldn't a mutex_trylock()
> have the same effect?

Yeah, this makes sense. From correctness point of view it should be ok
because no caller can expect that per-cpu pvecs are empty on return.
This might have some runtime effects that some paths might retry more -
e.g. offlining path drains pcp pvces before migrating the range away, if
there are pages still waiting for a worker to drain them then the
migration would fail and we would retry. But this not a correctness
issue.

Konstantin Khlebnikov Oct. 4, 2019, 12:32 p.m. UTC | #4

On 04/10/2019 15.27, Michal Hocko wrote:
> On Fri 04-10-19 05:10:17, Matthew Wilcox wrote:
>> On Fri, Oct 04, 2019 at 01:11:06PM +0300, Konstantin Khlebnikov wrote:
>>> This is very slow operation. There is no reason to do it again if somebody
>>> else already drained all per-cpu vectors after we waited for lock.
>>> +	seq = raw_read_seqcount_latch(&seqcount);
>>> +
>>>   	mutex_lock(&lock);
>>> +
>>> +	/* Piggyback on drain done by somebody else. */
>>> +	if (__read_seqcount_retry(&seqcount, seq))
>>> +		goto done;
>>> +
>>> +	raw_write_seqcount_latch(&seqcount);
>>> +
>>
>> Do we really need the seqcount to do this?  Wouldn't a mutex_trylock()
>> have the same effect?
> 
> Yeah, this makes sense. From correctness point of view it should be ok
> because no caller can expect that per-cpu pvecs are empty on return.
> This might have some runtime effects that some paths might retry more -
> e.g. offlining path drains pcp pvces before migrating the range away, if
> there are pages still waiting for a worker to drain them then the
> migration would fail and we would retry. But this not a correctness
> issue.
> 

Caller might expect that pages added by him before are drained.
Exiting after mutex_trylock() will not guarantee that.

For example POSIX_FADV_DONTNEED uses that.

Michal Hocko Oct. 4, 2019, 12:37 p.m. UTC | #5

On Fri 04-10-19 15:32:01, Konstantin Khlebnikov wrote:
> 
> 
> On 04/10/2019 15.27, Michal Hocko wrote:
> > On Fri 04-10-19 05:10:17, Matthew Wilcox wrote:
> > > On Fri, Oct 04, 2019 at 01:11:06PM +0300, Konstantin Khlebnikov wrote:
> > > > This is very slow operation. There is no reason to do it again if somebody
> > > > else already drained all per-cpu vectors after we waited for lock.
> > > > +	seq = raw_read_seqcount_latch(&seqcount);
> > > > +
> > > >   	mutex_lock(&lock);
> > > > +
> > > > +	/* Piggyback on drain done by somebody else. */
> > > > +	if (__read_seqcount_retry(&seqcount, seq))
> > > > +		goto done;
> > > > +
> > > > +	raw_write_seqcount_latch(&seqcount);
> > > > +
> > > 
> > > Do we really need the seqcount to do this?  Wouldn't a mutex_trylock()
> > > have the same effect?
> > 
> > Yeah, this makes sense. From correctness point of view it should be ok
> > because no caller can expect that per-cpu pvecs are empty on return.
> > This might have some runtime effects that some paths might retry more -
> > e.g. offlining path drains pcp pvces before migrating the range away, if
> > there are pages still waiting for a worker to drain them then the
> > migration would fail and we would retry. But this not a correctness
> > issue.
> > 
> 
> Caller might expect that pages added by him before are drained.
> Exiting after mutex_trylock() will not guarantee that.
> 
> For example POSIX_FADV_DONTNEED uses that.

OK, I was not aware of this case. Please make sure to document that in
the changelog and a comment in the code wouldn't hurt either. It would
certainly explain more thatn "Piggyback on drain done by somebody
else.".

Thanks!

mm/swap: piggyback lru_add_drain_all() calls

Commit Message

Comments

Patch