diff mbox

mm: Fixup the condition whether the page cache is free

Message ID CAFNq8R7tq9kvD9LyhZJ-Cj0kexQfDsPhB4iQYyZ9s9+8Jo82QA@mail.gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Li Haifeng March 4, 2013, 1:54 a.m. UTC
When a page cache is to reclaim, we should to decide whether the page
cache is free.
IMO, the condition whether a page cache is free should be 3 in page
frame reclaiming. The reason lists as below.

When page is allocated, the page->_count is 1(code fragment is code-1 ).
And when the page is allocated for reading files from extern disk, the
page->_count will increment 1 by page_cache_get() in
add_to_page_cache_locked()(code fragment is code-2). When the page is to
reclaim, the isolated LRU list also increase the page->_count(code
fragment is code-3).

According above reasons, when the file page is freeable, the
page->_count should be 3 instead of 2.

<code-1>
buffered_rmqueue ->prep_new_page->set_page_refcounted:
24 /*
25  * Turn a non-refcounted page (->_count == 0) into refcounted with
26  * a count of one.
27  */
28 static inline void set_page_refcounted(struct page *page)
29 {
30         VM_BUG_ON(PageTail(page));
31         VM_BUG_ON(atomic_read(&page->_count));
32         set_page_count(page, 1);
33 }

<code-2>
do_generic_file_read ->add_to_page_cache_lru-> add_to_page_cache->
add_to_page_cache_locked:
int add_to_page_cache_locked(struct page *page, struct address_space
*mapping,
                pgoff_t offset, gfp_t gfp_mask)
{
…
           page_cache_get(page);
                page->mapping = mapping;
                page->index = offset;

                spin_lock_irq(&mapping->tree_lock);
                error = radix_tree_insert(&mapping->page_tree, offset,
page);
                if (likely(!error)) {
                        mapping->nrpages++;
                        __inc_zone_page_state(page, NR_FILE_PAGES);
                        spin_unlock_irq(&mapping->tree_lock);
…
}
<code-3>
static noinline_for_stack unsigned long
shrink_inactive_list(unsigned long nr_to_scan, struct mem_cgroup_zone
*mz,
                     struct scan_control *sc, int priority, int file)
{
…
       nr_taken = isolate_lru_pages(nr_to_scan, mz, &page_list,
&nr_scanned,
                                     sc, isolate_mode, 0, file);
…
	   nr_reclaimed = shrink_page_list(&page_list, mz, sc, priority,
                                                &nr_dirty,
&nr_writeback);
}
Remarks for code-3:
isolate_lru_pages() will call get_page_unless_zero() ultimately to
increase the page->_count by 1.
And shrink_page_list() will call is_page_cache_freeable() finally to
check whether the page cache is free.

From 59b25b5e0163dcb120d913b570c1b8b5b0c47c5d Mon Sep 17 00:00:00 2001
From: Haifeng Li <hfli@marvell.com>
Date: Mon, 4 Mar 2013 09:42:53 +0800
Subject: [PATCH] mm: Fixup the condition whether the page cache is free

When a page is allocated, its reference is 1. If the page is
inserted into page cache tree, the referenced also should be
increased by 1. In reclaiming routine, it also referenced by
isolated list. So here, the condition whether the page is free
should be 3.

Signed-off-by: Haifeng Li <omycle@gmail.com>
---
 mm/vmscan.c |    9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

--
1.7.9.5

Comments

Johannes Weiner March 4, 2013, 3:09 p.m. UTC | #1
On Mon, Mar 04, 2013 at 09:54:26AM +0800, Li Haifeng wrote:
> When a page cache is to reclaim, we should to decide whether the page
> cache is free.
> IMO, the condition whether a page cache is free should be 3 in page
> frame reclaiming. The reason lists as below.
> 
> When page is allocated, the page->_count is 1(code fragment is code-1 ).
> And when the page is allocated for reading files from extern disk, the
> page->_count will increment 1 by page_cache_get() in
> add_to_page_cache_locked()(code fragment is code-2). When the page is to
> reclaim, the isolated LRU list also increase the page->_count(code
> fragment is code-3).

The page count is initialized to 1, but that does not stay with the
object.  It's a reference that is passed to the allocating task, which
drops it again when it's done with the page.  I.e. the pattern is like
this:

instantiation:
page = page_cache_alloc()	/* instantiator reference -> 1 */
add_to_page_cache(page, mapping, offset)
  get_page(page)		/* page cache reference -> 2 */
lru_cache_add(page)
  get_page(page)		/* pagevec reference -> 3 */
/* ...initiate read, write, associate buffers, ... */
page_cache_release(page)	/* drop instantiator reference -> 2 + private */

reclaim:
lru_add_drain()
  page_cache_release(page)	/* drop pagevec reference -> 1 + private */
__isolate_lru_page(page)
  page_cache_get(page)		/* reclaim reference -> 2 + private */
is_page_cache_freeable(page)
try_to_free_buffers()		/* drop buffer ref -> 2 */
__remove_mapping()		/* drop page cache and isolator ref -> 0 */
free_hot_cold_page()
Li Haifeng March 5, 2013, 1:51 a.m. UTC | #2
Thanks very much for you explanation. :-)

2013/3/4 Johannes Weiner <hannes@cmpxchg.org>:
> On Mon, Mar 04, 2013 at 09:54:26AM +0800, Li Haifeng wrote:
>> When a page cache is to reclaim, we should to decide whether the page
>> cache is free.
>> IMO, the condition whether a page cache is free should be 3 in page
>> frame reclaiming. The reason lists as below.
>>
>> When page is allocated, the page->_count is 1(code fragment is code-1 ).
>> And when the page is allocated for reading files from extern disk, the
>> page->_count will increment 1 by page_cache_get() in
>> add_to_page_cache_locked()(code fragment is code-2). When the page is to
>> reclaim, the isolated LRU list also increase the page->_count(code
>> fragment is code-3).
>
> The page count is initialized to 1, but that does not stay with the
> object.  It's a reference that is passed to the allocating task, which
> drops it again when it's done with the page.  I.e. the pattern is like
> this:
>
> instantiation:
> page = page_cache_alloc()       /* instantiator reference -> 1 */
> add_to_page_cache(page, mapping, offset)
>   get_page(page)                /* page cache reference -> 2 */
> lru_cache_add(page)
>   get_page(page)                /* pagevec reference -> 3 */
> /* ...initiate read, write, associate buffers, ... */
> page_cache_release(page)        /* drop instantiator reference -> 2 + private */
>
> reclaim:
> lru_add_drain()
>   page_cache_release(page)      /* drop pagevec reference -> 1 + private */
> __isolate_lru_page(page)
>   page_cache_get(page)          /* reclaim reference -> 2 + private */
> is_page_cache_freeable(page)
> try_to_free_buffers()           /* drop buffer ref -> 2 */
> __remove_mapping()              /* drop page cache and isolator ref -> 0 */
> free_hot_cold_page()
Simon Jeons March 6, 2013, 1:04 a.m. UTC | #3
Hi Johannes,
On 03/04/2013 11:09 PM, Johannes Weiner wrote:
> On Mon, Mar 04, 2013 at 09:54:26AM +0800, Li Haifeng wrote:
>> When a page cache is to reclaim, we should to decide whether the page
>> cache is free.
>> IMO, the condition whether a page cache is free should be 3 in page
>> frame reclaiming. The reason lists as below.
>>
>> When page is allocated, the page->_count is 1(code fragment is code-1 ).
>> And when the page is allocated for reading files from extern disk, the
>> page->_count will increment 1 by page_cache_get() in
>> add_to_page_cache_locked()(code fragment is code-2). When the page is to
>> reclaim, the isolated LRU list also increase the page->_count(code
>> fragment is code-3).
> The page count is initialized to 1, but that does not stay with the
> object.  It's a reference that is passed to the allocating task, which
> drops it again when it's done with the page.  I.e. the pattern is like
> this:
>
> instantiation:
> page = page_cache_alloc()	/* instantiator reference -> 1 */
> add_to_page_cache(page, mapping, offset)
>    get_page(page)		/* page cache reference -> 2 */
> lru_cache_add(page)
>    get_page(page)		/* pagevec reference -> 3 */
> /* ...initiate read, write, associate buffers, ... */
> page_cache_release(page)	/* drop instantiator reference -> 2 + private */
>
> reclaim:
> lru_add_drain()
>    page_cache_release(page)	/* drop pagevec reference -> 1 + private */

IIUC, when add page to lru will lead to add to pagevec firstly, and 
pagevec will take one reference, so if lru will take over the reference 
taken by pagevec when page transmit from pagevec to lru? or just drop 
the reference and lru will not take reference for page?

> __isolate_lru_page(page)
>    page_cache_get(page)		/* reclaim reference -> 2 + private */
> is_page_cache_freeable(page)
> try_to_free_buffers()		/* drop buffer ref -> 2 */
> __remove_mapping()		/* drop page cache and isolator ref -> 0 */
> free_hot_cold_page()
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
Johannes Weiner March 6, 2013, 7:47 p.m. UTC | #4
On Wed, Mar 06, 2013 at 09:04:55AM +0800, Simon Jeons wrote:
> Hi Johannes,
> On 03/04/2013 11:09 PM, Johannes Weiner wrote:
> >On Mon, Mar 04, 2013 at 09:54:26AM +0800, Li Haifeng wrote:
> >>When a page cache is to reclaim, we should to decide whether the page
> >>cache is free.
> >>IMO, the condition whether a page cache is free should be 3 in page
> >>frame reclaiming. The reason lists as below.
> >>
> >>When page is allocated, the page->_count is 1(code fragment is code-1 ).
> >>And when the page is allocated for reading files from extern disk, the
> >>page->_count will increment 1 by page_cache_get() in
> >>add_to_page_cache_locked()(code fragment is code-2). When the page is to
> >>reclaim, the isolated LRU list also increase the page->_count(code
> >>fragment is code-3).
> >The page count is initialized to 1, but that does not stay with the
> >object.  It's a reference that is passed to the allocating task, which
> >drops it again when it's done with the page.  I.e. the pattern is like
> >this:
> >
> >instantiation:
> >page = page_cache_alloc()	/* instantiator reference -> 1 */
> >add_to_page_cache(page, mapping, offset)
> >   get_page(page)		/* page cache reference -> 2 */
> >lru_cache_add(page)
> >   get_page(page)		/* pagevec reference -> 3 */
> >/* ...initiate read, write, associate buffers, ... */
> >page_cache_release(page)	/* drop instantiator reference -> 2 + private */
> >
> >reclaim:
> >lru_add_drain()
> >   page_cache_release(page)	/* drop pagevec reference -> 1 + private */
> 
> IIUC, when add page to lru will lead to add to pagevec firstly, and
> pagevec will take one reference, so if lru will take over the
> reference taken by pagevec when page transmit from pagevec to lru?
> or just drop the reference and lru will not take reference for page?

The LRU does not hold a reference, it would not make sense.  The
pagevec only needs one because it would be awkward to remove a
concurrently freed page out of a pagevec, but unlinking a page from
the LRU is easy.  See mm/swap.c::__page_cache_release() and friends.
Simon Jeons March 7, 2013, 1:05 a.m. UTC | #5
Hi Johannes,
On 03/07/2013 03:47 AM, Johannes Weiner wrote:
> On Wed, Mar 06, 2013 at 09:04:55AM +0800, Simon Jeons wrote:
>> Hi Johannes,
>> On 03/04/2013 11:09 PM, Johannes Weiner wrote:
>>> On Mon, Mar 04, 2013 at 09:54:26AM +0800, Li Haifeng wrote:
>>>> When a page cache is to reclaim, we should to decide whether the page
>>>> cache is free.
>>>> IMO, the condition whether a page cache is free should be 3 in page
>>>> frame reclaiming. The reason lists as below.
>>>>
>>>> When page is allocated, the page->_count is 1(code fragment is code-1 ).
>>>> And when the page is allocated for reading files from extern disk, the
>>>> page->_count will increment 1 by page_cache_get() in
>>>> add_to_page_cache_locked()(code fragment is code-2). When the page is to
>>>> reclaim, the isolated LRU list also increase the page->_count(code
>>>> fragment is code-3).
>>> The page count is initialized to 1, but that does not stay with the
>>> object.  It's a reference that is passed to the allocating task, which
>>> drops it again when it's done with the page.  I.e. the pattern is like
>>> this:
>>>
>>> instantiation:
>>> page = page_cache_alloc()	/* instantiator reference -> 1 */
>>> add_to_page_cache(page, mapping, offset)
>>>    get_page(page)		/* page cache reference -> 2 */
>>> lru_cache_add(page)
>>>    get_page(page)		/* pagevec reference -> 3 */
>>> /* ...initiate read, write, associate buffers, ... */
>>> page_cache_release(page)	/* drop instantiator reference -> 2 + private */
>>>
>>> reclaim:
>>> lru_add_drain()
>>>    page_cache_release(page)	/* drop pagevec reference -> 1 + private */
>> IIUC, when add page to lru will lead to add to pagevec firstly, and
>> pagevec will take one reference, so if lru will take over the
>> reference taken by pagevec when page transmit from pagevec to lru?
>> or just drop the reference and lru will not take reference for page?
> The LRU does not hold a reference, it would not make sense.  The
> pagevec only needs one because it would be awkward to remove a
> concurrently freed page out of a pagevec, but unlinking a page from
> the LRU is easy.  See mm/swap.c::__page_cache_release() and friends.

Since pagevec is per cpu, when can remove a concurrently freed page out 
of a pagevec happen?
Simon Jeons March 8, 2013, 2:13 a.m. UTC | #6
Ping, :-)
On 03/07/2013 09:05 AM, Simon Jeons wrote:
> Hi Johannes,
> On 03/07/2013 03:47 AM, Johannes Weiner wrote:
>> On Wed, Mar 06, 2013 at 09:04:55AM +0800, Simon Jeons wrote:
>>> Hi Johannes,
>>> On 03/04/2013 11:09 PM, Johannes Weiner wrote:
>>>> On Mon, Mar 04, 2013 at 09:54:26AM +0800, Li Haifeng wrote:
>>>>> When a page cache is to reclaim, we should to decide whether the page
>>>>> cache is free.
>>>>> IMO, the condition whether a page cache is free should be 3 in page
>>>>> frame reclaiming. The reason lists as below.
>>>>>
>>>>> When page is allocated, the page->_count is 1(code fragment is 
>>>>> code-1 ).
>>>>> And when the page is allocated for reading files from extern disk, 
>>>>> the
>>>>> page->_count will increment 1 by page_cache_get() in
>>>>> add_to_page_cache_locked()(code fragment is code-2). When the page 
>>>>> is to
>>>>> reclaim, the isolated LRU list also increase the page->_count(code
>>>>> fragment is code-3).
>>>> The page count is initialized to 1, but that does not stay with the
>>>> object.  It's a reference that is passed to the allocating task, which
>>>> drops it again when it's done with the page.  I.e. the pattern is like
>>>> this:
>>>>
>>>> instantiation:
>>>> page = page_cache_alloc()    /* instantiator reference -> 1 */
>>>> add_to_page_cache(page, mapping, offset)
>>>>    get_page(page)        /* page cache reference -> 2 */
>>>> lru_cache_add(page)
>>>>    get_page(page)        /* pagevec reference -> 3 */
>>>> /* ...initiate read, write, associate buffers, ... */
>>>> page_cache_release(page)    /* drop instantiator reference -> 2 + 
>>>> private */
>>>>
>>>> reclaim:
>>>> lru_add_drain()
>>>>    page_cache_release(page)    /* drop pagevec reference -> 1 + 
>>>> private */
>>> IIUC, when add page to lru will lead to add to pagevec firstly, and
>>> pagevec will take one reference, so if lru will take over the
>>> reference taken by pagevec when page transmit from pagevec to lru?
>>> or just drop the reference and lru will not take reference for page?
>> The LRU does not hold a reference, it would not make sense.  The
>> pagevec only needs one because it would be awkward to remove a
>> concurrently freed page out of a pagevec, but unlinking a page from
>> the LRU is easy.  See mm/swap.c::__page_cache_release() and friends.
>
> Since pagevec is per cpu, when can remove a concurrently freed page 
> out of a pagevec happen?
>
>
Johannes Weiner March 8, 2013, 2:37 a.m. UTC | #7
On Fri, Mar 08, 2013 at 10:13:25AM +0800, Simon Jeons wrote:
> Ping, :-)
> On 03/07/2013 09:05 AM, Simon Jeons wrote:
> >Hi Johannes,
> >On 03/07/2013 03:47 AM, Johannes Weiner wrote:
> >>On Wed, Mar 06, 2013 at 09:04:55AM +0800, Simon Jeons wrote:
> >>>Hi Johannes,
> >>>On 03/04/2013 11:09 PM, Johannes Weiner wrote:
> >>>>On Mon, Mar 04, 2013 at 09:54:26AM +0800, Li Haifeng wrote:
> >>>>>When a page cache is to reclaim, we should to decide whether the page
> >>>>>cache is free.
> >>>>>IMO, the condition whether a page cache is free should be 3 in page
> >>>>>frame reclaiming. The reason lists as below.
> >>>>>
> >>>>>When page is allocated, the page->_count is 1(code
> >>>>>fragment is code-1 ).
> >>>>>And when the page is allocated for reading files from
> >>>>>extern disk, the
> >>>>>page->_count will increment 1 by page_cache_get() in
> >>>>>add_to_page_cache_locked()(code fragment is code-2). When
> >>>>>the page is to
> >>>>>reclaim, the isolated LRU list also increase the page->_count(code
> >>>>>fragment is code-3).
> >>>>The page count is initialized to 1, but that does not stay with the
> >>>>object.  It's a reference that is passed to the allocating task, which
> >>>>drops it again when it's done with the page.  I.e. the pattern is like
> >>>>this:
> >>>>
> >>>>instantiation:
> >>>>page = page_cache_alloc()    /* instantiator reference -> 1 */
> >>>>add_to_page_cache(page, mapping, offset)
> >>>>   get_page(page)        /* page cache reference -> 2 */
> >>>>lru_cache_add(page)
> >>>>   get_page(page)        /* pagevec reference -> 3 */
> >>>>/* ...initiate read, write, associate buffers, ... */
> >>>>page_cache_release(page)    /* drop instantiator reference
> >>>>-> 2 + private */
> >>>>
> >>>>reclaim:
> >>>>lru_add_drain()
> >>>>   page_cache_release(page)    /* drop pagevec reference ->
> >>>>1 + private */
> >>>IIUC, when add page to lru will lead to add to pagevec firstly, and
> >>>pagevec will take one reference, so if lru will take over the
> >>>reference taken by pagevec when page transmit from pagevec to lru?
> >>>or just drop the reference and lru will not take reference for page?
> >>The LRU does not hold a reference, it would not make sense.  The
> >>pagevec only needs one because it would be awkward to remove a
> >>concurrently freed page out of a pagevec, but unlinking a page from
> >>the LRU is easy.  See mm/swap.c::__page_cache_release() and friends.
> >
> >Since pagevec is per cpu, when can remove a concurrently freed
> >page out of a pagevec happen?

It doesn't because the pagevec holds a reference, as I wrote above.

Feel free to consult the code as well for questions like these ;-)
Simon Jeons March 8, 2013, 2:48 a.m. UTC | #8
On 03/08/2013 10:37 AM, Johannes Weiner wrote:
> On Fri, Mar 08, 2013 at 10:13:25AM +0800, Simon Jeons wrote:
>> Ping, :-)
>> On 03/07/2013 09:05 AM, Simon Jeons wrote:
>>> Hi Johannes,
>>> On 03/07/2013 03:47 AM, Johannes Weiner wrote:
>>>> On Wed, Mar 06, 2013 at 09:04:55AM +0800, Simon Jeons wrote:
>>>>> Hi Johannes,
>>>>> On 03/04/2013 11:09 PM, Johannes Weiner wrote:
>>>>>> On Mon, Mar 04, 2013 at 09:54:26AM +0800, Li Haifeng wrote:
>>>>>>> When a page cache is to reclaim, we should to decide whether the page
>>>>>>> cache is free.
>>>>>>> IMO, the condition whether a page cache is free should be 3 in page
>>>>>>> frame reclaiming. The reason lists as below.
>>>>>>>
>>>>>>> When page is allocated, the page->_count is 1(code
>>>>>>> fragment is code-1 ).
>>>>>>> And when the page is allocated for reading files from
>>>>>>> extern disk, the
>>>>>>> page->_count will increment 1 by page_cache_get() in
>>>>>>> add_to_page_cache_locked()(code fragment is code-2). When
>>>>>>> the page is to
>>>>>>> reclaim, the isolated LRU list also increase the page->_count(code
>>>>>>> fragment is code-3).
>>>>>> The page count is initialized to 1, but that does not stay with the
>>>>>> object.  It's a reference that is passed to the allocating task, which
>>>>>> drops it again when it's done with the page.  I.e. the pattern is like
>>>>>> this:
>>>>>>
>>>>>> instantiation:
>>>>>> page = page_cache_alloc()    /* instantiator reference -> 1 */
>>>>>> add_to_page_cache(page, mapping, offset)
>>>>>>    get_page(page)        /* page cache reference -> 2 */
>>>>>> lru_cache_add(page)
>>>>>>    get_page(page)        /* pagevec reference -> 3 */
>>>>>> /* ...initiate read, write, associate buffers, ... */
>>>>>> page_cache_release(page)    /* drop instantiator reference
>>>>>> -> 2 + private */
>>>>>>
>>>>>> reclaim:
>>>>>> lru_add_drain()
>>>>>>    page_cache_release(page)    /* drop pagevec reference ->
>>>>>> 1 + private */
>>>>> IIUC, when add page to lru will lead to add to pagevec firstly, and
>>>>> pagevec will take one reference, so if lru will take over the
>>>>> reference taken by pagevec when page transmit from pagevec to lru?
>>>>> or just drop the reference and lru will not take reference for page?
>>>> The LRU does not hold a reference, it would not make sense.  The
>>>> pagevec only needs one because it would be awkward to remove a
>>>> concurrently freed page out of a pagevec, but unlinking a page from
>>>> the LRU is easy.  See mm/swap.c::__page_cache_release() and friends.
>>> Since pagevec is per cpu, when can remove a concurrently freed
>>> page out of a pagevec happen?
> It doesn't because the pagevec holds a reference, as I wrote above.

I mean since pagevec is per cpu, how can remove a concurrently freed 
page out of a pagevec happen? If it doesn't happen pagevec don't need to 
hold a reference. :-)

>
> Feel free to consult the code as well for questions like these ;-)
Johannes Weiner March 8, 2013, 3:16 a.m. UTC | #9
On Fri, Mar 08, 2013 at 10:48:31AM +0800, Simon Jeons wrote:
> On 03/08/2013 10:37 AM, Johannes Weiner wrote:
> >On Fri, Mar 08, 2013 at 10:13:25AM +0800, Simon Jeons wrote:
> >>Ping, :-)
> >>On 03/07/2013 09:05 AM, Simon Jeons wrote:
> >>>Hi Johannes,
> >>>On 03/07/2013 03:47 AM, Johannes Weiner wrote:
> >>>>On Wed, Mar 06, 2013 at 09:04:55AM +0800, Simon Jeons wrote:
> >>>>>Hi Johannes,
> >>>>>On 03/04/2013 11:09 PM, Johannes Weiner wrote:
> >>>>>>On Mon, Mar 04, 2013 at 09:54:26AM +0800, Li Haifeng wrote:
> >>>>>>>When a page cache is to reclaim, we should to decide whether the page
> >>>>>>>cache is free.
> >>>>>>>IMO, the condition whether a page cache is free should be 3 in page
> >>>>>>>frame reclaiming. The reason lists as below.
> >>>>>>>
> >>>>>>>When page is allocated, the page->_count is 1(code
> >>>>>>>fragment is code-1 ).
> >>>>>>>And when the page is allocated for reading files from
> >>>>>>>extern disk, the
> >>>>>>>page->_count will increment 1 by page_cache_get() in
> >>>>>>>add_to_page_cache_locked()(code fragment is code-2). When
> >>>>>>>the page is to
> >>>>>>>reclaim, the isolated LRU list also increase the page->_count(code
> >>>>>>>fragment is code-3).
> >>>>>>The page count is initialized to 1, but that does not stay with the
> >>>>>>object.  It's a reference that is passed to the allocating task, which
> >>>>>>drops it again when it's done with the page.  I.e. the pattern is like
> >>>>>>this:
> >>>>>>
> >>>>>>instantiation:
> >>>>>>page = page_cache_alloc()    /* instantiator reference -> 1 */
> >>>>>>add_to_page_cache(page, mapping, offset)
> >>>>>>   get_page(page)        /* page cache reference -> 2 */
> >>>>>>lru_cache_add(page)
> >>>>>>   get_page(page)        /* pagevec reference -> 3 */
> >>>>>>/* ...initiate read, write, associate buffers, ... */
> >>>>>>page_cache_release(page)    /* drop instantiator reference
> >>>>>>-> 2 + private */
> >>>>>>
> >>>>>>reclaim:
> >>>>>>lru_add_drain()
> >>>>>>   page_cache_release(page)    /* drop pagevec reference ->
> >>>>>>1 + private */
> >>>>>IIUC, when add page to lru will lead to add to pagevec firstly, and
> >>>>>pagevec will take one reference, so if lru will take over the
> >>>>>reference taken by pagevec when page transmit from pagevec to lru?
> >>>>>or just drop the reference and lru will not take reference for page?
> >>>>The LRU does not hold a reference, it would not make sense.  The
> >>>>pagevec only needs one because it would be awkward to remove a
> >>>>concurrently freed page out of a pagevec, but unlinking a page from
> >>>>the LRU is easy.  See mm/swap.c::__page_cache_release() and friends.
> >>>Since pagevec is per cpu, when can remove a concurrently freed
> >>>page out of a pagevec happen?
> >It doesn't because the pagevec holds a reference, as I wrote above.
> 
> I mean since pagevec is per cpu, how can remove a concurrently freed
> page out of a pagevec happen? If it doesn't happen pagevec don't
> need to hold a reference. :-)

It has nothing to do with the pagevec being per CPU.  The page may get
truncated or reclaimed and have every other reference being dropped
while it sits on the pagevec.
Simon Jeons March 12, 2013, 3:19 a.m. UTC | #10
Hi Hugh and Johannes,
On 03/08/2013 11:16 AM, Johannes Weiner wrote:
> On Fri, Mar 08, 2013 at 10:48:31AM +0800, Simon Jeons wrote:
>> On 03/08/2013 10:37 AM, Johannes Weiner wrote:
>>> On Fri, Mar 08, 2013 at 10:13:25AM +0800, Simon Jeons wrote:
>>>> Ping, :-)
>>>> On 03/07/2013 09:05 AM, Simon Jeons wrote:
>>>>> Hi Johannes,
>>>>> On 03/07/2013 03:47 AM, Johannes Weiner wrote:
>>>>>> On Wed, Mar 06, 2013 at 09:04:55AM +0800, Simon Jeons wrote:
>>>>>>> Hi Johannes,
>>>>>>> On 03/04/2013 11:09 PM, Johannes Weiner wrote:
>>>>>>>> On Mon, Mar 04, 2013 at 09:54:26AM +0800, Li Haifeng wrote:
>>>>>>>>> When a page cache is to reclaim, we should to decide whether the page
>>>>>>>>> cache is free.
>>>>>>>>> IMO, the condition whether a page cache is free should be 3 in page
>>>>>>>>> frame reclaiming. The reason lists as below.
>>>>>>>>>
>>>>>>>>> When page is allocated, the page->_count is 1(code
>>>>>>>>> fragment is code-1 ).
>>>>>>>>> And when the page is allocated for reading files from
>>>>>>>>> extern disk, the
>>>>>>>>> page->_count will increment 1 by page_cache_get() in
>>>>>>>>> add_to_page_cache_locked()(code fragment is code-2). When
>>>>>>>>> the page is to
>>>>>>>>> reclaim, the isolated LRU list also increase the page->_count(code
>>>>>>>>> fragment is code-3).
>>>>>>>> The page count is initialized to 1, but that does not stay with the
>>>>>>>> object.  It's a reference that is passed to the allocating task, which
>>>>>>>> drops it again when it's done with the page.  I.e. the pattern is like
>>>>>>>> this:
>>>>>>>>
>>>>>>>> instantiation:
>>>>>>>> page = page_cache_alloc()    /* instantiator reference -> 1 */
>>>>>>>> add_to_page_cache(page, mapping, offset)
>>>>>>>>    get_page(page)        /* page cache reference -> 2 */
>>>>>>>> lru_cache_add(page)
>>>>>>>>    get_page(page)        /* pagevec reference -> 3 */
>>>>>>>> /* ...initiate read, write, associate buffers, ... */
>>>>>>>> page_cache_release(page)    /* drop instantiator reference
>>>>>>>> -> 2 + private */
>>>>>>>>
>>>>>>>> reclaim:
>>>>>>>> lru_add_drain()
>>>>>>>>    page_cache_release(page)    /* drop pagevec reference ->
>>>>>>>> 1 + private */
>>>>>>> IIUC, when add page to lru will lead to add to pagevec firstly, and
>>>>>>> pagevec will take one reference, so if lru will take over the
>>>>>>> reference taken by pagevec when page transmit from pagevec to lru?
>>>>>>> or just drop the reference and lru will not take reference for page?
>>>>>> The LRU does not hold a reference, it would not make sense.  The
>>>>>> pagevec only needs one because it would be awkward to remove a
>>>>>> concurrently freed page out of a pagevec, but unlinking a page from
>>>>>> the LRU is easy.  See mm/swap.c::__page_cache_release() and friends.
>>>>> Since pagevec is per cpu, when can remove a concurrently freed
>>>>> page out of a pagevec happen?
>>> It doesn't because the pagevec holds a reference, as I wrote above.
>> I mean since pagevec is per cpu, how can remove a concurrently freed
>> page out of a pagevec happen? If it doesn't happen pagevec don't
>> need to hold a reference. :-)
> It has nothing to do with the pagevec being per CPU.  The page may get
> truncated or reclaimed and have every other reference being dropped
> while it sits on the pagevec.

In function shmem_replace_page, there are twice call of 
page_cache_release for oldpage, one is for pre_new_page, the other is 
for page cache, but if page is still in pagevec,  pagevec has one 
reference and oldpage can't be freed, is it a bug?
diff mbox

Patch

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 6759993..b588378 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -400,11 +400,12 @@  static void reset_reclaim_mode(struct scan_control *sc)
 static inline int is_page_cache_freeable(struct page *page)
 {
        /*
-        * A freeable page cache page is referenced only by the caller
-        * that isolated the page, the page cache radix tree and
-        * optional buffer heads at page->private.
+        * A freeable page cache page, _count of which is
+        * initialized by 1. And it is also referenced only
+        * by the caller that isolated the page, the page cache
+        * radix tree and optional buffer heads at page->private.
         */
-       return page_count(page) - page_has_private(page) == 2;
+       return page_count(page) - page_has_private(page) == 3;
 }

 static int may_write_to_queue(struct backing_dev_info *bdi,