diff mbox

[-mm,-V3,03/21] mm, THP, swap: Support PMD swap mapping in swap_duplicate()

Message ID 20180611204231.ojhlyrbmda6pouxb@ca-dmjordan1.us.oracle.com (mailing list archive)
State New, archived
Headers show

Commit Message

Daniel Jordan June 11, 2018, 8:42 p.m. UTC
Hi,

The series up to and including this patch doesn't build.  For this patch we
need:



On Wed, May 23, 2018 at 04:26:07PM +0800, Huang, Ying wrote:
> @@ -3516,11 +3512,39 @@ static int __swap_duplicate(swp_entry_t entry, unsigned char usage)

Two comments about this part of __swap_duplicate as long as you're moving it to
another function:

   } else if (count || has_cache) {
   
   	if ((count & ~COUNT_CONTINUED) < SWAP_MAP_MAX)          /* #1   */
   		count += usage;
   	else if ((count & ~COUNT_CONTINUED) > SWAP_MAP_MAX)     /* #2   */
   		err = -EINVAL;

#1:  __swap_duplicate_locked might use

    VM_BUG_ON(usage != SWAP_HAS_CACHE && usage != 1);

to document the unstated assumption that usage is 1 (otherwise count could
overflow).

#2:  We've masked off SWAP_HAS_CACHE and COUNT_CONTINUED, and already checked
for SWAP_MAP_BAD, so I think condition #2 always fails and can just be removed.

> +#ifdef CONFIG_THP_SWAP
> +static int __swap_duplicate_cluster(swp_entry_t *entry, unsigned char usage)
...
> +	} else {
> +		for (i = 0; i < SWAPFILE_CLUSTER; i++) {
> +retry:
> +			err = __swap_duplicate_locked(si, offset + i, 1);

I guess usage is assumed to be 1 at this point (__swap_duplicate_locked makes
the same assumption).  Maybe make this explicit with

			err = __swap_duplicate_locked(si, offset + i, usage);

, use 'usage' in cluster_set_count and __swap_entry_free too, and then
earlier have a

       VM_BUG_ON(usage != SWAP_HAS_CACHE && usage != 1);

?

> +#else
> +static inline int __swap_duplicate_cluster(swp_entry_t *entry,

This doesn't need inline.


Not related to your changes, but while we're here, the comment with
SWAP_HAS_CONT in swap_count() could be deleted: I don't think there ever was a
SWAP_HAS_CONT.

The rest looks ok up to this point.

Comments

Huang, Ying June 12, 2018, 1:23 a.m. UTC | #1
Hi, Daniel,

Thanks for your effort to review this series.

Daniel Jordan <daniel.m.jordan@oracle.com> writes:

> Hi,
>
> The series up to and including this patch doesn't build.  For this patch we
> need:
>
> diff --git a/mm/swap_state.c b/mm/swap_state.c
> index c6b3eab73fde..2f2d07627113 100644
> --- a/mm/swap_state.c
> +++ b/mm/swap_state.c
> @@ -433,7 +433,7 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
>                 /*
>                  * Swap entry may have been freed since our caller observed it.
>                  */
> -               err = swapcache_prepare(entry);
> +               err = swapcache_prepare(entry, false);
>                 if (err == -EEXIST) {
>                         radix_tree_preload_end();
>                         /*

Thanks for pointing this out!  Will change in the next version.

>
> On Wed, May 23, 2018 at 04:26:07PM +0800, Huang, Ying wrote:
>> @@ -3516,11 +3512,39 @@ static int __swap_duplicate(swp_entry_t entry, unsigned char usage)
>
> Two comments about this part of __swap_duplicate as long as you're moving it to
> another function:
>
>    } else if (count || has_cache) {
>    
>    	if ((count & ~COUNT_CONTINUED) < SWAP_MAP_MAX)          /* #1   */
>    		count += usage;
>    	else if ((count & ~COUNT_CONTINUED) > SWAP_MAP_MAX)     /* #2   */
>    		err = -EINVAL;
>
> #1:  __swap_duplicate_locked might use
>
>     VM_BUG_ON(usage != SWAP_HAS_CACHE && usage != 1);
>
> to document the unstated assumption that usage is 1 (otherwise count could
> overflow).

Sounds good.  Will do this.

> #2:  We've masked off SWAP_HAS_CACHE and COUNT_CONTINUED, and already checked
> for SWAP_MAP_BAD, so I think condition #2 always fails and can just be removed.

I think this is used to check some software bug.  For example,
SWAP_MAP_SHMEM will yield true here.

>> +#ifdef CONFIG_THP_SWAP
>> +static int __swap_duplicate_cluster(swp_entry_t *entry, unsigned char usage)
> ...
>> +	} else {
>> +		for (i = 0; i < SWAPFILE_CLUSTER; i++) {
>> +retry:
>> +			err = __swap_duplicate_locked(si, offset + i, 1);
>
> I guess usage is assumed to be 1 at this point (__swap_duplicate_locked makes
> the same assumption).  Maybe make this explicit with
>
> 			err = __swap_duplicate_locked(si, offset + i, usage);
>
> , use 'usage' in cluster_set_count and __swap_entry_free too, and then
> earlier have a
>
>        VM_BUG_ON(usage != SWAP_HAS_CACHE && usage != 1);
>
> ?

Yes.  I will fix this.  And we can just check it in
__swap_duplicate_locked() and all these will be covered.

>> +#else
>> +static inline int __swap_duplicate_cluster(swp_entry_t *entry,
>
> This doesn't need inline.

Why not?  This is just a one line stub.

> Not related to your changes, but while we're here, the comment with
> SWAP_HAS_CONT in swap_count() could be deleted: I don't think there ever was a
> SWAP_HAS_CONT.

Yes.  We should correct this.  Because this should go to a separate patch,
would you mind to submit a patch to fix it?

> The rest looks ok up to this point.

Thanks!

Best Regards,
Huang, Ying
Huang, Ying June 12, 2018, 3:15 a.m. UTC | #2
"Huang, Ying" <ying.huang@intel.com> writes:
>> On Wed, May 23, 2018 at 04:26:07PM +0800, Huang, Ying wrote:
>>> @@ -3516,11 +3512,39 @@ static int __swap_duplicate(swp_entry_t entry, unsigned char usage)
>>
>> Two comments about this part of __swap_duplicate as long as you're moving it to
>> another function:
>>
>>    } else if (count || has_cache) {
>>    
>>    	if ((count & ~COUNT_CONTINUED) < SWAP_MAP_MAX)          /* #1   */
>>    		count += usage;
>>    	else if ((count & ~COUNT_CONTINUED) > SWAP_MAP_MAX)     /* #2   */
>>    		err = -EINVAL;
>>
>> #1:  __swap_duplicate_locked might use
>>
>>     VM_BUG_ON(usage != SWAP_HAS_CACHE && usage != 1);
>>
>> to document the unstated assumption that usage is 1 (otherwise count could
>> overflow).
>
> Sounds good.  Will do this.

Found usage parameter of __swap_duplicate() could be SWAP_MAP_SHMEM too.
We can improve the parameter checking.  But that appears not belong to
this series.

Best Regards,
Huang, Ying
Daniel Jordan June 12, 2018, 12:04 p.m. UTC | #3
On Tue, Jun 12, 2018 at 09:23:19AM +0800, Huang, Ying wrote:
> Daniel Jordan <daniel.m.jordan@oracle.com> writes:
> > #2:  We've masked off SWAP_HAS_CACHE and COUNT_CONTINUED, and already checked
> > for SWAP_MAP_BAD, so I think condition #2 always fails and can just be removed.
> 
> I think this is used to check some software bug.  For example,
> SWAP_MAP_SHMEM will yield true here.

So it does!  And so __swap_duplicate returns -EINVAL in that case, which
swap_shmem_alloc just ignores.  Confusing, and an explicit check for
SWAP_MAP_SHMEM would be cleaner, but why fix what isn't broken.

> 
> >> +#ifdef CONFIG_THP_SWAP
> >> +static int __swap_duplicate_cluster(swp_entry_t *entry, unsigned char usage)
> > ...
> >> +	} else {
> >> +		for (i = 0; i < SWAPFILE_CLUSTER; i++) {
> >> +retry:
> >> +			err = __swap_duplicate_locked(si, offset + i, 1);
> >
> > I guess usage is assumed to be 1 at this point (__swap_duplicate_locked makes
> > the same assumption).  Maybe make this explicit with
> >
> > 			err = __swap_duplicate_locked(si, offset + i, usage);
> >
> > , use 'usage' in cluster_set_count and __swap_entry_free too, and then
> > earlier have a
> >
> >        VM_BUG_ON(usage != SWAP_HAS_CACHE && usage != 1);
> >
> > ?
> 
> Yes.  I will fix this.  And we can just check it in
> __swap_duplicate_locked() and all these will be covered.

I'll respond to your other mail.

> > Not related to your changes, but while we're here, the comment with
> > SWAP_HAS_CONT in swap_count() could be deleted: I don't think there ever was a
> > SWAP_HAS_CONT.
> 
> Yes.  We should correct this.  Because this should go to a separate patch,
> would you mind to submit a patch to fix it?

Sure, I'll do that.

Daniel
Daniel Jordan June 12, 2018, 12:05 p.m. UTC | #4
On Tue, Jun 12, 2018 at 11:15:28AM +0800, Huang, Ying wrote:
> "Huang, Ying" <ying.huang@intel.com> writes:
> >> On Wed, May 23, 2018 at 04:26:07PM +0800, Huang, Ying wrote:
> >>> @@ -3516,11 +3512,39 @@ static int __swap_duplicate(swp_entry_t entry, unsigned char usage)
> >>
> >> Two comments about this part of __swap_duplicate as long as you're moving it to
> >> another function:
> >>
> >>    } else if (count || has_cache) {
> >>    
> >>    	if ((count & ~COUNT_CONTINUED) < SWAP_MAP_MAX)          /* #1   */
> >>    		count += usage;
> >>    	else if ((count & ~COUNT_CONTINUED) > SWAP_MAP_MAX)     /* #2   */
> >>    		err = -EINVAL;
> >>
> >> #1:  __swap_duplicate_locked might use
> >>
> >>     VM_BUG_ON(usage != SWAP_HAS_CACHE && usage != 1);
> >>
> >> to document the unstated assumption that usage is 1 (otherwise count could
> >> overflow).
> >
> > Sounds good.  Will do this.
> 
> Found usage parameter of __swap_duplicate() could be SWAP_MAP_SHMEM too.
> We can improve the parameter checking.  But that appears not belong to
> this series.

Fair enough, I'll see about adding this along with the other patch I'm sending.

Daniel
Daniel Jordan June 12, 2018, 9:44 p.m. UTC | #5
On Tue, Jun 12, 2018 at 09:23:19AM +0800, Huang, Ying wrote:
> Daniel Jordan <daniel.m.jordan@oracle.com> writes:
> >> +#else
> >> +static inline int __swap_duplicate_cluster(swp_entry_t *entry,
> >
> > This doesn't need inline.
> 
> Why not?  This is just a one line stub.

Forgot to respond to this.  The compiler will likely choose to optimize out
calls to an empty function like this.  Checking, this is indeed what it does in
this case on my machine, with or without inline.


By the way, when building without CONFIG_THP_SWAP, we get

  linux/mm/swapfile.c:933:13: warning: ‘__swap_free_cluster’ defined but not used [-Wunused-function]
   static void __swap_free_cluster(struct swap_info_struct *si, unsigned long idx)
               ^~~~~~~~~~~~~~~~~~~
Huang, Ying June 13, 2018, 1:26 a.m. UTC | #6
Daniel Jordan <daniel.m.jordan@oracle.com> writes:

> On Tue, Jun 12, 2018 at 09:23:19AM +0800, Huang, Ying wrote:
>> Daniel Jordan <daniel.m.jordan@oracle.com> writes:
>> >> +#else
>> >> +static inline int __swap_duplicate_cluster(swp_entry_t *entry,
>> >
>> > This doesn't need inline.
>> 
>> Why not?  This is just a one line stub.
>
> Forgot to respond to this.  The compiler will likely choose to optimize out
> calls to an empty function like this.  Checking, this is indeed what it does in
> this case on my machine, with or without inline.

Yes.  I believe a decent compiler will inline the function in any way.
And it does no harm to keep "inline" too, Yes?

> By the way, when building without CONFIG_THP_SWAP, we get
>
>   linux/mm/swapfile.c:933:13: warning: ‘__swap_free_cluster’ defined but not used [-Wunused-function]
>    static void __swap_free_cluster(struct swap_info_struct *si, unsigned long idx)
>                ^~~~~~~~~~~~~~~~~~~

Thanks!  I will fix this.  Don't know why 0-Day didn't catch this.

Best Regards,
Huang, Ying
Daniel Jordan June 13, 2018, 11:49 a.m. UTC | #7
On Wed, Jun 13, 2018 at 09:26:54AM +0800, Huang, Ying wrote:
> Daniel Jordan <daniel.m.jordan@oracle.com> writes:
> 
> > On Tue, Jun 12, 2018 at 09:23:19AM +0800, Huang, Ying wrote:
> >> Daniel Jordan <daniel.m.jordan@oracle.com> writes:
> >> >> +#else
> >> >> +static inline int __swap_duplicate_cluster(swp_entry_t *entry,
> >> >
> >> > This doesn't need inline.
> >> 
> >> Why not?  This is just a one line stub.
> >
> > Forgot to respond to this.  The compiler will likely choose to optimize out
> > calls to an empty function like this.  Checking, this is indeed what it does in
> > this case on my machine, with or without inline.
> 
> Yes.  I believe a decent compiler will inline the function in any way.
> And it does no harm to keep "inline" too, Yes?

Right, it does no harm, it's just a matter of style.
diff mbox

Patch

diff --git a/mm/swap_state.c b/mm/swap_state.c
index c6b3eab73fde..2f2d07627113 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -433,7 +433,7 @@  struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
                /*
                 * Swap entry may have been freed since our caller observed it.
                 */
-               err = swapcache_prepare(entry);
+               err = swapcache_prepare(entry, false);
                if (err == -EEXIST) {
                        radix_tree_preload_end();
                        /*