diff mbox series

mm/swapfile: unuse_pte can map random data if swap read fails

Message ID 20220401072926.45051-1-linmiaohe@huawei.com (mailing list archive)
State New
Headers show
Series mm/swapfile: unuse_pte can map random data if swap read fails | expand

Commit Message

Miaohe Lin April 1, 2022, 7:29 a.m. UTC
There is a bug in unuse_pte(): when swap page happens to be unreadable,
page filled with random data is mapped into user address space. The fix
is to check for PageUptodate and fail swapoff in case of error.

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
---
 mm/swapfile.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

David Hildenbrand April 4, 2022, 1:37 p.m. UTC | #1
On 01.04.22 09:29, Miaohe Lin wrote:
> There is a bug in unuse_pte(): when swap page happens to be unreadable,
> page filled with random data is mapped into user address space. The fix
> is to check for PageUptodate and fail swapoff in case of error.
> 
> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
> ---
>  mm/swapfile.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 63c61f8b2611..e72a35de7a0f 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -1795,6 +1795,10 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd,
>  		ret = 0;
>  		goto out;
>  	}
> +	if (unlikely(!PageUptodate(page))) {
> +		ret = -EIO;
> +		goto out;
> +	}

Yeah, we have the same handling in do_swap_page(), whereby we send a
SIGBUS because we're dealing with an actual access.

Interestingly, folio_test_uptodate() states:

"Anonymous and CoW folios are always uptodate."

@Willy, is that true or is the swapin case not documented there?
Matthew Wilcox April 4, 2022, 2:11 p.m. UTC | #2
On Mon, Apr 04, 2022 at 03:37:36PM +0200, David Hildenbrand wrote:
> On 01.04.22 09:29, Miaohe Lin wrote:
> > There is a bug in unuse_pte(): when swap page happens to be unreadable,
> > page filled with random data is mapped into user address space. The fix
> > is to check for PageUptodate and fail swapoff in case of error.
> > 
> > Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
> > ---
> >  mm/swapfile.c | 4 ++++
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/mm/swapfile.c b/mm/swapfile.c
> > index 63c61f8b2611..e72a35de7a0f 100644
> > --- a/mm/swapfile.c
> > +++ b/mm/swapfile.c
> > @@ -1795,6 +1795,10 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd,
> >  		ret = 0;
> >  		goto out;
> >  	}
> > +	if (unlikely(!PageUptodate(page))) {
> > +		ret = -EIO;
> > +		goto out;
> > +	}
> 
> Yeah, we have the same handling in do_swap_page(), whereby we send a
> SIGBUS because we're dealing with an actual access.
> 
> Interestingly, folio_test_uptodate() states:
> 
> "Anonymous and CoW folios are always uptodate."
> 
> @Willy, is that true or is the swapin case not documented there?

Why do we keep a !Uptodate page in the swap cache?  If it can't be
read in from swap, I thought we just freed the page.  Since Miaohe
has observed that not happening, I guess it doesn't work that way,
but why not make it work that way?
Andrew Morton April 4, 2022, 10:53 p.m. UTC | #3
On Fri, 1 Apr 2022 15:29:26 +0800 Miaohe Lin <linmiaohe@huawei.com> wrote:

> There is a bug in unuse_pte(): when swap page happens to be unreadable,
> page filled with random data is mapped into user address space. The fix
> is to check for PageUptodate and fail swapoff in case of error.
> 
> ...
>
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -1795,6 +1795,10 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd,
>  		ret = 0;
>  		goto out;
>  	}
> +	if (unlikely(!PageUptodate(page))) {
> +		ret = -EIO;
> +		goto out;
> +	}
>  
>  	dec_mm_counter(vma->vm_mm, MM_SWAPENTS);
>  	inc_mm_counter(vma->vm_mm, MM_ANONPAGES);

Failing the swapoff after -EIO seems a bit rude.  The user ends up with
a permanently mounted swap because a sector was bad?

That would be like failing truncate() or close() or umount after -EIO
on a regular file.  Somewhat.

Can we do something better?  Such as shooting down the page anyway and
permitting the swapoff to proceed?  Worst case, just leak the dang page
with an apologetic message.
Miaohe Lin April 6, 2022, 8:44 a.m. UTC | #4
On 2022/4/5 6:53, Andrew Morton wrote:
> On Fri, 1 Apr 2022 15:29:26 +0800 Miaohe Lin <linmiaohe@huawei.com> wrote:
> 
>> There is a bug in unuse_pte(): when swap page happens to be unreadable,
>> page filled with random data is mapped into user address space. The fix
>> is to check for PageUptodate and fail swapoff in case of error.
>>
>> ...
>>
>> --- a/mm/swapfile.c
>> +++ b/mm/swapfile.c
>> @@ -1795,6 +1795,10 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd,
>>  		ret = 0;
>>  		goto out;
>>  	}
>> +	if (unlikely(!PageUptodate(page))) {
>> +		ret = -EIO;
>> +		goto out;
>> +	}
>>  
>>  	dec_mm_counter(vma->vm_mm, MM_SWAPENTS);
>>  	inc_mm_counter(vma->vm_mm, MM_ANONPAGES);
> 
> Failing the swapoff after -EIO seems a bit rude.  The user ends up with
> a permanently mounted swap because a sector was bad?
> 

This is really unfortunate. :(

> That would be like failing truncate() or close() or umount after -EIO
> on a regular file.  Somewhat.
> 
> Can we do something better?  Such as shooting down the page anyway and
> permitting the swapoff to proceed?  Worst case, just leak the dang page
> with an apologetic message.
> .
> 

We must have a way to prevent user from accessing the wrong data. One way
is kept the page in the swap cache and kill the user when page is accessed.
But this will end up with a permanently mounted swap.
Another way I can figure out now is that we could set the page table entry
to some special swap entry, such as SWP_EIO like SWP_HWPOISON, we can thus
kill the user when page is accessed while swapoff can proceed. But this makes
the code more complicated... Any suggestions?

Many thanks!
Miaohe Lin April 6, 2022, 8:47 a.m. UTC | #5
On 2022/4/4 22:11, Matthew Wilcox wrote:
> On Mon, Apr 04, 2022 at 03:37:36PM +0200, David Hildenbrand wrote:
>> On 01.04.22 09:29, Miaohe Lin wrote:
>>> There is a bug in unuse_pte(): when swap page happens to be unreadable,
>>> page filled with random data is mapped into user address space. The fix
>>> is to check for PageUptodate and fail swapoff in case of error.
>>>
>>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
>>> ---
>>>  mm/swapfile.c | 4 ++++
>>>  1 file changed, 4 insertions(+)
>>>
>>> diff --git a/mm/swapfile.c b/mm/swapfile.c
>>> index 63c61f8b2611..e72a35de7a0f 100644
>>> --- a/mm/swapfile.c
>>> +++ b/mm/swapfile.c
>>> @@ -1795,6 +1795,10 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd,
>>>  		ret = 0;
>>>  		goto out;
>>>  	}
>>> +	if (unlikely(!PageUptodate(page))) {
>>> +		ret = -EIO;
>>> +		goto out;
>>> +	}
>>
>> Yeah, we have the same handling in do_swap_page(), whereby we send a
>> SIGBUS because we're dealing with an actual access.
>>
>> Interestingly, folio_test_uptodate() states:
>>
>> "Anonymous and CoW folios are always uptodate."
>>
>> @Willy, is that true or is the swapin case not documented there?
> 
> Why do we keep a !Uptodate page in the swap cache?  If it can't be
> read in from swap, I thought we just freed the page.  Since Miaohe

We could free the bad page. But we still need a way to prevent user from
accessing the wrong data.

> has observed that not happening, I guess it doesn't work that way,
> but why not make it work that way?

How could we make it work that way? Could you please tell me in more detail?
Or any suggestions?

Many thanks!

> 
> .
>
diff mbox series

Patch

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 63c61f8b2611..e72a35de7a0f 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1795,6 +1795,10 @@  static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd,
 		ret = 0;
 		goto out;
 	}
+	if (unlikely(!PageUptodate(page))) {
+		ret = -EIO;
+		goto out;
+	}
 
 	dec_mm_counter(vma->vm_mm, MM_SWAPENTS);
 	inc_mm_counter(vma->vm_mm, MM_ANONPAGES);