diff mbox series

[v2,1/1] KVM: s390: fix race in gmap_make_secure

Message ID 20230426134834.35199-2-imbrenda@linux.ibm.com (mailing list archive)
State New, archived
Headers show
Series fix race in gmap_make_secure | expand

Commit Message

Claudio Imbrenda April 26, 2023, 1:48 p.m. UTC
This patch fixes a potential race in gmap_make_secure and removes the
last user of follow_page without FOLL_GET.

Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Fixes: 214d9bbcd3a6 ("s390/mm: provide memory management functions for protected KVM guests")
---
 arch/s390/kernel/uv.c | 32 +++++++++++---------------------
 1 file changed, 11 insertions(+), 21 deletions(-)

Comments

Heiko Carstens April 27, 2023, 10:53 a.m. UTC | #1
On Wed, Apr 26, 2023 at 03:48:34PM +0200, Claudio Imbrenda wrote:
> This patch fixes a potential race in gmap_make_secure and removes the
> last user of follow_page without FOLL_GET.
> 
> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Fixes: 214d9bbcd3a6 ("s390/mm: provide memory management functions for protected KVM guests")
> ---
>  arch/s390/kernel/uv.c | 32 +++++++++++---------------------
>  1 file changed, 11 insertions(+), 21 deletions(-)

It would be helpful if this would be a bit more descriptive. "Fix
race" is not very helpful :)

What race does this fix?
When can this happen?
What are the consequences if the race window is being hit?
Claudio Imbrenda April 27, 2023, 11:46 a.m. UTC | #2
On Thu, 27 Apr 2023 12:53:04 +0200
Heiko Carstens <hca@linux.ibm.com> wrote:

> On Wed, Apr 26, 2023 at 03:48:34PM +0200, Claudio Imbrenda wrote:
> > This patch fixes a potential race in gmap_make_secure and removes the
> > last user of follow_page without FOLL_GET.
> > 
> > Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
> > Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> > Fixes: 214d9bbcd3a6 ("s390/mm: provide memory management functions for protected KVM guests")
> > ---
> >  arch/s390/kernel/uv.c | 32 +++++++++++---------------------
> >  1 file changed, 11 insertions(+), 21 deletions(-)  
> 
> It would be helpful if this would be a bit more descriptive. "Fix
> race" is not very helpful :)
> 
> What race does this fix?
> When can this happen?
> What are the consequences if the race window is being hit?

We are locking something we don't have a reference to, and as explained
by Jason and David in this thread <Y9J4P/RNvY1Ztn0Q@nvidia.com> it can
lead to all kind of bad things, including the page getting
unmapped (MADV_DONTNEED), freed, reallocated as a larger folio and the
unlock_page() would target the wrong bit.

Also there is another race with the FOLL_WRITE, which could race
between the follow_page and the get_locked_pte.

The main point of the patch is to remove the last follow_page without
FOLL_GET or FOLL_PIN, removing the races can be considered a nice bonus.
Heiko Carstens April 27, 2023, 12:01 p.m. UTC | #3
On Thu, Apr 27, 2023 at 01:46:49PM +0200, Claudio Imbrenda wrote:
> On Thu, 27 Apr 2023 12:53:04 +0200
> Heiko Carstens <hca@linux.ibm.com> wrote:
> 
> > On Wed, Apr 26, 2023 at 03:48:34PM +0200, Claudio Imbrenda wrote:
> > > This patch fixes a potential race in gmap_make_secure and removes the
> > > last user of follow_page without FOLL_GET.
> > > 
> > > Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
> > > Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> > > Fixes: 214d9bbcd3a6 ("s390/mm: provide memory management functions for protected KVM guests")
> > > ---
> > >  arch/s390/kernel/uv.c | 32 +++++++++++---------------------
> > >  1 file changed, 11 insertions(+), 21 deletions(-)  
> > 
> > It would be helpful if this would be a bit more descriptive. "Fix
> > race" is not very helpful :)
> > 
> > What race does this fix?
> > When can this happen?
> > What are the consequences if the race window is being hit?
> 
> We are locking something we don't have a reference to, and as explained
> by Jason and David in this thread <Y9J4P/RNvY1Ztn0Q@nvidia.com> it can
> lead to all kind of bad things, including the page getting
> unmapped (MADV_DONTNEED), freed, reallocated as a larger folio and the
> unlock_page() would target the wrong bit.
> 
> Also there is another race with the FOLL_WRITE, which could race
> between the follow_page and the get_locked_pte.
> 
> The main point of the patch is to remove the last follow_page without
> FOLL_GET or FOLL_PIN, removing the races can be considered a nice bonus.

I've seen that discussion. What I'm actually asking for is that all of
this information should be added to the commit description. Nobody
will remember any of the details in one year.
Claudio Imbrenda April 27, 2023, 12:17 p.m. UTC | #4
On Thu, 27 Apr 2023 14:01:27 +0200
Heiko Carstens <hca@linux.ibm.com> wrote:

> On Thu, Apr 27, 2023 at 01:46:49PM +0200, Claudio Imbrenda wrote:
> > On Thu, 27 Apr 2023 12:53:04 +0200
> > Heiko Carstens <hca@linux.ibm.com> wrote:
> >   
> > > On Wed, Apr 26, 2023 at 03:48:34PM +0200, Claudio Imbrenda wrote:  
> > > > This patch fixes a potential race in gmap_make_secure and removes the
> > > > last user of follow_page without FOLL_GET.
> > > > 
> > > > Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
> > > > Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> > > > Fixes: 214d9bbcd3a6 ("s390/mm: provide memory management functions for protected KVM guests")
> > > > ---
> > > >  arch/s390/kernel/uv.c | 32 +++++++++++---------------------
> > > >  1 file changed, 11 insertions(+), 21 deletions(-)    
> > > 
> > > It would be helpful if this would be a bit more descriptive. "Fix
> > > race" is not very helpful :)
> > > 
> > > What race does this fix?
> > > When can this happen?
> > > What are the consequences if the race window is being hit?  
> > 
> > We are locking something we don't have a reference to, and as explained
> > by Jason and David in this thread <Y9J4P/RNvY1Ztn0Q@nvidia.com> it can
> > lead to all kind of bad things, including the page getting
> > unmapped (MADV_DONTNEED), freed, reallocated as a larger folio and the
> > unlock_page() would target the wrong bit.
> > 
> > Also there is another race with the FOLL_WRITE, which could race
> > between the follow_page and the get_locked_pte.
> > 
> > The main point of the patch is to remove the last follow_page without
> > FOLL_GET or FOLL_PIN, removing the races can be considered a nice bonus.  
> 
> I've seen that discussion. What I'm actually asking for is that all of
> this information should be added to the commit description. Nobody
> will remember any of the details in one year.

I will put it in the patch description.

do you think the text above is enough?
Heiko Carstens April 27, 2023, 12:45 p.m. UTC | #5
On Thu, Apr 27, 2023 at 02:17:11PM +0200, Claudio Imbrenda wrote:
> > > > On Wed, Apr 26, 2023 at 03:48:34PM +0200, Claudio Imbrenda wrote:  
> > > > > This patch fixes a potential race in gmap_make_secure and removes the
> > > > > last user of follow_page without FOLL_GET.
> > > > > 
> > > > > Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
> > > > > Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> > > > > Fixes: 214d9bbcd3a6 ("s390/mm: provide memory management functions for protected KVM guests")
> > > > > ---
> > > > >  arch/s390/kernel/uv.c | 32 +++++++++++---------------------
> > > > >  1 file changed, 11 insertions(+), 21 deletions(-)    
> > > > 
> > > > It would be helpful if this would be a bit more descriptive. "Fix
> > > > race" is not very helpful :)
> > > > 
> > > > What race does this fix?
> > > > When can this happen?
> > > > What are the consequences if the race window is being hit?  
> > > 
> > > We are locking something we don't have a reference to, and as explained
> > > by Jason and David in this thread <Y9J4P/RNvY1Ztn0Q@nvidia.com> it can
> > > lead to all kind of bad things, including the page getting
> > > unmapped (MADV_DONTNEED), freed, reallocated as a larger folio and the
> > > unlock_page() would target the wrong bit.
> > > 
> > > Also there is another race with the FOLL_WRITE, which could race
> > > between the follow_page and the get_locked_pte.
> > > 
> > > The main point of the patch is to remove the last follow_page without
> > > FOLL_GET or FOLL_PIN, removing the races can be considered a nice bonus.  
> > 
> > I've seen that discussion. What I'm actually asking for is that all of
> > this information should be added to the commit description. Nobody
> > will remember any of the details in one year.
> 
> I will put it in the patch description.
> 
> do you think the text above is enough?

Fine with me. With a proper Link: tag this is much better than before.
Thanks!
diff mbox series

Patch

diff --git a/arch/s390/kernel/uv.c b/arch/s390/kernel/uv.c
index 9f18a4af9c13..cb2ee06df286 100644
--- a/arch/s390/kernel/uv.c
+++ b/arch/s390/kernel/uv.c
@@ -192,21 +192,10 @@  static int expected_page_refs(struct page *page)
 	return res;
 }
 
-static int make_secure_pte(pte_t *ptep, unsigned long addr,
-			   struct page *exp_page, struct uv_cb_header *uvcb)
+static int make_page_secure(struct page *page, struct uv_cb_header *uvcb)
 {
-	pte_t entry = READ_ONCE(*ptep);
-	struct page *page;
 	int expected, cc = 0;
 
-	if (!pte_present(entry))
-		return -ENXIO;
-	if (pte_val(entry) & _PAGE_INVALID)
-		return -ENXIO;
-
-	page = pte_page(entry);
-	if (page != exp_page)
-		return -ENXIO;
 	if (PageWriteback(page))
 		return -EAGAIN;
 	expected = expected_page_refs(page);
@@ -304,17 +293,18 @@  int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb)
 		goto out;
 
 	rc = -ENXIO;
-	page = follow_page(vma, uaddr, FOLL_WRITE);
-	if (IS_ERR_OR_NULL(page))
-		goto out;
-
-	lock_page(page);
 	ptep = get_locked_pte(gmap->mm, uaddr, &ptelock);
-	if (should_export_before_import(uvcb, gmap->mm))
-		uv_convert_from_secure(page_to_phys(page));
-	rc = make_secure_pte(ptep, uaddr, page, uvcb);
+	if (pte_present(*ptep) && !(pte_val(*ptep) & _PAGE_INVALID) && pte_write(*ptep)) {
+		page = pte_page(*ptep);
+		rc = -EAGAIN;
+		if (trylock_page(page)) {
+			if (should_export_before_import(uvcb, gmap->mm))
+				uv_convert_from_secure(page_to_phys(page));
+			rc = make_page_secure(page, uvcb);
+			unlock_page(page);
+		}
+	}
 	pte_unmap_unlock(ptep, ptelock);
-	unlock_page(page);
 out:
 	mmap_read_unlock(gmap->mm);