From patchwork Thu Oct 10 16:12:41 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrea Arcangeli X-Patchwork-Id: 3017461 Return-Path: X-Original-To: patchwork-kvm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 3FEE79F243 for ; Thu, 10 Oct 2013 16:13:36 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 1FBFD20306 for ; Thu, 10 Oct 2013 16:13:35 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D95A3202F0 for ; Thu, 10 Oct 2013 16:13:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756857Ab3JJQMx (ORCPT ); Thu, 10 Oct 2013 12:12:53 -0400 Received: from mx1.redhat.com ([209.132.183.28]:50924 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756304Ab3JJQMv (ORCPT ); Thu, 10 Oct 2013 12:12:51 -0400 Received: from int-mx01.intmail.prod.int.phx2.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r9AGCgVf024076 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 10 Oct 2013 12:12:42 -0400 Received: from mail.random (ovpn-116-20.ams2.redhat.com [10.36.116.20]) by int-mx01.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id r9AGCfFK000919; Thu, 10 Oct 2013 12:12:42 -0400 From: Andrea Arcangeli To: Andrew Morton Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Gleb Natapov , Mel Gorman , Rik van Riel , Hugh Dickins Subject: [PATCH] mm: hugetlb: initialize PG_reserved for tail pages of gigantig compound pages Date: Thu, 10 Oct 2013 18:12:41 +0200 Message-Id: <1381421561-10203-2-git-send-email-aarcange@redhat.com> In-Reply-To: <1381421561-10203-1-git-send-email-aarcange@redhat.com> References: <1381421561-10203-1-git-send-email-aarcange@redhat.com> X-Scanned-By: MIMEDefang 2.67 on 10.5.11.11 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP 11feeb498086a3a5907b8148bdf1786a9b18fc55 introduced a memory leak when KVM is run on gigantic compound pages. 11feeb498086a3a5907b8148bdf1786a9b18fc55 depends on the assumption that PG_reserved is identical for all head and tail pages of a compound page. So that if get_user_pages returns a tail page, we don't need to check the head page in order to know if we deal with a reserved page that requires different refcounting. The assumption that PG_reserved is the same for head and tail pages is certainly correct for THP and regular hugepages, but gigantic hugepages allocated through bootmem don't clear the PG_reserved on the tail pages (the clearing of PG_reserved is done later only if the gigantic hugepage is freed). This patch corrects the gigantic compound page initialization so that we can retain the optimization in 11feeb498086a3a5907b8148bdf1786a9b18fc55. The cacheline was already modified in order to set PG_tail so this won't affect the boot time of large memory systems. Reported-by: andy123 Signed-off-by: Andrea Arcangeli Acked-by: Rik van Riel Acked-by: Rafael Aquini --- mm/hugetlb.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/mm/hugetlb.c b/mm/hugetlb.c index b49579c..315450e 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -695,8 +695,24 @@ static void prep_compound_gigantic_page(struct page *page, unsigned long order) /* we rely on prep_new_huge_page to set the destructor */ set_compound_order(page, order); __SetPageHead(page); + __ClearPageReserved(page); for (i = 1; i < nr_pages; i++, p = mem_map_next(p, page, i)) { __SetPageTail(p); + /* + * For gigantic hugepages allocated through bootmem at + * boot, it's safer to be consistent with the + * not-gigantic hugepages and to clear the PG_reserved + * bit from all tail pages too. Otherwse drivers using + * get_user_pages() to access tail pages, may get the + * reference counting wrong if they see the + * PG_reserved bitflag set on a tail page (despite the + * head page didn't have PG_reserved set). Enforcing + * this consistency between head and tail pages, + * allows drivers to optimize away a check on the head + * page when they need know if put_page is needed after + * get_user_pages() or not. + */ + __ClearPageReserved(p); set_page_count(p, 0); p->first_page = page; } @@ -1329,9 +1345,9 @@ static void __init gather_bootmem_prealloc(void) #else page = virt_to_page(m); #endif - __ClearPageReserved(page); WARN_ON(page_count(page) != 1); prep_compound_huge_page(page, h->order); + WARN_ON(PageReserved(page)); prep_new_huge_page(h, page, page_to_nid(page)); /* * If we had gigantic hugepages allocated at boot time, we need