From patchwork Tue Jul 26 00:35:19 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A . Shutemov" X-Patchwork-Id: 9247419 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 2717860B17 for ; Tue, 26 Jul 2016 00:42:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 16C9727AC2 for ; Tue, 26 Jul 2016 00:42:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0B89827BF8; Tue, 26 Jul 2016 00:42:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AC95427AC2 for ; Tue, 26 Jul 2016 00:42:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756503AbcGZAkm (ORCPT ); Mon, 25 Jul 2016 20:40:42 -0400 Received: from mga01.intel.com ([192.55.52.88]:3568 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755307AbcGZAgD (ORCPT ); Mon, 25 Jul 2016 20:36:03 -0400 Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga101.fm.intel.com with ESMTP; 25 Jul 2016 17:36:01 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.28,421,1464678000"; d="scan'208";a="739693609" Received: from black.fi.intel.com ([10.237.72.93]) by FMSMGA003.fm.intel.com with ESMTP; 25 Jul 2016 17:35:57 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 7D088966; Tue, 26 Jul 2016 03:35:47 +0300 (EEST) From: "Kirill A. Shutemov" To: "Theodore Ts'o" , Andreas Dilger , Jan Kara Cc: Alexander Viro , Hugh Dickins , Andrea Arcangeli , Andrew Morton , Dave Hansen , Vlastimil Babka , Matthew Wilcox , Ross Zwisler , linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-block@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv1, RFC 17/33] HACK: readahead: alloc huge pages, if allowed Date: Tue, 26 Jul 2016 03:35:19 +0300 Message-Id: <1469493335-3622-18-git-send-email-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1469493335-3622-1-git-send-email-kirill.shutemov@linux.intel.com> References: <1469493335-3622-1-git-send-email-kirill.shutemov@linux.intel.com> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Most page cache allocation happens via readahead (sync or async), so if we want to have significant number of huge pages in page cache we need to find a ways to allocate them from readahead. Unfortunately, huge pages doesn't fit into current readahead design: 128 max readahead window, assumption on page size, PageReadahead() to track hit/miss. I haven't found a ways to get it right yet. This patch just allocates huge page if allowed, but doesn't really provide any readahead if huge page is allocated. We read out 2M a time and I would expect spikes in latancy without readahead. Therefore HACK. Any suggestions are welcome. Signed-off-by: Kirill A. Shutemov --- mm/readahead.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/mm/readahead.c b/mm/readahead.c index 65ec288dc057..3d7742a687f2 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -173,6 +173,20 @@ int __do_page_cache_readahead(struct address_space *mapping, struct file *filp, if (page_offset > end_index) break; + if ((!page_idx || page_offset % HPAGE_PMD_NR == 0) && + page_cache_allow_huge(mapping, page_offset)) { + page = __page_cache_alloc_order(gfp_mask | __GFP_COMP, + HPAGE_PMD_ORDER); + if (page) { + prep_transhuge_page(page); + page->index = round_down(page_offset, + HPAGE_PMD_NR); + list_add(&page->lru, &page_pool); + ret++; + goto start_io; + } + } + rcu_read_lock(); page = radix_tree_lookup(&mapping->page_tree, page_offset); rcu_read_unlock(); @@ -188,7 +202,7 @@ int __do_page_cache_readahead(struct address_space *mapping, struct file *filp, SetPageReadahead(page); ret++; } - +start_io: /* * Now start the IO. We ignore I/O errors - if the page is not * uptodate then the caller will launch readpage again, and