From patchwork Fri Oct 20 02:39:29 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10018769 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id A711A60234 for ; Fri, 20 Oct 2017 02:45:57 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 90D7128E8B for ; Fri, 20 Oct 2017 02:45:57 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 85C4428E8D; Fri, 20 Oct 2017 02:45:57 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 124F628E8B for ; Fri, 20 Oct 2017 02:45:56 +0000 (UTC) Received: from [127.0.0.1] (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id E269C202E616F; Thu, 19 Oct 2017 19:42:16 -0700 (PDT) X-Original-To: linux-nvdimm@lists.01.org Delivered-To: linux-nvdimm@lists.01.org Received-SPF: Pass (sender SPF authorized) identity=mailfrom; client-ip=192.55.52.43; helo=mga05.intel.com; envelope-from=dan.j.williams@intel.com; receiver=linux-nvdimm@lists.01.org Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 998F92034C087 for ; Thu, 19 Oct 2017 19:42:15 -0700 (PDT) Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP; 19 Oct 2017 19:45:53 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.43,404,1503385200"; d="scan'208";a="165296487" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.125]) by fmsmga006.fm.intel.com with ESMTP; 19 Oct 2017 19:45:53 -0700 Subject: [PATCH v3 06/13] dax: store pfns in the radix From: Dan Williams To: akpm@linux-foundation.org Date: Thu, 19 Oct 2017 19:39:29 -0700 Message-ID: <150846716928.24336.1310590439872030859.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <150846713528.24336.4459262264611579791.stgit@dwillia2-desk3.amr.corp.intel.com> References: <150846713528.24336.4459262264611579791.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.17.1-9-g687f MIME-Version: 1.0 X-BeenThere: linux-nvdimm@lists.01.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: "Linux-nvdimm developer list." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jan Kara , linux-nvdimm@lists.01.org, Matthew Wilcox , linux-kernel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, hch@lst.de Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" X-Virus-Scanned: ClamAV using ClamSMTP In preparation for examining the busy state of dax pages in the truncate path, switch from sectors to pfns in the radix. Cc: Jan Kara Cc: Jeff Moyer Cc: Christoph Hellwig Cc: Matthew Wilcox Cc: Ross Zwisler Signed-off-by: Dan Williams --- fs/dax.c | 75 +++++++++++++++++++++++--------------------------------------- 1 file changed, 28 insertions(+), 47 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index f001d8c72a06..ac6497dcfebd 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -72,16 +72,15 @@ fs_initcall(init_dax_wait_table); #define RADIX_DAX_ZERO_PAGE (1 << (RADIX_TREE_EXCEPTIONAL_SHIFT + 2)) #define RADIX_DAX_EMPTY (1 << (RADIX_TREE_EXCEPTIONAL_SHIFT + 3)) -static unsigned long dax_radix_sector(void *entry) +static unsigned long dax_radix_pfn(void *entry) { return (unsigned long)entry >> RADIX_DAX_SHIFT; } -static void *dax_radix_locked_entry(sector_t sector, unsigned long flags) +static void *dax_radix_locked_entry(unsigned long pfn, unsigned long flags) { return (void *)(RADIX_TREE_EXCEPTIONAL_ENTRY | flags | - ((unsigned long)sector << RADIX_DAX_SHIFT) | - RADIX_DAX_ENTRY_LOCK); + (pfn << RADIX_DAX_SHIFT) | RADIX_DAX_ENTRY_LOCK); } static unsigned int dax_radix_order(void *entry) @@ -525,12 +524,13 @@ static int copy_user_dax(struct block_device *bdev, struct dax_device *dax_dev, */ static void *dax_insert_mapping_entry(struct address_space *mapping, struct vm_fault *vmf, - void *entry, sector_t sector, + void *entry, pfn_t pfn_t, unsigned long flags) { struct radix_tree_root *page_tree = &mapping->page_tree; - void *new_entry; + unsigned long pfn = pfn_t_to_pfn(pfn_t); pgoff_t index = vmf->pgoff; + void *new_entry; if (vmf->flags & FAULT_FLAG_WRITE) __mark_inode_dirty(mapping->host, I_DIRTY_PAGES); @@ -547,7 +547,7 @@ static void *dax_insert_mapping_entry(struct address_space *mapping, } spin_lock_irq(&mapping->tree_lock); - new_entry = dax_radix_locked_entry(sector, flags); + new_entry = dax_radix_locked_entry(pfn, flags); if (dax_is_zero_entry(entry) || dax_is_empty_entry(entry)) { /* @@ -653,17 +653,14 @@ static void dax_mapping_entry_mkclean(struct address_space *mapping, i_mmap_unlock_read(mapping); } -static int dax_writeback_one(struct block_device *bdev, - struct dax_device *dax_dev, struct address_space *mapping, - pgoff_t index, void *entry) +static int dax_writeback_one(struct dax_device *dax_dev, + struct address_space *mapping, pgoff_t index, void *entry) { struct radix_tree_root *page_tree = &mapping->page_tree; - void *entry2, **slot, *kaddr; - long ret = 0, id; - sector_t sector; - pgoff_t pgoff; + void *entry2, **slot; + unsigned long pfn; + long ret = 0; size_t size; - pfn_t pfn; /* * A page got tagged dirty in DAX mapping? Something is seriously @@ -682,7 +679,7 @@ static int dax_writeback_one(struct block_device *bdev, * compare sectors as we must not bail out due to difference in lockbit * or entry type. */ - if (dax_radix_sector(entry2) != dax_radix_sector(entry)) + if (dax_radix_pfn(entry2) != dax_radix_pfn(entry)) goto put_unlocked; if (WARN_ON_ONCE(dax_is_empty_entry(entry) || dax_is_zero_entry(entry))) { @@ -712,29 +709,11 @@ static int dax_writeback_one(struct block_device *bdev, * 'entry'. This allows us to flush for PMD_SIZE and not have to * worry about partial PMD writebacks. */ - sector = dax_radix_sector(entry); + pfn = dax_radix_pfn(entry); size = PAGE_SIZE << dax_radix_order(entry); - id = dax_read_lock(); - ret = bdev_dax_pgoff(bdev, sector, size, &pgoff); - if (ret) - goto dax_unlock; - - /* - * dax_direct_access() may sleep, so cannot hold tree_lock over - * its invocation. - */ - ret = dax_direct_access(dax_dev, pgoff, size / PAGE_SIZE, &kaddr, &pfn); - if (ret < 0) - goto dax_unlock; - - if (WARN_ON_ONCE(ret < size / PAGE_SIZE)) { - ret = -EIO; - goto dax_unlock; - } - - dax_mapping_entry_mkclean(mapping, index, pfn_t_to_pfn(pfn)); - dax_flush(dax_dev, kaddr, size); + dax_mapping_entry_mkclean(mapping, index, pfn); + dax_flush(dax_dev, page_address(pfn_to_page(pfn)), size); /* * After we have flushed the cache, we can clear the dirty tag. There * cannot be new dirty data in the pfn after the flush has completed as @@ -745,8 +724,6 @@ static int dax_writeback_one(struct block_device *bdev, radix_tree_tag_clear(page_tree, index, PAGECACHE_TAG_DIRTY); spin_unlock_irq(&mapping->tree_lock); trace_dax_writeback_one(mapping->host, index, size >> PAGE_SHIFT); - dax_unlock: - dax_read_unlock(id); put_locked_mapping_entry(mapping, index); return ret; @@ -804,8 +781,8 @@ int dax_writeback_mapping_range(struct address_space *mapping, break; } - ret = dax_writeback_one(bdev, dax_dev, mapping, - indices[i], pvec.pages[i]); + ret = dax_writeback_one(dax_dev, mapping, indices[i], + pvec.pages[i]); if (ret < 0) { mapping_set_error(mapping, ret); goto out; @@ -843,7 +820,7 @@ static int dax_insert_mapping(struct address_space *mapping, } dax_read_unlock(id); - ret = dax_insert_mapping_entry(mapping, vmf, entry, sector, 0); + ret = dax_insert_mapping_entry(mapping, vmf, entry, pfn, 0); if (IS_ERR(ret)) return PTR_ERR(ret); @@ -852,6 +829,7 @@ static int dax_insert_mapping(struct address_space *mapping, return vm_insert_mixed_mkwrite(vma, vaddr, pfn); else return vm_insert_mixed(vma, vaddr, pfn); + return rc; } /* @@ -869,6 +847,7 @@ static int dax_load_hole(struct address_space *mapping, void *entry, int ret = VM_FAULT_NOPAGE; struct page *zero_page; void *entry2; + pfn_t pfn; zero_page = ZERO_PAGE(0); if (unlikely(!zero_page)) { @@ -876,14 +855,15 @@ static int dax_load_hole(struct address_space *mapping, void *entry, goto out; } - entry2 = dax_insert_mapping_entry(mapping, vmf, entry, 0, + pfn = page_to_pfn_t(zero_page); + entry2 = dax_insert_mapping_entry(mapping, vmf, entry, pfn, RADIX_DAX_ZERO_PAGE); if (IS_ERR(entry2)) { ret = VM_FAULT_SIGBUS; goto out; } - vm_insert_mixed(vmf->vma, vaddr, page_to_pfn_t(zero_page)); + vm_insert_mixed(vmf->vma, vaddr, pfn); out: trace_dax_load_hole(inode, vmf, ret); return ret; @@ -1250,8 +1230,7 @@ static int dax_pmd_insert_mapping(struct vm_fault *vmf, struct iomap *iomap, goto unlock_fallback; dax_read_unlock(id); - ret = dax_insert_mapping_entry(mapping, vmf, entry, sector, - RADIX_DAX_PMD); + ret = dax_insert_mapping_entry(mapping, vmf, entry, pfn, RADIX_DAX_PMD); if (IS_ERR(ret)) goto fallback; @@ -1276,13 +1255,15 @@ static int dax_pmd_load_hole(struct vm_fault *vmf, struct iomap *iomap, void *ret = NULL; spinlock_t *ptl; pmd_t pmd_entry; + pfn_t pfn; zero_page = mm_get_huge_zero_page(vmf->vma->vm_mm); if (unlikely(!zero_page)) goto fallback; - ret = dax_insert_mapping_entry(mapping, vmf, entry, 0, + pfn = page_to_pfn_t(zero_page); + ret = dax_insert_mapping_entry(mapping, vmf, entry, pfn, RADIX_DAX_PMD | RADIX_DAX_ZERO_PAGE); if (IS_ERR(ret)) goto fallback;