From patchwork Mon Jul 19 10:35:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 12385513 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B3BE4C07E9B for ; Mon, 19 Jul 2021 11:04:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 99B2A61029 for ; Mon, 19 Jul 2021 11:04:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236361AbhGSKX6 (ORCPT ); Mon, 19 Jul 2021 06:23:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58676 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235440AbhGSKX5 (ORCPT ); Mon, 19 Jul 2021 06:23:57 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 90DF5C061574; Mon, 19 Jul 2021 03:13:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=Mri/mZGJNVqNUZOZpDacK9KS7AXJtfgkxhLzH1GJHvs=; b=W6UsLolXM36iNim1LnOQsmSUhu /8V4Kx34B3/BwW5Rk07ZmE81dJX33AEGp/h46PvvX1HdSK3npcKD3wK+0hU+gJi8cE5+FfEbJJYVr yNVRgywtOPNqGcww5BHLRFZiYX1HgqtRPbieo+B7eh0OW+KqCsdgu1UaOZzKf3+qkyu99ajuiaO/P IHXZ7KXMVDDYei3z6nK0vd/XL/EbgGt6gGmNffvjOBkpwwS4cKicflQK5+XDIB6yRYcmg2/HJHPyH Vj2cdFrvMRSj3fxoudq2t71hn9PaPiO39l/PTWAhNX4n9mtUEAFCxvf6tE93Rd217rn/JEXMvk/Ap tQrKW6Aw==; Received: from [2001:4bb8:193:7660:d2a4:8d57:2e55:21d0] (helo=localhost) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1m5R2c-006mPg-VX; Mon, 19 Jul 2021 11:02:41 +0000 From: Christoph Hellwig To: "Darrick J. Wong" Cc: Dan Williams , Matthew Wilcox , Andreas Gruenbacher , Shiyang Ruan , linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-btrfs@vger.kernel.org, nvdimm@lists.linux.dev, cluster-devel@redhat.com, Ritesh Harjani Subject: [PATCH 24/27] fsdax: factor out helpers to simplify the dax fault code Date: Mon, 19 Jul 2021 12:35:17 +0200 Message-Id: <20210719103520.495450-25-hch@lst.de> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210719103520.495450-1-hch@lst.de> References: <20210719103520.495450-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by casper.infradead.org. See http://www.infradead.org/rpr.html Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Shiyang Ruan The dax page fault code is too long and a bit difficult to read. And it is hard to understand when we trying to add new features. Some of the PTE/PMD codes have similar logic. So, factor out helper functions to simplify the code. Signed-off-by: Shiyang Ruan Reviewed-by: Ritesh Harjani Reviewed-by: Darrick J. Wong [hch: minor cleanups] Signed-off-by: Christoph Hellwig --- fs/dax.c | 153 ++++++++++++++++++++++++++++++------------------------- 1 file changed, 84 insertions(+), 69 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 51da45301350a6..c09d721629d167 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -1255,6 +1255,53 @@ static bool dax_fault_is_synchronous(unsigned long flags, && (iomap->flags & IOMAP_F_DIRTY); } +/* + * When handling a synchronous page fault and the inode need a fsync, we can + * insert the PTE/PMD into page tables only after that fsync happened. Skip + * insertion for now and return the pfn so that caller can insert it after the + * fsync is done. + */ +static vm_fault_t dax_fault_synchronous_pfnp(pfn_t *pfnp, pfn_t pfn) +{ + if (WARN_ON_ONCE(!pfnp)) + return VM_FAULT_SIGBUS; + *pfnp = pfn; + return VM_FAULT_NEEDDSYNC; +} + +static vm_fault_t dax_fault_cow_page(struct vm_fault *vmf, struct iomap *iomap, + loff_t pos) +{ + sector_t sector = dax_iomap_sector(iomap, pos); + unsigned long vaddr = vmf->address; + vm_fault_t ret; + int error = 0; + + switch (iomap->type) { + case IOMAP_HOLE: + case IOMAP_UNWRITTEN: + clear_user_highpage(vmf->cow_page, vaddr); + break; + case IOMAP_MAPPED: + error = copy_cow_page_dax(iomap->bdev, iomap->dax_dev, sector, + vmf->cow_page, vaddr); + break; + default: + WARN_ON_ONCE(1); + error = -EIO; + break; + } + + if (error) + return dax_fault_return(error); + + __SetPageUptodate(vmf->cow_page); + ret = finish_fault(vmf); + if (!ret) + return VM_FAULT_DONE_COW; + return ret; +} + static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp, int *iomap_errp, const struct iomap_ops *ops) { @@ -1323,30 +1370,7 @@ static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp, } if (vmf->cow_page) { - sector_t sector = dax_iomap_sector(&iomap, pos); - - switch (iomap.type) { - case IOMAP_HOLE: - case IOMAP_UNWRITTEN: - clear_user_highpage(vmf->cow_page, vaddr); - break; - case IOMAP_MAPPED: - error = copy_cow_page_dax(iomap.bdev, iomap.dax_dev, - sector, vmf->cow_page, vaddr); - break; - default: - WARN_ON_ONCE(1); - error = -EIO; - break; - } - - if (error) - goto error_finish_iomap; - - __SetPageUptodate(vmf->cow_page); - ret = finish_fault(vmf); - if (!ret) - ret = VM_FAULT_DONE_COW; + ret = dax_fault_cow_page(vmf, &iomap, pos); goto finish_iomap; } @@ -1366,19 +1390,8 @@ static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp, entry = dax_insert_entry(&xas, mapping, vmf, entry, pfn, 0, write && !sync); - /* - * If we are doing synchronous page fault and inode needs fsync, - * we can insert PTE into page tables only after that happens. - * Skip insertion for now and return the pfn so that caller can - * insert it after fsync is done. - */ if (sync) { - if (WARN_ON_ONCE(!pfnp)) { - error = -EIO; - goto error_finish_iomap; - } - *pfnp = pfn; - ret = VM_FAULT_NEEDDSYNC | major; + ret = dax_fault_synchronous_pfnp(pfnp, pfn); goto finish_iomap; } trace_dax_insert_mapping(inode, vmf, entry); @@ -1477,13 +1490,45 @@ static vm_fault_t dax_pmd_load_hole(struct xa_state *xas, struct vm_fault *vmf, return VM_FAULT_FALLBACK; } +static bool dax_fault_check_fallback(struct vm_fault *vmf, struct xa_state *xas, + pgoff_t max_pgoff) +{ + unsigned long pmd_addr = vmf->address & PMD_MASK; + bool write = vmf->flags & FAULT_FLAG_WRITE; + + /* + * Make sure that the faulting address's PMD offset (color) matches + * the PMD offset from the start of the file. This is necessary so + * that a PMD range in the page table overlaps exactly with a PMD + * range in the page cache. + */ + if ((vmf->pgoff & PG_PMD_COLOUR) != + ((vmf->address >> PAGE_SHIFT) & PG_PMD_COLOUR)) + return true; + + /* Fall back to PTEs if we're going to COW */ + if (write && !(vmf->vma->vm_flags & VM_SHARED)) + return true; + + /* If the PMD would extend outside the VMA */ + if (pmd_addr < vmf->vma->vm_start) + return true; + if ((pmd_addr + PMD_SIZE) > vmf->vma->vm_end) + return true; + + /* If the PMD would extend beyond the file size */ + if ((xas->xa_index | PG_PMD_COLOUR) >= max_pgoff) + return true; + + return false; +} + static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp, const struct iomap_ops *ops) { struct vm_area_struct *vma = vmf->vma; struct address_space *mapping = vma->vm_file->f_mapping; XA_STATE_ORDER(xas, &mapping->i_pages, vmf->pgoff, PMD_ORDER); - unsigned long pmd_addr = vmf->address & PMD_MASK; bool write = vmf->flags & FAULT_FLAG_WRITE; bool sync; unsigned int iomap_flags = (write ? IOMAP_WRITE : 0) | IOMAP_FAULT; @@ -1506,33 +1551,12 @@ static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp, trace_dax_pmd_fault(inode, vmf, max_pgoff, 0); - /* - * Make sure that the faulting address's PMD offset (color) matches - * the PMD offset from the start of the file. This is necessary so - * that a PMD range in the page table overlaps exactly with a PMD - * range in the page cache. - */ - if ((vmf->pgoff & PG_PMD_COLOUR) != - ((vmf->address >> PAGE_SHIFT) & PG_PMD_COLOUR)) - goto fallback; - - /* Fall back to PTEs if we're going to COW */ - if (write && !(vma->vm_flags & VM_SHARED)) - goto fallback; - - /* If the PMD would extend outside the VMA */ - if (pmd_addr < vma->vm_start) - goto fallback; - if ((pmd_addr + PMD_SIZE) > vma->vm_end) - goto fallback; - if (xas.xa_index >= max_pgoff) { result = VM_FAULT_SIGBUS; goto out; } - /* If the PMD would extend beyond the file size */ - if ((xas.xa_index | PG_PMD_COLOUR) >= max_pgoff) + if (dax_fault_check_fallback(vmf, &xas, max_pgoff)) goto fallback; /* @@ -1584,17 +1608,8 @@ static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp, entry = dax_insert_entry(&xas, mapping, vmf, entry, pfn, DAX_PMD, write && !sync); - /* - * If we are doing synchronous page fault and inode needs fsync, - * we can insert PMD into page tables only after that happens. - * Skip insertion for now and return the pfn so that caller can - * insert it after fsync is done. - */ if (sync) { - if (WARN_ON_ONCE(!pfnp)) - goto finish_iomap; - *pfnp = pfn; - result = VM_FAULT_NEEDDSYNC; + result = dax_fault_synchronous_pfnp(pfnp, pfn); goto finish_iomap; }