From patchwork Tue Oct 6 22:28:48 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ross Zwisler X-Patchwork-Id: 7340501 Return-Path: X-Original-To: patchwork-linux-fsdevel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 3DFD19F1D5 for ; Tue, 6 Oct 2015 22:30:03 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 5417B2066D for ; Tue, 6 Oct 2015 22:30:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 58EE720695 for ; Tue, 6 Oct 2015 22:30:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753388AbbJFW3g (ORCPT ); Tue, 6 Oct 2015 18:29:36 -0400 Received: from mga02.intel.com ([134.134.136.20]:15970 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752918AbbJFW3I (ORCPT ); Tue, 6 Oct 2015 18:29:08 -0400 Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga101.jf.intel.com with ESMTP; 06 Oct 2015 15:28:55 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.17,646,1437462000"; d="scan'208";a="805120774" Received: from theros.lm.intel.com ([10.232.112.146]) by fmsmga001.fm.intel.com with ESMTP; 06 Oct 2015 15:28:54 -0700 From: Ross Zwisler To: linux-kernel@vger.kernel.org Cc: Ross Zwisler , Alexander Viro , Matthew Wilcox , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Andrew Morton , Dan Williams , Dave Chinner , Jan Kara , "Kirill A. Shutemov" , linux-nvdimm@lists.01.org, Matthew Wilcox Subject: [PATCH v4 1/2] Revert "mm: take i_mmap_lock in unmap_mapping_range() for DAX" Date: Tue, 6 Oct 2015 16:28:48 -0600 Message-Id: <1444170529-12814-2-git-send-email-ross.zwisler@linux.intel.com> X-Mailer: git-send-email 2.1.0 In-Reply-To: <1444170529-12814-1-git-send-email-ross.zwisler@linux.intel.com> References: <1444170529-12814-1-git-send-email-ross.zwisler@linux.intel.com> Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This reverts commits 46c043ede4711e8d598b9d63c5616c1fedb0605e and 8346c416d17bf5b4ea1508662959bb62e73fd6a5. The following two locking commits in the DAX code: commit 843172978bb9 ("dax: fix race between simultaneous faults") commit 46c043ede471 ("mm: take i_mmap_lock in unmap_mapping_range() for DAX") introduced a number of deadlocks and other issues, and need to be reverted for the v4.3 kernel. The list of issues in DAX after these commits (some newly introduced by the commits, some preexisting) can be found here: https://lkml.org/lkml/2015/9/25/602 This revert keeps the PMEM API changes to the zeroing code in __dax_pmd_fault(), which were added by this commit: commit d77e92e270ed ("dax: update PMD fault handler with PMEM API") It also keeps the code dropping mapping->i_mmap_rwsem before calling unmap_mapping_range(), but converts it to a read lock since that's what is now used by the rest of __dax_pmd_fault(). This is needed to avoid recursively acquiring mapping->i_mmap_rwsem, once with a read lock in __dax_pmd_fault() and once with a write lock in unmap_mapping_range(). Signed-off-by: Ross Zwisler --- fs/dax.c | 37 +++++++++++++------------------------ mm/memory.c | 11 +++++++++-- 2 files changed, 22 insertions(+), 26 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index bcfb14b..f665bc9 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -569,36 +569,14 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, if (!buffer_size_valid(&bh) || bh.b_size < PMD_SIZE) goto fallback; - sector = bh.b_blocknr << (blkbits - 9); - - if (buffer_unwritten(&bh) || buffer_new(&bh)) { - int i; - - length = bdev_direct_access(bh.b_bdev, sector, &kaddr, &pfn, - bh.b_size); - if (length < 0) { - result = VM_FAULT_SIGBUS; - goto out; - } - if ((length < PMD_SIZE) || (pfn & PG_PMD_COLOUR)) - goto fallback; - - for (i = 0; i < PTRS_PER_PMD; i++) - clear_pmem(kaddr + i * PAGE_SIZE, PAGE_SIZE); - wmb_pmem(); - count_vm_event(PGMAJFAULT); - mem_cgroup_count_vm_event(vma->vm_mm, PGMAJFAULT); - result |= VM_FAULT_MAJOR; - } - /* * If we allocated new storage, make sure no process has any * zero pages covering this hole */ if (buffer_new(&bh)) { - i_mmap_unlock_write(mapping); + i_mmap_unlock_read(mapping); unmap_mapping_range(mapping, pgoff << PAGE_SHIFT, PMD_SIZE, 0); - i_mmap_lock_write(mapping); + i_mmap_lock_read(mapping); } /* @@ -635,6 +613,7 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, result = VM_FAULT_NOPAGE; spin_unlock(ptl); } else { + sector = bh.b_blocknr << (blkbits - 9); length = bdev_direct_access(bh.b_bdev, sector, &kaddr, &pfn, bh.b_size); if (length < 0) { @@ -644,6 +623,16 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, if ((length < PMD_SIZE) || (pfn & PG_PMD_COLOUR)) goto fallback; + if (buffer_unwritten(&bh) || buffer_new(&bh)) { + int i; + for (i = 0; i < PTRS_PER_PMD; i++) + clear_pmem(kaddr + i * PAGE_SIZE, PAGE_SIZE); + wmb_pmem(); + count_vm_event(PGMAJFAULT); + mem_cgroup_count_vm_event(vma->vm_mm, PGMAJFAULT); + result |= VM_FAULT_MAJOR; + } + result |= vmf_insert_pfn_pmd(vma, address, pmd, pfn, write); } diff --git a/mm/memory.c b/mm/memory.c index 9cb2747..5ec066f 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2426,10 +2426,17 @@ void unmap_mapping_range(struct address_space *mapping, if (details.last_index < details.first_index) details.last_index = ULONG_MAX; - i_mmap_lock_write(mapping); + + /* + * DAX already holds i_mmap_lock to serialise file truncate vs + * page fault and page fault vs page fault. + */ + if (!IS_DAX(mapping->host)) + i_mmap_lock_write(mapping); if (unlikely(!RB_EMPTY_ROOT(&mapping->i_mmap))) unmap_mapping_range_tree(&mapping->i_mmap, &details); - i_mmap_unlock_write(mapping); + if (!IS_DAX(mapping->host)) + i_mmap_unlock_write(mapping); } EXPORT_SYMBOL(unmap_mapping_range);