From patchwork Wed Sep 2 09:13:21 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 7110141 Return-Path: X-Original-To: patchwork-linux-fsdevel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id C41829F32B for ; Wed, 2 Sep 2015 09:13:48 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id BFC4F2065A for ; Wed, 2 Sep 2015 09:13:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B04D320654 for ; Wed, 2 Sep 2015 09:13:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753509AbbIBJN1 (ORCPT ); Wed, 2 Sep 2015 05:13:27 -0400 Received: from mail-wi0-f170.google.com ([209.85.212.170]:33729 "EHLO mail-wi0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752194AbbIBJNY (ORCPT ); Wed, 2 Sep 2015 05:13:24 -0400 Received: by wicmc4 with SMTP id mc4so58859177wic.0 for ; Wed, 02 Sep 2015 02:13:23 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-type:content-disposition:in-reply-to :user-agent; bh=5S/Lgcm5q9fomTW0txXA3jQl8GRv0XwZLORh/knFy2Q=; b=lx3UWuZh5BQHFDzr0BK16BRrwH9Vu5uXNHpIbp0WvOVt0Rp393Eam4+u03SS0QaTol D1x35ECFVgAUOj4L+luJVjrXBQv0yA8EkhjXQ1QdbBBOI/XlvreCRYnr3sFO0Tm5IitM ZF6W182KxjKIaYn77Os8nBUJiY5ZozhYBw7TWODaaoV2VDbf/J3DZ6zoEeZRdr6Hxvza NZQqwsjyEVSy92EuuoCgtEwxi75WxmxLue59udj4107mvgQ7ypDQw3GCeAG6uA6XNeC5 +lya8G/xwxcK89rlMRD7NrqyBERdCXO0/bdrzlNucMpi9koSM+w9+SjoduefQ8f0hgv0 zbKg== X-Gm-Message-State: ALoCoQlCaR0kVsxVoMbxtDoo7QY3kinDm8aJSQvZXo0F932i7ftRK04c58NdLMSRH6Ze58u520Cp X-Received: by 10.194.176.201 with SMTP id ck9mr38172330wjc.108.1441185203423; Wed, 02 Sep 2015 02:13:23 -0700 (PDT) Received: from node.shutemov.name (dsl-espbrasgw1-54f9d4-72.dhcp.inet.fi. [84.249.212.72]) by smtp.gmail.com with ESMTPSA id fb3sm2527911wib.21.2015.09.02.02.13.22 (version=TLS1_2 cipher=AES128-GCM-SHA256 bits=128/128); Wed, 02 Sep 2015 02:13:22 -0700 (PDT) Received: by node.shutemov.name (Postfix, from userid 1000) id C46466490E70; Wed, 2 Sep 2015 12:13:21 +0300 (EEST) Date: Wed, 2 Sep 2015 12:13:21 +0300 From: "Kirill A. Shutemov" To: Dave Chinner Cc: Ross Zwisler , linux-kernel@vger.kernel.org, Alexander Viro , Andrew Morton , Christoph Hellwig , Dave Hansen , "H. Peter Anvin" , Hugh Dickins , Ingo Molnar , "Kirill A. Shutemov" , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@lists.01.org, Matthew Wilcox , Peter Zijlstra , Thomas Gleixner , x86@kernel.org Subject: Re: [PATCH] dax, pmem: add support for msync Message-ID: <20150902091321.GA2323@node.dhcp.inet.fi> References: <1441047584-14664-1-git-send-email-ross.zwisler@linux.intel.com> <20150831233803.GO3902@dastard> <20150901100804.GA7045@node.dhcp.inet.fi> <20150901224922.GR3902@dastard> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20150901224922.GR3902@dastard> User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Wed, Sep 02, 2015 at 08:49:22AM +1000, Dave Chinner wrote: > On Tue, Sep 01, 2015 at 01:08:04PM +0300, Kirill A. Shutemov wrote: > > On Tue, Sep 01, 2015 at 09:38:03AM +1000, Dave Chinner wrote: > > > On Mon, Aug 31, 2015 at 12:59:44PM -0600, Ross Zwisler wrote: > > > Even for DAX, msync has to call vfs_fsync_range() for the filesystem to commit > > > the backing store allocations to stable storage, so there's not > > > getting around the fact msync is the wrong place to be flushing > > > DAX mappings to persistent storage. > > > > Why? > > IIUC, msync() doesn't have any requirements wrt metadata, right? > > Of course it does. If the backing store allocation has not been > committed, then after a crash there will be a hole in file and > so it will read as zeroes regardless of what data was written and > flushed. Any reason why backing store allocation cannot be committed on *_mkwrite? > > > I pointed this out almost 6 months ago (i.e. that fsync was broken) > > > anf hinted at how to solve it. Fix fsync, and msync gets fixed for > > > free: > > > > > > https://lists.01.org/pipermail/linux-nvdimm/2015-March/000341.html > > > > > > I've also reported to Willy that DAX write page faults don't work > > > correctly, either. xfstests generic/080 exposes this: a read > > > from a page followed immediately by a write to that page does not > > > result in ->page_mkwrite being called on the write and so > > > backing store is not allocated for the page, nor are the timestamps > > > for the file updated. This will also result in fsync (and msync) > > > not working properly. > > > > Is that because XFS doesn't provide vm_ops->pfn_mkwrite? > > I didn't know that had been committed. I don't recall seeing a pull > request with that in it It went though -mm tree. > none of the XFS DAX patches conflicted > against it and there's been no runtime errors. I'll fix it up. > > As such, shouldn't there be a check in the VM (in ->mmap callers) > that if we have the vma is returned with VM_MIXEDMODE enabled that > ->pfn_mkwrite is not NULL? It's now clear to me that any filesystem > that sets VM_MIXEDMODE needs to support both page_mkwrite and > pfn_mkwrite, and such a check would have caught this immediately... I guess it's "both or none" case. We have VM_MIXEDMAP users who don't care about *_mkwrite. I'm not yet sure it would be always correct, but something like this will catch the XFS case, without false-positive on other stuff in my KVM setup: diff --git a/mm/mmap.c b/mm/mmap.c index 3f78bceefe5a..f2e29a541e14 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1645,6 +1645,15 @@ unsigned long mmap_region(struct file *file, unsigned long addr, vma->vm_ops = &dummy_ops; } + /* + * Make sure that for VM_MIXEDMAP VMA has both + * vm_ops->page_mkwrite and vm_ops->pfn_mkwrite or has none. + */ + if ((vma->vm_ops->page_mkwrite || vma->vm_ops->pfn_mkwrite) && + vma->vm_flags & VM_MIXEDMAP) { + VM_BUG_ON_VMA(!vma->vm_ops->page_mkwrite, vma); + VM_BUG_ON_VMA(!vma->vm_ops->pfn_mkwrite, vma); + } addr = vma->vm_start; vm_flags = vma->vm_flags; } else if (vm_flags & VM_SHARED) {