From patchwork Mon Mar 23 12:54:40 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boaz Harrosh X-Patchwork-Id: 6072991 Return-Path: X-Original-To: patchwork-linux-nvdimm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id E046ABF90F for ; Mon, 23 Mar 2015 12:54:46 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 1063F20259 for ; Mon, 23 Mar 2015 12:54:46 +0000 (UTC) Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 133242024C for ; Mon, 23 Mar 2015 12:54:45 +0000 (UTC) Received: from ml01.vlan14.01.org (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id 04D8E80F54; Mon, 23 Mar 2015 05:54:45 -0700 (PDT) X-Original-To: linux-nvdimm@ml01.01.org Delivered-To: linux-nvdimm@ml01.01.org Received: from mail-wi0-f174.google.com (mail-wi0-f174.google.com [209.85.212.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 1F2C280F54 for ; Mon, 23 Mar 2015 05:54:44 -0700 (PDT) Received: by wibg7 with SMTP id g7so46495223wib.1 for ; Mon, 23 Mar 2015 05:54:42 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :subject:references:in-reply-to:content-type :content-transfer-encoding; bh=+V9L5XrklUT3AR26amRVJy/XC7Xa31OtAuTBpmwLB+g=; b=Hr2z2EibvmS9BefVF7q2oAymukJS6FqjLdRSLQQIcBadcqvCuujh7ITxo3sxlKI3Dd vF24T9nO2OOcYW2jK2Tbdc720cIUtE8a9ouy7t6NA792A4+tIR3OSn3HDdviLLSlpe+2 UbREG2PcYkdHtDsFVkewWt46W3IutNRKDZNn582VAu+W07Jdeh2b59E/4UwKxJq0M7on T4fPBFpwpb3ei8827jJ27nzIaWJ2m3//qaTUa8ipAmRmRmEcJuxc/RYkVSZHqTqoC8g2 Or/r+mUHLlIgeN9+z/4Tecfal+wLL4hVCm/NwmAuW9xi3FltD0sp9MObSasHIoIo90AX nvWg== X-Gm-Message-State: ALoCoQnRHIY5uNAVkWTbqU6Fz5oWmnnPXPAlYStASmlT5nN0bPdvbP/v4ISxdKsyBnI0v5V0KN4S X-Received: by 10.180.208.33 with SMTP id mb1mr19187170wic.69.1427115282757; Mon, 23 Mar 2015 05:54:42 -0700 (PDT) Received: from [10.0.0.5] ([207.232.55.62]) by mx.google.com with ESMTPSA id cn10sm11019343wib.15.2015.03.23.05.54.41 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 23 Mar 2015 05:54:42 -0700 (PDT) Message-ID: <55100D10.6090902@plexistor.com> Date: Mon, 23 Mar 2015 14:54:40 +0200 From: Boaz Harrosh User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: Dave Chinner , Matthew Wilcox , Andrew Morton , "Kirill A. Shutemov" , Jan Kara , Hugh Dickins , Mel Gorman , linux-mm@kvack.org, linux-nvdimm , linux-fsdevel , Eryu Guan References: <55100B78.501@plexistor.com> In-Reply-To: <55100B78.501@plexistor.com> Subject: [Linux-nvdimm] [PATCH 3/3] RFC: dax: dax_prepare_freeze X-BeenThere: linux-nvdimm@lists.01.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: "Linux-nvdimm developer list." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Boaz Harrosh When freezing an FS, we must write protect all IS_DAX() inodes that have an mmap mapping on an inode. Otherwise application will be able to modify previously faulted-in file pages. I'm actually doing a full unmap_mapping_range because there is no readily available "mapping_write_protect" like functionality. I do not think it is worth it to define one just for here and just for some extra read-faults after an fs_freeze. How hot-path is fs_freeze at all? CC: Jan Kara CC: Matthew Wilcox CC: Andrew Morton Signed-off-by: Boaz Harrosh --- fs/dax.c | 30 ++++++++++++++++++++++++++++++ fs/super.c | 3 +++ include/linux/fs.h | 1 + 3 files changed, 34 insertions(+) diff --git a/fs/dax.c b/fs/dax.c index d0bd1f4..f3fc28b 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -549,3 +549,33 @@ int dax_truncate_page(struct inode *inode, loff_t from, get_block_t get_block) return dax_zero_page_range(inode, from, length, get_block); } EXPORT_SYMBOL_GPL(dax_truncate_page); + +/* This is meant to be called as part of freeze_super. otherwise we might + * Need some extra locking before calling here. + */ +void dax_prepare_freeze(struct super_block *sb) +{ + struct inode *inode; + + /* TODO: each DAX fs has some private mount option to enable DAX. If + * We made that option a generic MS_DAX_ENABLE super_block flag we could + * Avoid the 95% extra unneeded loop-on-all-inodes every freeze. + * if (!(sb->s_flags & MS_DAX_ENABLE)) + * return 0; + */ + + list_for_each_entry(inode, &sb->s_inodes, i_sb_list) { + /* TODO: For freezing we can actually do with write-protecting + * the page. But I cannot find a ready made function that does + * that for a giving mapping (with all the proper locking). + * How performance sensitive is the all sb_freeze API? + * For now we can just unmap the all mapping, and pay extra + * on read faults. + */ + /* NOTE: Do not unmap private COW mapped pages it will not + * modify the FS. + */ + if (IS_DAX(inode)) + unmap_mapping_range(inode->i_mapping, 0, 0, 0); + } +} diff --git a/fs/super.c b/fs/super.c index 2b7dc90..9ef490c 100644 --- a/fs/super.c +++ b/fs/super.c @@ -1329,6 +1329,9 @@ int freeze_super(struct super_block *sb) /* All writers are done so after syncing there won't be dirty data */ sync_filesystem(sb); + /* Need to take care of DAX mmaped inodes */ + dax_prepare_freeze(sb); + /* Now wait for internal filesystem counter */ sb->s_writers.frozen = SB_FREEZE_FS; smp_wmb(); diff --git a/include/linux/fs.h b/include/linux/fs.h index 24af817..3b943d4 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2599,6 +2599,7 @@ int dax_truncate_page(struct inode *, loff_t from, get_block_t); int dax_fault(struct vm_area_struct *, struct vm_fault *, get_block_t); int dax_pfn_mkwrite(struct vm_area_struct *, struct vm_fault *); #define dax_mkwrite(vma, vmf, gb) dax_fault(vma, vmf, gb) +void dax_prepare_freeze(struct super_block *sb); #ifdef CONFIG_BLOCK typedef void (dio_submit_t)(int rw, struct bio *bio, struct inode *inode,