From patchwork Sat May 16 00:55:58 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jaegeuk Kim X-Patchwork-Id: 6418641 Return-Path: X-Original-To: patchwork-linux-fsdevel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 9B599C0432 for ; Sat, 16 May 2015 00:56:08 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id A388220573 for ; Sat, 16 May 2015 00:56:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 85C152049E for ; Sat, 16 May 2015 00:56:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2992534AbbEPA4F (ORCPT ); Fri, 15 May 2015 20:56:05 -0400 Received: from mail.kernel.org ([198.145.29.136]:36976 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934918AbbEPA4D (ORCPT ); Fri, 15 May 2015 20:56:03 -0400 Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 8ADB520490; Sat, 16 May 2015 00:56:01 +0000 (UTC) Received: from localhost (mobile-166-171-250-138.mycingular.net [166.171.250.138]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 4994C20573; Sat, 16 May 2015 00:56:00 +0000 (UTC) Date: Fri, 15 May 2015 17:55:58 -0700 From: Jaegeuk Kim To: Chao Yu Cc: 'hujianyang' , linux-fsdevel@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net Subject: Re: [f2fs-dev] Space leak in f2fs Message-ID: <20150516005436.GA10530@jaegeuk-mac02.mot.com> References: <5552FA7D.7000704@huawei.com> <20150513174417.GA56247@jaegeuk-mac02.mot.com> <20150514002417.GC68412@jaegeuk-mac02> <5553FD09.9030508@huawei.com> <20150514211250.GA76424@jaegeuk-mac02.mot.com> <015e01d08ee9$ac8e1060$05aa3120$@samsung.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <015e01d08ee9$ac8e1060$05aa3120$@samsung.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Hi Chao, On Fri, May 15, 2015 at 04:31:43PM +0800, Chao Yu wrote: > Hi Jaegeuk, > [snip] > > + /* if orphan inode, we don't need to write its data */ > > + if (is_orphan_inode(sbi, inode->i_ino)) > > + goto out; > > When user create a temp file by invoking open with O_TMPFILE flag, > in ->tmpfile our temp file will be added into orphan list as its > nlink is zero. > > If we skip writting out data for this orphan inode, later, even though > we add nlink/directory entry for orphan inode by calling linkat, > our file will contain inconsistent data between in-memory and on-disk. > > So how about considering for this case? Right. How about the below patch? > > BTW, the previous fixing patch looks good to me. But, my new concern here is a memory pressure. If we do not drop the inode when iput was called, we need to wait for another time slot to reclaim its memory. Thanks, --- fs/f2fs/checkpoint.c | 19 +++++++++++++++++++ fs/f2fs/data.c | 8 ++++++++ fs/f2fs/dir.c | 1 + fs/f2fs/f2fs.h | 2 ++ fs/f2fs/super.c | 14 +++++++++++++- 5 files changed, 43 insertions(+), 1 deletion(-) diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c index 7b7a9d8..74875fb 100644 --- a/fs/f2fs/checkpoint.c +++ b/fs/f2fs/checkpoint.c @@ -378,6 +378,20 @@ static void __remove_ino_entry(struct f2fs_sb_info *sbi, nid_t ino, int type) spin_unlock(&im->ino_lock); } +static bool __exist_ino_entry(struct f2fs_sb_info *sbi, nid_t ino, int type) +{ + struct inode_management *im = &sbi->im[type]; + struct ino_entry *e; + bool exist = false; + + spin_lock(&im->ino_lock); + e = radix_tree_lookup(&im->ino_root, ino); + if (e) + exist = true; + spin_unlock(&im->ino_lock); + return exist; +} + void add_dirty_inode(struct f2fs_sb_info *sbi, nid_t ino, int type) { /* add new dirty ino entry into list */ @@ -458,6 +472,11 @@ void remove_orphan_inode(struct f2fs_sb_info *sbi, nid_t ino) __remove_ino_entry(sbi, ino, ORPHAN_INO); } +bool is_orphan_inode(struct f2fs_sb_info *sbi, nid_t ino) +{ + return __exist_ino_entry(sbi, ino, ORPHAN_INO); +} + static void recover_orphan_inode(struct f2fs_sb_info *sbi, nid_t ino) { struct inode *inode = f2fs_iget(sbi->sb, ino); diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index b0cc2aa..d883c14 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -1749,6 +1749,14 @@ write: goto out; } + /* + * if orphan inode, we don't need to write its data, + * but, tmpfile is not the case. + */ + if (is_orphan_inode(sbi, inode->i_ino) && + !is_inode_flag_set(F2FS_I(inode), FI_TMP_INODE)) + goto out; + if (!wbc->for_reclaim) need_balance_fs = true; else if (has_not_enough_free_secs(sbi, 0)) diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c index 3e92376..a2ea1b9 100644 --- a/fs/f2fs/dir.c +++ b/fs/f2fs/dir.c @@ -648,6 +648,7 @@ int f2fs_do_tmpfile(struct inode *inode, struct inode *dir) update_inode(inode, page); f2fs_put_page(page, 1); + set_inode_flag(F2FS_I(inode), FI_TMP_INODE); clear_inode_flag(F2FS_I(inode), FI_NEW_INODE); fail: up_write(&F2FS_I(inode)->i_sem); diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index cdcae06..de21d38 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -1337,6 +1337,7 @@ static inline void f2fs_change_bit(unsigned int nr, char *addr) /* used for f2fs_inode_info->flags */ enum { FI_NEW_INODE, /* indicate newly allocated inode */ + FI_TMP_INODE, /* indicate tmpfile */ FI_DIRTY_INODE, /* indicate inode is dirty or not */ FI_DIRTY_DIR, /* indicate directory has dirty pages */ FI_INC_LINK, /* need to increment i_nlink */ @@ -1726,6 +1727,7 @@ int acquire_orphan_inode(struct f2fs_sb_info *); void release_orphan_inode(struct f2fs_sb_info *); void add_orphan_inode(struct f2fs_sb_info *, nid_t); void remove_orphan_inode(struct f2fs_sb_info *, nid_t); +bool is_orphan_inode(struct f2fs_sb_info *, nid_t); void recover_orphan_inodes(struct f2fs_sb_info *); int get_valid_checkpoint(struct f2fs_sb_info *); void update_dirty_page(struct inode *, struct page *); diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c index 7464d08..98af3bf 100644 --- a/fs/f2fs/super.c +++ b/fs/f2fs/super.c @@ -430,9 +430,21 @@ static int f2fs_drop_inode(struct inode *inode) * - f2fs_write_data_page * - f2fs_gc -> iput -> evict * - inode_wait_for_writeback(inode) + * In order to avoid that, f2fs_write_data_page does not write data + * pages for orphan inode except tmpfile. + * Nevertheless, we need to truncate the tmpfile's data to avoid + * needless cleaning. */ - if (!inode_unhashed(inode) && inode->i_state & I_SYNC) + if (is_inode_flag_set(F2FS_I(inode), FI_TMP_INODE) && + inode->i_state & I_SYNC) { + spin_unlock(&inode->i_lock); + i_size_write(inode, 0); + + if (F2FS_HAS_BLOCKS(inode)) + f2fs_truncate(inode); + spin_lock(&inode->i_lock); return 0; + } return generic_drop_inode(inode); }