From patchwork Tue Oct 6 00:48:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mauricio Faria de Oliveira X-Patchwork-Id: 11819241 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 79F971920 for ; Tue, 6 Oct 2020 21:14:06 +0000 (UTC) Received: from userp2120.oracle.com (userp2120.oracle.com [156.151.31.85]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3BDEF206BE for ; Tue, 6 Oct 2020 21:14:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3BDEF206BE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=canonical.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=ocfs2-devel-bounces@oss.oracle.com Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 096LBNdc124040; Tue, 6 Oct 2020 21:13:29 GMT Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by userp2120.oracle.com with ESMTP id 33xhxmxfa7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Tue, 06 Oct 2020 21:13:29 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 096LA9FR176212; Tue, 6 Oct 2020 21:13:29 GMT Received: from oss.oracle.com (oss-old-reserved.oracle.com [137.254.22.2]) by userp3030.oracle.com with ESMTP id 33y37xkgyg-1 (version=TLSv1 cipher=AES256-SHA bits=256 verify=NO); Tue, 06 Oct 2020 21:13:28 +0000 Received: from localhost ([127.0.0.1] helo=lb-oss.oracle.com) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1kPuH1-00054K-Vp; Tue, 06 Oct 2020 14:13:27 -0700 Received: from userp3020.oracle.com ([156.151.31.79]) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1kPbA0-0003fg-Mv for ocfs2-devel@oss.oracle.com; Mon, 05 Oct 2020 17:48:56 -0700 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 0960V74Z058478 for ; Tue, 6 Oct 2020 00:48:56 GMT Received: from userp2040.oracle.com (userp2040.oracle.com [156.151.31.90]) by userp3020.oracle.com with ESMTP id 33yyjeqm4j-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 06 Oct 2020 00:48:56 +0000 Received: from pps.filterd (userp2040.oracle.com [127.0.0.1]) by userp2040.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 0960WvjK034601 for ; Tue, 6 Oct 2020 00:48:55 GMT Received: from youngberry.canonical.com (youngberry.canonical.com [91.189.89.112]) by userp2040.oracle.com with ESMTP id 33xfwtet45-1 for ; Tue, 06 Oct 2020 00:48:55 +0000 Received: from mail-qt1-f200.google.com ([209.85.160.200]) by youngberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1kPb9x-0004V6-C0 for ocfs2-devel@oss.oracle.com; Tue, 06 Oct 2020 00:48:53 +0000 Received: by mail-qt1-f200.google.com with SMTP id 60so7943203qtf.21 for ; Mon, 05 Oct 2020 17:48:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=QDJEdG2z20N75SAmFr4Af1C1v9pHGHwyahIO8UDCZkA=; b=HmG4EgobaU1b4O1+psRfmjJnbI3+tVFYqAsD4KPKI6KwNbpGcv78ueGyRnRMrktUTW ca4oAXmyFc9nD78l9Y3g/1uudPN7Cmnykef7Fkh2TVNqZc44WOdkMXAED1I+yNFp4zrn JYKPeQzs80n+cyUPRmSqyQHiWgDijAJNisr60lTNuTa+Gc2fWXkErpvILQpb3zoTf7dI fdmMtvh3/7vaY0BAstwk6AzpLkR/cXSP4uILc8M2hUaEymyGKf3vRUyVQmUms437zZkx 3/NuYEmedMkZeGESaTR3QvSFR0BEBZjI29BoTZAxLw/1WWsXSFS2Fm5WAoDCWs9uzl0B JjTA== X-Gm-Message-State: AOAM531rIngFdJY5EKaifrYFF7qSfZNYxY2dzNzuZvPVEEyylH0NBPDi 0ru7KeNixhx9Ce0XgqT94lRll/p/ddTK9QyT4UoqP03f8E3YbB6DCHxEdY+SJs8njIcIGR2WcVQ VlJUbs98LbaS/kooVOrIRXGYd4zZjM4X1Ot8sGTo= X-Received: by 2002:a05:620a:4151:: with SMTP id k17mr2754713qko.433.1601945332280; Mon, 05 Oct 2020 17:48:52 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyKojxWB43glkP12uUdQAcKn67juGvI+RCd1iQ8YaQipWlkUBZKM/cuVDdsBaX9b+dNh4ZTCw== X-Received: by 2002:a05:620a:4151:: with SMTP id k17mr2754699qko.433.1601945332038; Mon, 05 Oct 2020 17:48:52 -0700 (PDT) Received: from localhost.localdomain ([201.82.49.101]) by smtp.gmail.com with ESMTPSA id l125sm1355322qke.23.2020.10.05.17.48.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 05 Oct 2020 17:48:51 -0700 (PDT) From: Mauricio Faria de Oliveira To: linux-ext4@vger.kernel.org, ocfs2-devel@oss.oracle.com Date: Mon, 5 Oct 2020 21:48:38 -0300 Message-Id: <20201006004841.600488-2-mfo@canonical.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20201006004841.600488-1-mfo@canonical.com> References: <20201006004841.600488-1-mfo@canonical.com> MIME-Version: 1.0 X-PDR: PASS X-Source-IP: 91.189.89.112 X-ServerName: youngberry.canonical.com X-Proofpoint-SPF-Result: None X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9765 signatures=668680 X-Proofpoint-Spam-Details: rule=tap_notspam policy=tap score=0 malwarescore=0 lowpriorityscore=0 phishscore=0 impostorscore=0 mlxlogscore=999 priorityscore=85 clxscore=191 bulkscore=0 suspectscore=2 spamscore=0 mlxscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2010060001 X-Spam: Clean X-Mailman-Approved-At: Tue, 06 Oct 2020 14:13:27 -0700 Cc: Andreas Dilger , dann frazier , Jan Kara Subject: [Ocfs2-devel] [PATCH v5 1/4] jbd2: introduce/export functions jbd2_journal_submit|finish_inode_data_buffers() X-BeenThere: ocfs2-devel@oss.oracle.com X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: ocfs2-devel-bounces@oss.oracle.com Errors-To: ocfs2-devel-bounces@oss.oracle.com X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9766 signatures=668680 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 malwarescore=0 suspectscore=0 adultscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2010060140 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9766 signatures=668680 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxscore=0 malwarescore=0 bulkscore=0 impostorscore=0 lowpriorityscore=0 suspectscore=0 phishscore=0 mlxlogscore=999 adultscore=0 clxscore=1011 spamscore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2010060140 Export functions that implement the current behavior done for an inode in journal_submit|finish_inode_data_buffers(). No functional change. Signed-off-by: Mauricio Faria de Oliveira Suggested-by: Jan Kara Reviewed-by: Jan Kara Reviewed-by: Andreas Dilger --- fs/jbd2/commit.c | 36 ++++++++++++++++-------------------- fs/jbd2/journal.c | 2 ++ include/linux/jbd2.h | 4 ++++ 3 files changed, 22 insertions(+), 20 deletions(-) diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c index 6d2da8ad0e6f..f79b86b4241f 100644 --- a/fs/jbd2/commit.c +++ b/fs/jbd2/commit.c @@ -187,19 +187,17 @@ static int journal_wait_on_commit_record(journal_t *journal, * use writepages() because with delayed allocation we may be doing * block allocation in writepages(). */ -static int journal_submit_inode_data_buffers(struct address_space *mapping, - loff_t dirty_start, loff_t dirty_end) +int jbd2_journal_submit_inode_data_buffers(struct jbd2_inode *jinode) { - int ret; + struct address_space *mapping = jinode->i_vfs_inode->i_mapping; struct writeback_control wbc = { .sync_mode = WB_SYNC_ALL, .nr_to_write = mapping->nrpages * 2, - .range_start = dirty_start, - .range_end = dirty_end, + .range_start = jinode->i_dirty_start, + .range_end = jinode->i_dirty_end, }; - ret = generic_writepages(mapping, &wbc); - return ret; + return generic_writepages(mapping, &wbc); } /* @@ -215,16 +213,11 @@ static int journal_submit_data_buffers(journal_t *journal, { struct jbd2_inode *jinode; int err, ret = 0; - struct address_space *mapping; spin_lock(&journal->j_list_lock); list_for_each_entry(jinode, &commit_transaction->t_inode_list, i_list) { - loff_t dirty_start = jinode->i_dirty_start; - loff_t dirty_end = jinode->i_dirty_end; - if (!(jinode->i_flags & JI_WRITE_DATA)) continue; - mapping = jinode->i_vfs_inode->i_mapping; jinode->i_flags |= JI_COMMIT_RUNNING; spin_unlock(&journal->j_list_lock); /* @@ -234,8 +227,7 @@ static int journal_submit_data_buffers(journal_t *journal, * only allocated blocks here. */ trace_jbd2_submit_inode_data(jinode->i_vfs_inode); - err = journal_submit_inode_data_buffers(mapping, dirty_start, - dirty_end); + err = jbd2_journal_submit_inode_data_buffers(jinode); if (!ret) ret = err; spin_lock(&journal->j_list_lock); @@ -248,6 +240,15 @@ static int journal_submit_data_buffers(journal_t *journal, return ret; } +int jbd2_journal_finish_inode_data_buffers(struct jbd2_inode *jinode) +{ + struct address_space *mapping = jinode->i_vfs_inode->i_mapping; + + return filemap_fdatawait_range_keep_errors(mapping, + jinode->i_dirty_start, + jinode->i_dirty_end); +} + /* * Wait for data submitted for writeout, refile inodes to proper * transaction if needed. @@ -262,16 +263,11 @@ static int journal_finish_inode_data_buffers(journal_t *journal, /* For locking, see the comment in journal_submit_data_buffers() */ spin_lock(&journal->j_list_lock); list_for_each_entry(jinode, &commit_transaction->t_inode_list, i_list) { - loff_t dirty_start = jinode->i_dirty_start; - loff_t dirty_end = jinode->i_dirty_end; - if (!(jinode->i_flags & JI_WAIT_DATA)) continue; jinode->i_flags |= JI_COMMIT_RUNNING; spin_unlock(&journal->j_list_lock); - err = filemap_fdatawait_range_keep_errors( - jinode->i_vfs_inode->i_mapping, dirty_start, - dirty_end); + err = jbd2_journal_finish_inode_data_buffers(jinode); if (!ret) ret = err; spin_lock(&journal->j_list_lock); diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c index 17fdc482f554..c0600405e7a2 100644 --- a/fs/jbd2/journal.c +++ b/fs/jbd2/journal.c @@ -91,6 +91,8 @@ EXPORT_SYMBOL(jbd2_journal_try_to_free_buffers); EXPORT_SYMBOL(jbd2_journal_force_commit); EXPORT_SYMBOL(jbd2_journal_inode_ranged_write); EXPORT_SYMBOL(jbd2_journal_inode_ranged_wait); +EXPORT_SYMBOL(jbd2_journal_submit_inode_data_buffers); +EXPORT_SYMBOL(jbd2_journal_finish_inode_data_buffers); EXPORT_SYMBOL(jbd2_journal_init_jbd_inode); EXPORT_SYMBOL(jbd2_journal_release_jbd_inode); EXPORT_SYMBOL(jbd2_journal_begin_ordered_truncate); diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h index 08f904943ab2..2865a5475888 100644 --- a/include/linux/jbd2.h +++ b/include/linux/jbd2.h @@ -1421,6 +1421,10 @@ extern int jbd2_journal_inode_ranged_write(handle_t *handle, extern int jbd2_journal_inode_ranged_wait(handle_t *handle, struct jbd2_inode *inode, loff_t start_byte, loff_t length); +extern int jbd2_journal_submit_inode_data_buffers( + struct jbd2_inode *jinode); +extern int jbd2_journal_finish_inode_data_buffers( + struct jbd2_inode *jinode); extern int jbd2_journal_begin_ordered_truncate(journal_t *journal, struct jbd2_inode *inode, loff_t new_size); extern void jbd2_journal_init_jbd_inode(struct jbd2_inode *jinode, struct inode *inode); From patchwork Tue Oct 6 00:48:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mauricio Faria de Oliveira X-Patchwork-Id: 11819247 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9FE9A13B2 for ; Tue, 6 Oct 2020 21:15:45 +0000 (UTC) Received: from aserp2130.oracle.com (aserp2130.oracle.com [141.146.126.79]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 5DCB2206BE for ; Tue, 6 Oct 2020 21:15:45 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5DCB2206BE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=canonical.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=ocfs2-devel-bounces@oss.oracle.com Received: from pps.filterd (aserp2130.oracle.com [127.0.0.1]) by aserp2130.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 096LE9le157373; Tue, 6 Oct 2020 21:15:30 GMT Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by aserp2130.oracle.com with ESMTP id 33xetaxq99-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Tue, 06 Oct 2020 21:15:29 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 096LA9Hq176213; Tue, 6 Oct 2020 21:13:29 GMT Received: from oss.oracle.com (oss-old-reserved.oracle.com [137.254.22.2]) by userp3030.oracle.com with ESMTP id 33y37xkgyj-1 (version=TLSv1 cipher=AES256-SHA bits=256 verify=NO); Tue, 06 Oct 2020 21:13:29 +0000 Received: from localhost ([127.0.0.1] helo=lb-oss.oracle.com) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1kPuH2-00054p-21; Tue, 06 Oct 2020 14:13:28 -0700 Received: from aserp3030.oracle.com ([141.146.126.71]) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1kPbA4-0003fz-7l for ocfs2-devel@oss.oracle.com; Mon, 05 Oct 2020 17:49:00 -0700 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 0960VCW7190372 for ; Tue, 6 Oct 2020 00:49:00 GMT Received: from userp2040.oracle.com (userp2040.oracle.com [156.151.31.90]) by aserp3030.oracle.com with ESMTP id 33y2vm9u7t-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 06 Oct 2020 00:48:59 +0000 Received: from pps.filterd (userp2040.oracle.com [127.0.0.1]) by userp2040.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 0960WuLq034557 for ; Tue, 6 Oct 2020 00:48:59 GMT Received: from youngberry.canonical.com (youngberry.canonical.com [91.189.89.112]) by userp2040.oracle.com with ESMTP id 33xfwtet4u-1 for ; Tue, 06 Oct 2020 00:48:58 +0000 Received: from mail-qk1-f197.google.com ([209.85.222.197]) by youngberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1kPbA0-0004Vd-AE for ocfs2-devel@oss.oracle.com; Tue, 06 Oct 2020 00:48:56 +0000 Received: by mail-qk1-f197.google.com with SMTP id s9so8059530qks.21 for ; Mon, 05 Oct 2020 17:48:56 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=5wW6u9KhIdzkZL6+0fxNb+/kiqeWxLKg1rVDMYJmums=; b=AEJT1kt+Cx4lyHeIKjnbk86v+9AiCRVEs5aqiaLC4M78KBijbvTjfecsiXr8clpe4Q fmeL6O2iljVHegAwgeFH7+azPJkoz1dVphzhfDUzx1LFH9CMyOx/xuqcLYgP7MH02V8z B85Jv/5pUkj9QSZKd/5IRwJKGpUw27KAeVqBwThwltYCG+PBYanSuYkD6mpjx9Zj+CFJ vnUOiUeYiDOZduLUd4f35vHL+Nnd/02NWjq67SaSJ94sFu83sAHeGRw8p1wliRlLfJSq HHAkVtkP429srE0f1+eb0jz58LXwU3fvUGDUElZHqhQUDpgkJ8GIhqLowBcBMOMxAzRq xZ9A== X-Gm-Message-State: AOAM533nJpKCS9EMz0KTGDYh47pTrCgSgb5vJNnK70yy7Nm/pAXjqRRx IYBOpUDOfbgVCKewVMh3RzpdO5hWNWGCqK82xUeTT5Nzj2kRsmzEt6vvPqE90HdchbAF0mKVe4u CvwKLzLCS8sob3NekArKTxUCecfjs2yp93RAbERc= X-Received: by 2002:ac8:3ac4:: with SMTP id x62mr2729799qte.279.1601945335337; Mon, 05 Oct 2020 17:48:55 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwdMucG6vaINfiXNjEeZJAfO7IqGRJS5/76sQsYvjr0lPJUqN2NbG71SIOSUfr0cDWO8hqllw== X-Received: by 2002:ac8:3ac4:: with SMTP id x62mr2729781qte.279.1601945335088; Mon, 05 Oct 2020 17:48:55 -0700 (PDT) Received: from localhost.localdomain ([201.82.49.101]) by smtp.gmail.com with ESMTPSA id l125sm1355322qke.23.2020.10.05.17.48.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 05 Oct 2020 17:48:54 -0700 (PDT) From: Mauricio Faria de Oliveira To: linux-ext4@vger.kernel.org, ocfs2-devel@oss.oracle.com Date: Mon, 5 Oct 2020 21:48:39 -0300 Message-Id: <20201006004841.600488-3-mfo@canonical.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20201006004841.600488-1-mfo@canonical.com> References: <20201006004841.600488-1-mfo@canonical.com> MIME-Version: 1.0 X-PDR: PASS X-Source-IP: 91.189.89.112 X-ServerName: youngberry.canonical.com X-Proofpoint-SPF-Result: None X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9765 signatures=668680 X-Proofpoint-Spam-Details: rule=tap_notspam policy=tap score=0 malwarescore=0 lowpriorityscore=0 phishscore=0 impostorscore=0 mlxlogscore=833 priorityscore=85 clxscore=192 bulkscore=0 suspectscore=0 spamscore=0 mlxscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2010060001 X-Spam: Clean X-Mailman-Approved-At: Tue, 06 Oct 2020 14:13:27 -0700 Cc: Andreas Dilger , dann frazier , Jan Kara Subject: [Ocfs2-devel] [PATCH v5 2/4] jbd2, ext4, ocfs2: introduce/use journal callbacks j_submit|finish_inode_data_buffers() X-BeenThere: ocfs2-devel@oss.oracle.com X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: ocfs2-devel-bounces@oss.oracle.com Errors-To: ocfs2-devel-bounces@oss.oracle.com X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9766 signatures=668680 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 malwarescore=0 suspectscore=0 adultscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2010060140 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9766 signatures=668680 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 spamscore=0 mlxscore=0 clxscore=1015 priorityscore=1501 adultscore=0 mlxlogscore=999 phishscore=0 impostorscore=0 malwarescore=0 suspectscore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2010060141 Introduce journal callbacks to allow different behaviors for an inode in journal_submit|finish_inode_data_buffers(). The existing users of the current behavior (ext4, ocfs2) are adapted to use the previously exported functions that implement the current behavior. Users are callers of jbd2_journal_inode_ranged_write|wait(), which adds the inode to the transaction's inode list with the JI_WRITE|WAIT_DATA flags. Only ext4 and ocfs2 in-tree. Both CONFIG_EXT4_FS and CONFIG_OCSFS2_FS select CONFIG_JBD2, which builds fs/jbd2/commit.c and journal.c that define and export the functions, so we can call directly in ext4/ocfs2. Signed-off-by: Mauricio Faria de Oliveira Suggested-by: Jan Kara Reviewed-by: Jan Kara Reviewed-by: Andreas Dilger --- fs/ext4/super.c | 4 ++++ fs/jbd2/commit.c | 30 ++++++++++++++++++------------ fs/ocfs2/journal.c | 4 ++++ include/linux/jbd2.h | 25 ++++++++++++++++++++++++- 4 files changed, 50 insertions(+), 13 deletions(-) diff --git a/fs/ext4/super.c b/fs/ext4/super.c index ea425b49b345..a14c1ed39aa3 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -4646,6 +4646,10 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) set_task_ioprio(sbi->s_journal->j_task, journal_ioprio); sbi->s_journal->j_commit_callback = ext4_journal_commit_callback; + sbi->s_journal->j_submit_inode_data_buffers = + jbd2_journal_submit_inode_data_buffers; + sbi->s_journal->j_finish_inode_data_buffers = + jbd2_journal_finish_inode_data_buffers; no_journal: if (!test_opt(sb, NO_MBCACHE)) { diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c index f79b86b4241f..6252b4c50666 100644 --- a/fs/jbd2/commit.c +++ b/fs/jbd2/commit.c @@ -197,6 +197,12 @@ int jbd2_journal_submit_inode_data_buffers(struct jbd2_inode *jinode) .range_end = jinode->i_dirty_end, }; + /* + * submit the inode data buffers. We use writepage + * instead of writepages. Because writepages can do + * block allocation with delalloc. We need to write + * only allocated blocks here. + */ return generic_writepages(mapping, &wbc); } @@ -220,16 +226,13 @@ static int journal_submit_data_buffers(journal_t *journal, continue; jinode->i_flags |= JI_COMMIT_RUNNING; spin_unlock(&journal->j_list_lock); - /* - * submit the inode data buffers. We use writepage - * instead of writepages. Because writepages can do - * block allocation with delalloc. We need to write - * only allocated blocks here. - */ + /* submit the inode data buffers. */ trace_jbd2_submit_inode_data(jinode->i_vfs_inode); - err = jbd2_journal_submit_inode_data_buffers(jinode); - if (!ret) - ret = err; + if (journal->j_submit_inode_data_buffers) { + err = journal->j_submit_inode_data_buffers(jinode); + if (!ret) + ret = err; + } spin_lock(&journal->j_list_lock); J_ASSERT(jinode->i_transaction == commit_transaction); jinode->i_flags &= ~JI_COMMIT_RUNNING; @@ -267,9 +270,12 @@ static int journal_finish_inode_data_buffers(journal_t *journal, continue; jinode->i_flags |= JI_COMMIT_RUNNING; spin_unlock(&journal->j_list_lock); - err = jbd2_journal_finish_inode_data_buffers(jinode); - if (!ret) - ret = err; + /* wait for the inode data buffers writeout. */ + if (journal->j_finish_inode_data_buffers) { + err = journal->j_finish_inode_data_buffers(jinode); + if (!ret) + ret = err; + } spin_lock(&journal->j_list_lock); jinode->i_flags &= ~JI_COMMIT_RUNNING; smp_mb(); diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c index b425f0b01dce..b9a9d69dde7e 100644 --- a/fs/ocfs2/journal.c +++ b/fs/ocfs2/journal.c @@ -883,6 +883,10 @@ int ocfs2_journal_init(struct ocfs2_journal *journal, int *dirty) OCFS2_JOURNAL_DIRTY_FL); journal->j_journal = j_journal; + journal->j_journal->j_submit_inode_data_buffers = + jbd2_journal_submit_inode_data_buffers; + journal->j_journal->j_finish_inode_data_buffers = + jbd2_journal_finish_inode_data_buffers; journal->j_inode = inode; journal->j_bh = bh; diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h index 2865a5475888..4aaa408c0ca7 100644 --- a/include/linux/jbd2.h +++ b/include/linux/jbd2.h @@ -629,7 +629,9 @@ struct transaction_s struct journal_head *t_shadow_list; /* - * List of inodes whose data we've modified in data=ordered mode. + * List of inodes associated with the transaction; e.g., ext4 uses + * this to track inodes in data=ordered and data=journal mode that + * need special handling on transaction commit; also used by ocfs2. * [j_list_lock] */ struct list_head t_inode_list; @@ -1111,6 +1113,27 @@ struct journal_s void (*j_commit_callback)(journal_t *, transaction_t *); + /** + * @j_submit_inode_data_buffers: + * + * This function is called for all inodes associated with the + * committing transaction marked with JI_WRITE_DATA flag + * before we start to write out the transaction to the journal. + */ + int (*j_submit_inode_data_buffers) + (struct jbd2_inode *); + + /** + * @j_finish_inode_data_buffers: + * + * This function is called for all inodes associated with the + * committing transaction marked with JI_WAIT_DATA flag + * after we have written the transaction to the journal + * but before we write out the commit block. + */ + int (*j_finish_inode_data_buffers) + (struct jbd2_inode *); + /* * Journal statistics */ From patchwork Tue Oct 6 00:48:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mauricio Faria de Oliveira X-Patchwork-Id: 11819249 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 43CE46CB for ; Tue, 6 Oct 2020 21:18:55 +0000 (UTC) Received: from userp2120.oracle.com (userp2120.oracle.com [156.151.31.85]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id EB004206BE for ; Tue, 6 Oct 2020 21:18:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EB004206BE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=canonical.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=ocfs2-devel-bounces@oss.oracle.com Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 096LFS9x141921; Tue, 6 Oct 2020 21:18:38 GMT Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by userp2120.oracle.com with ESMTP id 33xhxmxfy3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Tue, 06 Oct 2020 21:18:38 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 096LG9ap193711; Tue, 6 Oct 2020 21:16:38 GMT Received: from oss.oracle.com (oss-old-reserved.oracle.com [137.254.22.2]) by userp3030.oracle.com with ESMTP id 33y37xkmnc-1 (version=TLSv1 cipher=AES256-SHA bits=256 verify=NO); Tue, 06 Oct 2020 21:16:38 +0000 Received: from localhost ([127.0.0.1] helo=lb-oss.oracle.com) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1kPuH2-00055K-4O; Tue, 06 Oct 2020 14:13:28 -0700 Received: from aserp3030.oracle.com ([141.146.126.71]) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1kPbA7-0003gD-Ni for ocfs2-devel@oss.oracle.com; Mon, 05 Oct 2020 17:49:03 -0700 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 0960VFxv190910 for ; Tue, 6 Oct 2020 00:49:03 GMT Received: from userp2030.oracle.com (userp2030.oracle.com [156.151.31.89]) by aserp3030.oracle.com with ESMTP id 33y2vm9ua1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 06 Oct 2020 00:49:03 +0000 Received: from pps.filterd (userp2030.oracle.com [127.0.0.1]) by userp2030.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 0960YD1Q027109 for ; Tue, 6 Oct 2020 00:49:02 GMT Received: from youngberry.canonical.com (youngberry.canonical.com [91.189.89.112]) by userp2030.oracle.com with ESMTP id 33xfbgw3fc-1 for ; Tue, 06 Oct 2020 00:49:02 +0000 Received: from mail-qv1-f71.google.com ([209.85.219.71]) by youngberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1kPbA3-0004WJ-PR for ocfs2-devel@oss.oracle.com; Tue, 06 Oct 2020 00:48:59 +0000 Received: by mail-qv1-f71.google.com with SMTP id t7so7165939qvz.5 for ; Mon, 05 Oct 2020 17:48:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=M4L0gxyNWJ44GCTGybDq1XuqV+pDl5wyu3pRKrP7PvM=; b=Qrx8W0rfgN7zfH0mCT0Gvnm1Un78p1qoPf6ds6YBxzZqf2aM9ifLQpBHwD/C/tkjxw txwXuHoLczYH0pJusxdUgpLTMn6GXJjLtp5QynIQlkYUWVBfhEEWgNlwuTKG4X4rGApF t3Cyx6MAii3rdoUvs55uxLM6CR45uDQ1PNizKorxe4TFbv0HlJ8m9OGWNhYCgzG8zddC cR4ovrEqazIOwtdRIEGvtg5SOGLeujT5+dPIHd4cX6GY12oHEYjjD8JtLOSSP6nImbTm OFmlvfMvsw2LalMdB7NSXfJLURXywDG+ueyIlPhmbb1mNjIWTQvcj+lNCFVAjTVtfoS1 ldHw== X-Gm-Message-State: AOAM533BxaYH9ayBz9q9cSHT48a4CLyGrQfFzDRi/fqicJ4AqhjJaQcF zhwBQ5IczOkygnf4v1aAsL3dHzp3tdh83sS7FlRpkrAWlNwuyFwIRd/oWs6bQqsdR9zz7gd4R0R wvLTbZuS6IzHQnh1qcLPZJNWeoFO0tgvfF8xOkVY= X-Received: by 2002:aed:2786:: with SMTP id a6mr2675405qtd.92.1601945338558; Mon, 05 Oct 2020 17:48:58 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyDhAj1WgfHdPqHMpVrxVlAVWpeyTVJh0BtqWvI5sQK6pUvWSC9VjTkcgDY7J9EiHW+mNYLIA== X-Received: by 2002:aed:2786:: with SMTP id a6mr2675388qtd.92.1601945338263; Mon, 05 Oct 2020 17:48:58 -0700 (PDT) Received: from localhost.localdomain ([201.82.49.101]) by smtp.gmail.com with ESMTPSA id l125sm1355322qke.23.2020.10.05.17.48.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 05 Oct 2020 17:48:57 -0700 (PDT) From: Mauricio Faria de Oliveira To: linux-ext4@vger.kernel.org, ocfs2-devel@oss.oracle.com Date: Mon, 5 Oct 2020 21:48:40 -0300 Message-Id: <20201006004841.600488-4-mfo@canonical.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20201006004841.600488-1-mfo@canonical.com> References: <20201006004841.600488-1-mfo@canonical.com> MIME-Version: 1.0 X-PDR: PASS X-Source-IP: 91.189.89.112 X-ServerName: youngberry.canonical.com X-Proofpoint-SPF-Result: None X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9765 signatures=668680 X-Proofpoint-Spam-Details: rule=tap_notspam policy=tap score=0 spamscore=0 mlxscore=0 clxscore=255 impostorscore=0 lowpriorityscore=0 priorityscore=100 adultscore=0 bulkscore=0 malwarescore=0 suspectscore=0 phishscore=0 mlxlogscore=809 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2010060001 X-Spam: Clean X-Mailman-Approved-At: Tue, 06 Oct 2020 14:13:27 -0700 Cc: Andreas Dilger , dann frazier , Jan Kara Subject: [Ocfs2-devel] [PATCH v5 3/4] ext4: data=journal: fixes for ext4_page_mkwrite() X-BeenThere: ocfs2-devel@oss.oracle.com X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: ocfs2-devel-bounces@oss.oracle.com Errors-To: ocfs2-devel-bounces@oss.oracle.com X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9766 signatures=668680 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 malwarescore=0 suspectscore=0 adultscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2010060141 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9766 signatures=668680 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxscore=0 malwarescore=0 bulkscore=0 impostorscore=0 lowpriorityscore=0 suspectscore=0 phishscore=0 mlxlogscore=999 adultscore=0 clxscore=1015 spamscore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2010060141 These are two fixes for data journalling required by the next patch, discovered while testing it. First, the optimization to return early if all buffers are mapped is not appropriate for the next patch: The inode _must_ be added to the transaction's list in data=journal mode (so to write-protect pages on commit) thus we cannot return early there. Second, once that optimization to reduce transactions was disabled for data=journal mode, more transactions happened, and occasionally hit this warning message: 'JBD2: Spotted dirty metadata buffer'. Reason is, block_page_mkwrite() will set_buffer_dirty() before do_journal_get_write_access() that is there to prevent it. This issue was masked by the optimization. So, on data=journal use __block_write_begin() instead. This also requires page locking and len recalculation. (see block_page_mkwrite() for implementation details.) Finally, as Jan noted there is little sharing between data=journal and other modes in ext4_page_mkwrite(). However, a prototype of ext4_journalled_page_mkwrite() showed there still would be lots of duplicated lines (tens of) that didn't seem worth it. Thus this patch ends up with an ugly goto to skip all non-data journalling code (to avoid long indentations, but that can be changed..) in the beginning, and just a conditional in the transaction section. Well, we skip a common part to data journalling which is the page truncated check, but we do it again after ext4_journal_start() when we re-acquire the page lock (so not to acquire the page lock twice needlessly for data journalling.) Signed-off-by: Mauricio Faria de Oliveira Suggested-by: Jan Kara Reviewed-by: Jan Kara Reviewed-by: Andreas Dilger --- fs/ext4/inode.c | 51 ++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 44 insertions(+), 7 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index bf596467c234..ac153e340a6f 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -5977,9 +5977,17 @@ vm_fault_t ext4_page_mkwrite(struct vm_fault *vmf) if (err) goto out_ret; + /* + * On data journalling we skip straight to the transaction handle: + * there's no delalloc; page truncated will be checked later; the + * early return w/ all buffers mapped (calculates size/len) can't + * be used; and there's no dioread_nolock, so only ext4_get_block. + */ + if (ext4_should_journal_data(inode)) + goto retry_alloc; + /* Delalloc case is easy... */ if (test_opt(inode->i_sb, DELALLOC) && - !ext4_should_journal_data(inode) && !ext4_nonda_switch(inode->i_sb)) { do { err = block_page_mkwrite(vma, vmf, @@ -6005,6 +6013,9 @@ vm_fault_t ext4_page_mkwrite(struct vm_fault *vmf) /* * Return if we have all the buffers mapped. This avoids the need to do * journal_start/journal_stop which can block and take a long time + * + * This cannot be done for data journalling, as we have to add the + * inode to the transaction's list to writeprotect pages on commit. */ if (page_has_buffers(page)) { if (!ext4_walk_page_buffers(NULL, page_buffers(page), @@ -6029,16 +6040,42 @@ vm_fault_t ext4_page_mkwrite(struct vm_fault *vmf) ret = VM_FAULT_SIGBUS; goto out; } - err = block_page_mkwrite(vma, vmf, get_block); - if (!err && ext4_should_journal_data(inode)) { - if (ext4_walk_page_buffers(handle, page_buffers(page), 0, - PAGE_SIZE, NULL, do_journal_get_write_access)) { + /* + * Data journalling can't use block_page_mkwrite() because it + * will set_buffer_dirty() before do_journal_get_write_access() + * thus might hit warning messages for dirty metadata buffers. + */ + if (!ext4_should_journal_data(inode)) { + err = block_page_mkwrite(vma, vmf, get_block); + } else { + lock_page(page); + size = i_size_read(inode); + /* Page got truncated from under us? */ + if (page->mapping != mapping || page_offset(page) > size) { unlock_page(page); - ret = VM_FAULT_SIGBUS; + ret = VM_FAULT_NOPAGE; ext4_journal_stop(handle); goto out; } - ext4_set_inode_state(inode, EXT4_STATE_JDATA); + + if (page->index == size >> PAGE_SHIFT) + len = size & ~PAGE_MASK; + else + len = PAGE_SIZE; + + err = __block_write_begin(page, 0, len, ext4_get_block); + if (!err) { + if (ext4_walk_page_buffers(handle, page_buffers(page), + 0, len, NULL, do_journal_get_write_access)) { + unlock_page(page); + ret = VM_FAULT_SIGBUS; + ext4_journal_stop(handle); + goto out; + } + ext4_set_inode_state(inode, EXT4_STATE_JDATA); + } else { + unlock_page(page); + } } ext4_journal_stop(handle); if (err == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries)) From patchwork Tue Oct 6 00:48:41 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mauricio Faria de Oliveira X-Patchwork-Id: 11819243 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E8D301752 for ; Tue, 6 Oct 2020 21:14:25 +0000 (UTC) Received: from aserp2130.oracle.com (aserp2130.oracle.com [141.146.126.79]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 7EEA0206BE for ; Tue, 6 Oct 2020 21:14:24 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7EEA0206BE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=canonical.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=ocfs2-devel-bounces@oss.oracle.com Received: from pps.filterd (aserp2130.oracle.com [127.0.0.1]) by aserp2130.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 096LBRrP141110; Tue, 6 Oct 2020 21:13:30 GMT Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by aserp2130.oracle.com with ESMTP id 33xetaxq1m-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Tue, 06 Oct 2020 21:13:30 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 096LA91g176145; Tue, 6 Oct 2020 21:13:29 GMT Received: from oss.oracle.com (oss-old-reserved.oracle.com [137.254.22.2]) by userp3030.oracle.com with ESMTP id 33y37xkgyq-1 (version=TLSv1 cipher=AES256-SHA bits=256 verify=NO); Tue, 06 Oct 2020 21:13:29 +0000 Received: from localhost ([127.0.0.1] helo=lb-oss.oracle.com) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1kPuH2-00055q-84; Tue, 06 Oct 2020 14:13:28 -0700 Received: from aserp3030.oracle.com ([141.146.126.71]) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1kPbAA-0003gV-DD for ocfs2-devel@oss.oracle.com; Mon, 05 Oct 2020 17:49:06 -0700 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 0960VD1S190377 for ; Tue, 6 Oct 2020 00:49:06 GMT Received: from userp2040.oracle.com (userp2040.oracle.com [156.151.31.90]) by aserp3030.oracle.com with ESMTP id 33y2vm9uar-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 06 Oct 2020 00:49:06 +0000 Received: from pps.filterd (userp2040.oracle.com [127.0.0.1]) by userp2040.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 0960Wvf2034631 for ; Tue, 6 Oct 2020 00:49:05 GMT Received: from youngberry.canonical.com (youngberry.canonical.com [91.189.89.112]) by userp2040.oracle.com with ESMTP id 33xfwtet6d-1 for ; Tue, 06 Oct 2020 00:49:05 +0000 Received: from mail-qk1-f199.google.com ([209.85.222.199]) by youngberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1kPbA6-0004XL-Us for ocfs2-devel@oss.oracle.com; Tue, 06 Oct 2020 00:49:03 +0000 Received: by mail-qk1-f199.google.com with SMTP id y17so8164297qky.0 for ; Mon, 05 Oct 2020 17:49:02 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=UC5Ocn3Tw5Ct8smHnaeE3PO2yoiTUfAo3AuOvhlc9Ng=; b=GsSW/DHa/v50qdeW4aMIK3TtqalbBLegrSa0lkxonzIp0zZhDy/7zCzk7D1CIrcYJh QbieGc1098XimMVmzwRAy8RgL2n5jGGDhHG3qBVLl6KQWGrJ2SudsGbZK1QFQWVrrhTN Ehx2XeRqdTJcJqd8uXzkUoFkv/6GW2N650sTKbreALgAXuKuCmPrqaK5hjRrXHgi6Z5c HdIyBwXtkd+kpD5PylRX370x1z8HdfQrQcB7pnA/PJi7KVrQclIWK7c1sF+DpcrtP7R3 BrnjjKVocjAZ4S8jWzJURl8IJYhll05YmAIl69wyDHSq3Ds4wXTI14QTqOtB1ESkpuVF 2UGw== X-Gm-Message-State: AOAM5336dzqSEZT9RsOgKwLqHxRIXNQ+kX+pvfo5AGAJ1kreYXi1MQDw VojG8X5SSLm0yv/s5wMk0sNjZeEmf5Bco/YCO1gFOGLfPzgsdHUaBWedDD+nu2yLQ9g4bd+JUs+ ntM9EGs6HD5lv3/GfyqSxw2PULK2XLkIe3gDtNag= X-Received: by 2002:a05:620a:2e7:: with SMTP id a7mr2772974qko.48.1601945341841; Mon, 05 Oct 2020 17:49:01 -0700 (PDT) X-Google-Smtp-Source: ABdhPJytiNpfCNTo4NWRlfBdxl012ldpFf+gxvi/Vf25rkVA+kZFpWUShcB2cCZyOmiIutEdc4YJug== X-Received: by 2002:a05:620a:2e7:: with SMTP id a7mr2772961qko.48.1601945341552; Mon, 05 Oct 2020 17:49:01 -0700 (PDT) Received: from localhost.localdomain ([201.82.49.101]) by smtp.gmail.com with ESMTPSA id l125sm1355322qke.23.2020.10.05.17.48.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 05 Oct 2020 17:49:00 -0700 (PDT) From: Mauricio Faria de Oliveira To: linux-ext4@vger.kernel.org, ocfs2-devel@oss.oracle.com Date: Mon, 5 Oct 2020 21:48:41 -0300 Message-Id: <20201006004841.600488-5-mfo@canonical.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20201006004841.600488-1-mfo@canonical.com> References: <20201006004841.600488-1-mfo@canonical.com> MIME-Version: 1.0 X-PDR: PASS X-Source-IP: 91.189.89.112 X-ServerName: youngberry.canonical.com X-Proofpoint-SPF-Result: None X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9765 signatures=668680 X-Proofpoint-Spam-Details: rule=tap_notspam policy=tap score=0 malwarescore=0 lowpriorityscore=0 phishscore=0 impostorscore=0 mlxlogscore=999 priorityscore=85 clxscore=194 bulkscore=0 suspectscore=0 spamscore=0 mlxscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2010060001 X-Spam: Clean X-Mailman-Approved-At: Tue, 06 Oct 2020 14:13:27 -0700 Cc: Andreas Dilger , dann frazier , Jan Kara Subject: [Ocfs2-devel] [PATCH v5 4/4] ext4: data=journal: write-protect pages on j_submit_inode_data_buffers() X-BeenThere: ocfs2-devel@oss.oracle.com X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: ocfs2-devel-bounces@oss.oracle.com Errors-To: ocfs2-devel-bounces@oss.oracle.com X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9766 signatures=668680 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 malwarescore=0 suspectscore=0 adultscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2010060140 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9766 signatures=668680 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 spamscore=0 mlxscore=0 clxscore=1015 priorityscore=1501 adultscore=0 mlxlogscore=999 phishscore=0 impostorscore=0 malwarescore=0 suspectscore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2010060140 This implements journal callbacks j_submit|finish_inode_data_buffers() with different behavior for data=journal: to write-protect pages under commit, preventing changes to buffers writeably mapped to userspace. If a buffer's content changes between commit's checksum calculation and write-out to disk, it can cause journal recovery/mount failures upon a kernel crash or power loss. [ 27.334874] EXT4-fs: Warning: mounting with data=journal disables delayed allocation, dioread_nolock, and O_DIRECT support! [ 27.339492] JBD2: Invalid checksum recovering data block 8705 in log [ 27.342716] JBD2: recovery failed [ 27.343316] EXT4-fs (loop0): error loading journal mount: /ext4: can't read superblock on /dev/loop0. In j_submit_inode_data_buffers() we write-protect the inode's pages with write_cache_pages() and redirty w/ writepage callback if needed. In j_finish_inode_data_buffers() there is nothing do to. And in order to use the callbacks, inodes are added to the inode list in transaction in __ext4_journalled_writepage() and ext4_page_mkwrite(). In ext4_page_mkwrite() we must make sure that the buffers are attached to the transaction as jbddirty with write_end_fn(), as already done in __ext4_journalled_writepage(). Signed-off-by: Mauricio Faria de Oliveira Reported-by: Dann Frazier Reported-by: kernel test robot # wbc.nr_to_write Suggested-by: Jan Kara Reviewed-by: Jan Kara --- fs/ext4/inode.c | 25 +++++++++----- fs/ext4/super.c | 87 +++++++++++++++++++++++++++++++++++++++++++++++-- 2 files changed, 101 insertions(+), 11 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index ac153e340a6f..af5de62c1214 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -1910,6 +1910,9 @@ static int __ext4_journalled_writepage(struct page *page, err = ext4_walk_page_buffers(handle, page_bufs, 0, len, NULL, write_end_fn); } + if (ret == 0) + ret = err; + err = ext4_jbd2_inode_add_write(handle, inode, 0, len); if (ret == 0) ret = err; EXT4_I(inode)->i_datasync_tid = handle->h_transaction->t_tid; @@ -6052,10 +6055,8 @@ vm_fault_t ext4_page_mkwrite(struct vm_fault *vmf) size = i_size_read(inode); /* Page got truncated from under us? */ if (page->mapping != mapping || page_offset(page) > size) { - unlock_page(page); ret = VM_FAULT_NOPAGE; - ext4_journal_stop(handle); - goto out; + goto out_error; } if (page->index == size >> PAGE_SHIFT) @@ -6065,13 +6066,15 @@ vm_fault_t ext4_page_mkwrite(struct vm_fault *vmf) err = __block_write_begin(page, 0, len, ext4_get_block); if (!err) { + ret = VM_FAULT_SIGBUS; if (ext4_walk_page_buffers(handle, page_buffers(page), - 0, len, NULL, do_journal_get_write_access)) { - unlock_page(page); - ret = VM_FAULT_SIGBUS; - ext4_journal_stop(handle); - goto out; - } + 0, len, NULL, do_journal_get_write_access)) + goto out_error; + if (ext4_walk_page_buffers(handle, page_buffers(page), + 0, len, NULL, write_end_fn)) + goto out_error; + if (ext4_jbd2_inode_add_write(handle, inode, 0, len)) + goto out_error; ext4_set_inode_state(inode, EXT4_STATE_JDATA); } else { unlock_page(page); @@ -6086,6 +6089,10 @@ vm_fault_t ext4_page_mkwrite(struct vm_fault *vmf) up_read(&EXT4_I(inode)->i_mmap_sem); sb_end_pagefault(inode->i_sb); return ret; +out_error: + unlock_page(page); + ext4_journal_stop(handle); + goto out; } vm_fault_t ext4_filemap_fault(struct vm_fault *vmf) diff --git a/fs/ext4/super.c b/fs/ext4/super.c index a14c1ed39aa3..a2fc62a6d3b7 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -472,6 +472,89 @@ static void ext4_journal_commit_callback(journal_t *journal, transaction_t *txn) spin_unlock(&sbi->s_md_lock); } +/* + * This writepage callback for write_cache_pages() + * takes care of a few cases after page cleaning. + * + * write_cache_pages() already checks for dirty pages + * and calls clear_page_dirty_for_io(), which we want, + * to write protect the pages. + * + * However, we may have to redirty a page (see below.) + */ +static int ext4_journalled_writepage_callback(struct page *page, + struct writeback_control *wbc, + void *data) +{ + transaction_t *transaction = (transaction_t *) data; + struct buffer_head *bh, *head; + struct journal_head *jh; + + bh = head = page_buffers(page); + do { + /* + * We have to redirty a page in these cases: + * 1) If buffer is dirty, it means the page was dirty because it + * contains a buffer that needs checkpointing. So the dirty bit + * needs to be preserved so that checkpointing writes the buffer + * properly. + * 2) If buffer is not part of the committing transaction + * (we may have just accidentally come across this buffer because + * inode range tracking is not exact) or if the currently running + * transaction already contains this buffer as well, dirty bit + * needs to be preserved so that the buffer gets writeprotected + * properly on running transaction's commit. + */ + jh = bh2jh(bh); + if (buffer_dirty(bh) || + (jh && (jh->b_transaction != transaction || + jh->b_next_transaction))) { + redirty_page_for_writepage(wbc, page); + goto out; + } + } while ((bh = bh->b_this_page) != head); + +out: + return AOP_WRITEPAGE_ACTIVATE; +} + +static int ext4_journalled_submit_inode_data_buffers(struct jbd2_inode *jinode) +{ + struct address_space *mapping = jinode->i_vfs_inode->i_mapping; + struct writeback_control wbc = { + .sync_mode = WB_SYNC_ALL, + .nr_to_write = LONG_MAX, + .range_start = jinode->i_dirty_start, + .range_end = jinode->i_dirty_end, + }; + + return write_cache_pages(mapping, &wbc, + ext4_journalled_writepage_callback, + jinode->i_transaction); +} + +static int ext4_journal_submit_inode_data_buffers(struct jbd2_inode *jinode) +{ + int ret; + + if (ext4_should_journal_data(jinode->i_vfs_inode)) + ret = ext4_journalled_submit_inode_data_buffers(jinode); + else + ret = jbd2_journal_submit_inode_data_buffers(jinode); + + return ret; +} + +static int ext4_journal_finish_inode_data_buffers(struct jbd2_inode *jinode) +{ + int ret = 0; + + if (!ext4_should_journal_data(jinode->i_vfs_inode)) + ret = jbd2_journal_finish_inode_data_buffers(jinode); + + return ret; +} + static bool system_going_down(void) { return system_state == SYSTEM_HALT || system_state == SYSTEM_POWER_OFF @@ -4647,9 +4730,9 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) sbi->s_journal->j_commit_callback = ext4_journal_commit_callback; sbi->s_journal->j_submit_inode_data_buffers = - jbd2_journal_submit_inode_data_buffers; + ext4_journal_submit_inode_data_buffers; sbi->s_journal->j_finish_inode_data_buffers = - jbd2_journal_finish_inode_data_buffers; + ext4_journal_finish_inode_data_buffers; no_journal: if (!test_opt(sb, NO_MBCACHE)) {