From patchwork Thu Nov 16 01:27:07 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 10060573 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 5A5E760230 for ; Thu, 16 Nov 2017 01:27:20 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4E2E52A47F for ; Thu, 16 Nov 2017 01:27:20 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 42B3C2A475; Thu, 16 Nov 2017 01:27:20 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 333C22A475 for ; Thu, 16 Nov 2017 01:27:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933365AbdKPB1R (ORCPT ); Wed, 15 Nov 2017 20:27:17 -0500 Received: from aserp1040.oracle.com ([141.146.126.69]:40659 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933338AbdKPB1Q (ORCPT ); Wed, 15 Nov 2017 20:27:16 -0500 Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id vAG1RA84019920 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 16 Nov 2017 01:27:10 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by aserv0022.oracle.com (8.14.4/8.14.4) with ESMTP id vAG1R9Tp025374 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 16 Nov 2017 01:27:09 GMT Received: from abhmp0006.oracle.com (abhmp0006.oracle.com [141.146.116.12]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id vAG1R85Q008724; Thu, 16 Nov 2017 01:27:08 GMT Received: from localhost (/73.25.142.12) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 15 Nov 2017 17:27:08 -0800 Date: Wed, 15 Nov 2017 17:27:07 -0800 From: "Darrick J. Wong" To: xfs Cc: linux-fsdevel , Dave Chinner , Brian Foster , holger@applied-asynchrony.com Subject: [PATCH] iomap: report collisions between directio and buffered writes to userspace Message-ID: <20171116012707.GB5130@magnolia> MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.24 (2015-08-30) X-Source-IP: aserv0022.oracle.com [141.146.126.234] Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Darrick J. Wong If two programs simultaneously try to write to the same part of a file via direct IO and buffered IO, there's a chance that the post-diowrite pagecache invalidation will fail on the dirty page. When this happens, the dio write succeeded, which means that the page cache is no longer coherent with the disk! Programs are not supposed to mix IO types and this is a clear case of data corruption, so store an EIO which will be reflected to userspace during the next fsync. Get rid of the WARN_ON to assuage the fuzz-tester complaints. Signed-off-by: Darrick J. Wong --- fs/iomap.c | 25 +++++++++++++++++++++++-- 1 file changed, 23 insertions(+), 2 deletions(-) diff --git a/fs/iomap.c b/fs/iomap.c index 5011a96..9f0e8d4 100644 --- a/fs/iomap.c +++ b/fs/iomap.c @@ -711,6 +711,19 @@ struct iomap_dio { }; }; +static void iomap_warn_stale_pagecache(struct inode *inode) +{ + static DEFINE_RATELIMIT_STATE(_rs, + DEFAULT_RATELIMIT_INTERVAL, + DEFAULT_RATELIMIT_BURST); + + errseq_set(&inode->i_mapping->wb_err, -EIO); + if (__ratelimit(&_rs)) { + pr_crit("Page cache invalidation failure on direct I/O. Possible data corruption due to collision with buffered I/O!\n"); + dump_stack_print_info(KERN_CRIT); + } +} + static ssize_t iomap_dio_complete(struct iomap_dio *dio) { struct kiocb *iocb = dio->iocb; @@ -753,7 +766,8 @@ static ssize_t iomap_dio_complete(struct iomap_dio *dio) err = invalidate_inode_pages2_range(inode->i_mapping, offset >> PAGE_SHIFT, (offset + dio->size - 1) >> PAGE_SHIFT); - WARN_ON_ONCE(err); + if (err) + iomap_warn_stale_pagecache(inode); } inode_dio_end(file_inode(iocb->ki_filp)); @@ -1012,9 +1026,16 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, if (ret) goto out_free_dio; + /* + * Try to invalidate cache pages for the range we're direct + * writing. If this invalidation fails, tough, the write will + * still work, but racing two incompatible write paths is a + * pretty crazy thing to do, so we don't support it 100%. + */ ret = invalidate_inode_pages2_range(mapping, start >> PAGE_SHIFT, end >> PAGE_SHIFT); - WARN_ON_ONCE(ret); + if (ret) + iomap_warn_stale_pagecache(inode); ret = 0; if (iov_iter_rw(iter) == WRITE && !is_sync_kiocb(iocb) &&