From patchwork Fri Mar 22 16:52:42 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 10866261 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AA9FE922 for ; Fri, 22 Mar 2019 16:52:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 928B52A960 for ; Fri, 22 Mar 2019 16:52:44 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 874E72A963; Fri, 22 Mar 2019 16:52:44 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EE9AE2A960 for ; Fri, 22 Mar 2019 16:52:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728284AbfCVQwn (ORCPT ); Fri, 22 Mar 2019 12:52:43 -0400 Received: from mx1.redhat.com ([209.132.183.28]:53428 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728274AbfCVQwn (ORCPT ); Fri, 22 Mar 2019 12:52:43 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 290863082E60; Fri, 22 Mar 2019 16:52:43 +0000 (UTC) Received: from bfoster.bos.redhat.com (dhcp-41-2.bos.redhat.com [10.18.41.2]) by smtp.corp.redhat.com (Postfix) with ESMTP id B13E75D9D2; Fri, 22 Mar 2019 16:52:42 +0000 (UTC) From: Brian Foster To: linux-xfs@vger.kernel.org Cc: Dave Chinner , Zorro Lang Subject: [PATCH] xfs: serialize unaligned dio writes against all other dio writes Date: Fri, 22 Mar 2019 12:52:42 -0400 Message-Id: <20190322165242.40662-1-bfoster@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.46]); Fri, 22 Mar 2019 16:52:43 +0000 (UTC) Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP XFS applies more strict serialization constraints to unaligned direct writes to accommodate things like direct I/O layer zeroing, unwritten extent conversion, etc. Unaligned submissions acquire the exclusive iolock and wait for in-flight dio to complete to ensure multiple submissions do not race on the same block and cause data corruption. This generally works in the case of an aligned dio followed by an unaligned dio, but the serialization is lost if I/Os occur in the opposite order. If an unaligned write is submitted first and immediately followed by an overlapping, aligned write, the latter submits without the typical unaligned serialization barriers because there is no indication of an unaligned dio still in-flight. This can lead to unpredictable results. To provide proper unaligned dio serialization, require that such direct writes are always the only dio allowed in-flight at one time for a particular inode. We already acquire the exclusive iolock and drain pending dio before submitting the unaligned dio. Wait once more after the dio submission to hold the iolock across the I/O and prevent further submissions until the unaligned I/O completes. This is heavy handed, but consistent with the current pre-submission serialization for unaligned direct writes. Signed-off-by: Brian Foster Reviewed-by: Allison Henderson --- I was originally going to deal with this problem by hacking in an inode flag to track unaligned dio writes in-flight and use that to block any follow on dio writes until cleared. Dave suggested we could use the iolock to serialize by converting unaligned async dio writes to sync dio writes and just letting the unaligned dio itself always block. That seemed reasonable to me, but I morphed the approach slightly to just use inode_dio_wait() because it seemed a bit cleaner. Thoughts? Zorro, You reproduced this problem originally. It addresses the problem in the test case that reproduced for me. Care to confirm whether this patch fixes the problem for you? Thanks. Brian fs/xfs/xfs_file.c | 21 ++++++++++++--------- 1 file changed, 12 insertions(+), 9 deletions(-) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 770cc2edf777..8b2aaed82343 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -529,18 +529,19 @@ xfs_file_dio_aio_write( count = iov_iter_count(from); /* - * If we are doing unaligned IO, wait for all other IO to drain, - * otherwise demote the lock if we had to take the exclusive lock - * for other reasons in xfs_file_aio_write_checks. + * If we are doing unaligned IO, we can't allow any other IO in-flight + * at the same time or we risk data corruption. Wait for all other IO to + * drain, submit and wait for completion before we release the iolock. + * + * If the IO is aligned, demote the iolock if we had to take the + * exclusive lock in xfs_file_aio_write_checks() for other reasons. */ if (unaligned_io) { - /* If we are going to wait for other DIO to finish, bail */ - if (iocb->ki_flags & IOCB_NOWAIT) { - if (atomic_read(&inode->i_dio_count)) - return -EAGAIN; - } else { + /* unaligned dio always waits, bail */ + if (iocb->ki_flags & IOCB_NOWAIT) + return -EAGAIN; + else inode_dio_wait(inode); - } } else if (iolock == XFS_IOLOCK_EXCL) { xfs_ilock_demote(ip, XFS_IOLOCK_EXCL); iolock = XFS_IOLOCK_SHARED; @@ -548,6 +549,8 @@ xfs_file_dio_aio_write( trace_xfs_file_direct_write(ip, count, iocb->ki_pos); ret = iomap_dio_rw(iocb, from, &xfs_iomap_ops, xfs_dio_write_end_io); + if (unaligned_io && !is_sync_kiocb(iocb)) + inode_dio_wait(inode); out: xfs_iunlock(ip, iolock);