From patchwork Sat May 7 18:31:50 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eryu Guan X-Patchwork-Id: 9038401 Return-Path: X-Original-To: patchwork-linux-fsdevel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id DDFFF9F1D3 for ; Sat, 7 May 2016 19:14:34 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id D24F920160 for ; Sat, 7 May 2016 19:14:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id C021420149 for ; Sat, 7 May 2016 19:14:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750948AbcEGTOa (ORCPT ); Sat, 7 May 2016 15:14:30 -0400 Received: from mail-pf0-f193.google.com ([209.85.192.193]:34051 "EHLO mail-pf0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750933AbcEGTO3 (ORCPT ); Sat, 7 May 2016 15:14:29 -0400 Received: by mail-pf0-f193.google.com with SMTP id 145so12768682pfz.1; Sat, 07 May 2016 12:14:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=E1dxgZWP29TK93uiEDIav/lSKdGwOyCj8ci0lesftow=; b=u/aQa+d9UhXcNoQOAMfr03AgMDv/OQ39n/U3m5reijtDAjavCZwSR+TjX2AoARkxZ2 ycW7MvCGurBwIOGF/bAUhCqPad56cULZECGdPzx6lfjkFFOcq7Nj+/Byzq5M6i/OvmcB 23vNj4xTJ07NGP08p84flEc+u4nzl0vCcoQwDT3Ph02tcoUoYY6QyZIZEJfd8Di405dm 3XAhdvrqH/BwrlRRjO9vbdAxSrN3zNcqT/4lutouZXwq9TxQqc0jHluAQ5IZGjGWTcPo BFNEkP/PUO2pFj2WztRfwmUPgcvTrRf+Kozae+IjKVdpyw015ZoqiZTexHN+qYumi17U z+Vw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=E1dxgZWP29TK93uiEDIav/lSKdGwOyCj8ci0lesftow=; b=QzkJhDScYjFHfXXzaDjbOkgs3q3mKJbucOBCoFmHeStCn5I+xGZKFkxYvEkwgZ2HgM Tn5ib+b0K9MlpgbbMuz6ws7TgZHa9/EoLDqSHRqCBM3iik2NZD5ESzCAR3o/noWfnsS2 3scjQ4LzEriTRFzwc0RJmIbVTwPrHi6jaW+EZspjOtNYRHPRRZi2JYuvIl7I8qSqkxgn FRMviD1X9scDsy2kzBUoSa1F1WUA40Jxfi4e4nC4w/vqR3rwlXGkroI9Gu9+WA6ARcDU 9F881Ua4o3OmhguGjIRcQ419h1faCXwIscdlYKqPCD3jlquF7ri6SwKWM4J2rhaTXcEl zA/w== X-Gm-Message-State: AOPr4FXxkdHixN1aEDSSB4ZfigK+bhlJOwl2pNyu4jdHiCxuVy6k5JtwcIU0bljEbDno2A== X-Received: by 10.98.54.194 with SMTP id d185mr30025255pfa.34.1462648468300; Sat, 07 May 2016 12:14:28 -0700 (PDT) Received: from localhost ([128.199.137.77]) by smtp.gmail.com with ESMTPSA id y2sm29771403pfi.39.2016.05.07.12.14.27 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 07 May 2016 12:14:27 -0700 (PDT) From: Eryu Guan To: linux-fsdevel@vger.kernel.org Cc: linux-ext4@vger.kernel.org, jmoyer@redhat.com, Eryu Guan Subject: [PATCH v2 2/2] direct-io: fix stale data exposure from concurrent buffered read Date: Sun, 8 May 2016 02:31:50 +0800 Message-Id: <1462645910-23290-2-git-send-email-guaneryu@gmail.com> X-Mailer: git-send-email 2.5.5 In-Reply-To: <1462645910-23290-1-git-send-email-guaneryu@gmail.com> References: <1462645910-23290-1-git-send-email-guaneryu@gmail.com> Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Spam-Status: No, score=-8.9 required=5.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, T_DKIM_INVALID, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Currently direct writes inside i_size on a DIO_SKIP_HOLES filesystem are not allowed to allocate blocks(get_more_blocks() sets 'create' to 0 before calling get_block() callback), if it's a sparse file, direct writes fall back to buffered writes to avoid stale data exposure from concurrent buffered read. But there're two cases that can result in stale data exposure are not correctly detected. 1. The detection for "writing inside i_size" is not sufficient, writes can be treated as "extending writes" wrongly. For example, direct write 1FSB to a 1FSB sparse file on ext2/3/4, starting from offset 0, in this case it's writing inside i_size, but 'create' is non-zero, because 'block_in_file' and '(i_size_read(inode) >> blkbits' are both zero. 2. Direct writes starting from or beyong i_size (not inside i_size) also could trigger block allocation and expose stale data. For example, consider a sparse file with i_size of 2k, and a write to offset 2k or 3k into the file, with a filesystem block size of 4k. (Thanks to Jeff Moyer for pointing this case out.) The first problem can be demostrated by running ltp-aiodio test ADSP045 many times. When testing on extN filesystems, I see test failures occasionally, buffered read could read non-zero (stale) data. ADSP045: dio_sparse -a 4k -w 4k -s 2k -n 1 dio_sparse 0 TINFO : Dirtying free blocks dio_sparse 0 TINFO : Starting I/O tests non zero buffer at buf[0] => 0xffffffaa,ffffffaa,ffffffaa,ffffffaa non-zero read at offset 0 dio_sparse 0 TINFO : Killing childrens(s) dio_sparse 1 TFAIL : dio_sparse.c:191: 1 children(s) exited abnormally The second problem can also be reproduced easily by a hacked dio_sparse program, which accepts an option to specify the write offset. What we should really do is to disable block allocation for writes that could result in filling holes inside i_size. Signed-off-by: Eryu Guan Reviewed-by: Jan Kara --- v2: - Fix the case Jeff pointed out as well - Update commit log fs/direct-io.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/fs/direct-io.c b/fs/direct-io.c index 9d5aff9..5c13bbf 100644 --- a/fs/direct-io.c +++ b/fs/direct-io.c @@ -632,8 +632,10 @@ static int get_more_blocks(struct dio *dio, struct dio_submit *sdio, map_bh->b_size = fs_count << i_blkbits; /* - * For writes inside i_size on a DIO_SKIP_HOLES filesystem we - * forbid block creations: only overwrites are permitted. + * For writes that could fill holes inside i_size on a + * DIO_SKIP_HOLES filesystem we forbid block creations: only + * overwrites are permitted. + * * We will return early to the caller once we see an * unmapped buffer head returned, and the caller will fall * back to buffered I/O. @@ -644,7 +646,7 @@ static int get_more_blocks(struct dio *dio, struct dio_submit *sdio, */ create = dio->rw & WRITE; if (dio->flags & DIO_SKIP_HOLES) { - if (block_in_file < (i_size_read(inode) >> blkbits)) + if (fs_startblk <= ((i_size_read(inode) - 1) >> i_blkbits)) create = 0; }