From patchwork Fri Nov 22 23:53:22 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andreas Gruenbacher X-Patchwork-Id: 11258675 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 468CE14DB for ; Fri, 22 Nov 2019 23:53:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 26CFF20637 for ; Fri, 22 Nov 2019 23:53:45 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="EuBBMcP3" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726939AbfKVXxo (ORCPT ); Fri, 22 Nov 2019 18:53:44 -0500 Received: from us-smtp-1.mimecast.com ([207.211.31.81]:33463 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726705AbfKVXxm (ORCPT ); Fri, 22 Nov 2019 18:53:42 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1574466821; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rvAGjXKzggaIcjor/eU/2AoEyqgwEZJGtnOKutcJ+6k=; b=EuBBMcP3R9KIehvnddA7lDPlfznXcpOCwY1T4dFsmK56aSaOB9pme7DAMJiwx7Y1ZOyHnr jdL/ewlnEgAN745LCCd55cp5c/OSItcY4y6oVwQ4NYnswvjkUxHw3wrdkriiWo1hITJXi/ /obLBqa2cs9XyUyOWp0nidt88QRviIw= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-45-ZgnG4BKfPseZDtuQScqjhQ-1; Fri, 22 Nov 2019 18:53:40 -0500 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 73D761005509; Fri, 22 Nov 2019 23:53:38 +0000 (UTC) Received: from max.com (ovpn-204-21.brq.redhat.com [10.40.204.21]) by smtp.corp.redhat.com (Postfix) with ESMTP id E21FE5C1BB; Fri, 22 Nov 2019 23:53:34 +0000 (UTC) From: Andreas Gruenbacher To: Linus Torvalds Cc: Steven Whitehouse , Konstantin Khlebnikov , "Kirill A. Shutemov" , linux-mm@kvack.org, Andrew Morton , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Alexander Viro , Johannes Weiner , cluster-devel@redhat.com, Ronnie Sahlberg , Steve French , Bob Peterson , Andreas Gruenbacher Subject: [RFC PATCH 1/3] fs: Add IOCB_CACHED flag for generic_file_read_iter Date: Sat, 23 Nov 2019 00:53:22 +0100 Message-Id: <20191122235324.17245-2-agruenba@redhat.com> In-Reply-To: <20191122235324.17245-1-agruenba@redhat.com> References: <20191122235324.17245-1-agruenba@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-MC-Unique: ZgnG4BKfPseZDtuQScqjhQ-1 X-Mimecast-Spam-Score: 0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Add an IOCB_CACHED flag which indicates to generic_file_read_iter that it should only look at the page cache, without triggering any filesystem I/O for the actual request or for readahead. When filesystem I/O would be triggered, an error code should be returned instead. This allows the caller to perform a tentative read out of the page cache, and to retry the read after taking the necessary steps when the requested pages are not cached. When readahead would be triggered, we return -ECANCELED instead of -EAGAIN. This allows to distinguish attempted readheads from attempted reads (with IOCB_NOWAIT). Signed-off-by: Andreas Gruenbacher --- include/linux/fs.h | 1 + mm/filemap.c | 17 ++++++++++++++--- 2 files changed, 15 insertions(+), 3 deletions(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index e0d909d35763..4ca5e2885452 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -314,6 +314,7 @@ enum rw_hint { #define IOCB_SYNC (1 << 5) #define IOCB_WRITE (1 << 6) #define IOCB_NOWAIT (1 << 7) +#define IOCB_CACHED (1 << 8) struct kiocb { struct file *ki_filp; diff --git a/mm/filemap.c b/mm/filemap.c index 85b7d087eb45..024ff0b5fcb6 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2046,7 +2046,7 @@ static ssize_t generic_file_buffered_read(struct kiocb *iocb, page = find_get_page(mapping, index); if (!page) { - if (iocb->ki_flags & IOCB_NOWAIT) + if (iocb->ki_flags & (IOCB_NOWAIT | IOCB_CACHED)) goto would_block; page_cache_sync_readahead(mapping, ra, filp, @@ -2056,12 +2056,16 @@ static ssize_t generic_file_buffered_read(struct kiocb *iocb, goto no_cached_page; } if (PageReadahead(page)) { + if (iocb->ki_flags & IOCB_CACHED) { + error = -ECANCELED; + goto out; + } page_cache_async_readahead(mapping, ra, filp, page, index, last_index - index); } if (!PageUptodate(page)) { - if (iocb->ki_flags & IOCB_NOWAIT) { + if (iocb->ki_flags & (IOCB_NOWAIT | IOCB_CACHED)) { put_page(page); goto would_block; } @@ -2266,6 +2270,13 @@ static ssize_t generic_file_buffered_read(struct kiocb *iocb, * * This is the "read_iter()" routine for all filesystems * that can use the page cache directly. + * + * In the IOCB_NOWAIT flag in iocb->ki_flags indicates that -EAGAIN should be + * returned if completing the request would require I/O; this does not prevent + * readahead. The IOCB_CACHED flag indicates that -EAGAIN should be returned + * as under the IOCB_NOWAIT flag, and that -ECANCELED should be returned when + * readhead would be triggered. + * * Return: * * number of bytes copied, even for partial reads * * negative error code if nothing was read @@ -2286,7 +2297,7 @@ generic_file_read_iter(struct kiocb *iocb, struct iov_iter *iter) loff_t size; size = i_size_read(inode); - if (iocb->ki_flags & IOCB_NOWAIT) { + if (iocb->ki_flags & (IOCB_NOWAIT | IOCB_CACHED)) { if (filemap_range_has_page(mapping, iocb->ki_pos, iocb->ki_pos + count - 1)) return -EAGAIN; From patchwork Fri Nov 22 23:53:23 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andreas Gruenbacher X-Patchwork-Id: 11258679 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8E2AD13A4 for ; Fri, 22 Nov 2019 23:53:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 64D192064B for ; Fri, 22 Nov 2019 23:53:52 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="BSUZSIr0" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727096AbfKVXxr (ORCPT ); Fri, 22 Nov 2019 18:53:47 -0500 Received: from us-smtp-1.mimecast.com ([205.139.110.61]:21573 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726961AbfKVXxq (ORCPT ); Fri, 22 Nov 2019 18:53:46 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1574466825; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CQZMohj51Rvu4NKQeTHUm4c49yOWaO0z0fTazdBpmzU=; b=BSUZSIr0IFdGQK3hwZnSzw35h84etc4HpJpb+18UhYUz9aSdLZVsokRMzkvVcy93g2Gn0a k7+XzokNwQ+xiwUc1dMcpo9rX0oPg3LOwAMVIB77G3XIuZv0CmmDG0GnQNZyIwJy6nHmDb kpkOkPydqi9a/1KYQ41yMcKRjuZ3gRs= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-34-eUKUTjKfNcuoqx_xi4L-iA-1; Fri, 22 Nov 2019 18:53:44 -0500 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 8F516802689; Fri, 22 Nov 2019 23:53:42 +0000 (UTC) Received: from max.com (ovpn-204-21.brq.redhat.com [10.40.204.21]) by smtp.corp.redhat.com (Postfix) with ESMTP id CC6135C1B5; Fri, 22 Nov 2019 23:53:38 +0000 (UTC) From: Andreas Gruenbacher To: Linus Torvalds Cc: Steven Whitehouse , Konstantin Khlebnikov , "Kirill A. Shutemov" , linux-mm@kvack.org, Andrew Morton , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Alexander Viro , Johannes Weiner , cluster-devel@redhat.com, Ronnie Sahlberg , Steve French , Bob Peterson , Andreas Gruenbacher Subject: [RFC PATCH 2/3] fs: Add FAULT_FLAG_CACHED flag for filemap_fault Date: Sat, 23 Nov 2019 00:53:23 +0100 Message-Id: <20191122235324.17245-3-agruenba@redhat.com> In-Reply-To: <20191122235324.17245-1-agruenba@redhat.com> References: <20191122235324.17245-1-agruenba@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-MC-Unique: eUKUTjKfNcuoqx_xi4L-iA-1 X-Mimecast-Spam-Score: 0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Add a FAULT_FLAG_CACHED flag which indicates to filemap_fault that it should only look at the page cache, without triggering filesystem I/O for the actual request or for readahead. When filesystem I/O would be triggered, VM_FAULT_RETRY should be returned instead. This allows the caller to tentatively satisfy a minor page fault out of the page cache, and to retry the operation after taking the necessary steps when that isn't possible. Signed-off-by: Andreas Gruenbacher --- include/linux/mm.h | 4 +++- mm/filemap.c | 43 ++++++++++++++++++++++++++++++------------- 2 files changed, 33 insertions(+), 14 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index a2adf95b3f9c..b3317e4b2607 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -392,6 +392,7 @@ extern pgprot_t protection_map[16]; #define FAULT_FLAG_USER 0x40 /* The fault originated in userspace */ #define FAULT_FLAG_REMOTE 0x80 /* faulting for non current tsk/mm */ #define FAULT_FLAG_INSTRUCTION 0x100 /* The fault was during an instruction fetch */ +#define FAULT_FLAG_CACHED 0x200 /* Only look at the page cache */ #define FAULT_FLAG_TRACE \ { FAULT_FLAG_WRITE, "WRITE" }, \ @@ -402,7 +403,8 @@ extern pgprot_t protection_map[16]; { FAULT_FLAG_TRIED, "TRIED" }, \ { FAULT_FLAG_USER, "USER" }, \ { FAULT_FLAG_REMOTE, "REMOTE" }, \ - { FAULT_FLAG_INSTRUCTION, "INSTRUCTION" } + { FAULT_FLAG_INSTRUCTION, "INSTRUCTION" }, \ + { FAULT_FLAG_CACHED, "CACHED" } /* * vm_fault is filled by the the pagefault handler and passed to the vma's diff --git a/mm/filemap.c b/mm/filemap.c index 024ff0b5fcb6..2297fad3b03a 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2383,7 +2383,7 @@ static int lock_page_maybe_drop_mmap(struct vm_fault *vmf, struct page *page, * the mmap_sem still held. That's how FAULT_FLAG_RETRY_NOWAIT * is supposed to work. We have way too many special cases.. */ - if (vmf->flags & FAULT_FLAG_RETRY_NOWAIT) + if (vmf->flags & (FAULT_FLAG_RETRY_NOWAIT | FAULT_FLAG_CACHED)) return 0; *fpin = maybe_unlock_mmap_for_io(vmf, *fpin); @@ -2460,26 +2460,28 @@ static struct file *do_sync_mmap_readahead(struct vm_fault *vmf) * so we want to possibly extend the readahead further. We return the file that * was pinned if we have to drop the mmap_sem in order to do IO. */ -static struct file *do_async_mmap_readahead(struct vm_fault *vmf, - struct page *page) +static vm_fault_t do_async_mmap_readahead(struct vm_fault *vmf, + struct page *page, + struct file **fpin) { struct file *file = vmf->vma->vm_file; struct file_ra_state *ra = &file->f_ra; struct address_space *mapping = file->f_mapping; - struct file *fpin = NULL; pgoff_t offset = vmf->pgoff; /* If we don't want any read-ahead, don't bother */ if (vmf->vma->vm_flags & VM_RAND_READ) - return fpin; + return 0; if (ra->mmap_miss > 0) ra->mmap_miss--; if (PageReadahead(page)) { - fpin = maybe_unlock_mmap_for_io(vmf, fpin); + if (vmf->flags & FAULT_FLAG_CACHED) + return VM_FAULT_RETRY; + *fpin = maybe_unlock_mmap_for_io(vmf, *fpin); page_cache_async_readahead(mapping, ra, file, page, offset, ra->ra_pages); } - return fpin; + return 0; } /** @@ -2495,8 +2497,11 @@ static struct file *do_async_mmap_readahead(struct vm_fault *vmf, * * vma->vm_mm->mmap_sem must be held on entry. * - * If our return value has VM_FAULT_RETRY set, it's because the mmap_sem - * may be dropped before doing I/O or by lock_page_maybe_drop_mmap(). + * This function may drop the mmap_sem before doing I/O or waiting for a page + * lock; this is indicated by the VM_FAULT_RETRY flag in our return value. + * Setting FAULT_FLAG_CACHED or FAULT_FLAG_RETRY_NOWAIT in vmf->flags will + * prevent dropping the mmap_sem; in that case, VM_FAULT_RETRY indicates that + * the mmap_sem would have been dropped. * * If our return value does not have VM_FAULT_RETRY set, the mmap_sem * has not been released. @@ -2518,9 +2523,15 @@ vm_fault_t filemap_fault(struct vm_fault *vmf) struct page *page; vm_fault_t ret = 0; - max_off = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE); - if (unlikely(offset >= max_off)) - return VM_FAULT_SIGBUS; + /* + * FAULT_FLAG_CACHED indicates that the inode size is only guaranteed + * to be valid when the page we are looking for is in the page cache. + */ + if (!(vmf->flags & FAULT_FLAG_CACHED)) { + max_off = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE); + if (unlikely(offset >= max_off)) + return VM_FAULT_SIGBUS; + } /* * Do we have something in the page cache already? @@ -2531,8 +2542,14 @@ vm_fault_t filemap_fault(struct vm_fault *vmf) * We found the page, so try async readahead before * waiting for the lock. */ - fpin = do_async_mmap_readahead(vmf, page); + ret = do_async_mmap_readahead(vmf, page, &fpin); + if (ret) { + put_page(page); + return ret; + } } else if (!page) { + if (vmf->flags & FAULT_FLAG_CACHED) + goto out_retry; /* No page in the page cache at all */ count_vm_event(PGMAJFAULT); count_memcg_event_mm(vmf->vma->vm_mm, PGMAJFAULT); From patchwork Fri Nov 22 23:53:24 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andreas Gruenbacher X-Patchwork-Id: 11258683 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 35B3814C0 for ; Fri, 22 Nov 2019 23:53:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 09B2720637 for ; Fri, 22 Nov 2019 23:53:57 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="FWeJYkyw" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727117AbfKVXxx (ORCPT ); Fri, 22 Nov 2019 18:53:53 -0500 Received: from us-smtp-2.mimecast.com ([205.139.110.61]:40353 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726952AbfKVXxv (ORCPT ); Fri, 22 Nov 2019 18:53:51 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1574466830; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NOGkY4foGqbdnpZbcMiOQpwjMwXw9GwTN5p3QDTd77E=; b=FWeJYkywRLw89xbD1pdwqkyHuBW6Yskgb5LGaWgmydLkzced8TWtIg4LHx/AxcUl6Qz0Af 4g1eBfSKhvgvtgGsHAk5xaQGUSdERD1MgIkw0rPECIY4LH/EkxdOyAcUeCmyqFVkJPWF7E jGf+cWKXGYRgZY0cf9pjqYFIPwXmaTs= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-12-ccp0u32WPuykWAMXE7w2NQ-1; Fri, 22 Nov 2019 18:53:47 -0500 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 3EB3B107ACE3; Fri, 22 Nov 2019 23:53:46 +0000 (UTC) Received: from max.com (ovpn-204-21.brq.redhat.com [10.40.204.21]) by smtp.corp.redhat.com (Postfix) with ESMTP id EA83B5C1BB; Fri, 22 Nov 2019 23:53:42 +0000 (UTC) From: Andreas Gruenbacher To: Linus Torvalds Cc: Steven Whitehouse , Konstantin Khlebnikov , "Kirill A. Shutemov" , linux-mm@kvack.org, Andrew Morton , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Alexander Viro , Johannes Weiner , cluster-devel@redhat.com, Ronnie Sahlberg , Steve French , Bob Peterson , Andreas Gruenbacher Subject: [RFC PATCH 3/3] gfs2: Rework read and page fault locking Date: Sat, 23 Nov 2019 00:53:24 +0100 Message-Id: <20191122235324.17245-4-agruenba@redhat.com> In-Reply-To: <20191122235324.17245-1-agruenba@redhat.com> References: <20191122235324.17245-1-agruenba@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-MC-Unique: ccp0u32WPuykWAMXE7w2NQ-1 X-Mimecast-Spam-Score: 0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Move the glock lock taking code from the ->readpage and ->readpages address space operations to the ->read_iter file and ->fault vm operations. To avoid taking the lock even when an operation can be satisfied out of the page cache, try the operation without taking the lock first. When that fails, take the lock and repeat the opeation. Signed-off-by: Andreas Gruenbacher --- fs/gfs2/aops.c | 36 ++++++--------------------- fs/gfs2/file.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++-- 2 files changed, 71 insertions(+), 31 deletions(-) diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c index b9fe975d7625..4aa6c952eb90 100644 --- a/fs/gfs2/aops.c +++ b/fs/gfs2/aops.c @@ -515,26 +515,10 @@ static int __gfs2_readpage(void *file, struct page *page) static int gfs2_readpage(struct file *file, struct page *page) { - struct address_space *mapping = page->mapping; - struct gfs2_inode *ip = GFS2_I(mapping->host); - struct gfs2_holder gh; int error; - unlock_page(page); - gfs2_holder_init(ip->i_gl, LM_ST_SHARED, 0, &gh); - error = gfs2_glock_nq(&gh); - if (unlikely(error)) - goto out; - error = AOP_TRUNCATED_PAGE; - lock_page(page); - if (page->mapping == mapping && !PageUptodate(page)) - error = __gfs2_readpage(file, page); - else - unlock_page(page); - gfs2_glock_dq(&gh); -out: - gfs2_holder_uninit(&gh); - if (error && error != AOP_TRUNCATED_PAGE) + error = __gfs2_readpage(file, page); + if (error) lock_page(page); return error; } @@ -602,18 +586,12 @@ static int gfs2_readpages(struct file *file, struct address_space *mapping, struct inode *inode = mapping->host; struct gfs2_inode *ip = GFS2_I(inode); struct gfs2_sbd *sdp = GFS2_SB(inode); - struct gfs2_holder gh; - int ret; + int ret = 0; - gfs2_holder_init(ip->i_gl, LM_ST_SHARED, 0, &gh); - ret = gfs2_glock_nq(&gh); - if (unlikely(ret)) - goto out_uninit; - if (!gfs2_is_stuffed(ip)) - ret = mpage_readpages(mapping, pages, nr_pages, gfs2_block_map); - gfs2_glock_dq(&gh); -out_uninit: - gfs2_holder_uninit(&gh); + if (gfs2_is_stuffed(ip)) + goto out; + ret = mpage_readpages(mapping, pages, nr_pages, gfs2_block_map); +out: if (unlikely(test_bit(SDF_WITHDRAWN, &sdp->sd_flags))) ret = -EIO; return ret; diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c index 997b326247e2..207d39996353 100644 --- a/fs/gfs2/file.c +++ b/fs/gfs2/file.c @@ -524,8 +524,34 @@ static vm_fault_t gfs2_page_mkwrite(struct vm_fault *vmf) return block_page_mkwrite_return(ret); } +static vm_fault_t gfs2_fault(struct vm_fault *vmf) +{ + struct inode *inode = file_inode(vmf->vma->vm_file); + struct gfs2_inode *ip = GFS2_I(inode); + struct gfs2_holder gh; + vm_fault_t ret; + int err; + + vmf->flags |= FAULT_FLAG_CACHED; + ret = filemap_fault(vmf); + vmf->flags &= ~FAULT_FLAG_CACHED; + if (!(ret & VM_FAULT_RETRY)) + return ret; + gfs2_holder_init(ip->i_gl, LM_ST_SHARED, 0, &gh); + err = gfs2_glock_nq(&gh); + if (err) { + ret = block_page_mkwrite_return(err); + goto out_uninit; + } + ret = filemap_fault(vmf); + gfs2_glock_dq(&gh); +out_uninit: + gfs2_holder_uninit(&gh); + return ret; +} + static const struct vm_operations_struct gfs2_vm_ops = { - .fault = filemap_fault, + .fault = gfs2_fault, .map_pages = filemap_map_pages, .page_mkwrite = gfs2_page_mkwrite, }; @@ -778,15 +804,51 @@ static ssize_t gfs2_file_direct_write(struct kiocb *iocb, struct iov_iter *from) static ssize_t gfs2_file_read_iter(struct kiocb *iocb, struct iov_iter *to) { + struct gfs2_inode *ip; + struct gfs2_holder gh; + size_t written = 0; ssize_t ret; + gfs2_holder_mark_uninitialized(&gh); if (iocb->ki_flags & IOCB_DIRECT) { ret = gfs2_file_direct_read(iocb, to); if (likely(ret != -ENOTBLK)) return ret; iocb->ki_flags &= ~IOCB_DIRECT; } - return generic_file_read_iter(iocb, to); + iocb->ki_flags |= IOCB_CACHED; + ret = generic_file_read_iter(iocb, to); + iocb->ki_flags &= ~IOCB_CACHED; + if (ret >= 0) { + if (!iov_iter_count(to)) + return ret; + written = ret; + } else { + switch(ret) { + case -EAGAIN: + if (iocb->ki_flags & IOCB_NOWAIT) + return ret; + break; + case -ECANCELED: + break; + default: + return ret; + } + } + ip = GFS2_I(iocb->ki_filp->f_mapping->host); + gfs2_holder_init(ip->i_gl, LM_ST_SHARED, 0, &gh); + ret = gfs2_glock_nq(&gh); + if (ret) + goto out_uninit; + ret = generic_file_read_iter(iocb, to); + if (ret > 0) + written += ret; + if (gfs2_holder_initialized(&gh)) + gfs2_glock_dq(&gh); +out_uninit: + if (gfs2_holder_initialized(&gh)) + gfs2_holder_uninit(&gh); + return written ? written : ret; } /**