From patchwork Thu Feb 27 21:17:47 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11410831 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C0E561580 for ; Thu, 27 Feb 2020 21:47:53 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A97AB24690 for ; Thu, 27 Feb 2020 21:47:53 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A97AB24690 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id E928C34B521; Thu, 27 Feb 2020 13:37:59 -0800 (PST) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id B14093489F6 for ; Thu, 27 Feb 2020 13:21:25 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id 51EA8A152; Thu, 27 Feb 2020 16:18:20 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 501BA46C; Thu, 27 Feb 2020 16:18:20 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 27 Feb 2020 16:17:47 -0500 Message-Id: <1582838290-17243-600-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1582838290-17243-1-git-send-email-jsimmons@infradead.org> References: <1582838290-17243-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 599/622] lustre: llite: Accept EBUSY for page unaligned read X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Patrick Farrell When doing unaligned strided reads, it's possible for the first and last page of a stride to be read by another thread on the same node, resulting in EBUSY. Also this could potentially happen for sequential read, for example, several MPI split one large file with unaligned page size, sequential read happen with each MPI program. We shouldn't stop readahead in these cases. WC-bug-id: https://jira.whamcloud.com/browse/LU-12518 Lustre-commit: b9c155065d2c ("LU-12518 llite: Accept EBUSY for page unaligned read") Signed-off-by: Patrick Farrell Reviewed-on: https://review.whamcloud.com/35457 Reviewed-by: Wang Shilong Reviewed-by: Andreas Dilger Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/llite/rw.c | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/fs/lustre/llite/rw.c b/fs/lustre/llite/rw.c index 9509023..1b5260d 100644 --- a/fs/lustre/llite/rw.c +++ b/fs/lustre/llite/rw.c @@ -360,7 +360,8 @@ static bool ras_inside_ra_window(pgoff_t idx, struct ra_io_arg *ria) { struct cl_read_ahead ra = { 0 }; pgoff_t page_idx; - int count = 0; + /* busy page count is per stride */ + int count = 0, busy_page_count = 0; int rc; LASSERT(ria); @@ -416,8 +417,21 @@ static bool ras_inside_ra_window(pgoff_t idx, struct ra_io_arg *ria) /* If the page is inside the read-ahead window */ rc = ll_read_ahead_page(env, io, queue, page_idx); - if (rc < 0) + if (rc < 0 && rc != -EBUSY) break; + if (rc == -EBUSY) { + busy_page_count++; + CDEBUG(D_READA, + "skip busy page: %lu\n", page_idx); + /* For page unaligned readahead the first + * last pages of each region can be read by + * another reader on the same node, and so + * may be busy. So only stop for > 2 busy + * pages. + */ + if (busy_page_count > 2) + break; + } *ra_end = page_idx; /* Only subtract from reserve & count the page if we @@ -441,6 +455,7 @@ static bool ras_inside_ra_window(pgoff_t idx, struct ra_io_arg *ria) pos += (ria->ria_length - offset); if ((pos >> PAGE_SHIFT) >= page_idx + 1) page_idx = (pos >> PAGE_SHIFT) - 1; + busy_page_count = 0; CDEBUG(D_READA, "Stride: jump %llu pages to %lu\n", ria->ria_length - offset, page_idx);