From patchwork Wed May 24 19:36:25 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mauricio Faria de Oliveira X-Patchwork-Id: 9746795 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 8A48660209 for ; Wed, 24 May 2017 19:36:35 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7B46028575 for ; Wed, 24 May 2017 19:36:35 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6FCD72899B; Wed, 24 May 2017 19:36:35 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E49DC28575 for ; Wed, 24 May 2017 19:36:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1031318AbdEXTgd (ORCPT ); Wed, 24 May 2017 15:36:33 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:59669 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1031313AbdEXTgc (ORCPT ); Wed, 24 May 2017 15:36:32 -0400 Received: from pps.filterd (m0098410.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.20/8.16.0.20) with SMTP id v4OJYNGg021725 for ; Wed, 24 May 2017 15:36:31 -0400 Received: from e24smtp03.br.ibm.com (e24smtp03.br.ibm.com [32.104.18.24]) by mx0a-001b2d01.pphosted.com with ESMTP id 2ang97rffr-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Wed, 24 May 2017 15:36:31 -0400 Received: from localhost by e24smtp03.br.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 24 May 2017 16:36:28 -0300 Received: from d24relay04.br.ibm.com (9.18.232.146) by e24smtp03.br.ibm.com (10.172.0.139) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Wed, 24 May 2017 16:36:27 -0300 Received: from d24av01.br.ibm.com (d24av01.br.ibm.com [9.8.31.91]) by d24relay04.br.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id v4OJaQ4k3866962 for ; Wed, 24 May 2017 16:36:27 -0300 Received: from d24av01.br.ibm.com (localhost [127.0.0.1]) by d24av01.br.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id v4OJaQWW004088 for ; Wed, 24 May 2017 16:36:27 -0300 Received: from t440.ltc.br.ibm.com (t440.br.ibm.com [9.18.239.23]) by d24av01.br.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id v4OJaQ1N004074; Wed, 24 May 2017 16:36:26 -0300 From: Mauricio Faria de Oliveira To: bcrl@kvack.org, viro@zeniv.linux.org.uk Cc: jmoyer@redhat.com, linux-aio@kvack.org, linux-fsdevel@vger.kernel.org Subject: [RESEND PATCH v2 2/2] aio: use ctx->max_reqs only for counting against the global limit Date: Wed, 24 May 2017 16:36:25 -0300 X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1495654585-22790-1-git-send-email-mauricfo@linux.vnet.ibm.com> References: <1495654585-22790-1-git-send-email-mauricfo@linux.vnet.ibm.com> X-TM-AS-MML: disable x-cbid: 17052419-0024-0000-0000-00000177F36B X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17052419-0025-0000-0000-0000163F49AD Message-Id: <1495654585-22790-3-git-send-email-mauricfo@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-05-24_13:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1703280000 definitions=main-1705240093 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Decouple ctx->max_reqs and ctx->nr_events; each one represents a different side of the same coin -- userspace and kernelspace, respectively. Briefly, ctx->max_reqs represents what is userspace/externally accessible by userspace; and ctx->nr_events represents what is kernelspace/internally needed by the percpu allocation scheme. With the percpu scheme, the original value of ctx->max_reqs from userspace is changed (but still used to count against aio_max_nr) based on num_possible_cpus(), and it may increase significantly on systems with great num_possible_cpus() for smaller nr_events. This eventually prevents userspace applications from getting the actual value of aio_max_nr in the total requested nr_events. ctx->max_reqs ============= The ctx->max_reqs value once again aligns with its description: * This is what userspace passed to io_setup(), it's not used for * anything but counting against the global max_reqs quota. It stores the original value of nr_events that userspace passed to io_setup() (it's not increased to make room for requirements of the percpu allocation scheme) - and is used to increment and decrement the 'aio_nr' value, and to check against 'aio_max_nr'. So, regardless of how many additional nr_events are internally required for the percpu allocation scheme (e.g. make it 4x the number of possible CPUs, and double it), userspace can get all of the 'aio-max-nr' value that is made available/visible to it. Another benefit is a consistent value in '/proc/sys/fs/aio-nr': the sum of all values as requested by userspace, and it's less than or equal to '/proc/sys/fs/aio-max-nr' again (not 2x it). ctx->nr_events ============== The ctx->nr_events value is the actual size of the ringbuffer/ number of slots, which may be more than what userspace passed to io_setup() (depending on the requested value for nr_events and/or calculations made in aio_setup_ring()) - as determined by the percpu allocation scheme for its correct/fast behavior. Signed-off-by: Mauricio Faria de Oliveira --- fs/aio.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index 7c3c01f352c1..4967b0e1ef1a 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -706,6 +706,12 @@ static struct kioctx *ioctx_alloc(unsigned nr_events) int err = -ENOMEM; /* + * Store the original value of nr_events from userspace for counting + * against the global limit (aio_max_nr). + */ + unsigned max_reqs = nr_events; + + /* * We keep track of the number of available ringbuffer slots, to prevent * overflow (reqs_available), and we also use percpu counters for this. * @@ -723,14 +729,14 @@ static struct kioctx *ioctx_alloc(unsigned nr_events) return ERR_PTR(-EINVAL); } - if (!nr_events || (unsigned long)nr_events > (aio_max_nr * 2UL)) + if (!nr_events || (unsigned long)max_reqs > aio_max_nr) return ERR_PTR(-EAGAIN); ctx = kmem_cache_zalloc(kioctx_cachep, GFP_KERNEL); if (!ctx) return ERR_PTR(-ENOMEM); - ctx->max_reqs = nr_events; + ctx->max_reqs = max_reqs; spin_lock_init(&ctx->ctx_lock); spin_lock_init(&ctx->completion_lock); @@ -763,8 +769,8 @@ static struct kioctx *ioctx_alloc(unsigned nr_events) /* limit the number of system wide aios */ spin_lock(&aio_nr_lock); - if (aio_nr + nr_events > (aio_max_nr * 2UL) || - aio_nr + nr_events < aio_nr) { + if (aio_nr + ctx->max_reqs > aio_max_nr || + aio_nr + ctx->max_reqs < aio_nr) { spin_unlock(&aio_nr_lock); err = -EAGAIN; goto err_ctx;