From patchwork Fri Jan 19 16:02:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13523979 Received: from mail-io1-f51.google.com (mail-io1-f51.google.com [209.85.166.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F3C0D52F61 for ; Fri, 19 Jan 2024 16:03:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705680225; cv=none; b=LFyGbIMrwRDID8ziGK1whjexOihsLx+TOQS4t4zhxQ9EgRiiEZr9fYzD5+mJxlBuSyaSo7PHj9i13GBCfQmDzZ7/OyRjmOvctg6/FF7mlqJ7TNLo3ha7rL8J4iZZoPSHwJIFCFV+o4XNLKo3PKJS4vExDTzCsXps/gCjwcYKmXE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705680225; c=relaxed/simple; bh=tcuVntrfVztS32kuzKXBWCs7g7jQvxJZgfWq07XJAK0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=uc04BdWcyGPTWXroRzLm7THKV7U1I+GiZMeHmiLW6RXd24OR/3M7ETyblp8Y0YJxrkgGSM6y0NhVWx5wHb1Ho85cZq6ktWCk5OoTiod+Per31EFASg2nvCi+Pf08725+lkCrNLl4vTEsRMxzP5ULgCnx+vdvPF7f/KPjPIHRDu4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=DqPRpDQi; arc=none smtp.client-ip=209.85.166.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="DqPRpDQi" Received: by mail-io1-f51.google.com with SMTP id ca18e2360f4ac-7bee9f626caso13150539f.0 for ; Fri, 19 Jan 2024 08:03:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1705680222; x=1706285022; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=OiTfz39ZKyKMN0Cev9jwvMiF20xs0nOs8gdILsd+qdI=; b=DqPRpDQiz/a96hCP9pgJA5LSV1u2ra7mY/qCd2tllsOPPjMOUEce53cWTQmp0gK4p1 bjnkPtXjqltiLdc9bKMtCTewN3HTQtytGOfR/4wpfo7qABqiiol8XC1bgWAUxo3Rmfbc xyCNZXMcIVPHuUSH3lZFo1joXHirnGlG4BmeHtbFcw+/LLV5N9YQbqj9IwSUOMTw7Bi7 L+Z7WH5jvJ1WFixB4I2c8U/NK9KYhgfNUpVFSlKJoXwsiNmO0L/+aWL9Bh4JvxzFt3db epJcFtXWKLghjxE3SbG5ZyoS088o9R56Z0Wk/oN9U6VeSvn/8YDpQbdKGxNUEnPjDjYZ 1n+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705680222; x=1706285022; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=OiTfz39ZKyKMN0Cev9jwvMiF20xs0nOs8gdILsd+qdI=; b=SCg4vNvREdBG4lVWHlP3wjYncikYX+ekeEtLVkVN0At78ykR547L3BJRrx3KyeB5gg r3VylOf55x+ZQKN7GspAi4QAGSqeXBKaZDDVCF8RfigLKaqFoVf3HYAKoYZRLTIZMV8C TLL5+1aAE4RUvfox+B4/9Jye4PrIK+9Fi2M8q9whU+HVEdnOf95U6Qs7oXakg3tm8+BB jfSlKyU+72F1GCAPsyqbheyWx6vxj0yZOe5BQ+7VXE/gE5TjdxqNBLBRw6jHNF+Bkfhd nxlgRjCbNhgiGfj0E2Idl2r98+n1+HQ8FRvKsyMjzrN+sLHUF7bdZhaBJDISAZbCnrfH i6SA== X-Gm-Message-State: AOJu0YwZaI0r/bweyJDwAValj85O+S+7E+2FhtD7LQVTgYBLVSLcjFlc BKfoS3PLZmLrUh+G93wqZ/U8Dho7+XoUF2enEh8gBzrPQDD3ye9LETAkzPzaGBYC3d35x3822VK 2dwA= X-Google-Smtp-Source: AGHT+IGrnt4TSJrwbZB1W5hK9erkO7reo8k323wAyOYyqAIN327E6dRiMqJy5rOXfjj56RfFkTMGVA== X-Received: by 2002:a05:6e02:1c2f:b0:35f:f59f:9f4c with SMTP id m15-20020a056e021c2f00b0035ff59f9f4cmr74615ilh.1.1705680222550; Fri, 19 Jan 2024 08:03:42 -0800 (PST) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id bc8-20020a056e02008800b0035fe37a9c09sm5645163ilb.20.2024.01.19.08.03.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 19 Jan 2024 08:03:41 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org Cc: bvanassche@acm.org, Jens Axboe Subject: [PATCH 1/4] block/mq-deadline: pass in queue directly to dd_insert_request() Date: Fri, 19 Jan 2024 09:02:06 -0700 Message-ID: <20240119160338.1191281-2-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240119160338.1191281-1-axboe@kernel.dk> References: <20240119160338.1191281-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 The hardware queue isn't relevant, deadline only operates on the queue itself. Pass in the queue directly rather than the hardware queue, as that more clearly explains what is being operated on. Signed-off-by: Jens Axboe Reviewed-by: Bart Van Assche --- block/mq-deadline.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/block/mq-deadline.c b/block/mq-deadline.c index f958e79277b8..9b7563e9d638 100644 --- a/block/mq-deadline.c +++ b/block/mq-deadline.c @@ -792,10 +792,9 @@ static bool dd_bio_merge(struct request_queue *q, struct bio *bio, /* * add rq to rbtree and fifo */ -static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, +static void dd_insert_request(struct request_queue *q, struct request *rq, blk_insert_t flags, struct list_head *free) { - struct request_queue *q = hctx->queue; struct deadline_data *dd = q->elevator->elevator_data; const enum dd_data_dir data_dir = rq_data_dir(rq); u16 ioprio = req_get_ioprio(rq); @@ -875,7 +874,7 @@ static void dd_insert_requests(struct blk_mq_hw_ctx *hctx, rq = list_first_entry(list, struct request, queuelist); list_del_init(&rq->queuelist); - dd_insert_request(hctx, rq, flags, &free); + dd_insert_request(q, rq, flags, &free); } spin_unlock(&dd->lock); From patchwork Fri Jan 19 16:02:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13523980 Received: from mail-io1-f47.google.com (mail-io1-f47.google.com [209.85.166.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D7D2F54BF9 for ; Fri, 19 Jan 2024 16:03:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.47 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705680227; cv=none; b=XFNCz/OCxydESAWfDBJZRrT5Txye3aa6L89TRSajrXNN0LSFv3/dd2JeNtwK+Ba+hxd8AsineaYa4dl8nqM9RDGUK5IAYDZSCz2HULPUKxRD7+k6WiePw2mJRBv3UGRgenZ4StJt20geiFH2RJmKp3tC+YZ2DL3V+JigjWBjZoM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705680227; c=relaxed/simple; bh=eN5cYUa2uS7egMhgMRv3rkay+tY6Azv/e0obJTDB1RA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=cO7aRLVTHOHlFi1SxwwVOKxn9gQbW8/f1o5QI/S0/Z5dc2fv1BrwIYlHFVQLEVTmQcvqCGSvo1kAeOCMsmgZz+c4pQKnJgrv8/KxJflZxfOtsNptvitzrPy2mw0/UpAAJpCVMQ3PHVZC5B4bK0Nij2VPq0yAZEzkpEmorHJxST4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=0lh2hpMX; arc=none smtp.client-ip=209.85.166.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="0lh2hpMX" Received: by mail-io1-f47.google.com with SMTP id ca18e2360f4ac-7bed82030faso13083739f.1 for ; Fri, 19 Jan 2024 08:03:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1705680224; x=1706285024; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=AZgFpkGIasqO5x4x661AeTcqaj0ffo7KFA+y+n4T1k4=; b=0lh2hpMXMY8AL7CZB9Twla1PsdMeP33wVrXA3z/+SEXSATRHUCD+nriYnGOWbpijVj 0zBsMAHlYLG1+J0b9VcnHFv8NC4sXsQ9ks8IBwHFQ4V8dx8Y5V0pP1CSqXdlWltphvP5 GeeW5KkwPTkRadWMuNP/s5ZfasMXEdiLfE5UNEhddtrXoy2KKICV4X4tWaOOd3smvWTm 5L+cIBt2Q2MMdr/bxbLdnf7lgxovDssrplAGg4XHGmatgzmyBkpYKdbIfLi4CSOz0Mve CJZFD5jDuX9BBWk99/ljx/kDDpsjFHhK52m/TiMYmtdfffRWQ0NKe7jvVdncJU2mxGGN f46g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705680224; x=1706285024; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=AZgFpkGIasqO5x4x661AeTcqaj0ffo7KFA+y+n4T1k4=; b=BNmiSWvsnFLDqY+WQE04qVXud0Epo1TZiYyvRh8pB8WH4CCTH9UhzckWRc566D9nwC kOsJsPS8Ion72gg6kHNNgAWCn3cfpEhvpdvq/sGkyvaFfCKl+TW7N7kWVQSBccgAvTwa eG4cOSfoSfIE39gdfXJ8ysqaV1rZK4VE1qENDCsF6+MhX9oCl/gaUjS6guGxk8+Igu1m Xl4NXfaa5Ow8YfFfDrYTlOQaJc6XXPrZunrhFqHG+mhygVt/90aOaqJM55LdHrXHIA1o moD5+MfHSh5Hphnt78O9yhA8i4fkmW7uj1w7MZYXLya+6H5/n1050GmT2OTn4QR1Lhhk vj2A== X-Gm-Message-State: AOJu0YzSAcbhIBKN+RgJ6sd6vO/H7ihOBtQegUDtq5iZ4XVUdBnn4OSS m+49Ul/sl6u1O9Ktm6Nm9wQgci2Vj8NbNsaj5fHv5lhGIDJ8u3liGWCI/x3c9VXkH+opZoZETty xU0M= X-Google-Smtp-Source: AGHT+IE3upts68brXFHtCgUZLOwM3o5lCWF4GpABF0t9J7ZG7s9Yc+vSHrtuRnRO0FOA9GvPgG9a2w== X-Received: by 2002:a05:6e02:12ee:b0:360:968d:bf98 with SMTP id l14-20020a056e0212ee00b00360968dbf98mr79056iln.1.1705680224463; Fri, 19 Jan 2024 08:03:44 -0800 (PST) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id bc8-20020a056e02008800b0035fe37a9c09sm5645163ilb.20.2024.01.19.08.03.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 19 Jan 2024 08:03:42 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org Cc: bvanassche@acm.org, Jens Axboe Subject: [PATCH 2/4] block/mq-deadline: serialize request dispatching Date: Fri, 19 Jan 2024 09:02:07 -0700 Message-ID: <20240119160338.1191281-3-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240119160338.1191281-1-axboe@kernel.dk> References: <20240119160338.1191281-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 If we're entering request dispatch but someone else is already dispatching, then just skip this dispatch. We know IO is inflight and this will trigger another dispatch event for any completion. This will potentially cause slightly lower queue depth for contended cases, but those are slowed down anyway and this should not cause an issue. By itself, this patch doesn't help a whole lot, as the dispatch lock contention reduction is just eating up by the same dd->lock now seeing increased insertion contention. But it's required work to be able to reduce the lock contention in general. Signed-off-by: Jens Axboe --- block/mq-deadline.c | 31 ++++++++++++++++++++++++++----- 1 file changed, 26 insertions(+), 5 deletions(-) diff --git a/block/mq-deadline.c b/block/mq-deadline.c index 9b7563e9d638..b579ce282176 100644 --- a/block/mq-deadline.c +++ b/block/mq-deadline.c @@ -79,10 +79,20 @@ struct dd_per_prio { struct io_stats_per_prio stats; }; +enum { + DD_DISPATCHING = 0, +}; + struct deadline_data { /* * run time data */ + struct { + spinlock_t lock; + spinlock_t zone_lock; + } ____cacheline_aligned_in_smp; + + unsigned long run_state; struct dd_per_prio per_prio[DD_PRIO_COUNT]; @@ -100,9 +110,6 @@ struct deadline_data { int front_merges; u32 async_depth; int prio_aging_expire; - - spinlock_t lock; - spinlock_t zone_lock; }; /* Maps an I/O priority class to a deadline scheduler priority. */ @@ -600,6 +607,18 @@ static struct request *dd_dispatch_request(struct blk_mq_hw_ctx *hctx) struct request *rq; enum dd_prio prio; + /* + * If someone else is already dispatching, skip this one. This will + * defer the next dispatch event to when something completes, and could + * potentially lower the queue depth for contended cases. + * + * See the logic in blk_mq_do_dispatch_sched(), which loops and + * retries if nothing is dispatched. + */ + if (test_bit(DD_DISPATCHING, &dd->run_state) || + test_and_set_bit(DD_DISPATCHING, &dd->run_state)) + return NULL; + spin_lock(&dd->lock); rq = dd_dispatch_prio_aged_requests(dd, now); if (rq) @@ -616,6 +635,7 @@ static struct request *dd_dispatch_request(struct blk_mq_hw_ctx *hctx) } unlock: + clear_bit(DD_DISPATCHING, &dd->run_state); spin_unlock(&dd->lock); return rq; @@ -706,6 +726,9 @@ static int dd_init_sched(struct request_queue *q, struct elevator_type *e) eq->elevator_data = dd; + spin_lock_init(&dd->lock); + spin_lock_init(&dd->zone_lock); + for (prio = 0; prio <= DD_PRIO_MAX; prio++) { struct dd_per_prio *per_prio = &dd->per_prio[prio]; @@ -722,8 +745,6 @@ static int dd_init_sched(struct request_queue *q, struct elevator_type *e) dd->last_dir = DD_WRITE; dd->fifo_batch = fifo_batch; dd->prio_aging_expire = prio_aging_expire; - spin_lock_init(&dd->lock); - spin_lock_init(&dd->zone_lock); /* We dispatch from request queue wide instead of hw queue */ blk_queue_flag_set(QUEUE_FLAG_SQ_SCHED, q); From patchwork Fri Jan 19 16:02:08 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13523981 Received: from mail-io1-f53.google.com (mail-io1-f53.google.com [209.85.166.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C8B7B54F83 for ; Fri, 19 Jan 2024 16:03:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.53 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705680229; cv=none; b=udiXFgYDyW5FcuQOxOWQH0YNMjX5r27QGV4PRrsXQ4XrTaV3cmzFLmJjZ/nHNZH6CMd9hsOTw2m8iHA7k85NQ1UGxTpQle44+R93CwR1i1MR4p5bYdQsvwSAaySW6rQSiojaffhUMAfyjjRH7+P8gjyyRsB9zzp6EXhR3uz78cg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705680229; c=relaxed/simple; bh=LtBCGP8P0JdIyyvJ+j0kW3j5RJ+Qy/CI/yDgASn7SMc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=lXnKhNQhdZ4lm0YUilvql1yI92A39PE95HDmTCR9kIj/4FtiItuSPaKXSI0jMNTFL04hxp+GtnGXELZ0hUO9WZ+RzuBv+vQbXmIDFAk84Y88dNTcbCLUCWPoayH31oWM/6VD1OUJi/1fK9x62gQrJpQDB7Q1viuF4YOIs7aUoJI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=nWu0Aiu7; arc=none smtp.client-ip=209.85.166.53 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="nWu0Aiu7" Received: by mail-io1-f53.google.com with SMTP id ca18e2360f4ac-7bed82030faso13084639f.1 for ; Fri, 19 Jan 2024 08:03:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1705680226; x=1706285026; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=R+L4DBFcuvmEPP5mxZ7Z9hQF2ctRiwJtMB/6A5xRD0s=; b=nWu0Aiu7mBhiuYbMfVG5QJfXgtBhoy/hUTijHkGbKmHTOcXN1bu6kw9/Eqv/eAHw8X NDx/FmyqMqcpszkywpI173pTZfNU+MoCWN51EOu7QHAkq4ujYxkFVBSH79kYUL1FlnUJ Kf375f0I/JQqoYA8u0/UsC5sDGgT78BEewJ5lHMMEmPX88BUlhno+Ne0dAYKnfSShulO oYvVSA4FvhUcT1gea1nSQ3glD3VKbdy449as1uVUDPhVLbU0m0gsm21/6Z1vUL+xS1ye BaAi3C6nnFtOdOI7N+Zc1RATH4kp4Z+zrhOSlQ87NzLAv4rpFPf0OArzHwHVyL0k/LNd WB0A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705680226; x=1706285026; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=R+L4DBFcuvmEPP5mxZ7Z9hQF2ctRiwJtMB/6A5xRD0s=; b=aokB/SZQ9iKleW+4LOj1rqQi65vTFNsGzSnt3WpwwhATGdvtkWs1raJejcUYMDb2Hk FHNNv51B/M1GDoswY+u7YNwstBPnDul+2fiIw24zniXv1hnUeYikTpWaHcVdOPBYQ86C 2E1ckvfgRazmlJYKy6o4ZHYxkzX3PxTUKBdfU/XCJMflWh539RB9K1SiOcUbjhp5bS8N iAM9U6yT4ZgRSmfXdJQBFyGOG3n0yFBB+1EM+G7VvRd1O6LegmwHyuOQy93Y0SV6XIwV STOp0zEjQkr6bpMgDhg2cQuorkEA3X5+SvkbZh9EiIVnIfUssXGumqNKXFHL2a9x9hzQ K0gg== X-Gm-Message-State: AOJu0Ywt0nhSniEHNMRxa3AFO4T9dENUk9z4G8y8Y0vsXYeY7hrkyR3s nW4djc0WIBBpPY6NqNc0bLZFw58Ysyxnlws8MqC5om6ABrIJ4LapVEjHxbbD9niPLkD3dyKIgL4 bXOI= X-Google-Smtp-Source: AGHT+IG3jn/AocOtSq4GyyVyV5H33+HkVUS03WIJuC0Lmz8EJuQEZnEXSog7KvUYYYLLu3oGBCNyNQ== X-Received: by 2002:a05:6e02:1a25:b0:35f:b16d:cd64 with SMTP id g5-20020a056e021a2500b0035fb16dcd64mr107497ile.0.1705680226449; Fri, 19 Jan 2024 08:03:46 -0800 (PST) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id bc8-20020a056e02008800b0035fe37a9c09sm5645163ilb.20.2024.01.19.08.03.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 19 Jan 2024 08:03:44 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org Cc: bvanassche@acm.org, Jens Axboe Subject: [PATCH 3/4] block/mq-deadline: fallback to per-cpu insertion buckets under contention Date: Fri, 19 Jan 2024 09:02:08 -0700 Message-ID: <20240119160338.1191281-4-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240119160338.1191281-1-axboe@kernel.dk> References: <20240119160338.1191281-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 If we attempt to insert a list of requests, but someone else is already running an insertion, then fallback to queueing that list internally and let the existing inserter finish the operation. The current inserter will either see and flush this list, of if it ends before we're done doing our bucket insert, then we'll flush it and insert ourselves. This reduces contention on the dd->lock, which protects any request insertion or dispatch, by having a backup point to insert into which will either be flushed immediately or by an existing inserter. As the alternative is to just keep spinning on the dd->lock, it's very easy to get into a situation where multiple processes are trying to do IO and all sit and spin on this lock. With the previous dispatch optimization, this drastically reduces contention for a sample cases of 32 threads doing IO to devices. The test case looks as follows: fio --bs=512 --group_reporting=1 --gtod_reduce=1 --invalidate=1 \ --ioengine=io_uring --norandommap --runtime=60 --rw=randread \ --thread --time_based=1 --buffered=0 --fixedbufs=1 --numjobs=32 \ --iodepth=4 --iodepth_batch_submit=4 --iodepth_batch_complete=4 \ --name=scaletest --filename=/dev/$DEV Before: Device IOPS sys contention diff ==================================================== null_blk 879K 89% 93.6% nvme0n1 901K 86% 94.5% and after this and the previous dispatch patch: Device IOPS sys contention diff ==================================================== null_blk 2311K 10.3% 21.1% +257% nvme0n1 2610K 11.0% 24.6% +289% Signed-off-by: Jens Axboe --- block/mq-deadline.c | 130 +++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 121 insertions(+), 9 deletions(-) diff --git a/block/mq-deadline.c b/block/mq-deadline.c index b579ce282176..cc3155d50e0d 100644 --- a/block/mq-deadline.c +++ b/block/mq-deadline.c @@ -81,8 +81,18 @@ struct dd_per_prio { enum { DD_DISPATCHING = 0, + DD_INSERTING = 1, + DD_BUCKETS = 2, }; +#define DD_CPU_BUCKETS 32 +#define DD_CPU_BUCKETS_MASK (DD_CPU_BUCKETS - 1) + +struct dd_bucket_list { + struct list_head list; + spinlock_t lock; +} ____cacheline_aligned_in_smp; + struct deadline_data { /* * run time data @@ -94,6 +104,8 @@ struct deadline_data { unsigned long run_state; + struct dd_bucket_list bucket_lists[DD_CPU_BUCKETS]; + struct dd_per_prio per_prio[DD_PRIO_COUNT]; /* Data direction of latest dispatched request. */ @@ -714,7 +726,7 @@ static int dd_init_sched(struct request_queue *q, struct elevator_type *e) struct deadline_data *dd; struct elevator_queue *eq; enum dd_prio prio; - int ret = -ENOMEM; + int i, ret = -ENOMEM; eq = elevator_alloc(q, e); if (!eq) @@ -729,6 +741,11 @@ static int dd_init_sched(struct request_queue *q, struct elevator_type *e) spin_lock_init(&dd->lock); spin_lock_init(&dd->zone_lock); + for (i = 0; i < DD_CPU_BUCKETS; i++) { + INIT_LIST_HEAD(&dd->bucket_lists[i].list); + spin_lock_init(&dd->bucket_lists[i].lock); + } + for (prio = 0; prio <= DD_PRIO_MAX; prio++) { struct dd_per_prio *per_prio = &dd->per_prio[prio]; @@ -878,6 +895,94 @@ static void dd_insert_request(struct request_queue *q, struct request *rq, } } +static void dd_dispatch_from_buckets(struct deadline_data *dd, + struct list_head *list) +{ + int i; + + if (!test_bit(DD_BUCKETS, &dd->run_state) || + !test_and_clear_bit(DD_BUCKETS, &dd->run_state)) + return; + + for (i = 0; i < DD_CPU_BUCKETS; i++) { + struct dd_bucket_list *bucket = &dd->bucket_lists[i]; + + if (list_empty_careful(&bucket->list)) + continue; + spin_lock(&bucket->lock); + list_splice_init(&bucket->list, list); + spin_unlock(&bucket->lock); + } +} + +/* + * If we can grab the dd->lock, then just return and do the insertion as per + * usual. If not, add to one of our internal buckets, and afterwards recheck + * if if we should retry. + */ +static bool dd_insert_to_bucket(struct deadline_data *dd, + struct list_head *list) + __acquires(&dd->lock) +{ + struct dd_bucket_list *bucket; + + /* + * If we can grab the lock, proceed as per usual. If not, and insert + * isn't running, force grab the lock and proceed as per usual. + */ + if (spin_trylock(&dd->lock)) + return false; + if (!test_bit(DD_INSERTING, &dd->run_state)) { + spin_lock(&dd->lock); + return false; + } + + if (!test_bit(DD_BUCKETS, &dd->run_state)) + set_bit(DD_BUCKETS, &dd->run_state); + + bucket = &dd->bucket_lists[get_cpu() & DD_CPU_BUCKETS_MASK]; + spin_lock(&bucket->lock); + list_splice_init(list, &bucket->list); + spin_unlock(&bucket->lock); + put_cpu(); + + /* + * Insertion still running, we are done. + */ + if (test_bit(DD_INSERTING, &dd->run_state)) + return true; + + /* + * We may be too late, play it safe and grab the lock. This will + * flush the above bucket insert as well and insert it. + */ + spin_lock(&dd->lock); + return false; +} + +static void __dd_insert_requests(struct request_queue *q, + struct deadline_data *dd, + struct list_head *list, blk_insert_t flags, + struct list_head *free) +{ + set_bit(DD_INSERTING, &dd->run_state); + do { + while (!list_empty(list)) { + struct request *rq; + + rq = list_first_entry(list, struct request, queuelist); + list_del_init(&rq->queuelist); + dd_insert_request(q, rq, flags, free); + } + + dd_dispatch_from_buckets(dd, list); + if (list_empty(list)) + break; + } while (1); + + clear_bit(DD_INSERTING, &dd->run_state); +} + /* * Called from blk_mq_insert_request() or blk_mq_dispatch_plug_list(). */ @@ -889,16 +994,23 @@ static void dd_insert_requests(struct blk_mq_hw_ctx *hctx, struct deadline_data *dd = q->elevator->elevator_data; LIST_HEAD(free); - spin_lock(&dd->lock); - while (!list_empty(list)) { - struct request *rq; + /* + * If dispatch is busy and we ended up adding to our internal bucket, + * then we're done for now. + */ + if (dd_insert_to_bucket(dd, list)) + return; - rq = list_first_entry(list, struct request, queuelist); - list_del_init(&rq->queuelist); - dd_insert_request(q, rq, flags, &free); - } - spin_unlock(&dd->lock); + do { + __dd_insert_requests(q, dd, list, flags, &free); + /* + * If buckets is set after inserting was cleared, be safe and do + * another loop as we could be racing with bucket insertion. + */ + } while (test_bit(DD_BUCKETS, &dd->run_state)); + + spin_unlock(&dd->lock); blk_mq_free_requests(&free); } From patchwork Fri Jan 19 16:02:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13523982 Received: from mail-io1-f48.google.com (mail-io1-f48.google.com [209.85.166.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 87F3D54F89 for ; Fri, 19 Jan 2024 16:03:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705680231; cv=none; b=faQCcHW3vYmf9oFo3m+ONfztI6vLCVDfCc6KlQWtOhRnGKmvXXLTZ+8BOT3vH0X7kvM0uctcW/x9Gcbym2K4j3PCkKBkQR5ZqCqPes+rCD5hfThQ5B1XYVKYWkDQVrqFCIpu9TbU7Yb5Ai52f4xt2jX+bW58PoS/7K2wr+du/t0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705680231; c=relaxed/simple; bh=WK/x5/gu4d0+kAD7L+3T6eBJOVM5ZL1vRZbxjlPc3xo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=iAj+swnWq06TxdDP8yRYAbsWk97g3f6QsBw88PKKC0aWgVlRWODd84CXfuS37IkFR6jBohHZPnlUoPrQb11OOhgTUBw23GHN4O29Lg1o+N3DuYkguu3qbOb65oUvttNs4bBkKTCMahSpROIcaJQ/3xSAMm/vESvUZEYIn3FAWZU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=nckPgWPJ; arc=none smtp.client-ip=209.85.166.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="nckPgWPJ" Received: by mail-io1-f48.google.com with SMTP id ca18e2360f4ac-7bee9f626caso13152239f.0 for ; Fri, 19 Jan 2024 08:03:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1705680228; x=1706285028; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=efGH3Ww5OFomzaU1SXx6b7i3xuNfb5LRmOmTQkRzSEM=; b=nckPgWPJBEeHh0LVK1l9JDYKoO5PHgtKaIN1kgVbl692r6GiBj/B3KjNebMQT/yaU6 pGbMQT4WQyY2FZmFk2e4tQMMPAeFw0q+hrku59+JEewvKZOvIsuHH27WHEGNrY9uuaPm vQEsa+2jVYx7JJu9mlIhkPPO+NTTOr44NcAQ9gT9ul8w087jpdRZeo8bCOuP8XO+wveq k3uUHh/N5zeasw7tZU9SARIa3pvsvk2qhlqw6fpHgOUrDGMUNRfIXnCBo2kUKDdJtfOB 6v51eb22trdH/+NqS8wCwhHEAZqN1QgsrLowBIvHH6u7DJdduo6ysfMyhxjNuAMW4VYw LSzg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705680228; x=1706285028; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=efGH3Ww5OFomzaU1SXx6b7i3xuNfb5LRmOmTQkRzSEM=; b=cDFpRiI5PoMmDDFaXDvCwjOVRmK7CCoR/oABGV9vmHIvO+Fc2RwxFdGWxn3PIDoZhd d2ky8ZeVW5LxJnUPZw4z1FQpQvctfTtaUD76SDgfDdH+j6ip99tGsDsxMZ6bpdf4NVNi RqUP/AqHkZMmAH674xxvC3R8FQLtLK6szIqDaoTQaZgAGtJ40MQXSd5SEkeF+6f6vbTN W4qcORTKni9MYMFBskpM8SrADWitZNi2Stc/eijdRZSDFhtSuW9JZYh89Jki7o3/k2PJ J6jw4kQaLxrVUKdxi/PdoGO/ZkJ6aBn7T1hlrtsw4ZhSDGLp1Yoa4kQlcThFxb2sJ+wR q/oA== X-Gm-Message-State: AOJu0YwNO6c2wmXwdVZJUG5ZV4FAmLZi97hVk1YNk1fCb9jqy4ELaXYX 6KlTLjlRn5ureN07199qA3u5o4SLRP9KPHK55SQBqSH6JXSJjVg2goarAeJSXMeIpddJEdbwN+Z WP+I= X-Google-Smtp-Source: AGHT+IHG5VJIqG9vKDidy9uJpU2u7GLeyFWKkPchlZQb4v/Y+W7XsDpqdS6ZMw0THmxCgv79X/n1cg== X-Received: by 2002:a05:6e02:144d:b0:360:64ad:cf39 with SMTP id p13-20020a056e02144d00b0036064adcf39mr60459ilo.2.1705680228355; Fri, 19 Jan 2024 08:03:48 -0800 (PST) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id bc8-20020a056e02008800b0035fe37a9c09sm5645163ilb.20.2024.01.19.08.03.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 19 Jan 2024 08:03:46 -0800 (PST) From: Jens Axboe To: linux-block@vger.kernel.org Cc: bvanassche@acm.org, Jens Axboe Subject: [PATCH 4/4] block/mq-deadline: skip expensive merge lookups if contended Date: Fri, 19 Jan 2024 09:02:09 -0700 Message-ID: <20240119160338.1191281-5-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240119160338.1191281-1-axboe@kernel.dk> References: <20240119160338.1191281-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 We do several stages of merging in the block layer - the most likely one to work is also the cheap one, merging direct in the per-task plug when IO is submitted. Getting merges outside of that is a lot less likely, but IO schedulers may still maintain internal data structures to facilitate merge lookups outside of the plug. Make mq-deadline skip expensive merge lookups if the queue lock is already contended. The likelihood of getting a merge here is not very likely, hence it should not be a problem skipping the attempt in the also unlikely event that the queue is already contended. Perf diff shows the difference between a random read/write workload with 4 threads doing IO, with expensive merges turned on and off: 25.00% +61.94% [kernel.kallsyms] [k] queued_spin_lock_slowpath where we almost quadruple the lock contention by attempting these expensive merges. Signed-off-by: Jens Axboe --- block/mq-deadline.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/block/mq-deadline.c b/block/mq-deadline.c index cc3155d50e0d..2de0832b1e5d 100644 --- a/block/mq-deadline.c +++ b/block/mq-deadline.c @@ -817,7 +817,19 @@ static bool dd_bio_merge(struct request_queue *q, struct bio *bio, struct request *free = NULL; bool ret; - spin_lock(&dd->lock); + /* + * bio merging is called for every bio queued, and it's very easy + * to run into contention because of that. If we fail getting + * the dd lock, just skip this merge attempt. For related IO, the + * plug will be the successful merging point. If we get here, we + * already failed doing the obvious merge. Chances of actually + * getting a merge off this path is a lot slimmer, so skipping an + * occassional lookup that will most likely not succeed anyway should + * not be a problem. + */ + if (!spin_trylock(&dd->lock)) + return false; + ret = blk_mq_sched_try_merge(q, bio, nr_segs, &free); spin_unlock(&dd->lock);