From patchwork Fri Jun 9 18:31:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13274283 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F2D1DC7EE29 for ; Fri, 9 Jun 2023 18:31:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230230AbjFISbf (ORCPT ); Fri, 9 Jun 2023 14:31:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44622 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230283AbjFISbd (ORCPT ); Fri, 9 Jun 2023 14:31:33 -0400 Received: from mail-io1-xd34.google.com (mail-io1-xd34.google.com [IPv6:2607:f8b0:4864:20::d34]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3D55A3A85 for ; Fri, 9 Jun 2023 11:31:32 -0700 (PDT) Received: by mail-io1-xd34.google.com with SMTP id ca18e2360f4ac-777a9ca9112so20758039f.1 for ; Fri, 09 Jun 2023 11:31:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20221208.gappssmtp.com; s=20221208; t=1686335491; x=1688927491; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=MRcYjvsoYs3sziamhSQYD0QqxzzMy87FZ6PyQbKgmQo=; b=PPXRvFdEMz9gAaG477US1WVBni/WF+iO3rgx8ynnUAkmxSlFiazRex2lYED0U6V2VZ OAY9FRt/kRdCLgKUB9STG/5Yp6A4AtFmx1vAn1iH5iWxpsoxwzpqm6FLi86SCyajMrKl 5N1mtPsyyiOqkTPgt+oO1iKKX4+whFFWOh+KXpG1ytrd0DwdVXujekReoKXmK7SjrtZ8 m7R6q+uRxootQwZ+Bbsf0ShjVoGY7TNU2LJVEezKtPOE+cX1zjTmdF6jA/dV1li4Kcd/ 8aeNE3k8bkv9SALiJe5cWx6z+Qb5IPAWG17yJKhcbg+Mk8ZbK6ToRi+EYk6xPL7WHJpD iGCQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686335491; x=1688927491; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=MRcYjvsoYs3sziamhSQYD0QqxzzMy87FZ6PyQbKgmQo=; b=PdbZ9qDialOPTTs9JJxrktll/EurFzKSsX/ZMlezrdcaLKlQ6/5WUsaq73/dTB+yHA GzqyR+lDL/cszKkwqrk02UtE1BeaSHVLXhl0d8LPrh9A8CWU3DOsI3hKNXUCGb4+sURu RA77Fnqgzh6YDNYD+v9kaWkMxGQWEnA+zRnv+THnbCsQyIR+Drlf6o1D7H6P9GcNUeIX 1C5xMRlHOWcJp0mc/loyrDR+2XLCsdyKG+TXPFn+5yTzxcqMnnqVe01+/oFWBNr1VK0w +EE+Bj+umUp8zYoHpx1xw+Q5tlqJZvF3+kDyGAnWb6zLp3Lqybiq4TlN4eWhYMPG8jm+ h8hA== X-Gm-Message-State: AC+VfDyOK7ujcNltYgcNsFIpwL1UqN1dKAeOUTkzFUlMFWulssqjn2u6 2iyx6gnWZpd0VEHMnJ3pOscO0mavfE1nQDzZT34= X-Google-Smtp-Source: ACHHUZ7/hrN0SacC93P/3OrV7yYM63SNkWf8vOrKzVIQpNF43WVeUkM/v3cw1HpVWgeJxKbKdl6whw== X-Received: by 2002:a6b:b308:0:b0:777:a5a8:b6f8 with SMTP id c8-20020a6bb308000000b00777a5a8b6f8mr1799718iof.0.1686335490998; Fri, 09 Jun 2023 11:31:30 -0700 (PDT) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id j4-20020a02a684000000b0040fb2ba7357sm1103124jam.4.2023.06.09.11.31.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 09 Jun 2023 11:31:30 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: andres@anarazel.de, Jens Axboe Subject: [PATCH 1/6] futex: abstract out futex_op_to_flags() helper Date: Fri, 9 Jun 2023 12:31:20 -0600 Message-Id: <20230609183125.673140-2-axboe@kernel.dk> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230609183125.673140-1-axboe@kernel.dk> References: <20230609183125.673140-1-axboe@kernel.dk> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Rather than needing to duplicate this for the io_uring hook of futexes, abstract out a helper. No functional changes intended in this patch. Signed-off-by: Jens Axboe --- kernel/futex/futex.h | 15 +++++++++++++++ kernel/futex/syscalls.c | 11 ++--------- 2 files changed, 17 insertions(+), 9 deletions(-) diff --git a/kernel/futex/futex.h b/kernel/futex/futex.h index b5379c0e6d6d..d2949fca37d1 100644 --- a/kernel/futex/futex.h +++ b/kernel/futex/futex.h @@ -291,4 +291,19 @@ extern int futex_unlock_pi(u32 __user *uaddr, unsigned int flags); extern int futex_lock_pi(u32 __user *uaddr, unsigned int flags, ktime_t *time, int trylock); +static inline bool futex_op_to_flags(int op, int cmd, unsigned int *flags) +{ + if (!(op & FUTEX_PRIVATE_FLAG)) + *flags |= FLAGS_SHARED; + + if (op & FUTEX_CLOCK_REALTIME) { + *flags |= FLAGS_CLOCKRT; + if (cmd != FUTEX_WAIT_BITSET && cmd != FUTEX_WAIT_REQUEUE_PI && + cmd != FUTEX_LOCK_PI2) + return false; + } + + return true; +} + #endif /* _FUTEX_H */ diff --git a/kernel/futex/syscalls.c b/kernel/futex/syscalls.c index a8074079b09e..75ca8c41cc94 100644 --- a/kernel/futex/syscalls.c +++ b/kernel/futex/syscalls.c @@ -88,15 +88,8 @@ long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout, int cmd = op & FUTEX_CMD_MASK; unsigned int flags = 0; - if (!(op & FUTEX_PRIVATE_FLAG)) - flags |= FLAGS_SHARED; - - if (op & FUTEX_CLOCK_REALTIME) { - flags |= FLAGS_CLOCKRT; - if (cmd != FUTEX_WAIT_BITSET && cmd != FUTEX_WAIT_REQUEUE_PI && - cmd != FUTEX_LOCK_PI2) - return -ENOSYS; - } + if (!futex_op_to_flags(op, cmd, &flags)) + return -ENOSYS; switch (cmd) { case FUTEX_WAIT: From patchwork Fri Jun 9 18:31:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13274285 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6844FC7EE25 for ; Fri, 9 Jun 2023 18:31:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230247AbjFISbh (ORCPT ); Fri, 9 Jun 2023 14:31:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44694 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229616AbjFISbg (ORCPT ); Fri, 9 Jun 2023 14:31:36 -0400 Received: from mail-il1-x12f.google.com (mail-il1-x12f.google.com [IPv6:2607:f8b0:4864:20::12f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9C59E35B3 for ; Fri, 9 Jun 2023 11:31:33 -0700 (PDT) Received: by mail-il1-x12f.google.com with SMTP id e9e14a558f8ab-33d0c740498so911435ab.0 for ; Fri, 09 Jun 2023 11:31:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20221208.gappssmtp.com; s=20221208; t=1686335492; x=1688927492; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ZXQqhUliGcPBmuBlH0wkZgD4w96JtLsryqUu5jjomWg=; b=As1AkOemnueixaino3MHFDx2xUDZq2TSa/Rn0liGFK4Pm968NecFs5Sumj4AEKJN+I 87B2GnwU17n2ZFj6B028t93S8RUyqtatMYJ9FSty1zMAWlvbRpK1pNupuDnuiLtYhcxk KkYXCNjIUqaAKovQLubC9weMkHOdTPUTIaIaYCTBw4Nl2/0wynUzb+VgbWzxQri1M692 +LhraX1U/miDoldVL55S85jxU66nzdSEjbDBIEdY5vHotu11z8pxof5W3DcDEjbQce1h DV8dgOxZNMW8nkAo8LxD5Z4AUfFOHApooQoGfuDztknjTWKTwmmZgJXyS4JIn/s2qF29 gIDg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686335492; x=1688927492; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ZXQqhUliGcPBmuBlH0wkZgD4w96JtLsryqUu5jjomWg=; b=VNxm/8+vjZ3Dv/2IvbosI3UdJaW6YGwW9/R7SH4Aj5+mmRKw6x5ft5l4pcP56zQEBX QPd+/KhBcP3dEvyEMny0xwjSNrZhRuOIzpv4Zo3EWRKnLban86y+7RFTChU3v8qpKE/Z zLOr+gdME7/Tb/IVEjVAP3V6qzp/gbVrY8SjVhoIb4puCiNlNVHJ3vgWqXY2CNjiZSsv q/uKU8WhJamLLAZAuiJcFY4J7CotFM5mr5P5zW0dV/v9YgJTO1dLApefJUqtbh+r23wW O7+1C4F0N8nKIKlcPh4M9ngwHS5Gtlscuowl9zVBAWlBt9SOLsXgdcXwstUXhWv9ymA2 5XLA== X-Gm-Message-State: AC+VfDwhf+hNPTSZ+1ZtyxI1+LAuqOohvWamoPQfFhQ9Kkgs8GUy0fJ9 oraanA+YeE1UIh8ShsqRphDJSOD/OUOzVyqGIPQ= X-Google-Smtp-Source: ACHHUZ495VpbwfEM78NdUpSjoTokGZaP2yeKaGfAqZQXgkXJPKHbmQ3Uts3Db2azfbmISTdph2RIRA== X-Received: by 2002:a05:6e02:1188:b0:32a:eacb:c5d4 with SMTP id y8-20020a056e02118800b0032aeacbc5d4mr1225219ili.0.1686335492430; Fri, 09 Jun 2023 11:31:32 -0700 (PDT) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id j4-20020a02a684000000b0040fb2ba7357sm1103124jam.4.2023.06.09.11.31.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 09 Jun 2023 11:31:31 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: andres@anarazel.de, Jens Axboe Subject: [PATCH 2/6] futex: factor out the futex wake handling Date: Fri, 9 Jun 2023 12:31:21 -0600 Message-Id: <20230609183125.673140-3-axboe@kernel.dk> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230609183125.673140-1-axboe@kernel.dk> References: <20230609183125.673140-1-axboe@kernel.dk> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org In preparation for having another waker that isn't futex_wake_mark(), add a wake handler in futex_q and rename the futex_q->task field to just be wake_data. futex_wake_mark() is defined as the standard wakeup helper. Signed-off-by: Jens Axboe --- kernel/futex/core.c | 2 +- kernel/futex/futex.h | 3 ++- kernel/futex/requeue.c | 7 ++++--- kernel/futex/waitwake.c | 8 ++++---- 4 files changed, 11 insertions(+), 9 deletions(-) diff --git a/kernel/futex/core.c b/kernel/futex/core.c index 514e4582b863..6223cce3d876 100644 --- a/kernel/futex/core.c +++ b/kernel/futex/core.c @@ -556,7 +556,7 @@ void __futex_queue(struct futex_q *q, struct futex_hash_bucket *hb) plist_node_init(&q->list, prio); plist_add(&q->list, &hb->chain); - q->task = current; + q->wake_data = current; } /** diff --git a/kernel/futex/futex.h b/kernel/futex/futex.h index d2949fca37d1..1b7dd5266dd2 100644 --- a/kernel/futex/futex.h +++ b/kernel/futex/futex.h @@ -96,7 +96,6 @@ struct futex_pi_state { struct futex_q { struct plist_node list; - struct task_struct *task; spinlock_t *lock_ptr; union futex_key key; struct futex_pi_state *pi_state; @@ -107,6 +106,8 @@ struct futex_q { #ifdef CONFIG_PREEMPT_RT struct rcuwait requeue_wait; #endif + void (*wake)(struct wake_q_head *wake_q, struct futex_q *); + void *wake_data; } __randomize_layout; extern const struct futex_q futex_q_init; diff --git a/kernel/futex/requeue.c b/kernel/futex/requeue.c index cba8b1a6a4cc..6aee8408c341 100644 --- a/kernel/futex/requeue.c +++ b/kernel/futex/requeue.c @@ -61,6 +61,7 @@ const struct futex_q futex_q_init = { .key = FUTEX_KEY_INIT, .bitset = FUTEX_BITSET_MATCH_ANY, .requeue_state = ATOMIC_INIT(Q_REQUEUE_PI_NONE), + .wake = futex_wake_mark, }; /** @@ -234,7 +235,7 @@ void requeue_pi_wake_futex(struct futex_q *q, union futex_key *key, /* Signal locked state to the waiter */ futex_requeue_pi_complete(q, 1); - wake_up_state(q->task, TASK_NORMAL); + wake_up_state(q->wake_data, TASK_NORMAL); } /** @@ -316,7 +317,7 @@ futex_proxy_trylock_atomic(u32 __user *pifutex, struct futex_hash_bucket *hb1, * the user space lock can be acquired then PI state is attached to * the new owner (@top_waiter->task) when @set_waiters is true. */ - ret = futex_lock_pi_atomic(pifutex, hb2, key2, ps, top_waiter->task, + ret = futex_lock_pi_atomic(pifutex, hb2, key2, ps, top_waiter->wake_data, exiting, set_waiters); if (ret == 1) { /* @@ -626,7 +627,7 @@ int futex_requeue(u32 __user *uaddr1, unsigned int flags, u32 __user *uaddr2, ret = rt_mutex_start_proxy_lock(&pi_state->pi_mutex, this->rt_waiter, - this->task); + this->wake_data); if (ret == 1) { /* diff --git a/kernel/futex/waitwake.c b/kernel/futex/waitwake.c index ba01b9408203..5151c83e2db8 100644 --- a/kernel/futex/waitwake.c +++ b/kernel/futex/waitwake.c @@ -114,7 +114,7 @@ */ void futex_wake_mark(struct wake_q_head *wake_q, struct futex_q *q) { - struct task_struct *p = q->task; + struct task_struct *p = q->wake_data; if (WARN(q->pi_state || q->rt_waiter, "refusing to wake PI futex\n")) return; @@ -174,7 +174,7 @@ int futex_wake(u32 __user *uaddr, unsigned int flags, int nr_wake, u32 bitset) if (!(this->bitset & bitset)) continue; - futex_wake_mark(&wake_q, this); + this->wake(&wake_q, this); if (++ret >= nr_wake) break; } @@ -289,7 +289,7 @@ int futex_wake_op(u32 __user *uaddr1, unsigned int flags, u32 __user *uaddr2, ret = -EINVAL; goto out_unlock; } - futex_wake_mark(&wake_q, this); + this->wake(&wake_q, this); if (++ret >= nr_wake) break; } @@ -303,7 +303,7 @@ int futex_wake_op(u32 __user *uaddr1, unsigned int flags, u32 __user *uaddr2, ret = -EINVAL; goto out_unlock; } - futex_wake_mark(&wake_q, this); + this->wake(&wake_q, this); if (++op_ret >= nr_wake2) break; } From patchwork Fri Jun 9 18:31:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13274286 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EA362C7EE37 for ; Fri, 9 Jun 2023 18:31:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230238AbjFISbj (ORCPT ); Fri, 9 Jun 2023 14:31:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44716 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230299AbjFISbi (ORCPT ); Fri, 9 Jun 2023 14:31:38 -0400 Received: from mail-io1-xd2e.google.com (mail-io1-xd2e.google.com [IPv6:2607:f8b0:4864:20::d2e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8E1F83A81 for ; Fri, 9 Jun 2023 11:31:34 -0700 (PDT) Received: by mail-io1-xd2e.google.com with SMTP id ca18e2360f4ac-7748ca56133so19063139f.0 for ; Fri, 09 Jun 2023 11:31:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20221208.gappssmtp.com; s=20221208; t=1686335493; x=1688927493; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=58rvLYLUk8/S4pLP2TEKpTK3rfvG+0uSTMDF12Jv7EE=; b=V2QKKykZPPVpCqP6Tp+YZ3SHnKkz4Y9H3rEGXl9Ktf9gXRaOWgqD44erKTwbs5ACbE YixYWeybo8NvNoFHs8gy9hfvShmazIr/epu7G4w+eANKaclmioyZWp8QZ21TEuLroud+ cIXm1o02Dpm3Ra63sgANhSn8kK3WPUmjnXUsD/ggGswAt0h4c9Upjc1sPBJqsdvogCgd f0I3Cj3uvS/PWDHWe/zr0+XQXlfmp6AysD+hf2kk5DeBXPnYEUwzQFaO2lIeVWJGtwD8 MR8vTLY1FF5eohByBIgeNqaiq/96TeHtECIsOOnbwRQxXvRBdWbhzJ1BH9BKN7XDKhls CTQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686335493; x=1688927493; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=58rvLYLUk8/S4pLP2TEKpTK3rfvG+0uSTMDF12Jv7EE=; b=M5JrCgltzUWCM+9QSJjvJI3o360ZLdpTgvoDV3Ed4ggJmH9m6kXmfv2FvNuZp3LI/H GTGxC4Yiz7qIUZiCqQwp29M9v1DLM1NM9ddk94ltAMJYcX5j3ZZPYJQxO4gQTlbkFeo3 kre/Rx/6kVmoHQi39zdAL2riIoaJJzLCaYwS+ZUH4djhnJfEXhEk6vzvDc4ZoxZPPN+L X0oMPMFmVuT5URoXHeFxDOztH7gPIwAPBIOTK05sJB7gZkii2nNgT4YJeBMGF3j6Zq1G iSZdMKnr1xKeANY34ys+wo1FAVOgEYstQS2LorvJ/1cc7PoKBAr9/3rGG7RF/McIBcIj y91A== X-Gm-Message-State: AC+VfDyQibWx3zd27gwlOuBXGMsGQee6voZJXhGF6UzsdmSK+Icawgc7 kcQvujTgcvkQCepcG/LX71/A/q/YJhAUuUr59RM= X-Google-Smtp-Source: ACHHUZ62X5wmEsLFwnn8/vPAqNcIZVDyUcrTecrKXLRAgcCSPCCgUKrvNqo/wmDqMJqVdiWjJgADeA== X-Received: by 2002:a05:6602:408b:b0:777:a5a4:c6cb with SMTP id bl11-20020a056602408b00b00777a5a4c6cbmr1919989iob.1.1686335493619; Fri, 09 Jun 2023 11:31:33 -0700 (PDT) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id j4-20020a02a684000000b0040fb2ba7357sm1103124jam.4.2023.06.09.11.31.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 09 Jun 2023 11:31:32 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: andres@anarazel.de, Jens Axboe Subject: [PATCH 3/6] futex: assign default futex_q->wait_data at insertion time Date: Fri, 9 Jun 2023 12:31:22 -0600 Message-Id: <20230609183125.673140-4-axboe@kernel.dk> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230609183125.673140-1-axboe@kernel.dk> References: <20230609183125.673140-1-axboe@kernel.dk> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Rather than do it in the lower level queueing helper, move it to the upper level one. This enables use of that helper with the caller setting the wake handler data prior to calling it, rather than assume that futex_wake_mark() is the handler for this futex_q. Signed-off-by: Jens Axboe --- kernel/futex/core.c | 1 - kernel/futex/futex.h | 1 + 2 files changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/futex/core.c b/kernel/futex/core.c index 6223cce3d876..b9d8619c06fc 100644 --- a/kernel/futex/core.c +++ b/kernel/futex/core.c @@ -556,7 +556,6 @@ void __futex_queue(struct futex_q *q, struct futex_hash_bucket *hb) plist_node_init(&q->list, prio); plist_add(&q->list, &hb->chain); - q->wake_data = current; } /** diff --git a/kernel/futex/futex.h b/kernel/futex/futex.h index 1b7dd5266dd2..8c12cef83d38 100644 --- a/kernel/futex/futex.h +++ b/kernel/futex/futex.h @@ -171,6 +171,7 @@ extern int futex_unqueue(struct futex_q *q); static inline void futex_queue(struct futex_q *q, struct futex_hash_bucket *hb) __releases(&hb->lock) { + q->wake_data = current; __futex_queue(q, hb); spin_unlock(&hb->lock); } From patchwork Fri Jun 9 18:31:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13274287 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 353BBC7EE25 for ; Fri, 9 Jun 2023 18:31:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229616AbjFISbp (ORCPT ); Fri, 9 Jun 2023 14:31:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44806 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230283AbjFISbn (ORCPT ); Fri, 9 Jun 2023 14:31:43 -0400 Received: from mail-il1-x130.google.com (mail-il1-x130.google.com [IPv6:2607:f8b0:4864:20::130]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 874323A84 for ; Fri, 9 Jun 2023 11:31:36 -0700 (PDT) Received: by mail-il1-x130.google.com with SMTP id e9e14a558f8ab-33d0c740498so911505ab.0 for ; Fri, 09 Jun 2023 11:31:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20221208.gappssmtp.com; s=20221208; t=1686335495; x=1688927495; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=pRFuYQO+GGntnur6QkOmgAEvI9IAoRPXEikEtEaQgPY=; b=xCeg8Ros7YbjjJT626pLIvGcaxRdub/hwwYnfOoLo0mpbMOEty68oWAs1pFX0BUIn5 acIx8gPhaTut01q6OmTK1casdw9FSlxU05loZN+zDS0rTvKBo1N7XVHat+FfsH16v1XN O+1M18A9TIPKb4XA3aovBujsWskltYqYcMjD39CcTsJNHk7gSxwWurAf5gXo3R020A/N vmaWHMgUa7D6JVIfRl0e/0llXUESOha1CQBK5RgwW6FlcygREQeYKrRttDndEAcEc3Aw IL1G3Ram44rHERNryKZrF6y7544YoPhX+VBcbN49xF2n7T1kYZewD1usvhZ+IzL52Ubb MF2g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686335495; x=1688927495; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=pRFuYQO+GGntnur6QkOmgAEvI9IAoRPXEikEtEaQgPY=; b=PXoCTGSYpQMspgCaAVMiI/gXRue/MaXBNs062ruowOjyhYzNn9sGylkpQr6jxjtVTv u/6VntzfmBHt+HmEUa56qnTlNi1VnyBkCrmtyNCKjgHII3yY5KOYAslrtJ/xxdqf2QXO rtCO0OVwjpqvC3lWM49NuAHtxFpmex02Ab+WtzaWZuckV8AYf2wHuXkos7B/iKct4F7f 7fV08e8TJDF4RSCSaPgWSzGvjYKKPES7KSbQHV21BTv4uRwlFp5SgKe+loKqJ/gGRNDo CX5ehSRosWS2Uhpsmcpkdm394+wIw8b/k0srWnVeubtDhV9S8kS6eGs7rTjpic/W6tnt te5Q== X-Gm-Message-State: AC+VfDxNtt4UA1oQyiiOeBUVI24baYf+uJ9DLwVKkmMnGzltosivFnqW KOLdWMLP+4vp1K0QkmwmxZUpL3xRoArHzMsTxXw= X-Google-Smtp-Source: ACHHUZ6K/g6dGq+RJ+YZKMDYTxy0K6ukZh5B9fgKwuF2awPZul2oN7ecfz3jO8pgZ0bvsRXHyataug== X-Received: by 2002:a05:6e02:188a:b0:33b:583d:1273 with SMTP id o10-20020a056e02188a00b0033b583d1273mr1968012ilu.1.1686335494951; Fri, 09 Jun 2023 11:31:34 -0700 (PDT) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id j4-20020a02a684000000b0040fb2ba7357sm1103124jam.4.2023.06.09.11.31.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 09 Jun 2023 11:31:34 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: andres@anarazel.de, Jens Axboe Subject: [PATCH 4/6] futex: add futex wait variant that takes a futex_q directly Date: Fri, 9 Jun 2023 12:31:23 -0600 Message-Id: <20230609183125.673140-5-axboe@kernel.dk> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230609183125.673140-1-axboe@kernel.dk> References: <20230609183125.673140-1-axboe@kernel.dk> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org For async trigger of the wait, we need to be able to pass in a futex_q that is already setup. Add that helper. Signed-off-by: Jens Axboe --- kernel/futex/futex.h | 3 +++ kernel/futex/waitwake.c | 17 +++++++++++++++++ 2 files changed, 20 insertions(+) diff --git a/kernel/futex/futex.h b/kernel/futex/futex.h index 8c12cef83d38..29bf78a1f475 100644 --- a/kernel/futex/futex.h +++ b/kernel/futex/futex.h @@ -156,6 +156,9 @@ extern void __futex_unqueue(struct futex_q *q); extern void __futex_queue(struct futex_q *q, struct futex_hash_bucket *hb); extern int futex_unqueue(struct futex_q *q); +extern int futex_queue_wait(struct futex_q *q, u32 __user *uaddr, + unsigned int flags, u32 val); + /** * futex_queue() - Enqueue the futex_q on the futex_hash_bucket * @q: The futex_q to enqueue diff --git a/kernel/futex/waitwake.c b/kernel/futex/waitwake.c index 5151c83e2db8..442dafdfa22a 100644 --- a/kernel/futex/waitwake.c +++ b/kernel/futex/waitwake.c @@ -706,3 +706,20 @@ static long futex_wait_restart(struct restart_block *restart) restart->futex.val, tp, restart->futex.bitset); } +int futex_queue_wait(struct futex_q *q, u32 __user *uaddr, unsigned int flags, + u32 val) +{ + struct futex_hash_bucket *hb; + int ret; + + if (!q->bitset) + return -EINVAL; + + ret = futex_wait_setup(uaddr, val, flags, q, &hb); + if (ret) + return ret; + + __futex_queue(q, hb); + spin_unlock(&hb->lock); + return 0; +} From patchwork Fri Jun 9 18:31:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13274288 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 66198C7EE29 for ; Fri, 9 Jun 2023 18:31:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230299AbjFISbq (ORCPT ); Fri, 9 Jun 2023 14:31:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44864 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230283AbjFISbp (ORCPT ); Fri, 9 Jun 2023 14:31:45 -0400 Received: from mail-il1-x131.google.com (mail-il1-x131.google.com [IPv6:2607:f8b0:4864:20::131]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9E72D35B3 for ; Fri, 9 Jun 2023 11:31:37 -0700 (PDT) Received: by mail-il1-x131.google.com with SMTP id e9e14a558f8ab-33dbad61311so1754135ab.0 for ; Fri, 09 Jun 2023 11:31:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20221208.gappssmtp.com; s=20221208; t=1686335496; x=1688927496; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=HMTV1MvEKKYKZqTI3o6o/uh1NWf3aebrfhj1tOoeR18=; b=FFInjZKBs19vz4rK5wMPcC8oPROsFuOblpQIjNXWJNmbxNRCzJbTjCmJ5ujcpUbxpH ld7RH5H8wHiEbuGEQQt2r6XhpISFNSSGKbedbbaKBYcuceVEOlY1dqSBv470e8uwkKw0 5//l4PugnknZZdSAkaFi3I5Ah5fvVYsFgCAENMWjExmaECEgm168sVViLa+MmEvR0R9Q G6I3JW0mAJGG6eFkeMVhAzN2dlc0xOuHiwO3MkW6k/nyfaTXeGHGj7yfgaMeUhj4CF2k TmO0zK3VRYstwGPv0egUR0aQJDD3DIXL61tHOOkhzA2c3sa1G8c/hp3HHxMrNLfhx0AZ LsTg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686335496; x=1688927496; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=HMTV1MvEKKYKZqTI3o6o/uh1NWf3aebrfhj1tOoeR18=; b=J7FWqbLtnpwhjGINJCfxGi1Qw9OqLVt42TekL/K21zODhUnh/6wh8kuDddEchBhy4L P6HGpRdY4Dp+/2rqm3kM71DUwP34jjZ2RdmqDkAL/dZJJwls5QcZbk66VELCyk0D/InR tOkedflslbg5JaXX/rcJumw00V8oqkDKREgW90Kjk1w65JOcc7RgTJgU1muprK1Oh6ig 3pjInbUaVZA+fpZxeLK0rwotugKtL/EWur9eVUR6q8ad/1gx6lKo22vyX9zC2FtWQulI zWXgcCa89o00uAdzM5WtPWOwuPMfRvRmP3XjVTLE8VsMPMKs5jS+Ptc6BepvZfGz0hpb dwGQ== X-Gm-Message-State: AC+VfDwSRSrV1YMbdktunQ69lya1TTdU3R2U2Gs3px09lyfCWYVCVvZT aO0QjsDOQBibAZYD5nF36fgDwxuFESYPWwusgac= X-Google-Smtp-Source: ACHHUZ5pBGNdYFXHVBGTQJhND5SuYMtKgUzAMB4CFCxdvOsIhfH6Nf9wDXVYQrXcjI590ZVgdQMEHg== X-Received: by 2002:a05:6602:1545:b0:774:93a3:f163 with SMTP id h5-20020a056602154500b0077493a3f163mr1814238iow.0.1686335495877; Fri, 09 Jun 2023 11:31:35 -0700 (PDT) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id j4-20020a02a684000000b0040fb2ba7357sm1103124jam.4.2023.06.09.11.31.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 09 Jun 2023 11:31:35 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: andres@anarazel.de, Jens Axboe Subject: [PATCH 5/6] io_uring: add support for futex wake and wait Date: Fri, 9 Jun 2023 12:31:24 -0600 Message-Id: <20230609183125.673140-6-axboe@kernel.dk> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230609183125.673140-1-axboe@kernel.dk> References: <20230609183125.673140-1-axboe@kernel.dk> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Add support for FUTEX_WAKE/WAIT primitives. IORING_OP_FUTEX_WAKE is mix of FUTEX_WAKE and FUTEX_WAKE_BITSET, as it does support passing in a bitset. Similary, IORING_OP_FUTEX_WAIT is a mix of FUTEX_WAIT and FUTEX_WAIT_BITSET. FUTEX_WAKE is straight forward, as we can always just do those inline. FUTEX_WAIT will queue the futex with an appropriate callback, and that callback will in turn post a CQE when it has triggered. Cancelations are supported, both from the application point-of-view, but also to be able to cancel pending waits if the ring exits before all events have occurred. This is just the barebones wait/wake support. Features to be added later: - We do not support the PI or requeue operations. The immediate use case don't need them, unsure if future support for these would be useful. - Should we support futex wait with timeout? Not clear if this is necessary, as the usual io_uring linked timeouts could fill this purpose. - Would be nice to support registered futexes, just like we do buffers. This would avoid mapping in user memory for each operation. - Probably lots more that I just didn't think of. Signed-off-by: Jens Axboe --- include/linux/io_uring_types.h | 2 + include/uapi/linux/io_uring.h | 3 + io_uring/Makefile | 4 +- io_uring/cancel.c | 5 + io_uring/cancel.h | 4 + io_uring/futex.c | 194 +++++++++++++++++++++++++++++++++ io_uring/futex.h | 26 +++++ io_uring/io_uring.c | 3 + io_uring/opdef.c | 25 ++++- 9 files changed, 264 insertions(+), 2 deletions(-) create mode 100644 io_uring/futex.c create mode 100644 io_uring/futex.h diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index f04ce513fadb..d796b578c129 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -273,6 +273,8 @@ struct io_ring_ctx { struct io_wq_work_list locked_free_list; unsigned int locked_free_nr; + struct hlist_head futex_list; + const struct cred *sq_creds; /* cred used for __io_sq_thread() */ struct io_sq_data *sq_data; /* if using sq thread polling */ diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index f222d263bc55..b1a151ab8150 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -65,6 +65,7 @@ struct io_uring_sqe { __u32 xattr_flags; __u32 msg_ring_flags; __u32 uring_cmd_flags; + __u32 futex_flags; }; __u64 user_data; /* data to be passed back at completion time */ /* pack this to avoid bogus arm OABI complaints */ @@ -235,6 +236,8 @@ enum io_uring_op { IORING_OP_URING_CMD, IORING_OP_SEND_ZC, IORING_OP_SENDMSG_ZC, + IORING_OP_FUTEX_WAIT, + IORING_OP_FUTEX_WAKE, /* this goes last, obviously */ IORING_OP_LAST, diff --git a/io_uring/Makefile b/io_uring/Makefile index 8cc8e5387a75..2e4779bc550c 100644 --- a/io_uring/Makefile +++ b/io_uring/Makefile @@ -7,5 +7,7 @@ obj-$(CONFIG_IO_URING) += io_uring.o xattr.o nop.o fs.o splice.o \ openclose.o uring_cmd.o epoll.o \ statx.o net.o msg_ring.o timeout.o \ sqpoll.o fdinfo.o tctx.o poll.o \ - cancel.o kbuf.o rsrc.o rw.o opdef.o notif.o + cancel.o kbuf.o rsrc.o rw.o opdef.o \ + notif.o obj-$(CONFIG_IO_WQ) += io-wq.o +obj-$(CONFIG_FUTEX) += futex.o diff --git a/io_uring/cancel.c b/io_uring/cancel.c index b4f5dfacc0c3..280fb83145d3 100644 --- a/io_uring/cancel.c +++ b/io_uring/cancel.c @@ -15,6 +15,7 @@ #include "tctx.h" #include "poll.h" #include "timeout.h" +#include "futex.h" #include "cancel.h" struct io_cancel { @@ -98,6 +99,10 @@ int io_try_cancel(struct io_uring_task *tctx, struct io_cancel_data *cd, if (ret != -ENOENT) return ret; + ret = io_futex_cancel(ctx, cd, issue_flags); + if (ret != -ENOENT) + return ret; + spin_lock(&ctx->completion_lock); if (!(cd->flags & IORING_ASYNC_CANCEL_FD)) ret = io_timeout_cancel(ctx, cd); diff --git a/io_uring/cancel.h b/io_uring/cancel.h index 6a59ee484d0c..6a2a38df7159 100644 --- a/io_uring/cancel.h +++ b/io_uring/cancel.h @@ -1,4 +1,6 @@ // SPDX-License-Identifier: GPL-2.0 +#ifndef IORING_CANCEL_H +#define IORING_CANCEL_H #include @@ -21,3 +23,5 @@ int io_try_cancel(struct io_uring_task *tctx, struct io_cancel_data *cd, void init_hash_table(struct io_hash_table *table, unsigned size); int io_sync_cancel(struct io_ring_ctx *ctx, void __user *arg); + +#endif diff --git a/io_uring/futex.c b/io_uring/futex.c new file mode 100644 index 000000000000..a1d50145927a --- /dev/null +++ b/io_uring/futex.c @@ -0,0 +1,194 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include +#include + +#include + +#include "../kernel/futex/futex.h" +#include "io_uring.h" +#include "futex.h" + +struct io_futex { + struct file *file; + u32 __user *uaddr; + int futex_op; + unsigned int futex_val; + unsigned int futex_flags; + unsigned int futex_mask; + bool has_timeout; + ktime_t timeout; +}; + +static void io_futex_complete(struct io_kiocb *req, struct io_tw_state *ts) +{ + struct io_ring_ctx *ctx = req->ctx; + + kfree(req->async_data); + io_tw_lock(ctx, ts); + hlist_del_init(&req->hash_node); + io_req_task_complete(req, ts); +} + +static bool __io_futex_cancel(struct io_ring_ctx *ctx, struct io_kiocb *req) +{ + struct futex_q *q = req->async_data; + + /* futex wake already done or in progress */ + if (!futex_unqueue(q)) + return false; + + hlist_del_init(&req->hash_node); + io_req_set_res(req, -ECANCELED, 0); + req->io_task_work.func = io_futex_complete; + io_req_task_work_add(req); + return true; +} + +int io_futex_cancel(struct io_ring_ctx *ctx, struct io_cancel_data *cd, + unsigned int issue_flags) +{ + struct hlist_node *tmp; + struct io_kiocb *req; + int nr = 0; + + if (cd->flags & (IORING_ASYNC_CANCEL_FD|IORING_ASYNC_CANCEL_FD_FIXED)) + return 0; + + io_ring_submit_lock(ctx, issue_flags); + hlist_for_each_entry_safe(req, tmp, &ctx->futex_list, hash_node) { + if (req->cqe.user_data != cd->data && + !(cd->flags & IORING_ASYNC_CANCEL_ANY)) + continue; + if (__io_futex_cancel(ctx, req)) + nr++; + nr++; + if (!(cd->flags & IORING_ASYNC_CANCEL_ALL)) + break; + } + io_ring_submit_unlock(ctx, issue_flags); + + if (nr) + return nr; + + return -ENOENT; +} + +bool io_futex_remove_all(struct io_ring_ctx *ctx, struct task_struct *task, + bool cancel_all) +{ + struct hlist_node *tmp; + struct io_kiocb *req; + bool found = false; + + lockdep_assert_held(&ctx->uring_lock); + + hlist_for_each_entry_safe(req, tmp, &ctx->futex_list, hash_node) { + if (!io_match_task_safe(req, task, cancel_all)) + continue; + __io_futex_cancel(ctx, req); + found = true; + } + + return found; +} + +int io_futex_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) +{ + struct io_futex *iof = io_kiocb_to_cmd(req, struct io_futex); + struct __kernel_timespec __user *utime; + struct timespec64 t; + + iof->futex_op = READ_ONCE(sqe->fd); + iof->uaddr = u64_to_user_ptr(READ_ONCE(sqe->addr)); + iof->futex_val = READ_ONCE(sqe->len); + iof->has_timeout = false; + iof->futex_mask = READ_ONCE(sqe->file_index); + utime = u64_to_user_ptr(READ_ONCE(sqe->addr2)); + if (utime) { + if (get_timespec64(&t, utime)) + return -EFAULT; + iof->timeout = timespec64_to_ktime(t); + iof->timeout = ktime_add_safe(ktime_get(), iof->timeout); + iof->has_timeout = true; + } + iof->futex_flags = READ_ONCE(sqe->futex_flags); + if (iof->futex_flags & FUTEX_CMD_MASK) + return -EINVAL; + + return 0; +} + +static void io_futex_wake_fn(struct wake_q_head *wake_q, struct futex_q *q) +{ + struct io_kiocb *req = q->wake_data; + + __futex_unqueue(q); + smp_store_release(&q->lock_ptr, NULL); + + io_req_set_res(req, 0, 0); + req->io_task_work.func = io_futex_complete; + io_req_task_work_add(req); +} + +int io_futex_wait(struct io_kiocb *req, unsigned int issue_flags) +{ + struct io_futex *iof = io_kiocb_to_cmd(req, struct io_futex); + struct io_ring_ctx *ctx = req->ctx; + unsigned int flags = 0; + struct futex_q *q; + int ret; + + if (!futex_op_to_flags(FUTEX_WAIT, iof->futex_flags, &flags)) { + ret = -ENOSYS; + goto done; + } + + q = kmalloc(sizeof(*q), GFP_NOWAIT); + if (!q) { + ret = -ENOMEM; + goto done; + } + + req->async_data = q; + *q = futex_q_init; + q->bitset = iof->futex_mask; + q->wake = io_futex_wake_fn; + q->wake_data = req; + + io_ring_submit_lock(ctx, issue_flags); + hlist_add_head(&req->hash_node, &ctx->futex_list); + io_ring_submit_unlock(ctx, issue_flags); + + ret = futex_queue_wait(q, iof->uaddr, flags, iof->futex_val); + if (ret) + goto done; + + return IOU_ISSUE_SKIP_COMPLETE; +done: + if (ret < 0) + req_set_fail(req); + io_req_set_res(req, ret, 0); + return IOU_OK; +} + +int io_futex_wake(struct io_kiocb *req, unsigned int issue_flags) +{ + struct io_futex *iof = io_kiocb_to_cmd(req, struct io_futex); + unsigned int flags = 0; + int ret; + + if (!futex_op_to_flags(FUTEX_WAKE, iof->futex_flags, &flags)) { + ret = -ENOSYS; + goto done; + } + + ret = futex_wake(iof->uaddr, flags, iof->futex_val, iof->futex_mask); +done: + if (ret < 0) + req_set_fail(req); + io_req_set_res(req, ret, 0); + return IOU_OK; +} diff --git a/io_uring/futex.h b/io_uring/futex.h new file mode 100644 index 000000000000..16add2c069cc --- /dev/null +++ b/io_uring/futex.h @@ -0,0 +1,26 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include "cancel.h" + +int io_futex_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); +int io_futex_wait(struct io_kiocb *req, unsigned int issue_flags); +int io_futex_wake(struct io_kiocb *req, unsigned int issue_flags); + +#if defined(CONFIG_FUTEX) +int io_futex_cancel(struct io_ring_ctx *ctx, struct io_cancel_data *cd, + unsigned int issue_flags); +bool io_futex_remove_all(struct io_ring_ctx *ctx, struct task_struct *task, + bool cancel_all); +#else +static inline int io_futex_cancel(struct io_ring_ctx *ctx, + struct io_cancel_data *cd, + unsigned int issue_flags); +{ + return 0; +} +static inline bool io_futex_remove_all(struct io_ring_ctx *ctx, + struct task_struct *task, bool cancel_all) +{ + return false; +} +#endif diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index a467064da1af..8270f37c312d 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -92,6 +92,7 @@ #include "cancel.h" #include "net.h" #include "notif.h" +#include "futex.h" #include "timeout.h" #include "poll.h" @@ -336,6 +337,7 @@ static __cold struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p) INIT_LIST_HEAD(&ctx->tctx_list); ctx->submit_state.free_list.next = NULL; INIT_WQ_LIST(&ctx->locked_free_list); + INIT_HLIST_HEAD(&ctx->futex_list); INIT_DELAYED_WORK(&ctx->fallback_work, io_fallback_req_func); INIT_WQ_LIST(&ctx->submit_state.compl_reqs); return ctx; @@ -3309,6 +3311,7 @@ static __cold bool io_uring_try_cancel_requests(struct io_ring_ctx *ctx, ret |= io_cancel_defer_files(ctx, task, cancel_all); mutex_lock(&ctx->uring_lock); ret |= io_poll_remove_all(ctx, task, cancel_all); + ret |= io_futex_remove_all(ctx, task, cancel_all); mutex_unlock(&ctx->uring_lock); ret |= io_kill_timeouts(ctx, task, cancel_all); if (task) diff --git a/io_uring/opdef.c b/io_uring/opdef.c index 3b9c6489b8b6..e6b03d7f82e5 100644 --- a/io_uring/opdef.c +++ b/io_uring/opdef.c @@ -33,6 +33,7 @@ #include "poll.h" #include "cancel.h" #include "rw.h" +#include "futex.h" static int io_no_issue(struct io_kiocb *req, unsigned int issue_flags) { @@ -426,11 +427,26 @@ const struct io_issue_def io_issue_defs[] = { .issue = io_sendmsg_zc, #else .prep = io_eopnotsupp_prep, +#endif + }, + [IORING_OP_FUTEX_WAIT] = { +#if defined(CONFIG_FUTEX) + .prep = io_futex_prep, + .issue = io_futex_wait, +#else + .prep = io_eopnotsupp_prep, +#endif + }, + [IORING_OP_FUTEX_WAKE] = { +#if defined(CONFIG_FUTEX) + .prep = io_futex_prep, + .issue = io_futex_wake, +#else + .prep = io_eopnotsupp_prep, #endif }, }; - const struct io_cold_def io_cold_defs[] = { [IORING_OP_NOP] = { .name = "NOP", @@ -648,6 +664,13 @@ const struct io_cold_def io_cold_defs[] = { .fail = io_sendrecv_fail, #endif }, + [IORING_OP_FUTEX_WAIT] = { + .name = "FUTEX_WAIT", + }, + + [IORING_OP_FUTEX_WAKE] = { + .name = "FUTEX_WAKE", + }, }; const char *io_uring_get_opcode(u8 opcode) From patchwork Fri Jun 9 18:31:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13274289 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EE819C7EE37 for ; Fri, 9 Jun 2023 18:31:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230283AbjFISbs (ORCPT ); Fri, 9 Jun 2023 14:31:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44884 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230300AbjFISbq (ORCPT ); Fri, 9 Jun 2023 14:31:46 -0400 Received: from mail-il1-x12b.google.com (mail-il1-x12b.google.com [IPv6:2607:f8b0:4864:20::12b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1AF393AA1 for ; Fri, 9 Jun 2023 11:31:38 -0700 (PDT) Received: by mail-il1-x12b.google.com with SMTP id e9e14a558f8ab-3357fc32a31so1749675ab.1 for ; Fri, 09 Jun 2023 11:31:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20221208.gappssmtp.com; s=20221208; t=1686335497; x=1688927497; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=0ndkXtrnsLBfYw1QVxH1IcCt9fclqgXBKBZInitESTI=; b=NBwl05WNFveMujUd4JDuwi4J9ojB6I0qkhW7A5fCFJA9fSffJKo+297E4q1Tdk6c2J q09BnAOZt6slb4hxrFX0pfomKSHIFtUNDEINVoJts71C7zvdWwe4Vh0jA19LW9SaWNHU nSv9MbgryBPIOza5ADvKxbl0qrqTjyX3G3FU7dly+oCSkqp1b4vDk4m1KFcTGYOB4fs8 DPDkMkjLNPi6zPE0CBuiAmI4LcEClQeD/zmvKGpgRumSgtaOrt2r09uaRM8VVMKRHnLU 8SCqoUuP7jqUTaeWjpnIiohaEwhPnAtzDoG8UMbStKgscD4lgtGaSlOa5JemFy/SKGSr NbBw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686335497; x=1688927497; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=0ndkXtrnsLBfYw1QVxH1IcCt9fclqgXBKBZInitESTI=; b=a0Zrno980vUnaUZ8+Yers2Da275i2p/k0s4WBhhVJgJNG+076ukQ33YC7WCC1NwpXj B+uH/9ja5P+meKcd3ZNfK0btKU0xCegEcsFYMtjOj41/JqHC5WPFJX2XF8VgVFF/Og5o POAShs3kX1vvmnDq0GAxA6s/FgtX8IOL5wgm+ZImj6J1Zi3A7GGbLSdH/GFBo+neHxOI S5AtvkNFRP1N37mrGUd4w3ITqJWeHEE/tnAjVj8t/yjG7El9oIT+3o93MlNNEWChHZ8J y+LPppKezOG12Q6gieqBvQtyr6NhF976xAww+EjZ49BzgKn2dDtwpQDfSHewV5nqwR6q HU9g== X-Gm-Message-State: AC+VfDwPms0D+C4cOvSQdF/x69EbM27/TZVVO/sPNRQ+LDJgnV09PTm6 m4K1LQZu9KzPkVlJo+Vd4PXpee0kSvDJLFjWFrk= X-Google-Smtp-Source: ACHHUZ45RSiX1HunXcCeODlTWvUD/ySkh/IqNVPPi3sd1iUs7OhaIe02B/nSahNEa+g8F/tLictJdw== X-Received: by 2002:a5d:9d8f:0:b0:77a:c93b:f9d8 with SMTP id ay15-20020a5d9d8f000000b0077ac93bf9d8mr2354591iob.2.1686335496742; Fri, 09 Jun 2023 11:31:36 -0700 (PDT) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id j4-20020a02a684000000b0040fb2ba7357sm1103124jam.4.2023.06.09.11.31.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 09 Jun 2023 11:31:36 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: andres@anarazel.de, Jens Axboe Subject: [PATCH 6/6] io_uring/futex: enable use of the allocation caches for futex_q Date: Fri, 9 Jun 2023 12:31:25 -0600 Message-Id: <20230609183125.673140-7-axboe@kernel.dk> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230609183125.673140-1-axboe@kernel.dk> References: <20230609183125.673140-1-axboe@kernel.dk> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org We're under the ctx uring_lock for the issue and completion path anyway, wire up the futex_q allocator so we can just recycle entries rather than hit the allocator every time. Signed-off-by: Jens Axboe --- include/linux/io_uring_types.h | 1 + io_uring/futex.c | 65 +++++++++++++++++++++++++++------- io_uring/futex.h | 8 +++++ io_uring/io_uring.c | 2 ++ 4 files changed, 63 insertions(+), 13 deletions(-) diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index d796b578c129..a7f03d8d879f 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -274,6 +274,7 @@ struct io_ring_ctx { unsigned int locked_free_nr; struct hlist_head futex_list; + struct io_alloc_cache futex_cache; const struct cred *sq_creds; /* cred used for __io_sq_thread() */ struct io_sq_data *sq_data; /* if using sq thread polling */ diff --git a/io_uring/futex.c b/io_uring/futex.c index a1d50145927a..e0707723c689 100644 --- a/io_uring/futex.c +++ b/io_uring/futex.c @@ -9,6 +9,7 @@ #include "../kernel/futex/futex.h" #include "io_uring.h" +#include "rsrc.h" #include "futex.h" struct io_futex { @@ -22,22 +23,48 @@ struct io_futex { ktime_t timeout; }; +struct io_futex_data { + union { + struct futex_q q; + struct io_cache_entry cache; + }; +}; + +void io_futex_cache_init(struct io_ring_ctx *ctx) +{ + io_alloc_cache_init(&ctx->futex_cache, IO_NODE_ALLOC_CACHE_MAX, + sizeof(struct io_futex_data)); +} + +static void io_futex_cache_entry_free(struct io_cache_entry *entry) +{ + kfree(container_of(entry, struct io_futex_data, cache)); +} + +void io_futex_cache_free(struct io_ring_ctx *ctx) +{ + io_alloc_cache_free(&ctx->futex_cache, io_futex_cache_entry_free); +} + static void io_futex_complete(struct io_kiocb *req, struct io_tw_state *ts) { + struct io_futex_data *ifd = req->async_data; struct io_ring_ctx *ctx = req->ctx; - kfree(req->async_data); io_tw_lock(ctx, ts); + if (!io_alloc_cache_put(&ctx->futex_cache, &ifd->cache)) + kfree(ifd); + req->async_data = NULL; hlist_del_init(&req->hash_node); io_req_task_complete(req, ts); } static bool __io_futex_cancel(struct io_ring_ctx *ctx, struct io_kiocb *req) { - struct futex_q *q = req->async_data; + struct io_futex_data *ifd = req->async_data; /* futex wake already done or in progress */ - if (!futex_unqueue(q)) + if (!futex_unqueue(&ifd->q)) return false; hlist_del_init(&req->hash_node); @@ -133,12 +160,23 @@ static void io_futex_wake_fn(struct wake_q_head *wake_q, struct futex_q *q) io_req_task_work_add(req); } +static struct io_futex_data *io_alloc_ifd(struct io_ring_ctx *ctx) +{ + struct io_cache_entry *entry; + + entry = io_alloc_cache_get(&ctx->futex_cache); + if (entry) + return container_of(entry, struct io_futex_data, cache); + + return kmalloc(sizeof(struct io_futex_data), GFP_NOWAIT); +} + int io_futex_wait(struct io_kiocb *req, unsigned int issue_flags) { struct io_futex *iof = io_kiocb_to_cmd(req, struct io_futex); struct io_ring_ctx *ctx = req->ctx; + struct io_futex_data *ifd; unsigned int flags = 0; - struct futex_q *q; int ret; if (!futex_op_to_flags(FUTEX_WAIT, iof->futex_flags, &flags)) { @@ -146,23 +184,24 @@ int io_futex_wait(struct io_kiocb *req, unsigned int issue_flags) goto done; } - q = kmalloc(sizeof(*q), GFP_NOWAIT); - if (!q) { + io_ring_submit_lock(ctx, issue_flags); + ifd = io_alloc_ifd(ctx); + if (!ifd) { + io_ring_submit_unlock(ctx, issue_flags); ret = -ENOMEM; goto done; } - req->async_data = q; - *q = futex_q_init; - q->bitset = iof->futex_mask; - q->wake = io_futex_wake_fn; - q->wake_data = req; + req->async_data = ifd; + ifd->q = futex_q_init; + ifd->q.bitset = iof->futex_mask; + ifd->q.wake = io_futex_wake_fn; + ifd->q.wake_data = req; - io_ring_submit_lock(ctx, issue_flags); hlist_add_head(&req->hash_node, &ctx->futex_list); io_ring_submit_unlock(ctx, issue_flags); - ret = futex_queue_wait(q, iof->uaddr, flags, iof->futex_val); + ret = futex_queue_wait(&ifd->q, iof->uaddr, flags, iof->futex_val); if (ret) goto done; diff --git a/io_uring/futex.h b/io_uring/futex.h index 16add2c069cc..e60d0abaf676 100644 --- a/io_uring/futex.h +++ b/io_uring/futex.h @@ -11,6 +11,8 @@ int io_futex_cancel(struct io_ring_ctx *ctx, struct io_cancel_data *cd, unsigned int issue_flags); bool io_futex_remove_all(struct io_ring_ctx *ctx, struct task_struct *task, bool cancel_all); +void io_futex_cache_init(struct io_ring_ctx *ctx); +void io_futex_cache_free(struct io_ring_ctx *ctx); #else static inline int io_futex_cancel(struct io_ring_ctx *ctx, struct io_cancel_data *cd, @@ -23,4 +25,10 @@ static inline bool io_futex_remove_all(struct io_ring_ctx *ctx, { return false; } +static inline void io_futex_cache_init(struct io_ring_ctx *ctx) +{ +} +static inline void io_futex_cache_free(struct io_ring_ctx *ctx) +{ +} #endif diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 8270f37c312d..7db2a139d110 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -318,6 +318,7 @@ static __cold struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p) sizeof(struct async_poll)); io_alloc_cache_init(&ctx->netmsg_cache, IO_ALLOC_CACHE_MAX, sizeof(struct io_async_msghdr)); + io_futex_cache_init(ctx); init_completion(&ctx->ref_comp); xa_init_flags(&ctx->personalities, XA_FLAGS_ALLOC1); mutex_init(&ctx->uring_lock); @@ -2917,6 +2918,7 @@ static __cold void io_ring_ctx_free(struct io_ring_ctx *ctx) io_eventfd_unregister(ctx); io_alloc_cache_free(&ctx->apoll_cache, io_apoll_cache_free); io_alloc_cache_free(&ctx->netmsg_cache, io_netmsg_cache_free); + io_futex_cache_free(ctx); io_destroy_buffers(ctx); mutex_unlock(&ctx->uring_lock); if (ctx->sq_creds)