From patchwork Wed Jul 10 17:58:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13729555 Received: from mail-ed1-f49.google.com (mail-ed1-f49.google.com [209.85.208.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 865F619580B; Wed, 10 Jul 2024 17:58:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.208.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720634305; cv=none; b=jo5pCjzjL4RRrGNdZ48pvI0ot9SqiYPzENb5b7n9cPga2z9djukk39C9r/q6RJYgHV6Dfw0piTMit7RudY0ID5pACYxrC1nGRnNSwfK33QHgJ1NCID1BN7/pFw/VxdhPUXYSEguYUUaUHsIXB6WeWe7/4r7bQsBdrPCxSf5aPb4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720634305; c=relaxed/simple; bh=56yxF9GpN3+SBqvxWIHDoRuiZiSSCgbsjJKnBgn+SdQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Z3pcWGIFQ0+ZabN+T582L77pgplBtdX4yqpxdSoESYlRLYksXIJuT+zz9y640AAqCDruAZ0slZcw8GYiD/5/hFalNBz4WjAOWAfKQbtnssr7YK35i/LAF8AxHx6eTWWfYhYmw2roWgr7MmdcebdNe4F9OENwMoiWCVHQmiqN5Zk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=VI0KlQgp; arc=none smtp.client-ip=209.85.208.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="VI0KlQgp" Received: by mail-ed1-f49.google.com with SMTP id 4fb4d7f45d1cf-595856e2332so16692a12.1; Wed, 10 Jul 2024 10:58:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1720634302; x=1721239102; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=P+VgPEMLmae+/qxOpZtakJuw1R/jzNgCMh5Dl8wOtdE=; b=VI0KlQgpIC1DhkmyJse6tzvbbvcTkJmLSJyZ5PtPP/5YeWuOarYOGoEGU0AHR/UOc2 /njpMDUhzttUt+oTJoyB1g3wZVsuJKQ0Mf0uniOUqvs0wa6JbrkVf6i5aSYYHRgRdwuB vndTRFo+SlPlFLuw1ujie16H4nwf0FnZdALltfiUOwRjovyxNQ8BJF1cGFi9gUuEZXqh HUdXKoCXVEgtoSa4FcUTwBKLBwK0Uq1OYUyRyTYbeh/FUJ7U9xQSns5HVsIZR/mwUfZT uNo8u8fiWBpdPi0xOVa5ucDmh8+B1fC622oi30lX5FiSiZNRrtTDQPt+ExBPmtmiOct6 LBZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720634302; x=1721239102; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=P+VgPEMLmae+/qxOpZtakJuw1R/jzNgCMh5Dl8wOtdE=; b=OBMrbdrgfYEZC7ivf7bxVrAFqMVPa1ENs6MeFjkE8IU2CLpotUJ1H0B8p8BJqK4wIg 6nNd0XeelO9m4YBdTGXfZTtJxBAcXAcK5I+FzW0E2Sn7tF1lZjuLDH0FTX6avzIzidRx xIbPZuiVYy485gSyBz/AWi18OPncrIs1ldJcIn6v62tYtO23XK1f4zgyE+7FbA632JyE rxLY9tTm4j4BoS/wgN8f+h3MMTz2ZBne6nYe3iVcu+d3CqtJbm4ntDQxEB5sx3nXyu7A iBtKnwKlqcW13vpKbTpW1ZMQgPxFdrwzwLswVuc08t1N314utttjVdFoyYoiUDpF4Vyz Qg2w== X-Forwarded-Encrypted: i=1; AJvYcCWqGRY1lFmTBc//Q6V3LutJMx2pxTxaz0K1jBYtHLK00huZu1L9MuFri/2kgnLuWE7KMakQz8bEnX5gTAJW5CgYkX6A418DZu+rq/LL X-Gm-Message-State: AOJu0YxJcrFXJiIVm9pzcOUqpNukv+yMphIRqZaFsNIRTssXvsLOU6Wx ZUayNx1e7phdQujL6I4ORhEoWYw5I4KHOPbeKnI84bgs5/Y1A0z227ik/A== X-Google-Smtp-Source: AGHT+IG1Ah/owR/vxbAcLhRPAwlG+SRJjcy9ABTH2RYIo9D9vHFpGpHCuWvC9bFMyKUYkjtzwlX2Dg== X-Received: by 2002:a05:6402:11cb:b0:58e:4088:ed2 with SMTP id 4fb4d7f45d1cf-594baf8d6e7mr4700754a12.16.1720634301471; Wed, 10 Jul 2024 10:58:21 -0700 (PDT) Received: from 127.0.0.1localhost ([163.114.131.193]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-594bbe2ce4bsm2459679a12.25.2024.07.10.10.58.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Jul 2024 10:58:20 -0700 (PDT) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com, Oleg Nesterov , Andrew Morton , Christian Brauner , Tycho Andersen , Thomas Gleixner , linux-kernel@vger.kernel.org, Julian Orth , Peter Zijlstra , Tejun Heo Subject: [PATCH v3 1/2] io_uring/io-wq: limit retrying worker initialisation Date: Wed, 10 Jul 2024 18:58:17 +0100 Message-ID: <8280436925db88448c7c85c6656edee1a43029ea.1720634146.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 If io-wq worker creation fails, we retry it by queueing up a task_work. tasK_work is needed because it should be done from the user process context. The problem is that retries are not limited, and if queueing a task_work is the reason for the failure, we might get into an infinite loop. It doesn't seem to happen now but it would with the following patch executing task_work in the freezer's loop. For now, arbitrarily limit the number of attempts to create a worker. Cc: stable@vger.kernel.org Fixes: 3146cba99aa28 ("io-wq: make worker creation resilient against signals") Reported-by: Julian Orth Signed-off-by: Pavel Begunkov --- io_uring/io-wq.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/io_uring/io-wq.c b/io_uring/io-wq.c index 913c92249522..f1e7c670add8 100644 --- a/io_uring/io-wq.c +++ b/io_uring/io-wq.c @@ -23,6 +23,7 @@ #include "io_uring.h" #define WORKER_IDLE_TIMEOUT (5 * HZ) +#define WORKER_INIT_LIMIT 3 enum { IO_WORKER_F_UP = 0, /* up and active */ @@ -58,6 +59,7 @@ struct io_worker { unsigned long create_state; struct callback_head create_work; + int init_retries; union { struct rcu_head rcu; @@ -745,7 +747,7 @@ static bool io_wq_work_match_all(struct io_wq_work *work, void *data) return true; } -static inline bool io_should_retry_thread(long err) +static inline bool io_should_retry_thread(struct io_worker *worker, long err) { /* * Prevent perpetual task_work retry, if the task (or its group) is @@ -753,6 +755,8 @@ static inline bool io_should_retry_thread(long err) */ if (fatal_signal_pending(current)) return false; + if (worker->init_retries++ >= WORKER_INIT_LIMIT) + return false; switch (err) { case -EAGAIN: @@ -779,7 +783,7 @@ static void create_worker_cont(struct callback_head *cb) io_init_new_worker(wq, worker, tsk); io_worker_release(worker); return; - } else if (!io_should_retry_thread(PTR_ERR(tsk))) { + } else if (!io_should_retry_thread(worker, PTR_ERR(tsk))) { struct io_wq_acct *acct = io_wq_get_acct(worker); atomic_dec(&acct->nr_running); @@ -846,7 +850,7 @@ static bool create_io_worker(struct io_wq *wq, int index) tsk = create_io_thread(io_wq_worker, worker, NUMA_NO_NODE); if (!IS_ERR(tsk)) { io_init_new_worker(wq, worker, tsk); - } else if (!io_should_retry_thread(PTR_ERR(tsk))) { + } else if (!io_should_retry_thread(worker, PTR_ERR(tsk))) { kfree(worker); goto fail; } else { From patchwork Wed Jul 10 17:58:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13729556 Received: from mail-ed1-f50.google.com (mail-ed1-f50.google.com [209.85.208.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DDB9F195B3B; Wed, 10 Jul 2024 17:58:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.208.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720634306; cv=none; b=dBfPah87loxIcO2J4hCRVVJIq8LWAi2D+vPLAOI4/3jgXxucPoDYj7FqFq6je71qYtLGq8Qwy0jlHuxeTDCMmCW3na4e2IzFaRd1WhDgccWIrpR7oTqcvdVG/9MDdZU0158xFDPK6Z6OFQyyU/BgNqQgD7lqx7IFSyqsRm916Vw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720634306; c=relaxed/simple; bh=m7m/bVEi2H+hkAa64PgHmBJK8baUdNNCA8E/4JNoJ9k=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=qxCd+Eeg3ETjwYEnGYwepxwOoiVhnGAz0nx1/IOAXkcot1RATZ9KkTEImXREMYc/elRcl7xm0nZKngJGd6uo0r8w/t6PuOT2nrl4tVzLN8COMXCwbhVmgDHin1vXTDVAjfIumHFD0XQ/6Y2cMrM4AuGgS+/426kGC7QSA71boA0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=I4ZVwxrQ; arc=none smtp.client-ip=209.85.208.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="I4ZVwxrQ" Received: by mail-ed1-f50.google.com with SMTP id 4fb4d7f45d1cf-595856e2332so16711a12.1; Wed, 10 Jul 2024 10:58:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1720634303; x=1721239103; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=DACbqRpGi9j2THmusiEJ+KTHhPMZmG8ZX1NHB/+0xms=; b=I4ZVwxrQDuPJY44UJTycDPPCJX8sSVFNMfPLhbYFU8vRXUL2BFzzJVazdA0A0L8jRZ jkViiWdR0vZxfs6skoJA/gUkR+/SvuWndqQqocFAbvQroN8BJHLETpBmyp/fN3MWRtmh 8/wPUNW8+FRcj23lxu9Fe2yW3i4TRZXiJZAAqAp5bW6kpsozmr3R+kuEjbIT93kXyisG Xs7xwTvRH4gKs50z3ZgAddgAVu872zD73sR4Hsz+evTblGgeyLYP/DLV640RntWtJ5Wk sGkJsSuVOa1YhKbi3WyXE9KGW0XHHSxA5lzubZt+ShM9jP3gYWDocgbLC+KEBJmLQtu2 mAxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720634303; x=1721239103; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DACbqRpGi9j2THmusiEJ+KTHhPMZmG8ZX1NHB/+0xms=; b=nuGzUM0F3ND+6FsJsSjgIQ9N2V+Rj7AT6Mb4UpeTH29FpTlz6BynSb0C3AxKXSw7LL UxZt0F+eqqUss0i2yOCc5QqmiIQSRNjtrpSqDlzkdPwCT2EyfVU+L51QqMCeTs1Hq+8W poWcrSC1d/zgxoq0tOBiA6ovu3A5lxqoCnuRNuzEyCHo5d+JlMGQUxKk7tx5+q1mxvdZ QTHB04HrOtzGIQ/bfgDP876ZUojoIbhftjCv95EGAX5KXpA8+N76cauESNFjzRwuTuLa 9DvBPKxNcR/qUjcXrFDro2eUcG9YjGfeE8dO4L2D1oX5wOgj5NCUl/3R2WaPG2FKkOhe rAjA== X-Forwarded-Encrypted: i=1; AJvYcCVFa8zR550lnyB/9yUcKF5jJ9RUuymOzIqNbzAWiNuBsQK/1dlKHKhDPCuA5lx6u4WZ66aUQ6q6FamDnVZzpuP8/eOJyoiUKpV6pSpe X-Gm-Message-State: AOJu0YyJg05gBYVjy0dQxKgUoUqGRZE76EyfwB0O/yhsW+6Vf5Lao3Ee JFkgCisgW4LrwKyOFzj5kRYusE1ah6lc91GIzWFX6f1x14Anua4qTmXtKA== X-Google-Smtp-Source: AGHT+IFsxL2uyUHxli2G9EjszMsBJ2HdxR/VIspu0wmO4SKVOgiFFNFDAAxRJH6kcB8vapUt5CD+Vg== X-Received: by 2002:a05:6402:1ed6:b0:584:a6f8:c0c5 with SMTP id 4fb4d7f45d1cf-594b7d81a1emr5467989a12.0.1720634302527; Wed, 10 Jul 2024 10:58:22 -0700 (PDT) Received: from 127.0.0.1localhost ([163.114.131.193]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-594bbe2ce4bsm2459679a12.25.2024.07.10.10.58.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Jul 2024 10:58:22 -0700 (PDT) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com, Oleg Nesterov , Andrew Morton , Christian Brauner , Tycho Andersen , Thomas Gleixner , linux-kernel@vger.kernel.org, Julian Orth , Peter Zijlstra , Tejun Heo Subject: [PATCH v3 2/2] kernel: rerun task_work while freezing in get_signal() Date: Wed, 10 Jul 2024 18:58:18 +0100 Message-ID: <89ed3a52933370deaaf61a0a620a6ac91f1e754d.1720634146.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 io_uring can asynchronously add a task_work while the task is getting freezed. TIF_NOTIFY_SIGNAL will prevent the task from sleeping in do_freezer_trap(), and since the get_signal()'s relock loop doesn't retry task_work, the task will spin there not being able to sleep until the freezing is cancelled / the task is killed / etc. Run task_works in the freezer path. Keep the patch small and simple so it can be easily back ported, but we might need to do some cleaning after and look if there are other places with similar problems. Cc: stable@vger.kernel.org Link: https://github.com/systemd/systemd/issues/33626 Fixes: 12db8b690010c ("entry: Add support for TIF_NOTIFY_SIGNAL") Reported-by: Julian Orth Acked-by: Oleg Nesterov Acked-by: Tejun Heo Signed-off-by: Pavel Begunkov --- kernel/signal.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/kernel/signal.c b/kernel/signal.c index 1f9dd41c04be..60c737e423a1 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -2600,6 +2600,14 @@ static void do_freezer_trap(void) spin_unlock_irq(¤t->sighand->siglock); cgroup_enter_frozen(); schedule(); + + /* + * We could've been woken by task_work, run it to clear + * TIF_NOTIFY_SIGNAL. The caller will retry if necessary. + */ + clear_notify_signal(); + if (unlikely(task_work_pending(current))) + task_work_run(); } static int ptrace_signal(int signr, kernel_siginfo_t *info, enum pid_type type)