From patchwork Mon Sep 9 10:23:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Safonov X-Patchwork-Id: 11137655 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2F8BB76 for ; Mon, 9 Sep 2019 10:24:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0DC8521920 for ; Mon, 9 Sep 2019 10:24:32 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=arista.com header.i=@arista.com header.b="eJ1w8LF2" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390580AbfIIKXr (ORCPT ); Mon, 9 Sep 2019 06:23:47 -0400 Received: from mail-wm1-f66.google.com ([209.85.128.66]:40249 "EHLO mail-wm1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390549AbfIIKXq (ORCPT ); Mon, 9 Sep 2019 06:23:46 -0400 Received: by mail-wm1-f66.google.com with SMTP id t9so13972408wmi.5 for ; Mon, 09 Sep 2019 03:23:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=googlenew; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=f7Bzmc8unRj2o3NCWa0+dcZGFlGvmP3kCEEVnSXhkOA=; b=eJ1w8LF2MYSOvsRCvj7XcfJA5lhEO87R1bLBeu65GW2j/DGqZqPprE6kQ/hoghBzbS w6IdbZv33BK7k8oJg6WhQJ6q45RWIDsL9J4I4lmQeSytRwscjSdlC1vPe0up/jQedywO ozLkAMFRLRFb272kZZBTRLL+klZgN6S6NvqmWL2zm4s8aU7nI6u7NRc01CFL+L4mhfAq 13ec60iL8+uugHfNqWE9jgWn4GLDuRC4ysdNh03NZwYyadGN2ZTHGbRzb+wzh1HI8hEf RWxOlKkMMFrrMbl/ZDjywr90FbmUdRl8ZatAWFuhFYCT3A9SCDF+Qy7Us1qtVzMPfVr+ 5ZRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=f7Bzmc8unRj2o3NCWa0+dcZGFlGvmP3kCEEVnSXhkOA=; b=UX/tnYzU1Vnu4Z2OSFl+kbIeR9WjpYxCCM4ndBa/i8xusgqKKe7Qv1DDg/OjHDQswb fztY+gKwXHo3cRgu4TfHkNbyr5OSwhHMi8BArVx7loyf1YOcph4GXEKhVfrAu90Uxy/4 zihutm1MFp3WOFy2rGs+4bmc1PyN0gnMnWMnylFCGXbT5Mu6hpr08v9Z1xBCWFele1nh qGdO4qdEGDUU6FmEhxuH7XjSjYSeUooprD+DWmIMqm0L6QPb6Mrd0092/uSp+vHm+qq1 N/7/I2/QehWhYl1SXV5tVhDpU+S8/scHuATvdXppczz0s3DuDBKkxd8I7rcNr8vmkqEw 45GQ== X-Gm-Message-State: APjAAAXtyl01maLm6d8PP5yFddWOvqr9oDP3On2/h6+wY9Ymrq0fleUu bk6NqHn5+rFA4/NlvCteozyzLA== X-Google-Smtp-Source: APXvYqzmxzn7Fo/gtLbCgxCJksmoaFbMVNn7wj0/OSQx0ISZVsTWLDyDCK67zQsOMozuUO5BCkL7Zg== X-Received: by 2002:a1c:1acc:: with SMTP id a195mr18081713wma.106.1568024624900; Mon, 09 Sep 2019 03:23:44 -0700 (PDT) Received: from Mindolluin.localdomain ([148.69.85.38]) by smtp.gmail.com with ESMTPSA id d14sm1800008wrj.27.2019.09.09.03.23.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Sep 2019 03:23:44 -0700 (PDT) From: Dmitry Safonov To: linux-kernel@vger.kernel.org Cc: Dmitry Safonov <0x7f454c46@gmail.com>, Dmitry Safonov , Adrian Reber , Alexander Viro , Andrei Vagin , Andy Lutomirski , Cyrill Gorcunov , Ingo Molnar , Oleg Nesterov , Pavel Emelyanov , Thomas Gleixner , containers@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org Subject: [PATCH 1/9] futex: Remove unused uaddr2 in restart_block Date: Mon, 9 Sep 2019 11:23:32 +0100 Message-Id: <20190909102340.8592-2-dima@arista.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20190909102340.8592-1-dima@arista.com> References: <20190909102340.8592-1-dima@arista.com> MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Not used since introduction in commit 52400ba94675 ("futex: add requeue_pi functionality"). The result union stays the same size, so nothing saved in task_struct, but still one __user pointer less to keep. Signed-off-by: Dmitry Safonov --- include/linux/restart_block.h | 1 - 1 file changed, 1 deletion(-) diff --git a/include/linux/restart_block.h b/include/linux/restart_block.h index bba2920e9c05..e5078cae5567 100644 --- a/include/linux/restart_block.h +++ b/include/linux/restart_block.h @@ -32,7 +32,6 @@ struct restart_block { u32 flags; u32 bitset; u64 time; - u32 __user *uaddr2; } futex; /* For nanosleep */ struct { From patchwork Mon Sep 9 10:23:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Safonov X-Patchwork-Id: 11137639 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A3B2A76 for ; Mon, 9 Sep 2019 10:23:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 78EF921D7D for ; Mon, 9 Sep 2019 10:23:52 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=arista.com header.i=@arista.com header.b="U0XMlxSQ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390636AbfIIKXv (ORCPT ); Mon, 9 Sep 2019 06:23:51 -0400 Received: from mail-wm1-f66.google.com ([209.85.128.66]:54305 "EHLO mail-wm1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390548AbfIIKXt (ORCPT ); Mon, 9 Sep 2019 06:23:49 -0400 Received: by mail-wm1-f66.google.com with SMTP id p7so1582559wmp.4 for ; Mon, 09 Sep 2019 03:23:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=googlenew; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=IA0pHM62rBt/shfTkMZDw95CUns+TQYkKXXhJayt/9U=; b=U0XMlxSQHq3X/3eDTFp9gMAg0RdWOTmIkI/8SejEwVjgRQkVN+mKfrmUO+6NK/B19i J69SZuDgx6wSWiNJhyc+jxxJEjQODQtB3ii8qA590akliD2WOFuq8riSf4PUfQ1u9Wf7 HS8bUT+5T+5sgYfQjU/Nu0NUXkiPluWQZmVPxBfmlLGgEQGKoDx+XP8/rpRjdOoUQAXx RUAUlmzy+KiDwBC3+RmSfExy8zDPCh320PC2BcR3iO0d+DSrkCR+Oc9s8IjP0SRHFW/b Ivu2JW+UQMoQ2lhqXigxazO1duoF/4UIYCn4Oz8AuedOqrEd4r2+vw/AQajRaKAGjjrM KOTg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=IA0pHM62rBt/shfTkMZDw95CUns+TQYkKXXhJayt/9U=; b=l7UAQXbHKmWCRNMfYw5C5awh1QjX4z5t7gkGJiu8f+zkYNBjqxvZNzk7Vyq+XFX35k W4JhleAupghxzWVIqPgrznTHq4+d6vJsciBAApHAjU8NDsKh7JwGygXSWeeVcyU/fuHM hZuZ0cqqn6D8iS9mYqhM0/wZyOvYbiq0J+461fh1K75CFu2ydFd36eAIB0NCnclf6Rdi DXFxG7JVMxWLf/i35UWyrLZms/aSVnMJ3UQSw2OCIpeUsa5yRbnIpvEh1zQzdsuFvABr NWndhY/rWBFMRNmPFNwpu5WK/P4UvhTUcuzrtKaCkGXnybX+2ZBp6Ak4sSPfF6Setku0 Zp1A== X-Gm-Message-State: APjAAAWCX5pAuo3t8D5cXti4wxpmMu3ls1QO7kDYRcRlCeqZcmGLZhtf qYAPn/w8jRZhqrTVsQoVlhTIcA== X-Google-Smtp-Source: APXvYqxO92spFZfDerxwHiQLtm1hbtX83culFxinsCcT06VZq/PB8u2x8luUXD3yumaETBFthgzPsg== X-Received: by 2002:a1c:a558:: with SMTP id o85mr18092823wme.30.1568024626269; Mon, 09 Sep 2019 03:23:46 -0700 (PDT) Received: from Mindolluin.localdomain ([148.69.85.38]) by smtp.gmail.com with ESMTPSA id d14sm1800008wrj.27.2019.09.09.03.23.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Sep 2019 03:23:45 -0700 (PDT) From: Dmitry Safonov To: linux-kernel@vger.kernel.org Cc: Dmitry Safonov <0x7f454c46@gmail.com>, Dmitry Safonov , Adrian Reber , Alexander Viro , Andrei Vagin , Andy Lutomirski , Cyrill Gorcunov , Ingo Molnar , Oleg Nesterov , Pavel Emelyanov , Thomas Gleixner , containers@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org Subject: [PATCH 2/9] restart_block: Prevent userspace set part of the block Date: Mon, 9 Sep 2019 11:23:33 +0100 Message-Id: <20190909102340.8592-3-dima@arista.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20190909102340.8592-1-dima@arista.com> References: <20190909102340.8592-1-dima@arista.com> MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Parameters for nanosleep() could be chosen the way to make hrtimer_nanosleep() fail. In that case changes to restarter_block bring it into inconsistent state. Luckily, it won't corrupt anything critical for poll() or futex(). But as it's not evident that userspace may do tricks in the union changing restart_block for other @fs(s) - than further changes in the code may create a potential local vulnerability. I.e., if userspace could do tricks with poll() or futex() than corruption to @clockid or @type would trigger BUG() in timer code. Set @fn every time restart_block is changed, preventing surprises. Also, add a comment for any new restart_block user. Signed-off-by: Dmitry Safonov --- include/linux/restart_block.h | 4 ++++ kernel/time/hrtimer.c | 8 +++++--- kernel/time/posix-cpu-timers.c | 6 +++--- kernel/time/posix-stubs.c | 8 +++++--- kernel/time/posix-timers.c | 8 +++++--- 5 files changed, 22 insertions(+), 12 deletions(-) diff --git a/include/linux/restart_block.h b/include/linux/restart_block.h index e5078cae5567..e66e982105f4 100644 --- a/include/linux/restart_block.h +++ b/include/linux/restart_block.h @@ -21,6 +21,10 @@ enum timespec_type { /* * System call restart block. + * + * Safety rule: if you change anything inside @restart_block, + * set @fn to keep the structure in consistent state and prevent + * userspace tricks in the union. */ struct restart_block { long (*fn)(struct restart_block *); diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c index 5ee77f1a8a92..4ba2b50d068f 100644 --- a/kernel/time/hrtimer.c +++ b/kernel/time/hrtimer.c @@ -1762,8 +1762,9 @@ SYSCALL_DEFINE2(nanosleep, struct __kernel_timespec __user *, rqtp, if (!timespec64_valid(&tu)) return -EINVAL; - current->restart_block.nanosleep.type = rmtp ? TT_NATIVE : TT_NONE; - current->restart_block.nanosleep.rmtp = rmtp; + current->restart_block.fn = do_no_restart_syscall; + current->restart_block.nanosleep.type = rmtp ? TT_NATIVE : TT_NONE; + current->restart_block.nanosleep.rmtp = rmtp; return hrtimer_nanosleep(&tu, HRTIMER_MODE_REL, CLOCK_MONOTONIC); } @@ -1782,7 +1783,8 @@ SYSCALL_DEFINE2(nanosleep_time32, struct old_timespec32 __user *, rqtp, if (!timespec64_valid(&tu)) return -EINVAL; - current->restart_block.nanosleep.type = rmtp ? TT_COMPAT : TT_NONE; + current->restart_block.fn = do_no_restart_syscall; + current->restart_block.nanosleep.type = rmtp ? TT_COMPAT : TT_NONE; current->restart_block.nanosleep.compat_rmtp = rmtp; return hrtimer_nanosleep(&tu, HRTIMER_MODE_REL, CLOCK_MONOTONIC); } diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c index 0a426f4e3125..b4dddf74dd15 100644 --- a/kernel/time/posix-cpu-timers.c +++ b/kernel/time/posix-cpu-timers.c @@ -1243,6 +1243,8 @@ void set_process_cpu_timer(struct task_struct *tsk, unsigned int clock_idx, tick_dep_set_signal(tsk->signal, TICK_DEP_BIT_POSIX_TIMER); } +static long posix_cpu_nsleep_restart(struct restart_block *restart_block); + static int do_cpu_nanosleep(const clockid_t which_clock, int flags, const struct timespec64 *rqtp) { @@ -1330,6 +1332,7 @@ static int do_cpu_nanosleep(const clockid_t which_clock, int flags, * Report back to the user the time still remaining. */ restart = ¤t->restart_block; + restart->fn = posix_cpu_nsleep_restart; restart->nanosleep.expires = expires; if (restart->nanosleep.type != TT_NONE) error = nanosleep_copyout(restart, &it.it_value); @@ -1338,8 +1341,6 @@ static int do_cpu_nanosleep(const clockid_t which_clock, int flags, return error; } -static long posix_cpu_nsleep_restart(struct restart_block *restart_block); - static int posix_cpu_nsleep(const clockid_t which_clock, int flags, const struct timespec64 *rqtp) { @@ -1361,7 +1362,6 @@ static int posix_cpu_nsleep(const clockid_t which_clock, int flags, if (flags & TIMER_ABSTIME) return -ERESTARTNOHAND; - restart_block->fn = posix_cpu_nsleep_restart; restart_block->nanosleep.clockid = which_clock; } return error; diff --git a/kernel/time/posix-stubs.c b/kernel/time/posix-stubs.c index 67df65f887ac..d73039a9ca8f 100644 --- a/kernel/time/posix-stubs.c +++ b/kernel/time/posix-stubs.c @@ -142,8 +142,9 @@ SYSCALL_DEFINE4(clock_nanosleep, const clockid_t, which_clock, int, flags, return -EINVAL; if (flags & TIMER_ABSTIME) rmtp = NULL; - current->restart_block.nanosleep.type = rmtp ? TT_NATIVE : TT_NONE; - current->restart_block.nanosleep.rmtp = rmtp; + current->restart_block.fn = do_no_restart_syscall; + current->restart_block.nanosleep.type = rmtp ? TT_NATIVE : TT_NONE; + current->restart_block.nanosleep.rmtp = rmtp; return hrtimer_nanosleep(&t, flags & TIMER_ABSTIME ? HRTIMER_MODE_ABS : HRTIMER_MODE_REL, which_clock); @@ -228,7 +229,8 @@ SYSCALL_DEFINE4(clock_nanosleep_time32, clockid_t, which_clock, int, flags, return -EINVAL; if (flags & TIMER_ABSTIME) rmtp = NULL; - current->restart_block.nanosleep.type = rmtp ? TT_COMPAT : TT_NONE; + current->restart_block.fn = do_no_restart_syscall; + current->restart_block.nanosleep.type = rmtp ? TT_COMPAT : TT_NONE; current->restart_block.nanosleep.compat_rmtp = rmtp; return hrtimer_nanosleep(&t, flags & TIMER_ABSTIME ? HRTIMER_MODE_ABS : HRTIMER_MODE_REL, diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c index d7f2d91acdac..0ca0bfc20aff 100644 --- a/kernel/time/posix-timers.c +++ b/kernel/time/posix-timers.c @@ -1189,8 +1189,9 @@ SYSCALL_DEFINE4(clock_nanosleep, const clockid_t, which_clock, int, flags, return -EINVAL; if (flags & TIMER_ABSTIME) rmtp = NULL; - current->restart_block.nanosleep.type = rmtp ? TT_NATIVE : TT_NONE; - current->restart_block.nanosleep.rmtp = rmtp; + current->restart_block.fn = do_no_restart_syscall; + current->restart_block.nanosleep.type = rmtp ? TT_NATIVE : TT_NONE; + current->restart_block.nanosleep.rmtp = rmtp; return kc->nsleep(which_clock, flags, &t); } @@ -1216,7 +1217,8 @@ SYSCALL_DEFINE4(clock_nanosleep_time32, clockid_t, which_clock, int, flags, return -EINVAL; if (flags & TIMER_ABSTIME) rmtp = NULL; - current->restart_block.nanosleep.type = rmtp ? TT_COMPAT : TT_NONE; + current->restart_block.fn = do_no_restart_syscall; + current->restart_block.nanosleep.type = rmtp ? TT_COMPAT : TT_NONE; current->restart_block.nanosleep.compat_rmtp = rmtp; return kc->nsleep(which_clock, flags, &t); From patchwork Mon Sep 9 10:23:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Safonov X-Patchwork-Id: 11137653 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 976AF76 for ; Mon, 9 Sep 2019 10:24:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 752D921920 for ; Mon, 9 Sep 2019 10:24:28 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=arista.com header.i=@arista.com header.b="XMwbu7B5" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390812AbfIIKYY (ORCPT ); Mon, 9 Sep 2019 06:24:24 -0400 Received: from mail-wm1-f68.google.com ([209.85.128.68]:39835 "EHLO mail-wm1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390597AbfIIKXt (ORCPT ); Mon, 9 Sep 2019 06:23:49 -0400 Received: by mail-wm1-f68.google.com with SMTP id q12so13994093wmj.4 for ; Mon, 09 Sep 2019 03:23:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=googlenew; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=STLctuURefaE8eq55GBboHVj7+QXpVRUnAJsIE8QlMY=; b=XMwbu7B5OAgAIWw6N2LtryeCTvQ6z3+05hNQ741oPpGzrgd7tXuAkkU5zdc3X/eqlD yiG4Gmi4sTjgvThu5RQsAALJWUCRUekt+KXZnGUx3ELzgvvgcr2QaY/r7DGaasyT2tDw 5qTGdnitBh3WFpeDyGDVP8LHsDpqHI+/GRyiiFZq0Li1sjubvjRX5RiioqxW1bJiEnXH Uy0OVpQKqUjqjQ31Me19gCjqn13KYbhrB1YPo2wHuo24N28g+cKTvYI7Zc1vvUGsAWmu QF1hzkYFCwmTBiO9omikKuQXD539nG96dOaFw4+JkHg//ePtrRAiF98KfEWUyoFnOKaG KmLQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=STLctuURefaE8eq55GBboHVj7+QXpVRUnAJsIE8QlMY=; b=fP3JCmgwXxunDjvlubbnJ9kCbcQC1IvMWQJmePHloymGXVlQOr3QY6C3mH2zZHSTfN lB+6Cby8a0lXUgk8HLrKh8LJ6wzz9pDbb5QzOOQanAJzuOHDx97GyvPCI6NxqEmPCPC5 AcSBG2/4VDfNRhgYqsvLdREjPHVUrGK3Y2vPwkYPhnKifqPjB+wdkRjlFhSDAGocoybG koNqHfIZ83ecl4kxzhxPdJe+EjjFYiFbTz4vW216PelH0hMa5caq0Ml5jH90SlFmx4mA bfPyuX00rRfOvNQP1E8rQtiz39ZwLuZNphwcJiAO6e0Awd3D2On7uwqtuEQFSYRtXKgy tUaw== X-Gm-Message-State: APjAAAXohVens11n4jafNkvT0xqu5+tPCSOKrNXagq7IqOJPv9qTBOyI S/2cD2Dw4DVTLLeLf/F+Ing6SA== X-Google-Smtp-Source: APXvYqwlnYxK/Ion+EpCVZ4VcHAZvyCrtmY/DDiUgd2oheMGu7g2FrmmWJ0F4QyACrXJgUyTC44ndA== X-Received: by 2002:a7b:c651:: with SMTP id q17mr17777639wmk.13.1568024627727; Mon, 09 Sep 2019 03:23:47 -0700 (PDT) Received: from Mindolluin.localdomain ([148.69.85.38]) by smtp.gmail.com with ESMTPSA id d14sm1800008wrj.27.2019.09.09.03.23.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Sep 2019 03:23:47 -0700 (PDT) From: Dmitry Safonov To: linux-kernel@vger.kernel.org Cc: Dmitry Safonov <0x7f454c46@gmail.com>, Dmitry Safonov , Adrian Reber , Alexander Viro , Andrei Vagin , Andy Lutomirski , Cyrill Gorcunov , Ingo Molnar , Oleg Nesterov , Pavel Emelyanov , Thomas Gleixner , containers@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org Subject: [PATCH 3/9] select: Convert __esimate_accuracy() to ktime_t Date: Mon, 9 Sep 2019 11:23:34 +0100 Message-Id: <20190909102340.8592-4-dima@arista.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20190909102340.8592-1-dima@arista.com> References: <20190909102340.8592-1-dima@arista.com> MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org __estimate_accuracy() divides 64-bit integers twice which is suboptimal. Converting to ktime_t not only avoids that, but also simplifies the logic on some extent. The long-term goal is to convert poll() to leave timeout value in ktime_t inside restart_block as it's the only user that leaves it in timespec. That's a preparation ground for introducing a new ptrace() request that will dump timeout for interrupted syscall. Furthermore, do_select() and do_poll() actually both need time in ktime_t for poll_schedule_timeout(), so there is this hack that converts time on the first loop. It's not only a "hack", but also it's done every time poll() syscall is restarted. After conversion it'll be removed. While at it, rename parameters "slack" and "timeout" which describe their purpose better. Signed-off-by: Dmitry Safonov --- fs/select.c | 33 +++++++++++++-------------------- 1 file changed, 13 insertions(+), 20 deletions(-) diff --git a/fs/select.c b/fs/select.c index 53a0c149f528..12cdefd3be2d 100644 --- a/fs/select.c +++ b/fs/select.c @@ -36,7 +36,7 @@ /* - * Estimate expected accuracy in ns from a timeval. + * Estimate expected accuracy in ns. * * After quite a bit of churning around, we've settled on * a simple thing of taking 0.1% of the timeout as the @@ -49,22 +49,17 @@ #define MAX_SLACK (100 * NSEC_PER_MSEC) -static long __estimate_accuracy(struct timespec64 *tv) +static long __estimate_accuracy(ktime_t slack) { - long slack; int divfactor = 1000; - if (tv->tv_sec < 0) + if (slack < 0) return 0; if (task_nice(current) > 0) divfactor = divfactor / 5; - if (tv->tv_sec > MAX_SLACK / (NSEC_PER_SEC/divfactor)) - return MAX_SLACK; - - slack = tv->tv_nsec / divfactor; - slack += tv->tv_sec * (NSEC_PER_SEC/divfactor); + slack = ktime_divns(slack, divfactor); if (slack > MAX_SLACK) return MAX_SLACK; @@ -72,27 +67,25 @@ static long __estimate_accuracy(struct timespec64 *tv) return slack; } -u64 select_estimate_accuracy(struct timespec64 *tv) +u64 select_estimate_accuracy(struct timespec64 *timeout) { - u64 ret; - struct timespec64 now; + ktime_t now, slack; /* * Realtime tasks get a slack of 0 for obvious reasons. */ - if (rt_task(current)) return 0; - ktime_get_ts64(&now); - now = timespec64_sub(*tv, now); - ret = __estimate_accuracy(&now); - if (ret < current->timer_slack_ns) - return current->timer_slack_ns; - return ret; -} + now = ktime_get(); + slack = now - timespec64_to_ktime(*timeout); + slack = __estimate_accuracy(slack); + if (slack < current->timer_slack_ns) + return current->timer_slack_ns; + return slack; +} struct poll_table_page { struct poll_table_page * next; From patchwork Mon Sep 9 10:23:35 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Safonov X-Patchwork-Id: 11137651 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0221F76 for ; Mon, 9 Sep 2019 10:24:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D496A21D6C for ; Mon, 9 Sep 2019 10:24:23 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=arista.com header.i=@arista.com header.b="H2AxMfh6" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390810AbfIIKYX (ORCPT ); Mon, 9 Sep 2019 06:24:23 -0400 Received: from mail-wm1-f65.google.com ([209.85.128.65]:54315 "EHLO mail-wm1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390614AbfIIKXu (ORCPT ); Mon, 9 Sep 2019 06:23:50 -0400 Received: by mail-wm1-f65.google.com with SMTP id p7so1582750wmp.4 for ; Mon, 09 Sep 2019 03:23:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=googlenew; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=In6xjIv1qvZx6AHWSp+pYmS8Y+YcRx5T9VqtwgDDOao=; b=H2AxMfh6lZl28VuEpAig+Ks6VvS2c/2mZ96FjkKszabA6MyJP9kHbQ1KKprSKyFSmJ mN4OqbigiTeZ3SFvyoAWcicUp+ltfyiVQvI/5E+Fx8g5Lq1/q6CYklfhsHkBRI1QY/MU dETvL5Cke5dXWfEWU3vBj8E8Ij/JfReNOs3Yy/sxO8XIGlENDWstC5iB0Ttzggciw8CK qQ3SjJQjkpCNt+9wSAjQytMKAwbxIbfFWkxX+TPYPFag194n25iWz50I4ChcDlZEEPuv KMYnXYd3ae0OtxC/b7wnIClKjO4keCcC049HzFNRzUbrsnSwkpz05DTvgEmahBkZQunG FCRg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=In6xjIv1qvZx6AHWSp+pYmS8Y+YcRx5T9VqtwgDDOao=; b=kmqrUZURFP0gD/1DTA5N2+qLiNiXpTcM3650xIdLUyF3FMm6Fki+Oy9RhQHmZAIuky Bndfql1ujQyEKayf2GP46TGJHmICeN4amoZQkPMTmbt3CFpWK2rYN0wa/unFb4mwlwnf ZSOkBuPDgdoSJGhznP3ClVcB6gr9V+nJTkQR4RScoOX1V4mksTuKBRrB+9NVIWfycZXg m2vidZ5AEV2/t/cBZMybLI5Ng5+rhB/JMQIIjjtmSNk4ruHNPydLBonJANgccJHZ/HU+ hJi+AD9q4a4+LjSa9ML5KIcRMDbXDb2UHcIU4xuRPSqM1KxfImwzrCaOas95TGeun0Dj s5Gg== X-Gm-Message-State: APjAAAXyQGT/pN2lUFJoDt53Ivgt5hBEvra9pW0kWhnez07kyQ5axELB dohIGmLQltt2LfKt9cVdA1K8pg== X-Google-Smtp-Source: APXvYqwodAb2hSbu1oBC3qnGuQjKRAYFo/bkD32ISS9HPSjmr8FMWLHhger1WxXkvavqQkINxe6skw== X-Received: by 2002:a7b:c5ca:: with SMTP id n10mr18393828wmk.138.1568024629199; Mon, 09 Sep 2019 03:23:49 -0700 (PDT) Received: from Mindolluin.localdomain ([148.69.85.38]) by smtp.gmail.com with ESMTPSA id d14sm1800008wrj.27.2019.09.09.03.23.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Sep 2019 03:23:48 -0700 (PDT) From: Dmitry Safonov To: linux-kernel@vger.kernel.org Cc: Dmitry Safonov <0x7f454c46@gmail.com>, Dmitry Safonov , Adrian Reber , Alexander Viro , Andrei Vagin , Andy Lutomirski , Cyrill Gorcunov , Ingo Molnar , Oleg Nesterov , Pavel Emelyanov , Thomas Gleixner , containers@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org Subject: [PATCH 4/9] select: Micro-optimise __estimate_accuracy() Date: Mon, 9 Sep 2019 11:23:35 +0100 Message-Id: <20190909102340.8592-5-dima@arista.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20190909102340.8592-1-dima@arista.com> References: <20190909102340.8592-1-dima@arista.com> MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Shift on s64 is faster than division, use it instead. As the result of the patch there is a hardly user-visible effect: poll(), select(), etc syscalls will be a bit more precise on ~2.3% than before because 1000 != 1024 :) Signed-off-by: Dmitry Safonov --- fs/select.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/fs/select.c b/fs/select.c index 12cdefd3be2d..2477c202631e 100644 --- a/fs/select.c +++ b/fs/select.c @@ -51,15 +51,14 @@ static long __estimate_accuracy(ktime_t slack) { - int divfactor = 1000; - if (slack < 0) return 0; - if (task_nice(current) > 0) - divfactor = divfactor / 5; + /* A bit more precise than 0.1% */ + slack = slack >> 10; - slack = ktime_divns(slack, divfactor); + if (task_nice(current) > 0) + slack = slack * 5; if (slack > MAX_SLACK) return MAX_SLACK; From patchwork Mon Sep 9 10:23:36 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Safonov X-Patchwork-Id: 11137645 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 94F2076 for ; Mon, 9 Sep 2019 10:24:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 72FCA21D79 for ; Mon, 9 Sep 2019 10:24:11 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=arista.com header.i=@arista.com header.b="DtU4lyMQ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390676AbfIIKX4 (ORCPT ); Mon, 9 Sep 2019 06:23:56 -0400 Received: from mail-wm1-f67.google.com ([209.85.128.67]:50352 "EHLO mail-wm1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390548AbfIIKXw (ORCPT ); Mon, 9 Sep 2019 06:23:52 -0400 Received: by mail-wm1-f67.google.com with SMTP id c10so13188530wmc.0 for ; Mon, 09 Sep 2019 03:23:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=googlenew; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=b96u2DZEmN+6gC1egZA7UNHltszHG2vFMnr/ogVBH54=; b=DtU4lyMQhHwYVonGt7Yv5uvfdIscs6YoQ3ePua1nQ369s7idHb74TmeY4LLq8tTUoa YwqpJtwSXQIRTw9hCyL7LKKcMj7NxFnQzRtg6sBswQQuY1Agwsf7KTIjs2rK3amt8bVQ gXghtAAfetLzaL7jJGHR12sst2D9XvDTt0yqS4ubKamuocD8FEP2mGTmygkR466fduA4 th7EJO+7oL22XpuutN4/zN+JOixNa8VUDIiZGDiN6SshhI+9CUN+rRS44HA459UJCq4V VhP4hEb3hLlCXHexrUzbnbfBsUyERMUwQEO1XmzK3VL6+qMiXRcaONnxP9Jj9lma8Psx mQSg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=b96u2DZEmN+6gC1egZA7UNHltszHG2vFMnr/ogVBH54=; b=g2VrlzJ1eTTaXW7k0YfNbdKAzyQRNbvChBLyzM17ChaAXBXoAJMS+Qm+fwVRZq/JPS kjP2aX7QaJSqNaj/eIjtiCT1tIssx8YDJe1LS1u7JW+uzMY8rn38pNH894YkuORKvoWd W5UJR98ErcUO9jJ5cG0IKeporFVnTrY4TYqZuboLxu0ZRcNtQy5EJ/urvfqdVZUNv8Gn XfjtFUYs1MviJLfi55KvgLdfCc3MJV93c7dDM8kfojFJs0KOF81YGfkV6ofHOOJBmb2r 8knS8qhquOoa4dafoLzVZxF+YMLvVsV2ZInGCpl2os0UgSvdoEOYc4WxjL5zlw34nrMN Zhcw== X-Gm-Message-State: APjAAAU31Mo5eZfdVthFh0nyiG+kKPtp3duEIHl+yigdj7ej+BCFkOAH MlTmxomdzEbCOUng4C9UkOJpFA== X-Google-Smtp-Source: APXvYqwY+peEV0qNHvT12MvBJHBW1QbIn5NYjERIgrYxQ5I61p7PI7+Sb3g9dDm4YmLNnuuWbNR2gg== X-Received: by 2002:a1c:cb07:: with SMTP id b7mr19044419wmg.41.1568024630617; Mon, 09 Sep 2019 03:23:50 -0700 (PDT) Received: from Mindolluin.localdomain ([148.69.85.38]) by smtp.gmail.com with ESMTPSA id d14sm1800008wrj.27.2019.09.09.03.23.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Sep 2019 03:23:50 -0700 (PDT) From: Dmitry Safonov To: linux-kernel@vger.kernel.org Cc: Dmitry Safonov <0x7f454c46@gmail.com>, Dmitry Safonov , Adrian Reber , Alexander Viro , Andrei Vagin , Andy Lutomirski , Cyrill Gorcunov , Ingo Molnar , Oleg Nesterov , Pavel Emelyanov , Thomas Gleixner , containers@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org Subject: [PATCH 5/9] select: Convert select_estimate_accuracy() to take ktime_t Date: Mon, 9 Sep 2019 11:23:36 +0100 Message-Id: <20190909102340.8592-6-dima@arista.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20190909102340.8592-1-dima@arista.com> References: <20190909102340.8592-1-dima@arista.com> MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Instead of converting the time on the first loop, the same if (end_time) can be shared. Simplify the loop by taking time conversion out. Also prepare the ground for converting poll() restart_block timeout into ktime_t - that's the only user that leaves it in timespec. The conversion is needed to introduce an API for ptrace() to get a timeout from restart_block. Signed-off-by: Dmitry Safonov --- fs/eventpoll.c | 4 ++-- fs/select.c | 38 ++++++++++++-------------------------- include/linux/poll.h | 2 +- 3 files changed, 15 insertions(+), 29 deletions(-) diff --git a/fs/eventpoll.c b/fs/eventpoll.c index d7f1f5011fac..d5120fc49a39 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -1836,9 +1836,9 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events, if (timeout > 0) { struct timespec64 end_time = ep_set_mstimeout(timeout); - slack = select_estimate_accuracy(&end_time); + expires = timespec64_to_ktime(end_time); to = &expires; - *to = timespec64_to_ktime(end_time); + slack = select_estimate_accuracy(expires); } else if (timeout == 0) { /* * Avoid the unnecessary trip to the wait queue loop, if the diff --git a/fs/select.c b/fs/select.c index 2477c202631e..458f2a944318 100644 --- a/fs/select.c +++ b/fs/select.c @@ -66,7 +66,7 @@ static long __estimate_accuracy(ktime_t slack) return slack; } -u64 select_estimate_accuracy(struct timespec64 *timeout) +u64 select_estimate_accuracy(ktime_t timeout) { ktime_t now, slack; @@ -77,7 +77,7 @@ u64 select_estimate_accuracy(struct timespec64 *timeout) return 0; now = ktime_get(); - slack = now - timespec64_to_ktime(*timeout); + slack = now - timeout; slack = __estimate_accuracy(slack); if (slack < current->timer_slack_ns) @@ -490,8 +490,11 @@ static int do_select(int n, fd_set_bits *fds, struct timespec64 *end_time) timed_out = 1; } - if (end_time && !timed_out) - slack = select_estimate_accuracy(end_time); + if (end_time && !timed_out) { + expire = timespec64_to_ktime(*end_time); + to = &expire; + slack = select_estimate_accuracy(expire); + } retval = 0; for (;;) { @@ -582,16 +585,6 @@ static int do_select(int n, fd_set_bits *fds, struct timespec64 *end_time) } busy_flag = 0; - /* - * If this is the first loop and we have a timeout - * given, then we convert to ktime_t and set the to - * pointer to the expiry value. - */ - if (end_time && !to) { - expire = timespec64_to_ktime(*end_time); - to = &expire; - } - if (!poll_schedule_timeout(&table, TASK_INTERRUPTIBLE, to, slack)) timed_out = 1; @@ -876,8 +869,11 @@ static int do_poll(struct poll_list *list, struct poll_wqueues *wait, timed_out = 1; } - if (end_time && !timed_out) - slack = select_estimate_accuracy(end_time); + if (end_time && !timed_out) { + expire = timespec64_to_ktime(*end_time); + to = &expire; + slack = select_estimate_accuracy(expire); + } for (;;) { struct poll_list *walk; @@ -930,16 +926,6 @@ static int do_poll(struct poll_list *list, struct poll_wqueues *wait, } busy_flag = 0; - /* - * If this is the first loop and we have a timeout - * given, then we convert to ktime_t and set the to - * pointer to the expiry value. - */ - if (end_time && !to) { - expire = timespec64_to_ktime(*end_time); - to = &expire; - } - if (!poll_schedule_timeout(wait, TASK_INTERRUPTIBLE, to, slack)) timed_out = 1; } diff --git a/include/linux/poll.h b/include/linux/poll.h index 1cdc32b1f1b0..d0f21eb19257 100644 --- a/include/linux/poll.h +++ b/include/linux/poll.h @@ -112,7 +112,7 @@ struct poll_wqueues { extern void poll_initwait(struct poll_wqueues *pwq); extern void poll_freewait(struct poll_wqueues *pwq); -extern u64 select_estimate_accuracy(struct timespec64 *tv); +extern u64 select_estimate_accuracy(ktime_t timeout); #define MAX_INT64_SECONDS (((s64)(~((u64)0)>>1)/HZ)-1) From patchwork Mon Sep 9 10:23:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Safonov X-Patchwork-Id: 11137647 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5415D14DB for ; Mon, 9 Sep 2019 10:24:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 27FC821920 for ; Mon, 9 Sep 2019 10:24:16 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=arista.com header.i=@arista.com header.b="lJfo7jVn" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390668AbfIIKXz (ORCPT ); Mon, 9 Sep 2019 06:23:55 -0400 Received: from mail-wm1-f67.google.com ([209.85.128.67]:53247 "EHLO mail-wm1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390651AbfIIKXx (ORCPT ); Mon, 9 Sep 2019 06:23:53 -0400 Received: by mail-wm1-f67.google.com with SMTP id t17so13162057wmi.2 for ; Mon, 09 Sep 2019 03:23:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=googlenew; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=IfO/aFsv9csBvAqLmxPrLHozIhVbgmsUP4z+fLvYLKk=; b=lJfo7jVni8HmxsSWeAG1c3uqO1KeWoVPi/W0TZ0bzK7R9ne6ihwnEbCY4QEV66GRyz /oklInIRK/cLDGhHPPMJjgG8aUlCQcMM4I4RXFMzEtrIjz7JjHpFdU5vr4NYtNmQ5r3j RsPiEgCJIUX6Nw7HnJSqmhNg0pMNlR9drvx9g1FbFs1RROI9s/VCJjS9D+NiaqW/phwJ XoTRJy47dlX018UraTK5GAPPLY7dy4rnAGZqip2nB5hJe1qiYoarfrmZ23hKYdqNJftT X4wAzBecTiPRZglrkXmWf7kMTeYaFVzXp7RQtW6n/L3UgAmuHhxjTTeihM7H70Snb5lJ QEqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=IfO/aFsv9csBvAqLmxPrLHozIhVbgmsUP4z+fLvYLKk=; b=QwDDTcizW0/iah00bKdnm8YrRu6No0Ly+yX7G1qKmyMZHLiVXWLSse8KOB/wU0UaoI 5uH1nPaeF1jYS5QjhMziEK8dFOf1tma6cAb0U4pram+3R00huskkNyIiS8Mqp+/a0DJ5 laLERUM3xHQjJvZsLCn5xyCWBi1rDHWXA0PJV9cFphex0mcvn0xomAbVKlK/Y5VoIhvd 4A7d10GfnBBJhr7n3uqXYJZELEQTy/liOJckVCh5nXuccXIEaSqA0sj4KbFcMhamiFjg 6wA6XItTwL6zki1A8UiSgYG2jVBUmyD/z2alsezP+F48pSFYVdqiVW0qOW55+50GH6hx 2H1g== X-Gm-Message-State: APjAAAUmzgpmuWqspJipI3FAjfSEv0b/phi3i4xlP/V9siTB/2H5IoLH zktu+NZB+gCpfVu3GkiVGQl6pQ== X-Google-Smtp-Source: APXvYqxhTzZ1Fh8C7DgSzqRlLzBHLQAbXCfmJHK6OASTOE6O9Taf1A+yq/zvODVupkbxdtsIx3tznQ== X-Received: by 2002:a1c:7513:: with SMTP id o19mr17677294wmc.126.1568024632076; Mon, 09 Sep 2019 03:23:52 -0700 (PDT) Received: from Mindolluin.localdomain ([148.69.85.38]) by smtp.gmail.com with ESMTPSA id d14sm1800008wrj.27.2019.09.09.03.23.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Sep 2019 03:23:51 -0700 (PDT) From: Dmitry Safonov To: linux-kernel@vger.kernel.org Cc: Dmitry Safonov <0x7f454c46@gmail.com>, Dmitry Safonov , Adrian Reber , Alexander Viro , Andrei Vagin , Andy Lutomirski , Cyrill Gorcunov , Ingo Molnar , Oleg Nesterov , Pavel Emelyanov , Thomas Gleixner , containers@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org Subject: [PATCH 6/9] select: Extract common code into do_sys_ppoll() Date: Mon, 9 Sep 2019 11:23:37 +0100 Message-Id: <20190909102340.8592-7-dima@arista.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20190909102340.8592-1-dima@arista.com> References: <20190909102340.8592-1-dima@arista.com> MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Reduce the amount of code and shrink a .text section a bit: [linux]$ ./scripts/bloat-o-meter -t /tmp/vmlinux.o.{old,new} add/remove: 1/0 grow/shrink: 0/4 up/down: 284/-691 (-407) Function old new delta do_sys_ppoll - 284 +284 __x64_sys_ppoll 214 42 -172 __ia32_sys_ppoll 213 40 -173 __ia32_compat_sys_ppoll_time64 213 40 -173 __ia32_compat_sys_ppoll_time32 213 40 -173 Total: Before=13357557, After=13357150, chg -0.00% The downside is that "tsp" and "sigmask" parameters gets (void *), but it seems worth losing static type checking if there is only one line in syscall definition. Other way could be to add compat parameters in do_sys_ppoll(), but that trashes 2 more registers.. Signed-off-by: Dmitry Safonov --- fs/select.c | 94 ++++++++++++++++++----------------------------------- 1 file changed, 32 insertions(+), 62 deletions(-) diff --git a/fs/select.c b/fs/select.c index 458f2a944318..262300e58370 100644 --- a/fs/select.c +++ b/fs/select.c @@ -1056,54 +1056,58 @@ SYSCALL_DEFINE3(poll, struct pollfd __user *, ufds, unsigned int, nfds, return ret; } -SYSCALL_DEFINE5(ppoll, struct pollfd __user *, ufds, unsigned int, nfds, - struct __kernel_timespec __user *, tsp, const sigset_t __user *, sigmask, - size_t, sigsetsize) +static int do_sys_ppoll(struct pollfd __user *ufds, unsigned int nfds, + void __user *tsp, const void __user *sigmask, + size_t sigsetsize, enum poll_time_type pt_type) { struct timespec64 ts, end_time, *to = NULL; int ret; if (tsp) { - if (get_timespec64(&ts, tsp)) - return -EFAULT; + switch (pt_type) { + case PT_TIMESPEC: + if (get_timespec64(&ts, tsp)) + return -EFAULT; + break; + case PT_OLD_TIMESPEC: + if (get_old_timespec32(&ts, tsp)) + return -EFAULT; + break; + default: + WARN_ON_ONCE(1); + return -ENOSYS; + } to = &end_time; if (poll_select_set_timeout(to, ts.tv_sec, ts.tv_nsec)) return -EINVAL; } - ret = set_user_sigmask(sigmask, sigsetsize); + if (!in_compat_syscall()) + ret = set_user_sigmask(sigmask, sigsetsize); + else + ret = set_compat_user_sigmask(sigmask, sigsetsize); + if (ret) return ret; ret = do_sys_poll(ufds, nfds, to); - return poll_select_finish(&end_time, tsp, PT_TIMESPEC, ret); + return poll_select_finish(&end_time, tsp, pt_type, ret); } -#if defined(CONFIG_COMPAT_32BIT_TIME) && !defined(CONFIG_64BIT) +SYSCALL_DEFINE5(ppoll, struct pollfd __user *, ufds, unsigned int, nfds, + struct __kernel_timespec __user *, tsp, const sigset_t __user *, sigmask, + size_t, sigsetsize) +{ + return do_sys_ppoll(ufds, nfds, tsp, sigmask, sigsetsize, PT_TIMESPEC); +} +#if defined(CONFIG_COMPAT_32BIT_TIME) && !defined(CONFIG_64BIT) SYSCALL_DEFINE5(ppoll_time32, struct pollfd __user *, ufds, unsigned int, nfds, struct old_timespec32 __user *, tsp, const sigset_t __user *, sigmask, size_t, sigsetsize) { - struct timespec64 ts, end_time, *to = NULL; - int ret; - - if (tsp) { - if (get_old_timespec32(&ts, tsp)) - return -EFAULT; - - to = &end_time; - if (poll_select_set_timeout(to, ts.tv_sec, ts.tv_nsec)) - return -EINVAL; - } - - ret = set_user_sigmask(sigmask, sigsetsize); - if (ret) - return ret; - - ret = do_sys_poll(ufds, nfds, to); - return poll_select_finish(&end_time, tsp, PT_OLD_TIMESPEC, ret); + return do_sys_ppoll(ufds, nfds, tsp, sigmask, sigsetsize, PT_OLD_TIMESPEC); } #endif @@ -1352,24 +1356,7 @@ COMPAT_SYSCALL_DEFINE5(ppoll_time32, struct pollfd __user *, ufds, unsigned int, nfds, struct old_timespec32 __user *, tsp, const compat_sigset_t __user *, sigmask, compat_size_t, sigsetsize) { - struct timespec64 ts, end_time, *to = NULL; - int ret; - - if (tsp) { - if (get_old_timespec32(&ts, tsp)) - return -EFAULT; - - to = &end_time; - if (poll_select_set_timeout(to, ts.tv_sec, ts.tv_nsec)) - return -EINVAL; - } - - ret = set_compat_user_sigmask(sigmask, sigsetsize); - if (ret) - return ret; - - ret = do_sys_poll(ufds, nfds, to); - return poll_select_finish(&end_time, tsp, PT_OLD_TIMESPEC, ret); + return do_sys_ppoll(ufds, nfds, tsp, sigmask, sigsetsize, PT_OLD_TIMESPEC); } #endif @@ -1378,24 +1365,7 @@ COMPAT_SYSCALL_DEFINE5(ppoll_time64, struct pollfd __user *, ufds, unsigned int, nfds, struct __kernel_timespec __user *, tsp, const compat_sigset_t __user *, sigmask, compat_size_t, sigsetsize) { - struct timespec64 ts, end_time, *to = NULL; - int ret; - - if (tsp) { - if (get_timespec64(&ts, tsp)) - return -EFAULT; - - to = &end_time; - if (poll_select_set_timeout(to, ts.tv_sec, ts.tv_nsec)) - return -EINVAL; - } - - ret = set_compat_user_sigmask(sigmask, sigsetsize); - if (ret) - return ret; - - ret = do_sys_poll(ufds, nfds, to); - return poll_select_finish(&end_time, tsp, PT_TIMESPEC, ret); + return do_sys_ppoll(ufds, nfds, tsp, sigmask, sigsetsize, PT_TIMESPEC); } #endif From patchwork Mon Sep 9 10:23:38 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Safonov X-Patchwork-Id: 11137649 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3F6EC14DB for ; Mon, 9 Sep 2019 10:24:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1311921D79 for ; Mon, 9 Sep 2019 10:24:18 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=arista.com header.i=@arista.com header.b="Fb/PZC7J" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390785AbfIIKYR (ORCPT ); Mon, 9 Sep 2019 06:24:17 -0400 Received: from mail-wm1-f68.google.com ([209.85.128.68]:38554 "EHLO mail-wm1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390615AbfIIKXz (ORCPT ); Mon, 9 Sep 2019 06:23:55 -0400 Received: by mail-wm1-f68.google.com with SMTP id o184so13969388wme.3 for ; Mon, 09 Sep 2019 03:23:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=googlenew; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=yu1wCZdYIYP1jJvk74CcBZj8+AFMfC/Evvd513Cb5Yg=; b=Fb/PZC7J5LnCdiTui3OYvoq5ZKTb0KInKov6ma1riHbkF7BSEr80N1Sof6/BZkmYBd oujqoQ9ATVhk8MujF+smYlsQEdOJxS/FxRBYkkd5iRgiNfUoSZrn0ZP3NtnOmly47Ijg GfoFBCjsBnidcfeejZKwNFneFR4jfSIwCC+ndnVgzl3mz0HhrcuzKdzYpT92oFxz+JVI bQYSvhA1EYI8GsCYuh6uufVHQ2YQQzJeha2PkjOGKtPGNxAnzImagdhYVHAsfwbo9Kgp /iiVNAF/n8D5AUMHwEh5Kse3uhCpYRvpPRujBc+bxCmKdWcX4GqKQBfH+ZsVpNPp8Kt0 cAZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=yu1wCZdYIYP1jJvk74CcBZj8+AFMfC/Evvd513Cb5Yg=; b=Qq29krpHLEECJwe98xJ/HlXTCMjUZ/LuAnj3bUYJ+VLQj90g/QVmc1BP8njgxYAFJ+ /5Awoyw1ZE5vO5TsC0I+vHKdbBAJjCvME2xoAPAEIgnqm+6B1rSg5LCd0zN/8uWT7UOc xLaph6k6FCGGMKF/Je8iCF3ForEyY2bWy5dtuMJ2eh9YxlFUDG0wWyJuNw8OsK777Vvl OLe2ASPrNNBZ3tUqIbSe6+GG9/27qDSr7rGeSPwmltGzpTwnlVOxX795rd2ZsFtZGh0q Iri3BdV/94xcMto3/ll8UNaC2ch5H9751/NcUMI7oNvErf+DH4OVgslejDaPKtRxbsmU a0pQ== X-Gm-Message-State: APjAAAWK8ANedLTn0NFGQ/Y1jIMkaLhuYZSeWNIvTJ41A3I9/8qYyTnP QfCQ0WxO+epf3AawuhluDrTyPQ== X-Google-Smtp-Source: APXvYqzLItSWebfZaJEphGQJBKCLGY7oWmQJ8incowXbvPbkOUINKisuMYaDsgYg+9c37v7q+MManw== X-Received: by 2002:a1c:be12:: with SMTP id o18mr8864792wmf.128.1568024633598; Mon, 09 Sep 2019 03:23:53 -0700 (PDT) Received: from Mindolluin.localdomain ([148.69.85.38]) by smtp.gmail.com with ESMTPSA id d14sm1800008wrj.27.2019.09.09.03.23.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Sep 2019 03:23:53 -0700 (PDT) From: Dmitry Safonov To: linux-kernel@vger.kernel.org Cc: Dmitry Safonov <0x7f454c46@gmail.com>, Dmitry Safonov , Adrian Reber , Alexander Viro , Andrei Vagin , Andy Lutomirski , Cyrill Gorcunov , Ingo Molnar , Oleg Nesterov , Pavel Emelyanov , Thomas Gleixner , containers@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org Subject: [PATCH 7/9] select: Use ktime_t in do_sys_poll() and do_poll() Date: Mon, 9 Sep 2019 11:23:38 +0100 Message-Id: <20190909102340.8592-8-dima@arista.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20190909102340.8592-1-dima@arista.com> References: <20190909102340.8592-1-dima@arista.com> MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org The plan is to store what's left of timeout in restart block as ktime_t which will be used for futex() and nanosleep() timeouts too. That will be a value to return with a new ptrace() request API. Convert end_time argument of do_{sys_,}poll() functions to ktime_t as a preparation ground for storing ktime_t inside restart_block. Signed-off-by: Dmitry Safonov --- fs/select.c | 47 +++++++++++++++++++++++------------------------ 1 file changed, 23 insertions(+), 24 deletions(-) diff --git a/fs/select.c b/fs/select.c index 262300e58370..4af88feaa2fe 100644 --- a/fs/select.c +++ b/fs/select.c @@ -854,25 +854,22 @@ static inline __poll_t do_pollfd(struct pollfd *pollfd, poll_table *pwait, } static int do_poll(struct poll_list *list, struct poll_wqueues *wait, - struct timespec64 *end_time) + ktime_t end_time) { poll_table* pt = &wait->pt; - ktime_t expire, *to = NULL; + ktime_t *to = NULL; int timed_out = 0, count = 0; u64 slack = 0; __poll_t busy_flag = net_busy_loop_on() ? POLL_BUSY_LOOP : 0; unsigned long busy_start = 0; /* Optimise the no-wait case */ - if (end_time && !end_time->tv_sec && !end_time->tv_nsec) { + if (ktime_compare(ktime_get(), end_time) >= 0) { pt->_qproc = NULL; timed_out = 1; - } - - if (end_time && !timed_out) { - expire = timespec64_to_ktime(*end_time); - to = &expire; - slack = select_estimate_accuracy(expire); + } else { + to = &end_time; + slack = select_estimate_accuracy(end_time); } for (;;) { @@ -936,7 +933,7 @@ static int do_poll(struct poll_list *list, struct poll_wqueues *wait, sizeof(struct pollfd)) static int do_sys_poll(struct pollfd __user *ufds, unsigned int nfds, - struct timespec64 *end_time) + ktime_t end_time) { struct poll_wqueues table; int err = -EFAULT, fdcount, len; @@ -1004,16 +1001,15 @@ static long do_restart_poll(struct restart_block *restart_block) { struct pollfd __user *ufds = restart_block->poll.ufds; int nfds = restart_block->poll.nfds; - struct timespec64 *to = NULL, end_time; + ktime_t timeout = 0; int ret; if (restart_block->poll.has_timeout) { - end_time.tv_sec = restart_block->poll.tv_sec; - end_time.tv_nsec = restart_block->poll.tv_nsec; - to = &end_time; + timeout = ktime_set(restart_block->poll.tv_sec, + restart_block->poll.tv_nsec); } - ret = do_sys_poll(ufds, nfds, to); + ret = do_sys_poll(ufds, nfds, timeout); if (ret == -ERESTARTNOHAND) { restart_block->fn = do_restart_poll; @@ -1025,16 +1021,17 @@ static long do_restart_poll(struct restart_block *restart_block) SYSCALL_DEFINE3(poll, struct pollfd __user *, ufds, unsigned int, nfds, int, timeout_msecs) { - struct timespec64 end_time, *to = NULL; + struct timespec64 end_time; + ktime_t timeout = 0; int ret; if (timeout_msecs >= 0) { - to = &end_time; - poll_select_set_timeout(to, timeout_msecs / MSEC_PER_SEC, + poll_select_set_timeout(&end_time, timeout_msecs / MSEC_PER_SEC, NSEC_PER_MSEC * (timeout_msecs % MSEC_PER_SEC)); + timeout = timespec64_to_ktime(end_time); } - ret = do_sys_poll(ufds, nfds, to); + ret = do_sys_poll(ufds, nfds, timeout); if (ret == -ERESTARTNOHAND) { struct restart_block *restart_block; @@ -1060,7 +1057,8 @@ static int do_sys_ppoll(struct pollfd __user *ufds, unsigned int nfds, void __user *tsp, const void __user *sigmask, size_t sigsetsize, enum poll_time_type pt_type) { - struct timespec64 ts, end_time, *to = NULL; + struct timespec64 ts, *to = NULL; + ktime_t timeout = 0; int ret; if (tsp) { @@ -1078,9 +1076,10 @@ static int do_sys_ppoll(struct pollfd __user *ufds, unsigned int nfds, return -ENOSYS; } - to = &end_time; - if (poll_select_set_timeout(to, ts.tv_sec, ts.tv_nsec)) + to = &ts; + if (poll_select_set_timeout(&ts, ts.tv_sec, ts.tv_nsec)) return -EINVAL; + timeout = timespec64_to_ktime(ts); } if (!in_compat_syscall()) @@ -1091,8 +1090,8 @@ static int do_sys_ppoll(struct pollfd __user *ufds, unsigned int nfds, if (ret) return ret; - ret = do_sys_poll(ufds, nfds, to); - return poll_select_finish(&end_time, tsp, pt_type, ret); + ret = do_sys_poll(ufds, nfds, timeout); + return poll_select_finish(to, tsp, pt_type, ret); } SYSCALL_DEFINE5(ppoll, struct pollfd __user *, ufds, unsigned int, nfds, From patchwork Mon Sep 9 10:23:39 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Safonov X-Patchwork-Id: 11137643 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F3C0014DB for ; Mon, 9 Sep 2019 10:24:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D261C21D79 for ; Mon, 9 Sep 2019 10:24:10 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=arista.com header.i=@arista.com header.b="hdrHIIyj" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390759AbfIIKYJ (ORCPT ); Mon, 9 Sep 2019 06:24:09 -0400 Received: from mail-wm1-f67.google.com ([209.85.128.67]:51768 "EHLO mail-wm1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390681AbfIIKX5 (ORCPT ); Mon, 9 Sep 2019 06:23:57 -0400 Received: by mail-wm1-f67.google.com with SMTP id 7so3500761wme.1 for ; Mon, 09 Sep 2019 03:23:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=googlenew; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=FNRbbp8G8eTiBWYxO3vdxym/3OjnixNj+aS47nL+0BI=; b=hdrHIIyjAT+k8oXsgByuk1ULYIDO+Lqvl8Os8lSPNmqm+S5M/kr39Eu+i1bqKs0oqE GbyGOXmIcEeO7lx6L7xndSmFNg19q/AHrkqXIniqCq9qxVfhEa5MWCqCMP6aCSbmGOFY JI/eilxR5hDviGih9niYE5XoqIVGnwXZsTAwI2SeDmpFAijfmaRJIlmGFM9B5VOtST/N 8EJqVLfAj+4vORTmeH6VlvR5DN9zKPAqS4ZdYzpmdRI5HelsEGQcfB0teFgEQ9eI6D4D HYGMrzj/9sxZHkiVXR2V0dR0nlWucvdBRwj/BFZ7F88N7t09MdvMgmMxKNQcdCB2lG6E v7jQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=FNRbbp8G8eTiBWYxO3vdxym/3OjnixNj+aS47nL+0BI=; b=e/yL/6m0YEBDkwVlJRbwkY7KuWzlclBz5tt1Jac0sfEHGaC3BB5p7iMscl2LNLQson fSXFgTAlRfYbQyPBvtR7YAxE4BaCwpGj82BBbBWYxFKVavQ/QxpydV1SkUuhv6TJmziH 0RFMyNUgMUwhz+/9bzmA18nuHZtpB9EPylFNJe+Vi9LuriP7Txeq8T1+gc2RmITi90SA fqvMAEHBCmOHoGqWcYeEFLiRNZ+JrWC1j7oOl3PmxCx7xA/HeGHy5OZr6XOCzYtqenRr ng+86yUsuFZ1lLialjEoDyGd7KjWVomAGcTuFrka+ANyD/DetvUnV84T7PweLnwoC+Zy Hgdw== X-Gm-Message-State: APjAAAX8j9vjTZYvACo3QkwoNXMQRXcobmlXtCPv4+c7+Z2x3t+Lelyy fVs+lq1A0psfBe2nTNZWx9fkbQ== X-Google-Smtp-Source: APXvYqxZzvWmASuH6w4UTwekMD7soXRGZno3Iw5SFEGT0uhY8x9yZWUYfyFpI1sm8te4R9c2x47VEA== X-Received: by 2002:a7b:c651:: with SMTP id q17mr17778107wmk.13.1568024634967; Mon, 09 Sep 2019 03:23:54 -0700 (PDT) Received: from Mindolluin.localdomain ([148.69.85.38]) by smtp.gmail.com with ESMTPSA id d14sm1800008wrj.27.2019.09.09.03.23.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Sep 2019 03:23:54 -0700 (PDT) From: Dmitry Safonov To: linux-kernel@vger.kernel.org Cc: Dmitry Safonov <0x7f454c46@gmail.com>, Dmitry Safonov , Adrian Reber , Alexander Viro , Andrei Vagin , Andy Lutomirski , Cyrill Gorcunov , Ingo Molnar , Oleg Nesterov , Pavel Emelyanov , Thomas Gleixner , containers@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org Subject: [PATCH 8/9] select/restart_block: Convert poll's timeout to u64 Date: Mon, 9 Sep 2019 11:23:39 +0100 Message-Id: <20190909102340.8592-9-dima@arista.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20190909102340.8592-1-dima@arista.com> References: <20190909102340.8592-1-dima@arista.com> MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org All preparations have been done - now poll() can set u64 timeout in restart_block. It allows to do the next step - unifying all timeouts in restart_block and provide ptrace() API to read it. Signed-off-by: Dmitry Safonov --- fs/select.c | 27 +++++++-------------------- include/linux/restart_block.h | 4 +--- 2 files changed, 8 insertions(+), 23 deletions(-) diff --git a/fs/select.c b/fs/select.c index 4af88feaa2fe..ff2b9c4865cd 100644 --- a/fs/select.c +++ b/fs/select.c @@ -1001,14 +1001,9 @@ static long do_restart_poll(struct restart_block *restart_block) { struct pollfd __user *ufds = restart_block->poll.ufds; int nfds = restart_block->poll.nfds; - ktime_t timeout = 0; + ktime_t timeout = restart_block->poll.timeout; int ret; - if (restart_block->poll.has_timeout) { - timeout = ktime_set(restart_block->poll.tv_sec, - restart_block->poll.tv_nsec); - } - ret = do_sys_poll(ufds, nfds, timeout); if (ret == -ERESTARTNOHAND) { @@ -1021,14 +1016,12 @@ static long do_restart_poll(struct restart_block *restart_block) SYSCALL_DEFINE3(poll, struct pollfd __user *, ufds, unsigned int, nfds, int, timeout_msecs) { - struct timespec64 end_time; ktime_t timeout = 0; int ret; if (timeout_msecs >= 0) { - poll_select_set_timeout(&end_time, timeout_msecs / MSEC_PER_SEC, - NSEC_PER_MSEC * (timeout_msecs % MSEC_PER_SEC)); - timeout = timespec64_to_ktime(end_time); + timeout = ktime_add_ms(0, timeout_msecs); + timeout = ktime_add_safe(ktime_get(), timeout); } ret = do_sys_poll(ufds, nfds, timeout); @@ -1037,16 +1030,10 @@ SYSCALL_DEFINE3(poll, struct pollfd __user *, ufds, unsigned int, nfds, struct restart_block *restart_block; restart_block = ¤t->restart_block; - restart_block->fn = do_restart_poll; - restart_block->poll.ufds = ufds; - restart_block->poll.nfds = nfds; - - if (timeout_msecs >= 0) { - restart_block->poll.tv_sec = end_time.tv_sec; - restart_block->poll.tv_nsec = end_time.tv_nsec; - restart_block->poll.has_timeout = 1; - } else - restart_block->poll.has_timeout = 0; + restart_block->fn = do_restart_poll; + restart_block->poll.ufds = ufds; + restart_block->poll.nfds = nfds; + restart_block->poll.timeout = timeout; ret = -ERESTART_RESTARTBLOCK; } diff --git a/include/linux/restart_block.h b/include/linux/restart_block.h index e66e982105f4..63d647b65395 100644 --- a/include/linux/restart_block.h +++ b/include/linux/restart_block.h @@ -49,11 +49,9 @@ struct restart_block { } nanosleep; /* For poll */ struct { + u64 timeout; struct pollfd __user *ufds; int nfds; - int has_timeout; - unsigned long tv_sec; - unsigned long tv_nsec; } poll; }; }; From patchwork Mon Sep 9 10:23:40 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Safonov X-Patchwork-Id: 11137641 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 99A371709 for ; Mon, 9 Sep 2019 10:24:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 6D50C21920 for ; Mon, 9 Sep 2019 10:24:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=arista.com header.i=@arista.com header.b="KWVXJKkg" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390738AbfIIKYB (ORCPT ); Mon, 9 Sep 2019 06:24:01 -0400 Received: from mail-wm1-f68.google.com ([209.85.128.68]:55852 "EHLO mail-wm1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390692AbfIIKX7 (ORCPT ); Mon, 9 Sep 2019 06:23:59 -0400 Received: by mail-wm1-f68.google.com with SMTP id g207so13145029wmg.5 for ; Mon, 09 Sep 2019 03:23:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=googlenew; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=3oTy+lkWbkFFfNbPOS2AOFsjB8RXGgtp9fv0BAsW7sw=; b=KWVXJKkgFOQhTfLznVaJozh5R/o/MAC5kqv+Qml3IcuGyQxjB7qRq685l5cBV05Ipo sgDx5/qcqGjKuGr9VE0lmZRjo/HjTZbkkbXFm/HX027RXGgT8qgkUTjbTsrWFaU78863 wzESbRXN+09ou7ZrrCkHTB5+c4ETLx6aCyp21N8Pjc66pFQFBWAMgaiwDL3tp77qaRTC LYyBTNH8KtaVoQ83sVMzOlD8AG50aQKGghkExi80zGT8ETXPvABPF/W2N78ihbUogsQy WLTY1RGj/p6SaQ17it8EMedrfkaaf3HfqKRE7kznmLO5rNb9bMj7AYrYWpRKlNzPSiQM +LqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=3oTy+lkWbkFFfNbPOS2AOFsjB8RXGgtp9fv0BAsW7sw=; b=VqSalPB0k2rtHnp7vgstI36Ui0WZZ7GBcWidC/sdLETKT4U8MDkeUccm1ztLUfC1ss W3iEVKnGYq6gEuB4Rt7JPA4A4i9NRQZncvcAzUrgLeMglqOyLDblEVXxL+a73MRouQ8c /LaJtGZ1DknV17edhOQAVCVO/AOHMIIjk9/KSfW6B+FJTIdNIBndAGB04lbQeJG5TCsv ea28Tz5dOVsA2XyhkrnZnIEllTIIwhD6j5we5RYq2lEsVrxgMWbMisbeeQ/3z0CdVAHF rBTLpSCoufXVQEDgbWJLbMsI1A37528lXSWUxRdnWvG6Izg6IASQ2SjFjMoSyIOkBTzS lmgQ== X-Gm-Message-State: APjAAAU9tvdv1Bvh2WlS/qbWkk3OQ2t5TE88VJsWeXdYhz6r/rFUmics zhVvVClLW1EcbdZzdKwo4JM1EQ== X-Google-Smtp-Source: APXvYqwkEm9NakHd5yjJLZAfHiHfqUsqd4ZN6ARrN70g1pNUadQK+lUNEg6XugNBQoIbW5DPuRGGdw== X-Received: by 2002:a1c:1aca:: with SMTP id a193mr19207427wma.120.1568024636323; Mon, 09 Sep 2019 03:23:56 -0700 (PDT) Received: from Mindolluin.localdomain ([148.69.85.38]) by smtp.gmail.com with ESMTPSA id d14sm1800008wrj.27.2019.09.09.03.23.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Sep 2019 03:23:55 -0700 (PDT) From: Dmitry Safonov To: linux-kernel@vger.kernel.org Cc: Dmitry Safonov <0x7f454c46@gmail.com>, Dmitry Safonov , Adrian Reber , Alexander Viro , Andrei Vagin , Andy Lutomirski , Cyrill Gorcunov , Ingo Molnar , Oleg Nesterov , Pavel Emelyanov , Thomas Gleixner , containers@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org Subject: [PATCH 9/9] restart_block: Make common timeout Date: Mon, 9 Sep 2019 11:23:40 +0100 Message-Id: <20190909102340.8592-10-dima@arista.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20190909102340.8592-1-dima@arista.com> References: <20190909102340.8592-1-dima@arista.com> MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org In order to provide a unified API to get the leftover of timeout, the timeout for different users of restart_block can be joined. All preparations done, so move timeout out of union and convert the users. Signed-off-by: Dmitry Safonov --- fs/select.c | 10 +++++----- include/linux/restart_block.h | 4 +--- kernel/futex.c | 14 +++++++------- kernel/time/alarmtimer.c | 6 +++--- kernel/time/hrtimer.c | 6 +++--- kernel/time/posix-cpu-timers.c | 6 +++--- 6 files changed, 22 insertions(+), 24 deletions(-) diff --git a/fs/select.c b/fs/select.c index ff2b9c4865cd..9ab6fc6fb7c5 100644 --- a/fs/select.c +++ b/fs/select.c @@ -1001,7 +1001,7 @@ static long do_restart_poll(struct restart_block *restart_block) { struct pollfd __user *ufds = restart_block->poll.ufds; int nfds = restart_block->poll.nfds; - ktime_t timeout = restart_block->poll.timeout; + ktime_t timeout = restart_block->timeout; int ret; ret = do_sys_poll(ufds, nfds, timeout); @@ -1030,10 +1030,10 @@ SYSCALL_DEFINE3(poll, struct pollfd __user *, ufds, unsigned int, nfds, struct restart_block *restart_block; restart_block = ¤t->restart_block; - restart_block->fn = do_restart_poll; - restart_block->poll.ufds = ufds; - restart_block->poll.nfds = nfds; - restart_block->poll.timeout = timeout; + restart_block->fn = do_restart_poll; + restart_block->poll.ufds = ufds; + restart_block->poll.nfds = nfds; + restart_block->timeout = timeout; ret = -ERESTART_RESTARTBLOCK; } diff --git a/include/linux/restart_block.h b/include/linux/restart_block.h index 63d647b65395..02f90ab00a2d 100644 --- a/include/linux/restart_block.h +++ b/include/linux/restart_block.h @@ -27,6 +27,7 @@ enum timespec_type { * userspace tricks in the union. */ struct restart_block { + s64 timeout; long (*fn)(struct restart_block *); union { /* For futex_wait and futex_wait_requeue_pi */ @@ -35,7 +36,6 @@ struct restart_block { u32 val; u32 flags; u32 bitset; - u64 time; } futex; /* For nanosleep */ struct { @@ -45,11 +45,9 @@ struct restart_block { struct __kernel_timespec __user *rmtp; struct old_timespec32 __user *compat_rmtp; }; - u64 expires; } nanosleep; /* For poll */ struct { - u64 timeout; struct pollfd __user *ufds; int nfds; } poll; diff --git a/kernel/futex.c b/kernel/futex.c index 6d50728ef2e7..0738167e4911 100644 --- a/kernel/futex.c +++ b/kernel/futex.c @@ -2755,12 +2755,12 @@ static int futex_wait(u32 __user *uaddr, unsigned int flags, u32 val, goto out; restart = ¤t->restart_block; - restart->fn = futex_wait_restart; - restart->futex.uaddr = uaddr; - restart->futex.val = val; - restart->futex.time = *abs_time; - restart->futex.bitset = bitset; - restart->futex.flags = flags | FLAGS_HAS_TIMEOUT; + restart->fn = futex_wait_restart; + restart->futex.uaddr = uaddr; + restart->futex.val = val; + restart->timeout = *abs_time; + restart->futex.bitset = bitset; + restart->futex.flags = flags | FLAGS_HAS_TIMEOUT; ret = -ERESTART_RESTARTBLOCK; @@ -2779,7 +2779,7 @@ static long futex_wait_restart(struct restart_block *restart) ktime_t t, *tp = NULL; if (restart->futex.flags & FLAGS_HAS_TIMEOUT) { - t = restart->futex.time; + t = restart->timeout; tp = &t; } restart->fn = do_no_restart_syscall; diff --git a/kernel/time/alarmtimer.c b/kernel/time/alarmtimer.c index 57518efc3810..148b187c371e 100644 --- a/kernel/time/alarmtimer.c +++ b/kernel/time/alarmtimer.c @@ -763,7 +763,7 @@ alarm_init_on_stack(struct alarm *alarm, enum alarmtimer_type type, static long __sched alarm_timer_nsleep_restart(struct restart_block *restart) { enum alarmtimer_type type = restart->nanosleep.clockid; - ktime_t exp = restart->nanosleep.expires; + ktime_t exp = restart->timeout; struct alarm alarm; alarm_init_on_stack(&alarm, type, alarmtimer_nsleep_wakeup); @@ -816,9 +816,9 @@ static int alarm_timer_nsleep(const clockid_t which_clock, int flags, if (flags == TIMER_ABSTIME) return -ERESTARTNOHAND; - restart->fn = alarm_timer_nsleep_restart; + restart->fn = alarm_timer_nsleep_restart; restart->nanosleep.clockid = type; - restart->nanosleep.expires = exp; + restart->timeout = exp; return ret; } diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c index 4ba2b50d068f..18d4b0cc919c 100644 --- a/kernel/time/hrtimer.c +++ b/kernel/time/hrtimer.c @@ -1709,7 +1709,7 @@ static long __sched hrtimer_nanosleep_restart(struct restart_block *restart) hrtimer_init_on_stack(&t.timer, restart->nanosleep.clockid, HRTIMER_MODE_ABS); - hrtimer_set_expires_tv64(&t.timer, restart->nanosleep.expires); + hrtimer_set_expires_tv64(&t.timer, restart->timeout); ret = do_nanosleep(&t, HRTIMER_MODE_ABS); destroy_hrtimer_on_stack(&t.timer); @@ -1741,9 +1741,9 @@ long hrtimer_nanosleep(const struct timespec64 *rqtp, } restart = ¤t->restart_block; - restart->fn = hrtimer_nanosleep_restart; + restart->fn = hrtimer_nanosleep_restart; restart->nanosleep.clockid = t.timer.base->clockid; - restart->nanosleep.expires = hrtimer_get_expires_tv64(&t.timer); + restart->timeout = hrtimer_get_expires_tv64(&t.timer); out: destroy_hrtimer_on_stack(&t.timer); return ret; diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c index b4dddf74dd15..691de00107c2 100644 --- a/kernel/time/posix-cpu-timers.c +++ b/kernel/time/posix-cpu-timers.c @@ -1332,8 +1332,8 @@ static int do_cpu_nanosleep(const clockid_t which_clock, int flags, * Report back to the user the time still remaining. */ restart = ¤t->restart_block; - restart->fn = posix_cpu_nsleep_restart; - restart->nanosleep.expires = expires; + restart->fn = posix_cpu_nsleep_restart; + restart->timeout = expires; if (restart->nanosleep.type != TT_NONE) error = nanosleep_copyout(restart, &it.it_value); } @@ -1372,7 +1372,7 @@ static long posix_cpu_nsleep_restart(struct restart_block *restart_block) clockid_t which_clock = restart_block->nanosleep.clockid; struct timespec64 t; - t = ns_to_timespec64(restart_block->nanosleep.expires); + t = ns_to_timespec64(restart_block->timeout); return do_cpu_nanosleep(which_clock, TIMER_ABSTIME, &t); }