diff mbox series

[v11,07/16] sched: Split the guts of sched_setaffinity() into a helper function

Message ID 20210730112443.23245-8-will@kernel.org (mailing list archive)
State New, archived
Headers show
Series Add support for 32-bit tasks on asymmetric AArch32 systems | expand

Commit Message

Will Deacon July 30, 2021, 11:24 a.m. UTC
In preparation for replaying user affinity requests using a saved mask,
split sched_setaffinity() up so that the initial task lookup and
security checks are only performed when the request is coming directly
from userspace.

Reviewed-by: Valentin Schneider <Valentin.Schneider@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
---
 kernel/sched/core.c | 105 ++++++++++++++++++++++++--------------------
 1 file changed, 57 insertions(+), 48 deletions(-)

Comments

Peter Zijlstra Aug. 17, 2021, 3:40 p.m. UTC | #1
On Fri, Jul 30, 2021 at 12:24:34PM +0100, Will Deacon wrote:
> In preparation for replaying user affinity requests using a saved mask,
> split sched_setaffinity() up so that the initial task lookup and
> security checks are only performed when the request is coming directly
> from userspace.
> 
> Reviewed-by: Valentin Schneider <Valentin.Schneider@arm.com>
> Signed-off-by: Will Deacon <will@kernel.org>

Should not sched_setaffinity() update user_cpus_ptr when it isn't NULL,
such that the upcoming relax_compatible_cpus_allowed_ptr() preserve the
full user mask?
Will Deacon Aug. 18, 2021, 10:50 a.m. UTC | #2
On Tue, Aug 17, 2021 at 05:40:24PM +0200, Peter Zijlstra wrote:
> On Fri, Jul 30, 2021 at 12:24:34PM +0100, Will Deacon wrote:
> > In preparation for replaying user affinity requests using a saved mask,
> > split sched_setaffinity() up so that the initial task lookup and
> > security checks are only performed when the request is coming directly
> > from userspace.
> > 
> > Reviewed-by: Valentin Schneider <Valentin.Schneider@arm.com>
> > Signed-off-by: Will Deacon <will@kernel.org>
> 
> Should not sched_setaffinity() update user_cpus_ptr when it isn't NULL,
> such that the upcoming relax_compatible_cpus_allowed_ptr() preserve the
> full user mask?

The idea is that force_compatible_cpus_allowed_ptr() and
relax_compatible_cpus_allowed_ptr() are used as a pair, with the former
setting ->user_cpus_ptr and the latter restoring it. An intervening call
to sched_setaffinity() must _clear_ the saved mask, as we discussed
before at:

https://lore.kernel.org/r/YK53kDtczHIYumDC@hirez.programming.kicks-ass.net

Will
Peter Zijlstra Aug. 18, 2021, 10:56 a.m. UTC | #3
On Wed, Aug 18, 2021 at 11:50:30AM +0100, Will Deacon wrote:
> On Tue, Aug 17, 2021 at 05:40:24PM +0200, Peter Zijlstra wrote:
> > On Fri, Jul 30, 2021 at 12:24:34PM +0100, Will Deacon wrote:
> > > In preparation for replaying user affinity requests using a saved mask,
> > > split sched_setaffinity() up so that the initial task lookup and
> > > security checks are only performed when the request is coming directly
> > > from userspace.
> > > 
> > > Reviewed-by: Valentin Schneider <Valentin.Schneider@arm.com>
> > > Signed-off-by: Will Deacon <will@kernel.org>
> > 
> > Should not sched_setaffinity() update user_cpus_ptr when it isn't NULL,
> > such that the upcoming relax_compatible_cpus_allowed_ptr() preserve the
> > full user mask?
> 
> The idea is that force_compatible_cpus_allowed_ptr() and
> relax_compatible_cpus_allowed_ptr() are used as a pair, with the former
> setting ->user_cpus_ptr and the latter restoring it. An intervening call
> to sched_setaffinity() must _clear_ the saved mask, as we discussed
> before at:
> 
> https://lore.kernel.org/r/YK53kDtczHIYumDC@hirez.programming.kicks-ass.net

Clearly that deserves a comment somewhere, because I keep trying to make
it more consistent than it can be :/ I'll see if I can find a spot.
Will Deacon Aug. 18, 2021, 11:11 a.m. UTC | #4
On Wed, Aug 18, 2021 at 12:56:24PM +0200, Peter Zijlstra wrote:
> On Wed, Aug 18, 2021 at 11:50:30AM +0100, Will Deacon wrote:
> > On Tue, Aug 17, 2021 at 05:40:24PM +0200, Peter Zijlstra wrote:
> > > On Fri, Jul 30, 2021 at 12:24:34PM +0100, Will Deacon wrote:
> > > > In preparation for replaying user affinity requests using a saved mask,
> > > > split sched_setaffinity() up so that the initial task lookup and
> > > > security checks are only performed when the request is coming directly
> > > > from userspace.
> > > > 
> > > > Reviewed-by: Valentin Schneider <Valentin.Schneider@arm.com>
> > > > Signed-off-by: Will Deacon <will@kernel.org>
> > > 
> > > Should not sched_setaffinity() update user_cpus_ptr when it isn't NULL,
> > > such that the upcoming relax_compatible_cpus_allowed_ptr() preserve the
> > > full user mask?
> > 
> > The idea is that force_compatible_cpus_allowed_ptr() and
> > relax_compatible_cpus_allowed_ptr() are used as a pair, with the former
> > setting ->user_cpus_ptr and the latter restoring it. An intervening call
> > to sched_setaffinity() must _clear_ the saved mask, as we discussed
> > before at:
> > 
> > https://lore.kernel.org/r/YK53kDtczHIYumDC@hirez.programming.kicks-ass.net
> 
> Clearly that deserves a comment somewhere, because I keep trying to make
> it more consistent than it can be :/ I'll see if I can find a spot.

Agreed. The relax/force functions are already commented, so maybe alongside
SCA_USER?

Will
diff mbox series

Patch

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index a139ed8be7e3..d4219d366103 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7578,53 +7578,22 @@  SYSCALL_DEFINE4(sched_getattr, pid_t, pid, struct sched_attr __user *, uattr,
 	return retval;
 }
 
-long sched_setaffinity(pid_t pid, const struct cpumask *in_mask)
+static int
+__sched_setaffinity(struct task_struct *p, const struct cpumask *mask)
 {
-	cpumask_var_t cpus_allowed, new_mask;
-	struct task_struct *p;
 	int retval;
+	cpumask_var_t cpus_allowed, new_mask;
 
-	rcu_read_lock();
-
-	p = find_process_by_pid(pid);
-	if (!p) {
-		rcu_read_unlock();
-		return -ESRCH;
-	}
-
-	/* Prevent p going away */
-	get_task_struct(p);
-	rcu_read_unlock();
+	if (!alloc_cpumask_var(&cpus_allowed, GFP_KERNEL))
+		return -ENOMEM;
 
-	if (p->flags & PF_NO_SETAFFINITY) {
-		retval = -EINVAL;
-		goto out_put_task;
-	}
-	if (!alloc_cpumask_var(&cpus_allowed, GFP_KERNEL)) {
-		retval = -ENOMEM;
-		goto out_put_task;
-	}
 	if (!alloc_cpumask_var(&new_mask, GFP_KERNEL)) {
 		retval = -ENOMEM;
 		goto out_free_cpus_allowed;
 	}
-	retval = -EPERM;
-	if (!check_same_owner(p)) {
-		rcu_read_lock();
-		if (!ns_capable(__task_cred(p)->user_ns, CAP_SYS_NICE)) {
-			rcu_read_unlock();
-			goto out_free_new_mask;
-		}
-		rcu_read_unlock();
-	}
-
-	retval = security_task_setscheduler(p);
-	if (retval)
-		goto out_free_new_mask;
-
 
 	cpuset_cpus_allowed(p, cpus_allowed);
-	cpumask_and(new_mask, in_mask, cpus_allowed);
+	cpumask_and(new_mask, mask, cpus_allowed);
 
 	/*
 	 * Since bandwidth control happens on root_domain basis,
@@ -7645,23 +7614,63 @@  long sched_setaffinity(pid_t pid, const struct cpumask *in_mask)
 #endif
 again:
 	retval = __set_cpus_allowed_ptr(p, new_mask, SCA_CHECK);
+	if (retval)
+		goto out_free_new_mask;
 
-	if (!retval) {
-		cpuset_cpus_allowed(p, cpus_allowed);
-		if (!cpumask_subset(new_mask, cpus_allowed)) {
-			/*
-			 * We must have raced with a concurrent cpuset
-			 * update. Just reset the cpus_allowed to the
-			 * cpuset's cpus_allowed
-			 */
-			cpumask_copy(new_mask, cpus_allowed);
-			goto again;
-		}
+	cpuset_cpus_allowed(p, cpus_allowed);
+	if (!cpumask_subset(new_mask, cpus_allowed)) {
+		/*
+		 * We must have raced with a concurrent cpuset update.
+		 * Just reset the cpumask to the cpuset's cpus_allowed.
+		 */
+		cpumask_copy(new_mask, cpus_allowed);
+		goto again;
 	}
+
 out_free_new_mask:
 	free_cpumask_var(new_mask);
 out_free_cpus_allowed:
 	free_cpumask_var(cpus_allowed);
+	return retval;
+}
+
+long sched_setaffinity(pid_t pid, const struct cpumask *in_mask)
+{
+	struct task_struct *p;
+	int retval;
+
+	rcu_read_lock();
+
+	p = find_process_by_pid(pid);
+	if (!p) {
+		rcu_read_unlock();
+		return -ESRCH;
+	}
+
+	/* Prevent p going away */
+	get_task_struct(p);
+	rcu_read_unlock();
+
+	if (p->flags & PF_NO_SETAFFINITY) {
+		retval = -EINVAL;
+		goto out_put_task;
+	}
+
+	if (!check_same_owner(p)) {
+		rcu_read_lock();
+		if (!ns_capable(__task_cred(p)->user_ns, CAP_SYS_NICE)) {
+			rcu_read_unlock();
+			retval = -EPERM;
+			goto out_put_task;
+		}
+		rcu_read_unlock();
+	}
+
+	retval = security_task_setscheduler(p);
+	if (retval)
+		goto out_put_task;
+
+	retval = __sched_setaffinity(p, in_mask);
 out_put_task:
 	put_task_struct(p);
 	return retval;