diff mbox series

arch_topology: Update user supplied capacity to possible cpus in cluster

Message ID 1551354838-29902-1-git-send-email-clingutla@codeaurora.org (mailing list archive)
State New, archived
Headers show
Series arch_topology: Update user supplied capacity to possible cpus in cluster | expand

Commit Message

Chandrasekhar L Feb. 28, 2019, 11:53 a.m. UTC
With commit '5bdd2b3f0f8 ("arm64: topology: add support to remove cpu
topology sibling masks")', when cpu hotplugged out, it resets the cpu
information in its sibling CPUs. If user changes capacity of any cpu,
then the new capacity applied to all online cpus in the cluster.

If any hot plugged out cpu in the same cluster comes back to online,
then that would have different/stale capacity value.

Fix it by applying user supplied capacity to all possible cpus in the
cluster.

Signed-off-by: Lingutla Chandrasekhar <clingutla@codeaurora.org>

Comments

Sudeep Holla Feb. 28, 2019, 12:19 p.m. UTC | #1
On Thu, Feb 28, 2019 at 05:23:58PM +0530, Lingutla Chandrasekhar wrote:
> With commit '5bdd2b3f0f8 ("arm64: topology: add support to remove cpu
> topology sibling masks")', when cpu hotplugged out, it resets the cpu
> information in its sibling CPUs. If user changes capacity of any cpu,
> then the new capacity applied to all online cpus in the cluster.
>

Correct but you are now changing to apply the same to all the CPUs
in the package which is wrong.

> If any hot plugged out cpu in the same cluster comes back to online,
> then that would have different/stale capacity value.
>

Why not save the value ?

> Fix it by applying user supplied capacity to all possible cpus in the
> cluster.
>

NACK for the change. It changes for all the CPUs in the package/socket.
Though DT platforms have cluster ids as package ids, that's wrong and
must be fixed. So you need to fix this issuw without depending on the
package id. I have removed all the wrong users of the same and this is
also a wrong usage.

--
Regards,
Sudeep
Chandrasekhar L Feb. 28, 2019, 2:38 p.m. UTC | #2
Hi Sudeep,

On 2/28/2019 5:49 PM, Sudeep Holla wrote:
> On Thu, Feb 28, 2019 at 05:23:58PM +0530, Lingutla Chandrasekhar wrote:
>> With commit '5bdd2b3f0f8 ("arm64: topology: add support to remove cpu
>> topology sibling masks")', when cpu hotplugged out, it resets the cpu
>> information in its sibling CPUs. If user changes capacity of any cpu,
>> then the new capacity applied to all online cpus in the cluster.
>>
> Correct but you are now changing to apply the same to all the CPUs
> in the package which is wrong.
>
>> If any hot plugged out cpu in the same cluster comes back to online,
>> then that would have different/stale capacity value.
>>
> Why not save the value ?
Sorry, didn't get you, you mean save user supplied value ?
>
>> Fix it by applying user supplied capacity to all possible cpus in the
>> cluster.
>>
> NACK for the change. It changes for all the CPUs in the package/socket.
> Though DT platforms have cluster ids as package ids, that's wrong and
> must be fixed. So you need to fix this issuw without depending on the
> package id. I have removed all the wrong users of the same and this is
> also a wrong usage.

I presumed all cores with same package-id have same cpu capacity, so depended on it.
I think, we can update the capacity of newly online cpu by reading its core_sibling cpu capacity.
Let me know your opinion on this option ?

> --
> Regards,
> Sudeep

--
Chandrasekhar L,  
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
 a Linux Foundation Collaborative Project.
Sudeep Holla Feb. 28, 2019, 3:25 p.m. UTC | #3
On Thu, Feb 28, 2019 at 08:08:13PM +0530, Chandra Sekhar Lingutla wrote:
> Hi Sudeep,
>
> On 2/28/2019 5:49 PM, Sudeep Holla wrote:
> > On Thu, Feb 28, 2019 at 05:23:58PM +0530, Lingutla Chandrasekhar wrote:
> >> With commit '5bdd2b3f0f8 ("arm64: topology: add support to remove cpu
> >> topology sibling masks")', when cpu hotplugged out, it resets the cpu
> >> information in its sibling CPUs. If user changes capacity of any cpu,
> >> then the new capacity applied to all online cpus in the cluster.
> >>
> > Correct but you are now changing to apply the same to all the CPUs
> > in the package which is wrong.
> >
> >> If any hot plugged out cpu in the same cluster comes back to online,
> >> then that would have different/stale capacity value.
> >>
> > Why not save the value ?
> Sorry, didn't get you, you mean save user supplied value ?

I meant save the last user set value and reset when CPU comes online.

> >
> >> Fix it by applying user supplied capacity to all possible cpus in the
> >> cluster.
> >>
> > NACK for the change. It changes for all the CPUs in the package/socket.
> > Though DT platforms have cluster ids as package ids, that's wrong and
> > must be fixed. So you need to fix this issuw without depending on the
> > package id. I have removed all the wrong users of the same and this is
> > also a wrong usage.
>
> I presumed all cores with same package-id have same cpu capacity, so
> depended on it.

No,
1. Package is not cluster, it's the physical socket which on typical
   mobile systems will be the whole SoC. So it includes all the CPUs
   in the system.

2. How about DSU systems where CPUs can have different capacity within
   cluster ?

> I think, we can update the capacity of newly online cpu by reading its
> core_sibling cpu capacity.

Will that survive scenario where all the CPUs in the so-called cluster
is hot-plugged out and back in.

> Let me know your opinion on this option ?
>

I see only solution is to save the value last set from the user somewhere
if not already done and restore the same.

--
Regards,
Sudeep
Chandrasekhar L March 2, 2019, 1:30 p.m. UTC | #4
On 2/28/2019 8:55 PM, Sudeep Holla wrote:
> On Thu, Feb 28, 2019 at 08:08:13PM +0530, Chandra Sekhar Lingutla wrote:
>> Hi Sudeep,
>>
>> On 2/28/2019 5:49 PM, Sudeep Holla wrote:
>>> On Thu, Feb 28, 2019 at 05:23:58PM +0530, Lingutla Chandrasekhar wrote:
>>>> With commit '5bdd2b3f0f8 ("arm64: topology: add support to remove cpu
>>>> topology sibling masks")', when cpu hotplugged out, it resets the cpu
>>>> information in its sibling CPUs. If user changes capacity of any cpu,
>>>> then the new capacity applied to all online cpus in the cluster.
>>>>
>>> Correct but you are now changing to apply the same to all the CPUs
>>> in the package which is wrong.
>>>
>>>> If any hot plugged out cpu in the same cluster comes back to online,
>>>> then that would have different/stale capacity value.
>>>>
>>> Why not save the value ?
>> Sorry, didn't get you, you mean save user supplied value ?
> I meant save the last user set value and reset when CPU comes online.

But 'cpu_capacity' for hot plugging cpu is not touched, so it would retain same value.

The actual problem is, cpu_capacity_store() tries to change the capacity for all its 'sibling cpus'.

Now the commit '5bdd2b3f0f8' keeps only online cpus in the sibling mask, so user supplied

cpu_capacity would be applied only online sibling cpus at the time. After that, if any

cpu hot plugged in, it would have different cpu_capacity than its siblings.

>>>> Fix it by applying user supplied capacity to all possible cpus in the
>>>> cluster.
>>>>
>>> NACK for the change. It changes for all the CPUs in the package/socket.
>>> Though DT platforms have cluster ids as package ids, that's wrong and
>>> must be fixed. So you need to fix this issuw without depending on the
>>> package id. I have removed all the wrong users of the same and this is
>>> also a wrong usage.
>> I presumed all cores with same package-id have same cpu capacity, so
>> depended on it.
> No,
> 1. Package is not cluster, it's the physical socket which on typical
>    mobile systems will be the whole SoC. So it includes all the CPUs
>    in the system.
>
> 2. How about DSU systems where CPUs can have different capacity within
>    cluster ?
So cpus in cpu_topology->core_sibling mask would not need to have same capacity_cpu?

Then i think, we should update the cpu_capacity for only requested cpu right?

ex: 'echo 910 > sys/devices/system/cpu/cpu5/cpu_capacity' should be applied only to cpu5.

>> I think, we can update the capacity of newly online cpu by reading its
>> core_sibling cpu capacity.
> Will that survive scenario where all the CPUs in the so-called cluster
> is hot-plugged out and back in.
>
>> Let me know your opinion on this option ?
>>
> I see only solution is to save the value last set from the user somewhere
> if not already done and restore the same.
>
> --
> Regards,
> Sudeep

-- Chandrasekhar L,
Sudeep Holla March 4, 2019, 6:21 p.m. UTC | #5
On Sat, Mar 02, 2019 at 07:00:43PM +0530, Chandra Sekhar Lingutla wrote:
>
> On 2/28/2019 8:55 PM, Sudeep Holla wrote:
> > On Thu, Feb 28, 2019 at 08:08:13PM +0530, Chandra Sekhar Lingutla wrote:
> >> Hi Sudeep,
> >>
> >> On 2/28/2019 5:49 PM, Sudeep Holla wrote:
> >>> On Thu, Feb 28, 2019 at 05:23:58PM +0530, Lingutla Chandrasekhar wrote:
> >>>> With commit '5bdd2b3f0f8 ("arm64: topology: add support to remove cpu
> >>>> topology sibling masks")', when cpu hotplugged out, it resets the cpu
> >>>> information in its sibling CPUs. If user changes capacity of any cpu,
> >>>> then the new capacity applied to all online cpus in the cluster.
> >>>>
> >>> Correct but you are now changing to apply the same to all the CPUs
> >>> in the package which is wrong.
> >>>
> >>>> If any hot plugged out cpu in the same cluster comes back to online,
> >>>> then that would have different/stale capacity value.
> >>>>
> >>> Why not save the value ?
> >> Sorry, didn't get you, you mean save user supplied value ?
> > I meant save the last user set value and reset when CPU comes online.
>
> But 'cpu_capacity' for hot plugging cpu is not touched, so it would retain
> same value. The actual problem is, cpu_capacity_store() tries to change the
> capacity for all its 'sibling cpus'. Now the commit '5bdd2b3f0f8' keeps only
> online cpus in the sibling mask, so user supplied cpu_capacity would be
> applied only online sibling cpus at the time. After that, if any cpu hot
> plugged in, it would have different cpu_capacity than its siblings.
>

OK, I understand the problem. What I was saying is using sibling cpus to
update cpu_capacity needs to be changed. That is no longer correct with
2 scenarios I mentioned: DSU and ACPI sockets/physical package. We need
to fix physical package id for DT systems but that shouldn't block this
issue IMO.

> >>>> Fix it by applying user supplied capacity to all possible cpus in the
> >>>> cluster.
> >>>>
> >>> NACK for the change. It changes for all the CPUs in the package/socket.
> >>> Though DT platforms have cluster ids as package ids, that's wrong and
> >>> must be fixed. So you need to fix this issuw without depending on the
> >>> package id. I have removed all the wrong users of the same and this is
> >>> also a wrong usage.
> >> I presumed all cores with same package-id have same cpu capacity, so
> >> depended on it.
> > No,
> > 1. Package is not cluster, it's the physical socket which on typical
> >    mobile systems will be the whole SoC. So it includes all the CPUs
> >    in the system.
> >
> > 2. How about DSU systems where CPUs can have different capacity within
> >    cluster ?
> So cpus in cpu_topology->core_sibling mask would not need to have same
> capacity_cpu ?
>

Yes, it need not. DSU is simple example. Even normal heterogeneous
multi-cluster single socket systems will have all the cpus in the die
present in core_siblings.

> Then i think, we should update the cpu_capacity for only requested cpu
> right?

One possible solution and a simpler one. But I am open to any better
alternative if it exists/possible.

>
> ex: 'echo 910 > sys/devices/system/cpu/cpu5/cpu_capacity' should be applied
> only to cpu5.

Yes something similar.

>
> >> I think, we can update the capacity of newly online cpu by reading its
> >> core_sibling cpu capacity.
> > Will that survive scenario where all the CPUs in the so-called cluster
> > is hot-plugged out and back in.
> >
> >> Let me know your opinion on this option ?
> >>

I was always under the impression that this was in debugfs and will be
removed. I did mention this in one of the thread couple of months back.
I was wrong and do understand the need for this on system where firmware
doesn't provide this capacity value.

If possible I want to drop the write capability for the sysfs.

--
Regards,
Sudeep
Quentin Perret March 5, 2019, 9:23 a.m. UTC | #6
On Monday 04 Mar 2019 at 18:21:38 (+0000), Sudeep Holla wrote:
> On Sat, Mar 02, 2019 at 07:00:43PM +0530, Chandra Sekhar Lingutla wrote:
> > So cpus in cpu_topology->core_sibling mask would not need to have same
> > capacity_cpu ?
> 
> Yes, it need not. DSU is simple example. Even normal heterogeneous
> multi-cluster single socket systems will have all the cpus in the die
> present in core_siblings.
> 
> > Then i think, we should update the cpu_capacity for only requested cpu
> > right?
> 
> One possible solution and a simpler one. But I am open to any better
> alternative if it exists/possible.

How about we update the capacity for the related_cpus of the CPUFreq
policy ? This is what we're interested in here, I think, and is
orthogonal to the topology stuff. And that should map fairly well to the
core_sibling_mask for legacy platforms.

FWIW, we already mandate something similar for EAS for example
(see [1]), and I'm not sure we want to support having different uarchs
in the same freq domain here either, even though strictly speaking
DynamIQ doesn't forbid it.

[1] https://elixir.bootlin.com/linux/latest/source/kernel/power/energy_model.c#L170

[...]

> I was always under the impression that this was in debugfs and will be
> removed. I did mention this in one of the thread couple of months back.
> I was wrong and do understand the need for this on system where firmware
> doesn't provide this capacity value.
> 
> If possible I want to drop the write capability for the sysfs.

But yes, that is even better, if at all possible.

Thanks,
Quentin
Sudeep Holla March 5, 2019, 11:13 a.m. UTC | #7
On Tue, Mar 05, 2019 at 09:23:25AM +0000, Quentin Perret wrote:
> On Monday 04 Mar 2019 at 18:21:38 (+0000), Sudeep Holla wrote:
> > On Sat, Mar 02, 2019 at 07:00:43PM +0530, Chandra Sekhar Lingutla wrote:
> > > So cpus in cpu_topology->core_sibling mask would not need to have same
> > > capacity_cpu ?
> >
> > Yes, it need not. DSU is simple example. Even normal heterogeneous
> > multi-cluster single socket systems will have all the cpus in the die
> > present in core_siblings.
> >
> > > Then i think, we should update the cpu_capacity for only requested cpu
> > > right?
> >
> > One possible solution and a simpler one. But I am open to any better
> > alternative if it exists/possible.
>
> How about we update the capacity for the related_cpus of the CPUFreq
> policy ? This is what we're interested in here, I think, and is
> orthogonal to the topology stuff. And that should map fairly well to the
> core_sibling_mask for legacy platforms.
>

While I like the idea, I am afraid that linking this to cpufreq policy
may not be good. How will we deal with it on systems without CPUfreq ?

> FWIW, we already mandate something similar for EAS for example
> (see [1]), and I'm not sure we want to support having different uarchs
> in the same freq domain here either, even though strictly speaking
> DynamIQ doesn't forbid it.
>

Yes that dependency is other way around and topology is not optional, so
it works out well. The reverse may not be that simple.

> [1] https://elixir.bootlin.com/linux/latest/source/kernel/power/energy_model.c#L170
>
> [...]
>
> > I was always under the impression that this was in debugfs and will be
> > removed. I did mention this in one of the thread couple of months back.
> > I was wrong and do understand the need for this on system where firmware
> > doesn't provide this capacity value.
> >
> > If possible I want to drop the write capability for the sysfs.
>
> But yes, that is even better, if at all possible.
>

I think if there are no valid users of this, we *must* remove it. As I
have pointed out in past, giving user such access will need platform
knowledge. Though it's debatable topic, firmware providing this
information is the only correct solution IMO.

--
Regards,
Sudeep
Quentin Perret March 5, 2019, 11:29 a.m. UTC | #8
On Tuesday 05 Mar 2019 at 11:13:21 (+0000), Sudeep Holla wrote:
[...]
> While I like the idea, I am afraid that linking this to cpufreq policy
> may not be good. How will we deal with it on systems without CPUfreq ?

Maybe something like this ?

	policy = cpufreq_cpu_get(cpu);
	if (policy) {
		for_each_cpu(i, policy->related_cpus) {
			/* Update capacity for @i*/
		}
		cpufreq_cpu_put(policy);
	} else {
		/* Update capacity for @cpu*/
	}

I think it's OK to assume per-cpu capacities w/o CPUFreq. The only
case where it makes sense to 'bind' the capacity of several CPUs
together is when they're part of the same perf domain, I think. If you
don't know what the perf domains are, then there's nothing sensible you
can do.

And for the dependency, a large part of the arch_topology driver is
already dependent on CPUFreq -- it registers a CPUFreq notifier on boot
to re-scale the CPU capacities depending on the max freq of the various
policies and so on. So the dependency is already there somehow.

[...]

> I think if there are no valid users of this, we *must* remove it. As I
> have pointed out in past, giving user such access will need platform
> knowledge. Though it's debatable topic, firmware providing this
> information is the only correct solution IMO.

Yeah, if nobody is using it then maybe we can just remove it. Or at
least we can give it a go and if somebody complains then we can 'fix' it
with something like my snippet above :-)

Thanks,
Quentin
Sudeep Holla March 5, 2019, 11:36 a.m. UTC | #9
On Tue, Mar 05, 2019 at 11:29:55AM +0000, Quentin Perret wrote:
> On Tuesday 05 Mar 2019 at 11:13:21 (+0000), Sudeep Holla wrote:
> [...]
> > While I like the idea, I am afraid that linking this to cpufreq policy
> > may not be good. How will we deal with it on systems without CPUfreq ?
>
> Maybe something like this ?
>
> 	policy = cpufreq_cpu_get(cpu);
> 	if (policy) {
> 		for_each_cpu(i, policy->related_cpus) {
> 			/* Update capacity for @i*/
> 		}
> 		cpufreq_cpu_put(policy);
> 	} else {
> 		/* Update capacity for @cpu*/
> 	}
>
> I think it's OK to assume per-cpu capacities w/o CPUFreq. The only
> case where it makes sense to 'bind' the capacity of several CPUs
> together is when they're part of the same perf domain, I think. If you
> don't know what the perf domains are, then there's nothing sensible you
> can do.
>

Makes sense.

> And for the dependency, a large part of the arch_topology driver is
> already dependent on CPUFreq -- it registers a CPUFreq notifier on boot
> to re-scale the CPU capacities depending on the max freq of the various
> policies and so on. So the dependency is already there somehow.
>

Sorry when I mentioned dependency, I meant absence of it needs to be
dealt with. Your suggestion looks good.

> [...]
>
> > I think if there are no valid users of this, we *must* remove it. As I
> > have pointed out in past, giving user such access will need platform
> > knowledge. Though it's debatable topic, firmware providing this
> > information is the only correct solution IMO.
>
> Yeah, if nobody is using it then maybe we can just remove it. Or at
> least we can give it a go and if somebody complains then we can 'fix' it
> with something like my snippet above :-)
>

Happy to Ack code removing it ;). The argument that it can't be provided
by firmware is no longer valid. We already have some dependency on DVFS
data from the firmware for this to be functional correctly.

--
Regards,
Sudeep
Chandrasekhar L March 5, 2019, 3:53 p.m. UTC | #10
On 3/5/2019 5:06 PM, Sudeep Holla wrote:
> On Tue, Mar 05, 2019 at 11:29:55AM +0000, Quentin Perret wrote:
>> On Tuesday 05 Mar 2019 at 11:13:21 (+0000), Sudeep Holla wrote:
>> [...]
>>> While I like the idea, I am afraid that linking this to cpufreq policy
>>> may not be good. How will we deal with it on systems without CPUfreq ?
>>
>> Maybe something like this ?
>>
>> 	policy = cpufreq_cpu_get(cpu);
>> 	if (policy) {
>> 		for_each_cpu(i, policy->related_cpus) {
>> 			/* Update capacity for @i*/
>> 		}
>> 		cpufreq_cpu_put(policy);
>> 	} else {
>> 		/* Update capacity for @cpu*/
>> 	}
>>
>> I think it's OK to assume per-cpu capacities w/o CPUFreq. The only
>> case where it makes sense to 'bind' the capacity of several CPUs
>> together is when they're part of the same perf domain, I think. If you
>> don't know what the perf domains are, then there's nothing sensible you
>> can do.
>>
> 
> Makes sense.
> 
>> And for the dependency, a large part of the arch_topology driver is
>> already dependent on CPUFreq -- it registers a CPUFreq notifier on boot
>> to re-scale the CPU capacities depending on the max freq of the various
>> policies and so on. So the dependency is already there somehow.
>>
> 
> Sorry when I mentioned dependency, I meant absence of it needs to be
> dealt with. Your suggestion looks good.
> 
>> [...]
>>
>>> I think if there are no valid users of this, we *must* remove it. As I
>>> have pointed out in past, giving user such access will need platform
>>> knowledge. Though it's debatable topic, firmware providing this
>>> information is the only correct solution IMO.
>>
>> Yeah, if nobody is using it then maybe we can just remove it. Or at
>> least we can give it a go and if somebody complains then we can 'fix' it
>> with something like my snippet above :-)
>>
> 
> Happy to Ack code removing it ;). The argument that it can't be provided
> by firmware is no longer valid. We already have some dependency on DVFS
> data from the firmware for this to be functional correctly.
> 
If at all nobody uses it, making it as read-only (under debugfs) would be good ?
or else we are ok to update patch with related_cpus ?

> --
> Regards,
> Sudeep
> 

-- Chandrasekhar L,
Quentin Perret March 5, 2019, 4:12 p.m. UTC | #11
On Tuesday 05 Mar 2019 at 21:23:24 (+0530), Chandra Sekhar Lingutla wrote:
> 
> 
> On 3/5/2019 5:06 PM, Sudeep Holla wrote:
> > On Tue, Mar 05, 2019 at 11:29:55AM +0000, Quentin Perret wrote:
> >> On Tuesday 05 Mar 2019 at 11:13:21 (+0000), Sudeep Holla wrote:
> >> [...]
> >>> While I like the idea, I am afraid that linking this to cpufreq policy
> >>> may not be good. How will we deal with it on systems without CPUfreq ?
> >>
> >> Maybe something like this ?
> >>
> >> 	policy = cpufreq_cpu_get(cpu);
> >> 	if (policy) {
> >> 		for_each_cpu(i, policy->related_cpus) {
> >> 			/* Update capacity for @i*/
> >> 		}
> >> 		cpufreq_cpu_put(policy);
> >> 	} else {
> >> 		/* Update capacity for @cpu*/
> >> 	}
> >>
> >> I think it's OK to assume per-cpu capacities w/o CPUFreq. The only
> >> case where it makes sense to 'bind' the capacity of several CPUs
> >> together is when they're part of the same perf domain, I think. If you
> >> don't know what the perf domains are, then there's nothing sensible you
> >> can do.
> >>
> > 
> > Makes sense.
> > 
> >> And for the dependency, a large part of the arch_topology driver is
> >> already dependent on CPUFreq -- it registers a CPUFreq notifier on boot
> >> to re-scale the CPU capacities depending on the max freq of the various
> >> policies and so on. So the dependency is already there somehow.
> >>
> > 
> > Sorry when I mentioned dependency, I meant absence of it needs to be
> > dealt with. Your suggestion looks good.
> > 
> >> [...]
> >>
> >>> I think if there are no valid users of this, we *must* remove it. As I
> >>> have pointed out in past, giving user such access will need platform
> >>> knowledge. Though it's debatable topic, firmware providing this
> >>> information is the only correct solution IMO.
> >>
> >> Yeah, if nobody is using it then maybe we can just remove it. Or at
> >> least we can give it a go and if somebody complains then we can 'fix' it
> >> with something like my snippet above :-)
> >>
> > 
> > Happy to Ack code removing it ;). The argument that it can't be provided
> > by firmware is no longer valid. We already have some dependency on DVFS
> > data from the firmware for this to be functional correctly.
> > 
> If at all nobody uses it, making it as read-only (under debugfs) would be good ?

I'd say keep the sysfs, but make it RO. I'm aware of a few tools reading
from that sysfs node so let's not break them. (None of these tools
_write_ in there, though.)

> or else we are ok to update patch with related_cpus ?

And that we can keep as a backup solution if the above fails.

Please also don't forget to CC Greg KH who maintains that file.

Thanks,
Quentin
Sudeep Holla March 5, 2019, 4:54 p.m. UTC | #12
On Tue, Mar 05, 2019 at 09:23:24PM +0530, Chandra Sekhar Lingutla wrote:
> 
> 
> On 3/5/2019 5:06 PM, Sudeep Holla wrote:
> > On Tue, Mar 05, 2019 at 11:29:55AM +0000, Quentin Perret wrote:
> >> On Tuesday 05 Mar 2019 at 11:13:21 (+0000), Sudeep Holla wrote:
> >> [...]
> >>> While I like the idea, I am afraid that linking this to cpufreq policy
> >>> may not be good. How will we deal with it on systems without CPUfreq ?
> >>
> >> Maybe something like this ?
> >>
> >> 	policy = cpufreq_cpu_get(cpu);
> >> 	if (policy) {
> >> 		for_each_cpu(i, policy->related_cpus) {
> >> 			/* Update capacity for @i*/
> >> 		}
> >> 		cpufreq_cpu_put(policy);
> >> 	} else {
> >> 		/* Update capacity for @cpu*/
> >> 	}
> >>
> >> I think it's OK to assume per-cpu capacities w/o CPUFreq. The only
> >> case where it makes sense to 'bind' the capacity of several CPUs
> >> together is when they're part of the same perf domain, I think. If you
> >> don't know what the perf domains are, then there's nothing sensible you
> >> can do.
> >>
> > 
> > Makes sense.
> > 
> >> And for the dependency, a large part of the arch_topology driver is
> >> already dependent on CPUFreq -- it registers a CPUFreq notifier on boot
> >> to re-scale the CPU capacities depending on the max freq of the various
> >> policies and so on. So the dependency is already there somehow.
> >>
> > 
> > Sorry when I mentioned dependency, I meant absence of it needs to be
> > dealt with. Your suggestion looks good.
> > 
> >> [...]
> >>
> >>> I think if there are no valid users of this, we *must* remove it. As I
> >>> have pointed out in past, giving user such access will need platform
> >>> knowledge. Though it's debatable topic, firmware providing this
> >>> information is the only correct solution IMO.
> >>
> >> Yeah, if nobody is using it then maybe we can just remove it. Or at
> >> least we can give it a go and if somebody complains then we can 'fix' it
> >> with something like my snippet above :-)
> >>
> > 
> > Happy to Ack code removing it ;). The argument that it can't be provided
> > by firmware is no longer valid. We already have some dependency on DVFS
> > data from the firmware for this to be functional correctly.
> > 
> If at all nobody uses it, making it as read-only (under debugfs) would be
> good ?

Yes, but under sysfs as it is now. Just remove write capability and make
it read-only.

> or else we are ok to update patch with related_cpus ?

As Quentin, mentions only if anyone has objections to the above and
provide valid use-case with details on how kernel can validate the data
provided from the user which is the most difficult part IMO.

--
Regards,
Sudeep
Dietmar Eggemann March 6, 2019, 9:48 a.m. UTC | #13
On 3/5/19 12:29 PM, Quentin Perret wrote:
> On Tuesday 05 Mar 2019 at 11:13:21 (+0000), Sudeep Holla wrote:
> [...]
>> While I like the idea, I am afraid that linking this to cpufreq policy
>> may not be good. How will we deal with it on systems without CPUfreq ?
> 
> Maybe something like this ?
> 
> 	policy = cpufreq_cpu_get(cpu);
> 	if (policy) {
> 		for_each_cpu(i, policy->related_cpus) {
> 			/* Update capacity for @i*/
> 		}
> 		cpufreq_cpu_put(policy);
> 	} else {
> 		/* Update capacity for @cpu*/
> 	}
> 
> I think it's OK to assume per-cpu capacities w/o CPUFreq. The only
> case where it makes sense to 'bind' the capacity of several CPUs
> together is when they're part of the same perf domain, I think. If you
> don't know what the perf domains are, then there's nothing sensible you
> can do.
> 
> And for the dependency, a large part of the arch_topology driver is
> already dependent on CPUFreq -- it registers a CPUFreq notifier on boot
> to re-scale the CPU capacities depending on the max freq of the various
> policies and so on. So the dependency is already there somehow.
> 
> [...]
> 
>> I think if there are no valid users of this, we *must* remove it. As I
>> have pointed out in past, giving user such access will need platform
>> knowledge. Though it's debatable topic, firmware providing this
>> information is the only correct solution IMO.
> 
> Yeah, if nobody is using it then maybe we can just remove it. Or at
> least we can give it a go and if somebody complains then we can 'fix' it
> with something like my snippet above :-)

+1 for dropping the write capability for this sysfs.

I questioned the usefulness of the interface already in 2015: 
https://lkml.org/lkml/2015/12/10/324

Now in the meantime, reading the interface for all CPUs became a nice 
way of figuring out that we're on a system with asymmetric CPU 
capacities (big.LITTLE or DynamIQ). Especially for systems w/o an Energy 
Model.
Morten Rasmussen March 6, 2019, 3:22 p.m. UTC | #14
On Tue, Mar 05, 2019 at 04:54:31PM +0000, Sudeep Holla wrote:
> On Tue, Mar 05, 2019 at 09:23:24PM +0530, Chandra Sekhar Lingutla wrote:
> > 
> > 
> > On 3/5/2019 5:06 PM, Sudeep Holla wrote:
> > > On Tue, Mar 05, 2019 at 11:29:55AM +0000, Quentin Perret wrote:
> > >> On Tuesday 05 Mar 2019 at 11:13:21 (+0000), Sudeep Holla wrote:
> > >> [...]
> > >>> While I like the idea, I am afraid that linking this to cpufreq policy
> > >>> may not be good. How will we deal with it on systems without CPUfreq ?
> > >>
> > >> Maybe something like this ?
> > >>
> > >> 	policy = cpufreq_cpu_get(cpu);
> > >> 	if (policy) {
> > >> 		for_each_cpu(i, policy->related_cpus) {
> > >> 			/* Update capacity for @i*/
> > >> 		}
> > >> 		cpufreq_cpu_put(policy);
> > >> 	} else {
> > >> 		/* Update capacity for @cpu*/
> > >> 	}
> > >>
> > >> I think it's OK to assume per-cpu capacities w/o CPUFreq. The only
> > >> case where it makes sense to 'bind' the capacity of several CPUs
> > >> together is when they're part of the same perf domain, I think. If you
> > >> don't know what the perf domains are, then there's nothing sensible you
> > >> can do.

Moving away from using the topology masks is certainly good thing if we
can't kill the write ability through sysfs. My only concern with the
per-cpu approach for non-cpufreq systems is that we have to be sure that
the rebuild of the sched_domain hierarchy doesn't go wrong when you have
different capacities in the same group. I don't think it will but it is
worth checking.

> > >>
> > > 
> > > Makes sense.
> > > 
> > >> And for the dependency, a large part of the arch_topology driver is
> > >> already dependent on CPUFreq -- it registers a CPUFreq notifier on boot
> > >> to re-scale the CPU capacities depending on the max freq of the various
> > >> policies and so on. So the dependency is already there somehow.
> > >>
> > > 
> > > Sorry when I mentioned dependency, I meant absence of it needs to be
> > > dealt with. Your suggestion looks good.
> > > 
> > >> [...]
> > >>
> > >>> I think if there are no valid users of this, we *must* remove it. As I
> > >>> have pointed out in past, giving user such access will need platform
> > >>> knowledge. Though it's debatable topic, firmware providing this
> > >>> information is the only correct solution IMO.
> > >>
> > >> Yeah, if nobody is using it then maybe we can just remove it. Or at
> > >> least we can give it a go and if somebody complains then we can 'fix' it
> > >> with something like my snippet above :-)
> > >>
> > > 
> > > Happy to Ack code removing it ;). The argument that it can't be provided
> > > by firmware is no longer valid. We already have some dependency on DVFS
> > > data from the firmware for this to be functional correctly.
> > > 
> > If at all nobody uses it, making it as read-only (under debugfs) would be
> > good ?
> 
> Yes, but under sysfs as it is now. Just remove write capability and make
> it read-only.
> 
> > or else we are ok to update patch with related_cpus ?
> 
> As Quentin, mentions only if anyone has objections to the above and
> provide valid use-case with details on how kernel can validate the data
> provided from the user which is the most difficult part IMO.

+1

Morten
diff mbox series

Patch

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index edfcf8d..dadc5d8 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -61,6 +61,7 @@  static ssize_t cpu_capacity_store(struct device *dev,
 	int i;
 	unsigned long new_capacity;
 	ssize_t ret;
+	struct cpu_topology *icpu_topo, *this_topo = &cpu_topology[this_cpu];
 
 	if (!count)
 		return 0;
@@ -72,8 +73,15 @@  static ssize_t cpu_capacity_store(struct device *dev,
 		return -EINVAL;
 
 	mutex_lock(&cpu_scale_mutex);
-	for_each_cpu(i, &cpu_topology[this_cpu].core_sibling)
+
+	for_each_possible_cpu(i) {
+		icpu_topo = &cpu_topology[i];
+
+		if (icpu_topo->package_id != this_topo->package_id)
+			continue;
 		topology_set_cpu_scale(i, new_capacity);
+	}
+
 	mutex_unlock(&cpu_scale_mutex);
 
 	schedule_work(&update_topology_flags_work);