diff mbox series

[net,v3,3/3] sock: Ignore memcg pressure heuristics when raising allocated

Message ID 20231019120026.42215-3-wuyun.abel@bytedance.com (mailing list archive)
State Accepted
Commit 66e6369e312d161708786123fb44ecd53ff32d82
Delegated to: Netdev Maintainers
Headers show
Series [net,v3,1/3] sock: Code cleanup on __sk_mem_raise_allocated() | expand

Checks

Context Check Description
netdev/series_format warning Series does not have a cover letter
netdev/tree_selection success Clearly marked for net
netdev/fixes_present fail Series targets non-next tree, but doesn't contain any Fixes tags
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1368 this patch: 1368
netdev/cc_maintainers warning 1 maintainers not CCed: kuniyu@amazon.com
netdev/build_clang success Errors and warnings before: 1387 this patch: 1387
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 1392 this patch: 1392
netdev/checkpatch warning WARNING: line length of 81 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Abel Wu Oct. 19, 2023, noon UTC
Before sockets became aware of net-memcg's memory pressure since
commit e1aab161e013 ("socket: initial cgroup code."), the memory
usage would be granted to raise if below average even when under
protocol's pressure. This provides fairness among the sockets of
same protocol.

That commit changes this because the heuristic will also be
effective when only memcg is under pressure which makes no sense.
So revert that behavior.

After reverting, __sk_mem_raise_allocated() no longer considers
memcg's pressure. As memcgs are isolated from each other w.r.t.
memory accounting, consuming one's budget won't affect others.
So except the places where buffer sizes are needed to be tuned,
allow workloads to use the memory they are provisioned.

Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
---
 net/core/sock.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

Comments

Simon Horman Oct. 23, 2023, 7:49 a.m. UTC | #1
On Thu, Oct 19, 2023 at 08:00:26PM +0800, Abel Wu wrote:
> Before sockets became aware of net-memcg's memory pressure since
> commit e1aab161e013 ("socket: initial cgroup code."), the memory
> usage would be granted to raise if below average even when under
> protocol's pressure. This provides fairness among the sockets of
> same protocol.
> 
> That commit changes this because the heuristic will also be
> effective when only memcg is under pressure which makes no sense.
> So revert that behavior.
> 
> After reverting, __sk_mem_raise_allocated() no longer considers
> memcg's pressure. As memcgs are isolated from each other w.r.t.
> memory accounting, consuming one's budget won't affect others.
> So except the places where buffer sizes are needed to be tuned,
> allow workloads to use the memory they are provisioned.
> 
> Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
> Acked-by: Shakeel Butt <shakeelb@google.com>
> Acked-by: Paolo Abeni <pabeni@redhat.com>

Reviewed-by: Simon Horman <horms@kernel.org>
Paolo Abeni Oct. 24, 2023, 7:08 a.m. UTC | #2
On Thu, 2023-10-19 at 20:00 +0800, Abel Wu wrote:
> Before sockets became aware of net-memcg's memory pressure since
> commit e1aab161e013 ("socket: initial cgroup code."), the memory
> usage would be granted to raise if below average even when under
> protocol's pressure. This provides fairness among the sockets of
> same protocol.
> 
> That commit changes this because the heuristic will also be
> effective when only memcg is under pressure which makes no sense.
> So revert that behavior.
> 
> After reverting, __sk_mem_raise_allocated() no longer considers
> memcg's pressure. As memcgs are isolated from each other w.r.t.
> memory accounting, consuming one's budget won't affect others.
> So except the places where buffer sizes are needed to be tuned,
> allow workloads to use the memory they are provisioned.
> 
> Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
> Acked-by: Shakeel Butt <shakeelb@google.com>
> Acked-by: Paolo Abeni <pabeni@redhat.com>

It's totally not clear to me why you changed the target tree from net-
next to net ?!? This is net-next material, I asked to strip the fixes
tag exactly for that reason.

Since there is agreement on this series and we are late in the cycle, I
would avoid a re-post (we can apply the series to net-next anyway) but
any clarification on the target tree change will be appreciated,
thanks!

Paolo
Shakeel Butt Oct. 24, 2023, 7:35 a.m. UTC | #3
On Tue, Oct 24, 2023 at 12:08 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> On Thu, 2023-10-19 at 20:00 +0800, Abel Wu wrote:
> > Before sockets became aware of net-memcg's memory pressure since
> > commit e1aab161e013 ("socket: initial cgroup code."), the memory
> > usage would be granted to raise if below average even when under
> > protocol's pressure. This provides fairness among the sockets of
> > same protocol.
> >
> > That commit changes this because the heuristic will also be
> > effective when only memcg is under pressure which makes no sense.
> > So revert that behavior.
> >
> > After reverting, __sk_mem_raise_allocated() no longer considers
> > memcg's pressure. As memcgs are isolated from each other w.r.t.
> > memory accounting, consuming one's budget won't affect others.
> > So except the places where buffer sizes are needed to be tuned,
> > allow workloads to use the memory they are provisioned.
> >
> > Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
> > Acked-by: Shakeel Butt <shakeelb@google.com>
> > Acked-by: Paolo Abeni <pabeni@redhat.com>
>
> It's totally not clear to me why you changed the target tree from net-
> next to net ?!? This is net-next material, I asked to strip the fixes
> tag exactly for that reason.
>
> Since there is agreement on this series and we are late in the cycle, I
> would avoid a re-post (we can apply the series to net-next anyway) but
> any clarification on the target tree change will be appreciated,
> thanks!
>

I didn't even notice the change in the target tree. I would say let's
keep this for net-next as there are no urgent fixes here.
Abel Wu Oct. 24, 2023, 8:21 a.m. UTC | #4
On 10/24/23 3:08 PM, Paolo Abeni Wrote:
> On Thu, 2023-10-19 at 20:00 +0800, Abel Wu wrote:
>> Before sockets became aware of net-memcg's memory pressure since
>> commit e1aab161e013 ("socket: initial cgroup code."), the memory
>> usage would be granted to raise if below average even when under
>> protocol's pressure. This provides fairness among the sockets of
>> same protocol.
>>
>> That commit changes this because the heuristic will also be
>> effective when only memcg is under pressure which makes no sense.
>> So revert that behavior.
>>
>> After reverting, __sk_mem_raise_allocated() no longer considers
>> memcg's pressure. As memcgs are isolated from each other w.r.t.
>> memory accounting, consuming one's budget won't affect others.
>> So except the places where buffer sizes are needed to be tuned,
>> allow workloads to use the memory they are provisioned.
>>
>> Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
>> Acked-by: Shakeel Butt <shakeelb@google.com>
>> Acked-by: Paolo Abeni <pabeni@redhat.com>
> 
> It's totally not clear to me why you changed the target tree from net-
> next to net ?!? This is net-next material, I asked to strip the fixes
> tag exactly for that reason.

Sorry I misunderstood your suggestion..

> 
> Since there is agreement on this series and we are late in the cycle, I
> would avoid a re-post (we can apply the series to net-next anyway) but
> any clarification on the target tree change will be appreciated,
> thanks!

Please apply to net-next.

Thanks!
	Abel
diff mbox series

Patch

diff --git a/net/core/sock.c b/net/core/sock.c
index 45841a5689b6..0ec3f5d70715 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -3037,7 +3037,13 @@  EXPORT_SYMBOL(sk_wait_data);
  *	@amt: pages to allocate
  *	@kind: allocation type
  *
- *	Similar to __sk_mem_schedule(), but does not update sk_forward_alloc
+ *	Similar to __sk_mem_schedule(), but does not update sk_forward_alloc.
+ *
+ *	Unlike the globally shared limits among the sockets under same protocol,
+ *	consuming the budget of a memcg won't have direct effect on other ones.
+ *	So be optimistic about memcg's tolerance, and leave the callers to decide
+ *	whether or not to raise allocated through sk_under_memory_pressure() or
+ *	its variants.
  */
 int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind)
 {
@@ -3095,7 +3101,11 @@  int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind)
 	if (sk_has_memory_pressure(sk)) {
 		u64 alloc;
 
-		if (!sk_under_memory_pressure(sk))
+		/* The following 'average' heuristic is within the
+		 * scope of global accounting, so it only makes
+		 * sense for global memory pressure.
+		 */
+		if (!sk_under_global_memory_pressure(sk))
 			return 1;
 
 		/* Try to be fair among all the sockets under global