diff mbox series

mm: change memcg->oom_group access with atomic operations

Message ID 20230220151638.1371-1-findns94@gmail.com (mailing list archive)
State New
Headers show
Series mm: change memcg->oom_group access with atomic operations | expand

Commit Message

Yue Zhao Feb. 20, 2023, 3:16 p.m. UTC
The knob for cgroup v2 memory controller: memory.oom.group
will be read and written simultaneously by user space
programs, thus we'd better change memcg->oom_group access
with atomic operations to avoid concurrency problems.

Signed-off-by: Yue Zhao <findns94@gmail.com>
---
 mm/memcontrol.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comments

Roman Gushchin Feb. 20, 2023, 9:09 p.m. UTC | #1
On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote:
> The knob for cgroup v2 memory controller: memory.oom.group
> will be read and written simultaneously by user space
> programs, thus we'd better change memcg->oom_group access
> with atomic operations to avoid concurrency problems.
> 
> Signed-off-by: Yue Zhao <findns94@gmail.com>

Hi Yue!

I'm curious, have any seen any real issues which your patch is solving?
Can you, please, provide a bit more details.

Also, WRITE/READ_ONCE() don't generally make operations atomic,
they only prevent the compiler from merging and re-fetching reads
and writes.

Thanks!
Shakeel Butt Feb. 20, 2023, 11:06 p.m. UTC | #2
On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote:
> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote:
> > The knob for cgroup v2 memory controller: memory.oom.group
> > will be read and written simultaneously by user space
> > programs, thus we'd better change memcg->oom_group access
> > with atomic operations to avoid concurrency problems.
> > 
> > Signed-off-by: Yue Zhao <findns94@gmail.com>
> 
> Hi Yue!
> 
> I'm curious, have any seen any real issues which your patch is solving?
> Can you, please, provide a bit more details.
> 

IMHO such details are not needed. oom_group is being accessed
concurrently and one of them can be a write access. At least
READ_ONCE/WRITE_ONCE is needed here. Most probably syzbot didn't
catch this race because it does not know about the memory.oom.group
interface.
Roman Gushchin Feb. 21, 2023, 5:17 a.m. UTC | #3
> On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote:
> 
> On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote:
>>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote:
>>> The knob for cgroup v2 memory controller: memory.oom.group
>>> will be read and written simultaneously by user space
>>> programs, thus we'd better change memcg->oom_group access
>>> with atomic operations to avoid concurrency problems.
>>> 
>>> Signed-off-by: Yue Zhao <findns94@gmail.com>
>> 
>> Hi Yue!
>> 
>> I'm curious, have any seen any real issues which your patch is solving?
>> Can you, please, provide a bit more details.
>> 
> 
> IMHO such details are not needed. oom_group is being accessed
> concurrently and one of them can be a write access. At least
> READ_ONCE/WRITE_ONCE is needed here.

Needed for what?

I mean it’s obviously not a big deal to put READ_ONCE()/WRITE_ONCE() here, but I struggle to imagine a scenario when it will make any difference. IMHO it’s easier to justify a proper atomic operation here, even if it’s most likely an overkill.

My question is very simple: the commit log mentions “… to avoid concurrency problems”, so I wonder what problems are these.

Also there are other similar cgroup interfaces without READ_ONCE()/WRITE_ONCE().

Thanks!
Shakeel Butt Feb. 21, 2023, 6:52 a.m. UTC | #4
On Mon, Feb 20, 2023 at 9:17 PM Roman Gushchin <roman.gushchin@linux.dev> wrote:
>
> > On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote:
> >
> > On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote:
> >>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote:
> >>> The knob for cgroup v2 memory controller: memory.oom.group
> >>> will be read and written simultaneously by user space
> >>> programs, thus we'd better change memcg->oom_group access
> >>> with atomic operations to avoid concurrency problems.
> >>>
> >>> Signed-off-by: Yue Zhao <findns94@gmail.com>
> >>
> >> Hi Yue!
> >>
> >> I'm curious, have any seen any real issues which your patch is solving?
> >> Can you, please, provide a bit more details.
> >>
> >
> > IMHO such details are not needed. oom_group is being accessed
> > concurrently and one of them can be a write access. At least
> > READ_ONCE/WRITE_ONCE is needed here.
>
> Needed for what?

For this particular case, documenting such an access. Though I don't
think there are any architectures which may tear a one byte read/write
and merging/refetching is not an issue for this.

>
> I mean it’s obviously not a big deal to put READ_ONCE()/WRITE_ONCE() here, but I struggle to imagine a scenario when it will make any difference. IMHO it’s easier to justify a proper atomic operation here, even if it’s most likely an overkill.
>
> My question is very simple: the commit log mentions “… to avoid concurrency problems”, so I wonder what problems are these.
>
> Also there are other similar cgroup interfaces without READ_ONCE()/WRITE_ONCE()

Yeah and those are v1 interfaces e.g. oom_kill_disable, swappiness,
soft_limit. These definitely need [READ|WRITE]_ONCE primitive.

Yue, can you update your patch and convert all accesses to these
fields through [READ|WRITE]_ONCE ?
Muchun Song Feb. 21, 2023, 7:22 a.m. UTC | #5
> On Feb 21, 2023, at 13:17, Roman Gushchin <roman.gushchin@linux.dev> wrote:
> 
>> On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote:
>> 
>> On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote:
>>>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote:
>>>> The knob for cgroup v2 memory controller: memory.oom.group
>>>> will be read and written simultaneously by user space
>>>> programs, thus we'd better change memcg->oom_group access
>>>> with atomic operations to avoid concurrency problems.
>>>> 
>>>> Signed-off-by: Yue Zhao <findns94@gmail.com>
>>> 
>>> Hi Yue!
>>> 
>>> I'm curious, have any seen any real issues which your patch is solving?
>>> Can you, please, provide a bit more details.
>>> 
>> 
>> IMHO such details are not needed. oom_group is being accessed
>> concurrently and one of them can be a write access. At least
>> READ_ONCE/WRITE_ONCE is needed here.
> 
> Needed for what?
> 
> I mean it’s obviously not a big deal to put READ_ONCE()/WRITE_ONCE() here, but I struggle to imagine a scenario when it will make any difference. IMHO it’s easier to justify a proper atomic operation here, even if it’s most likely an overkill.
> 
> My question is very simple: the commit log mentions “… to avoid concurrency problems”, so I wonder what problems are these.

I think there is no difference in the assembly code between them in most
cases. The only intention that I can think of is to avoid the potential
complaint (data race) emitted by KCSAN.

> 
> Also there are other similar cgroup interfaces without READ_ONCE()/WRITE_ONCE().

If we decide to fix, then we should fix all.

Thanks.

> 
> Thanks!
Michal Hocko Feb. 21, 2023, 8:26 a.m. UTC | #6
On Mon 20-02-23 23:06:24, Shakeel Butt wrote:
> On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote:
> > On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote:
> > > The knob for cgroup v2 memory controller: memory.oom.group
> > > will be read and written simultaneously by user space
> > > programs, thus we'd better change memcg->oom_group access
> > > with atomic operations to avoid concurrency problems.
> > > 
> > > Signed-off-by: Yue Zhao <findns94@gmail.com>
> > 
> > Hi Yue!
> > 
> > I'm curious, have any seen any real issues which your patch is solving?
> > Can you, please, provide a bit more details.
> > 
> 
> IMHO such details are not needed. oom_group is being accessed
> concurrently and one of them can be a write access. At least
> READ_ONCE/WRITE_ONCE is needed here. Most probably syzbot didn't
> catch this race because it does not know about the memory.oom.group
> interface.

I do agree with Roman here. It is _always_ good to mention whether this
is a tool/review or actual bug triggered fix. Also {READ,WRITE}_ONCE doesn't
guarantee atomicity so it would be good to rephrase the changelog.
Something like:
The knob for cgroup v2 memory controller: memory.oom.group
is not protected by any locking so it can be modified while it is used.
This is not an actual problem because races are unlikely (the knob is
usually configured long before any workloads hits actual memcg oom)
but it is better to use READ_ONCE/WRITE_ONCE to prevent compiler from
doing anything funky.

This patch is not fixing any actual user visible bug but it is in line
of a standard practice.
Matthew Wilcox Feb. 21, 2023, 1:51 p.m. UTC | #7
On Mon, Feb 20, 2023 at 10:52:10PM -0800, Shakeel Butt wrote:
> On Mon, Feb 20, 2023 at 9:17 PM Roman Gushchin <roman.gushchin@linux.dev> wrote:
> > > On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote:
> > >
> > > On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote:
> > >>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote:
> > >>> The knob for cgroup v2 memory controller: memory.oom.group
> > >>> will be read and written simultaneously by user space
> > >>> programs, thus we'd better change memcg->oom_group access
> > >>> with atomic operations to avoid concurrency problems.
> > >>>
> > >>> Signed-off-by: Yue Zhao <findns94@gmail.com>
> > >>
> > >> Hi Yue!
> > >>
> > >> I'm curious, have any seen any real issues which your patch is solving?
> > >> Can you, please, provide a bit more details.
> > >>
> > >
> > > IMHO such details are not needed. oom_group is being accessed
> > > concurrently and one of them can be a write access. At least
> > > READ_ONCE/WRITE_ONCE is needed here.
> >
> > Needed for what?
> 
> For this particular case, documenting such an access. Though I don't
> think there are any architectures which may tear a one byte read/write
> and merging/refetching is not an issue for this.

Wouldn't a compiler be within its rights to implement a one byte store as:

	load-word
	modify-byte-in-word
	store-word

and if this is a lockless store to a word which has an adjacent byte also
being modified by another CPU, one of those CPUs can lose its store?
And WRITE_ONCE would prevent the compiler from implementing the store
in that way.
Shakeel Butt Feb. 21, 2023, 4:56 p.m. UTC | #8
+Paul & Marco

On Tue, Feb 21, 2023 at 5:51 AM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Mon, Feb 20, 2023 at 10:52:10PM -0800, Shakeel Butt wrote:
> > On Mon, Feb 20, 2023 at 9:17 PM Roman Gushchin <roman.gushchin@linux.dev> wrote:
> > > > On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote:
> > > >
> > > > On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote:
> > > >>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote:
> > > >>> The knob for cgroup v2 memory controller: memory.oom.group
> > > >>> will be read and written simultaneously by user space
> > > >>> programs, thus we'd better change memcg->oom_group access
> > > >>> with atomic operations to avoid concurrency problems.
> > > >>>
> > > >>> Signed-off-by: Yue Zhao <findns94@gmail.com>
> > > >>
> > > >> Hi Yue!
> > > >>
> > > >> I'm curious, have any seen any real issues which your patch is solving?
> > > >> Can you, please, provide a bit more details.
> > > >>
> > > >
> > > > IMHO such details are not needed. oom_group is being accessed
> > > > concurrently and one of them can be a write access. At least
> > > > READ_ONCE/WRITE_ONCE is needed here.
> > >
> > > Needed for what?
> >
> > For this particular case, documenting such an access. Though I don't
> > think there are any architectures which may tear a one byte read/write
> > and merging/refetching is not an issue for this.
>
> Wouldn't a compiler be within its rights to implement a one byte store as:
>
>         load-word
>         modify-byte-in-word
>         store-word
>
> and if this is a lockless store to a word which has an adjacent byte also
> being modified by another CPU, one of those CPUs can lose its store?
> And WRITE_ONCE would prevent the compiler from implementing the store
> in that way.
>

Thanks Willy for pointing this out. If the compiler can really do this
then [READ|WRITE]_ONCE are required here. I always have big bad
compiler lwn article open in a tab. I couldn't map this transformation
to ones mentioned in that article. Do we have name of this one?

thanks,
Shakeel
Yue Zhao Feb. 21, 2023, 5 p.m. UTC | #9
On Tue, Feb 21, 2023 at 4:26 PM Michal Hocko <mhocko@suse.com> wrote:
>
> On Mon 20-02-23 23:06:24, Shakeel Butt wrote:
> > On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote:
> > > On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote:
> > > > The knob for cgroup v2 memory controller: memory.oom.group
> > > > will be read and written simultaneously by user space
> > > > programs, thus we'd better change memcg->oom_group access
> > > > with atomic operations to avoid concurrency problems.
> > > >
> > > > Signed-off-by: Yue Zhao <findns94@gmail.com>
> > >
> > > Hi Yue!
> > >
> > > I'm curious, have any seen any real issues which your patch is solving?
> > > Can you, please, provide a bit more details.
> > >
> >
> > IMHO such details are not needed. oom_group is being accessed
> > concurrently and one of them can be a write access. At least
> > READ_ONCE/WRITE_ONCE is needed here. Most probably syzbot didn't
> > catch this race because it does not know about the memory.oom.group
> > interface.
>
> I do agree with Roman here. It is _always_ good to mention whether this
> is a tool/review or actual bug triggered fix. Also {READ,WRITE}_ONCE doesn't
> guarantee atomicity so it would be good to rephrase the changelog.
> Something like:
> The knob for cgroup v2 memory controller: memory.oom.group
> is not protected by any locking so it can be modified while it is used.
> This is not an actual problem because races are unlikely (the knob is
> usually configured long before any workloads hits actual memcg oom)
> but it is better to use READ_ONCE/WRITE_ONCE to prevent compiler from
> doing anything funky.

Thanks a lot, I will rephrase and update my patch later.

>
> This patch is not fixing any actual user visible bug but it is in line
> of a standard practice.
> --
> Michal Hocko
> SUSE Labs
Yue Zhao Feb. 21, 2023, 5 p.m. UTC | #10
On Tue, Feb 21, 2023 at 2:52 PM Shakeel Butt <shakeelb@google.com> wrote:
>
> On Mon, Feb 20, 2023 at 9:17 PM Roman Gushchin <roman.gushchin@linux.dev> wrote:
> >
> > > On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote:
> > >
> > > On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote:
> > >>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote:
> > >>> The knob for cgroup v2 memory controller: memory.oom.group
> > >>> will be read and written simultaneously by user space
> > >>> programs, thus we'd better change memcg->oom_group access
> > >>> with atomic operations to avoid concurrency problems.
> > >>>
> > >>> Signed-off-by: Yue Zhao <findns94@gmail.com>
> > >>
> > >> Hi Yue!
> > >>
> > >> I'm curious, have any seen any real issues which your patch is solving?
> > >> Can you, please, provide a bit more details.
> > >>
> > >
> > > IMHO such details are not needed. oom_group is being accessed
> > > concurrently and one of them can be a write access. At least
> > > READ_ONCE/WRITE_ONCE is needed here.
> >
> > Needed for what?
>
> For this particular case, documenting such an access. Though I don't
> think there are any architectures which may tear a one byte read/write
> and merging/refetching is not an issue for this.
>
> >
> > I mean it’s obviously not a big deal to put READ_ONCE()/WRITE_ONCE() here, but I struggle to imagine a scenario when it will make any difference. IMHO it’s easier to justify a proper atomic operation here, even if it’s most likely an overkill.
> >
> > My question is very simple: the commit log mentions “… to avoid concurrency problems”, so I wonder what problems are these.
> >
> > Also there are other similar cgroup interfaces without READ_ONCE()/WRITE_ONCE()
>
> Yeah and those are v1 interfaces e.g. oom_kill_disable, swappiness,
> soft_limit. These definitely need [READ|WRITE]_ONCE primitive.
>
> Yue, can you update your patch and convert all accesses to these
> fields through [READ|WRITE]_ONCE ?

Sure, it will take some time to update my patch later. I think most of
the accesses use [READ|WRITE]_ONCE already, only few interfaces need to
update.
Yue Zhao Feb. 21, 2023, 5 p.m. UTC | #11
On Tue, Feb 21, 2023 at 1:17 PM Roman Gushchin <roman.gushchin@linux.dev> wrote:
>
> > On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote:
> >
> > On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote:
> >>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote:
> >>> The knob for cgroup v2 memory controller: memory.oom.group
> >>> will be read and written simultaneously by user space
> >>> programs, thus we'd better change memcg->oom_group access
> >>> with atomic operations to avoid concurrency problems.
> >>>
> >>> Signed-off-by: Yue Zhao <findns94@gmail.com>
> >>
> >> Hi Yue!
> >>
> >> I'm curious, have any seen any real issues which your patch is solving?
> >> Can you, please, provide a bit more details.
> >>
> >
> > IMHO such details are not needed. oom_group is being accessed
> > concurrently and one of them can be a write access. At least
> > READ_ONCE/WRITE_ONCE is needed here.
>
> Needed for what?
>
> I mean it’s obviously not a big deal to put READ_ONCE()/WRITE_ONCE() here, but I struggle to imagine a scenario when it will make any difference. IMHO it’s easier to justify a proper atomic operation here, even if it’s most likely an overkill.
>
> My question is very simple: the commit log mentions “… to avoid concurrency problems”, so I wonder what problems are these.

Thanks for your watching!
This topic is found in code review by coincidence, so no real issues
recorded for now. I checked other read/write callbacks about other knobs,
most of them use READ_ONCE/WRITE_ONCE on the user setting variable.
Actually I am curious as well why this interface doesn't use
READ_ONCE/WRITE_ONCE, is there any other synchronization mechanism I
didn't notice yet?

>
> Also there are other similar cgroup interfaces without READ_ONCE()/WRITE_ONCE().
>
> Thanks!
Roman Gushchin Feb. 21, 2023, 5:47 p.m. UTC | #12
On Tue, Feb 21, 2023 at 01:51:29PM +0000, Matthew Wilcox wrote:
> On Mon, Feb 20, 2023 at 10:52:10PM -0800, Shakeel Butt wrote:
> > On Mon, Feb 20, 2023 at 9:17 PM Roman Gushchin <roman.gushchin@linux.dev> wrote:
> > > > On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote:
> > > >
> > > > On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote:
> > > >>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote:
> > > >>> The knob for cgroup v2 memory controller: memory.oom.group
> > > >>> will be read and written simultaneously by user space
> > > >>> programs, thus we'd better change memcg->oom_group access
> > > >>> with atomic operations to avoid concurrency problems.
> > > >>>
> > > >>> Signed-off-by: Yue Zhao <findns94@gmail.com>
> > > >>
> > > >> Hi Yue!
> > > >>
> > > >> I'm curious, have any seen any real issues which your patch is solving?
> > > >> Can you, please, provide a bit more details.
> > > >>
> > > >
> > > > IMHO such details are not needed. oom_group is being accessed
> > > > concurrently and one of them can be a write access. At least
> > > > READ_ONCE/WRITE_ONCE is needed here.
> > >
> > > Needed for what?
> > 
> > For this particular case, documenting such an access. Though I don't
> > think there are any architectures which may tear a one byte read/write
> > and merging/refetching is not an issue for this.
> 
> Wouldn't a compiler be within its rights to implement a one byte store as:
> 
> 	load-word
> 	modify-byte-in-word
> 	store-word
> 
> and if this is a lockless store to a word which has an adjacent byte also
> being modified by another CPU, one of those CPUs can lose its store?
> And WRITE_ONCE would prevent the compiler from implementing the store
> in that way.

Even then it's not an issue in this case, as we end up with either 0 or 1,
I don't see how we can screw things up here.

Thanks!
Roman Gushchin Feb. 21, 2023, 5:48 p.m. UTC | #13
On Tue, Feb 21, 2023 at 03:22:32PM +0800, Muchun Song wrote:
> 
> 
> > On Feb 21, 2023, at 13:17, Roman Gushchin <roman.gushchin@linux.dev> wrote:
> > 
> >> On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote:
> >> 
> >> On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote:
> >>>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote:
> >>>> The knob for cgroup v2 memory controller: memory.oom.group
> >>>> will be read and written simultaneously by user space
> >>>> programs, thus we'd better change memcg->oom_group access
> >>>> with atomic operations to avoid concurrency problems.
> >>>> 
> >>>> Signed-off-by: Yue Zhao <findns94@gmail.com>
> >>> 
> >>> Hi Yue!
> >>> 
> >>> I'm curious, have any seen any real issues which your patch is solving?
> >>> Can you, please, provide a bit more details.
> >>> 
> >> 
> >> IMHO such details are not needed. oom_group is being accessed
> >> concurrently and one of them can be a write access. At least
> >> READ_ONCE/WRITE_ONCE is needed here.
> > 
> > Needed for what?
> > 
> > I mean it’s obviously not a big deal to put READ_ONCE()/WRITE_ONCE() here, but I struggle to imagine a scenario when it will make any difference. IMHO it’s easier to justify a proper atomic operation here, even if it’s most likely an overkill.
> > 
> > My question is very simple: the commit log mentions “… to avoid concurrency problems”, so I wonder what problems are these.
> 
> I think there is no difference in the assembly code between them in most
> cases. The only intention that I can think of is to avoid the potential
> complaint (data race) emitted by KCSAN.

+1

And it might be a totally good reason for this change, let's just make it clear,
instead of pretending to fix non-existing concurrency problems.

Thanks!
Roman Gushchin Feb. 21, 2023, 6:02 p.m. UTC | #14
On Wed, Feb 22, 2023 at 01:00:00AM +0800, Martin Zhao wrote:
> On Tue, Feb 21, 2023 at 1:17 PM Roman Gushchin <roman.gushchin@linux.dev> wrote:
> >
> > > On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote:
> > >
> > > On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote:
> > >>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote:
> > >>> The knob for cgroup v2 memory controller: memory.oom.group
> > >>> will be read and written simultaneously by user space
> > >>> programs, thus we'd better change memcg->oom_group access
> > >>> with atomic operations to avoid concurrency problems.
> > >>>
> > >>> Signed-off-by: Yue Zhao <findns94@gmail.com>
> > >>
> > >> Hi Yue!
> > >>
> > >> I'm curious, have any seen any real issues which your patch is solving?
> > >> Can you, please, provide a bit more details.
> > >>
> > >
> > > IMHO such details are not needed. oom_group is being accessed
> > > concurrently and one of them can be a write access. At least
> > > READ_ONCE/WRITE_ONCE is needed here.
> >
> > Needed for what?
> >
> > I mean it’s obviously not a big deal to put READ_ONCE()/WRITE_ONCE() here, but I struggle to imagine a scenario when it will make any difference. IMHO it’s easier to justify a proper atomic operation here, even if it’s most likely an overkill.
> >
> > My question is very simple: the commit log mentions “… to avoid concurrency problems”, so I wonder what problems are these.
> 
> Thanks for your watching!
> This topic is found in code review by coincidence, so no real issues
> recorded for now. I checked other read/write callbacks about other knobs,
> most of them use READ_ONCE/WRITE_ONCE on the user setting variable.

Sorry, which knobs are you talking about? I actually don't see any user knobs
in mm/memcontrol.c which are using WRITE_ONCE(). I see some of them using
xchg(), but it's a different thing.

> Actually I am curious as well why this interface doesn't use
> READ_ONCE/WRITE_ONCE, is there any other synchronization mechanism I
> didn't notice yet?

Because nobody saw any issues with the current code?

And again if it's something that makes any automated verifiers/tooling unhappy,
I'm totally fine for fixing it, just let make it clear (and also fix the commit
title, which is not true).

Thanks!
Shakeel Butt Feb. 21, 2023, 6:15 p.m. UTC | #15
On Tue, Feb 21, 2023 at 9:47 AM Roman Gushchin <roman.gushchin@linux.dev> wrote:
>
> On Tue, Feb 21, 2023 at 01:51:29PM +0000, Matthew Wilcox wrote:
> > On Mon, Feb 20, 2023 at 10:52:10PM -0800, Shakeel Butt wrote:
> > > On Mon, Feb 20, 2023 at 9:17 PM Roman Gushchin <roman.gushchin@linux.dev> wrote:
> > > > > On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote:
> > > > >
> > > > > On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote:
> > > > >>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote:
> > > > >>> The knob for cgroup v2 memory controller: memory.oom.group
> > > > >>> will be read and written simultaneously by user space
> > > > >>> programs, thus we'd better change memcg->oom_group access
> > > > >>> with atomic operations to avoid concurrency problems.
> > > > >>>
> > > > >>> Signed-off-by: Yue Zhao <findns94@gmail.com>
> > > > >>
> > > > >> Hi Yue!
> > > > >>
> > > > >> I'm curious, have any seen any real issues which your patch is solving?
> > > > >> Can you, please, provide a bit more details.
> > > > >>
> > > > >
> > > > > IMHO such details are not needed. oom_group is being accessed
> > > > > concurrently and one of them can be a write access. At least
> > > > > READ_ONCE/WRITE_ONCE is needed here.
> > > >
> > > > Needed for what?
> > >
> > > For this particular case, documenting such an access. Though I don't
> > > think there are any architectures which may tear a one byte read/write
> > > and merging/refetching is not an issue for this.
> >
> > Wouldn't a compiler be within its rights to implement a one byte store as:
> >
> >       load-word
> >       modify-byte-in-word
> >       store-word
> >
> > and if this is a lockless store to a word which has an adjacent byte also
> > being modified by another CPU, one of those CPUs can lose its store?
> > And WRITE_ONCE would prevent the compiler from implementing the store
> > in that way.
>
> Even then it's not an issue in this case, as we end up with either 0 or 1,
> I don't see how we can screw things up here.
>

What do you mean by this is not an issue in this case? Yes, the
oom_group usage will be ok but we can not say anything about the
adjacent byte/fields.
Matthew Wilcox Feb. 21, 2023, 6:18 p.m. UTC | #16
On Tue, Feb 21, 2023 at 09:47:05AM -0800, Roman Gushchin wrote:
> > Wouldn't a compiler be within its rights to implement a one byte store as:
> > 
> > 	load-word
> > 	modify-byte-in-word
> > 	store-word
> > 
> > and if this is a lockless store to a word which has an adjacent byte also
> > being modified by another CPU, one of those CPUs can lose its store?
> > And WRITE_ONCE would prevent the compiler from implementing the store
> > in that way.
> 
> Even then it's not an issue in this case, as we end up with either 0 or 1,
> I don't see how we can screw things up here.

Thread 1:
	load word containing oom_group and oom_lock

Thread 2:
	store to oom_lock

Thread 1:
	store word containing oom_group and oom_lock

Thread 2's store has been lost.
Paul E. McKenney Feb. 21, 2023, 6:23 p.m. UTC | #17
On Tue, Feb 21, 2023 at 08:56:59AM -0800, Shakeel Butt wrote:
> +Paul & Marco
> 
> On Tue, Feb 21, 2023 at 5:51 AM Matthew Wilcox <willy@infradead.org> wrote:
> >
> > On Mon, Feb 20, 2023 at 10:52:10PM -0800, Shakeel Butt wrote:
> > > On Mon, Feb 20, 2023 at 9:17 PM Roman Gushchin <roman.gushchin@linux.dev> wrote:
> > > > > On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote:
> > > > >
> > > > > On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote:
> > > > >>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote:
> > > > >>> The knob for cgroup v2 memory controller: memory.oom.group
> > > > >>> will be read and written simultaneously by user space
> > > > >>> programs, thus we'd better change memcg->oom_group access
> > > > >>> with atomic operations to avoid concurrency problems.
> > > > >>>
> > > > >>> Signed-off-by: Yue Zhao <findns94@gmail.com>
> > > > >>
> > > > >> Hi Yue!
> > > > >>
> > > > >> I'm curious, have any seen any real issues which your patch is solving?
> > > > >> Can you, please, provide a bit more details.
> > > > >>
> > > > >
> > > > > IMHO such details are not needed. oom_group is being accessed
> > > > > concurrently and one of them can be a write access. At least
> > > > > READ_ONCE/WRITE_ONCE is needed here.
> > > >
> > > > Needed for what?
> > >
> > > For this particular case, documenting such an access. Though I don't
> > > think there are any architectures which may tear a one byte read/write
> > > and merging/refetching is not an issue for this.
> >
> > Wouldn't a compiler be within its rights to implement a one byte store as:
> >
> >         load-word
> >         modify-byte-in-word
> >         store-word
> >
> > and if this is a lockless store to a word which has an adjacent byte also
> > being modified by another CPU, one of those CPUs can lose its store?
> > And WRITE_ONCE would prevent the compiler from implementing the store
> > in that way.
> 
> Thanks Willy for pointing this out. If the compiler can really do this
> then [READ|WRITE]_ONCE are required here. I always have big bad
> compiler lwn article open in a tab. I couldn't map this transformation
> to ones mentioned in that article. Do we have name of this one?

No, recent compilers are absolutely forbidden from doing this sort of
thing except under very special circumstances.

Before C11, compilers could and in fact did do things like this.  This is
after all a great way to keep the CPU's vector unit from getting bored.
Unfortunately for those who prize optimization above all else, doing
this can introduce data races, for example:

	char a;
	char b;
	spin_lock la;
	spin_lock lb;

	void change_a(char new_a)
	{
		spin_lock(&la);
		a = new_a;
		spin_unlock(&la);
	}

	void change_b(char new_b)
	{
		spin_lock(&lb);
		b = new_b;
		spin_unlock(&lb);
	}

If the compiler "optimized" that "a = new_a" so as to produce a non-atomic
read-modify-write sequence, it would be introducing a data race.
And since C11, the compiler is absolutely forbidden from introducing
data races.  So, again, no, the compiler cannot invent writes to
variables.

What are those very special circumstances?

1.	The other variables were going to be written to anyway, and
	none of the writes was non-volatile and there was no ordering
	directive between any of those writes.

2.	The other variables are dead, as in there are no subsequent
	reads from them anywhere in the program.  Of course in that case,
	there is no need to read the prior values of those variables.

3.	All accesses to all of the variables are visible to the compiler,
	and the compiler can prove that there are no concurrent accesses
	to any of them.  For example, all of the variables are on-stack
	variables whose addresses are never taken.

Does that help, or am I misunderstanding the question?

							Thanx, Paul
Roman Gushchin Feb. 21, 2023, 10:23 p.m. UTC | #18
On Tue, Feb 21, 2023 at 10:23:59AM -0800, Paul E. McKenney wrote:
> On Tue, Feb 21, 2023 at 08:56:59AM -0800, Shakeel Butt wrote:
> > +Paul & Marco
> > 
> > On Tue, Feb 21, 2023 at 5:51 AM Matthew Wilcox <willy@infradead.org> wrote:
> > >
> > > On Mon, Feb 20, 2023 at 10:52:10PM -0800, Shakeel Butt wrote:
> > > > On Mon, Feb 20, 2023 at 9:17 PM Roman Gushchin <roman.gushchin@linux.dev> wrote:
> > > > > > On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote:
> > > > > >
> > > > > > On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote:
> > > > > >>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote:
> > > > > >>> The knob for cgroup v2 memory controller: memory.oom.group
> > > > > >>> will be read and written simultaneously by user space
> > > > > >>> programs, thus we'd better change memcg->oom_group access
> > > > > >>> with atomic operations to avoid concurrency problems.
> > > > > >>>
> > > > > >>> Signed-off-by: Yue Zhao <findns94@gmail.com>
> > > > > >>
> > > > > >> Hi Yue!
> > > > > >>
> > > > > >> I'm curious, have any seen any real issues which your patch is solving?
> > > > > >> Can you, please, provide a bit more details.
> > > > > >>
> > > > > >
> > > > > > IMHO such details are not needed. oom_group is being accessed
> > > > > > concurrently and one of them can be a write access. At least
> > > > > > READ_ONCE/WRITE_ONCE is needed here.
> > > > >
> > > > > Needed for what?
> > > >
> > > > For this particular case, documenting such an access. Though I don't
> > > > think there are any architectures which may tear a one byte read/write
> > > > and merging/refetching is not an issue for this.
> > >
> > > Wouldn't a compiler be within its rights to implement a one byte store as:
> > >
> > >         load-word
> > >         modify-byte-in-word
> > >         store-word
> > >
> > > and if this is a lockless store to a word which has an adjacent byte also
> > > being modified by another CPU, one of those CPUs can lose its store?
> > > And WRITE_ONCE would prevent the compiler from implementing the store
> > > in that way.
> > 
> > Thanks Willy for pointing this out. If the compiler can really do this
> > then [READ|WRITE]_ONCE are required here. I always have big bad
> > compiler lwn article open in a tab. I couldn't map this transformation
> > to ones mentioned in that article. Do we have name of this one?
> 
> No, recent compilers are absolutely forbidden from doing this sort of
> thing except under very special circumstances.
> 
> Before C11, compilers could and in fact did do things like this.  This is
> after all a great way to keep the CPU's vector unit from getting bored.
> Unfortunately for those who prize optimization above all else, doing
> this can introduce data races, for example:
> 
> 	char a;
> 	char b;
> 	spin_lock la;
> 	spin_lock lb;
> 
> 	void change_a(char new_a)
> 	{
> 		spin_lock(&la);
> 		a = new_a;
> 		spin_unlock(&la);
> 	}
> 
> 	void change_b(char new_b)
> 	{
> 		spin_lock(&lb);
> 		b = new_b;
> 		spin_unlock(&lb);
> 	}
> 
> If the compiler "optimized" that "a = new_a" so as to produce a non-atomic
> read-modify-write sequence, it would be introducing a data race.
> And since C11, the compiler is absolutely forbidden from introducing
> data races.  So, again, no, the compiler cannot invent writes to
> variables.
> 
> What are those very special circumstances?
> 
> 1.	The other variables were going to be written to anyway, and
> 	none of the writes was non-volatile and there was no ordering
> 	directive between any of those writes.
> 
> 2.	The other variables are dead, as in there are no subsequent
> 	reads from them anywhere in the program.  Of course in that case,
> 	there is no need to read the prior values of those variables.
> 
> 3.	All accesses to all of the variables are visible to the compiler,
> 	and the compiler can prove that there are no concurrent accesses
> 	to any of them.  For example, all of the variables are on-stack
> 	variables whose addresses are never taken.
> 
> Does that help, or am I misunderstanding the question?

Thank you, Paul!

So it seems like READ_ONCE()/WRITE_ONCE() are totally useless here.
Or I still miss something?

Thanks!
Paul E. McKenney Feb. 21, 2023, 10:38 p.m. UTC | #19
On Tue, Feb 21, 2023 at 02:23:31PM -0800, Roman Gushchin wrote:
> On Tue, Feb 21, 2023 at 10:23:59AM -0800, Paul E. McKenney wrote:
> > On Tue, Feb 21, 2023 at 08:56:59AM -0800, Shakeel Butt wrote:
> > > +Paul & Marco
> > > 
> > > On Tue, Feb 21, 2023 at 5:51 AM Matthew Wilcox <willy@infradead.org> wrote:
> > > >
> > > > On Mon, Feb 20, 2023 at 10:52:10PM -0800, Shakeel Butt wrote:
> > > > > On Mon, Feb 20, 2023 at 9:17 PM Roman Gushchin <roman.gushchin@linux.dev> wrote:
> > > > > > > On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote:
> > > > > > >
> > > > > > > On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote:
> > > > > > >>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote:
> > > > > > >>> The knob for cgroup v2 memory controller: memory.oom.group
> > > > > > >>> will be read and written simultaneously by user space
> > > > > > >>> programs, thus we'd better change memcg->oom_group access
> > > > > > >>> with atomic operations to avoid concurrency problems.
> > > > > > >>>
> > > > > > >>> Signed-off-by: Yue Zhao <findns94@gmail.com>
> > > > > > >>
> > > > > > >> Hi Yue!
> > > > > > >>
> > > > > > >> I'm curious, have any seen any real issues which your patch is solving?
> > > > > > >> Can you, please, provide a bit more details.
> > > > > > >>
> > > > > > >
> > > > > > > IMHO such details are not needed. oom_group is being accessed
> > > > > > > concurrently and one of them can be a write access. At least
> > > > > > > READ_ONCE/WRITE_ONCE is needed here.
> > > > > >
> > > > > > Needed for what?
> > > > >
> > > > > For this particular case, documenting such an access. Though I don't
> > > > > think there are any architectures which may tear a one byte read/write
> > > > > and merging/refetching is not an issue for this.
> > > >
> > > > Wouldn't a compiler be within its rights to implement a one byte store as:
> > > >
> > > >         load-word
> > > >         modify-byte-in-word
> > > >         store-word
> > > >
> > > > and if this is a lockless store to a word which has an adjacent byte also
> > > > being modified by another CPU, one of those CPUs can lose its store?
> > > > And WRITE_ONCE would prevent the compiler from implementing the store
> > > > in that way.
> > > 
> > > Thanks Willy for pointing this out. If the compiler can really do this
> > > then [READ|WRITE]_ONCE are required here. I always have big bad
> > > compiler lwn article open in a tab. I couldn't map this transformation
> > > to ones mentioned in that article. Do we have name of this one?
> > 
> > No, recent compilers are absolutely forbidden from doing this sort of
> > thing except under very special circumstances.
> > 
> > Before C11, compilers could and in fact did do things like this.  This is
> > after all a great way to keep the CPU's vector unit from getting bored.
> > Unfortunately for those who prize optimization above all else, doing
> > this can introduce data races, for example:
> > 
> > 	char a;
> > 	char b;
> > 	spin_lock la;
> > 	spin_lock lb;
> > 
> > 	void change_a(char new_a)
> > 	{
> > 		spin_lock(&la);
> > 		a = new_a;
> > 		spin_unlock(&la);
> > 	}
> > 
> > 	void change_b(char new_b)
> > 	{
> > 		spin_lock(&lb);
> > 		b = new_b;
> > 		spin_unlock(&lb);
> > 	}
> > 
> > If the compiler "optimized" that "a = new_a" so as to produce a non-atomic
> > read-modify-write sequence, it would be introducing a data race.
> > And since C11, the compiler is absolutely forbidden from introducing
> > data races.  So, again, no, the compiler cannot invent writes to
> > variables.
> > 
> > What are those very special circumstances?
> > 
> > 1.	The other variables were going to be written to anyway, and
> > 	none of the writes was non-volatile and there was no ordering
> > 	directive between any of those writes.
> > 
> > 2.	The other variables are dead, as in there are no subsequent
> > 	reads from them anywhere in the program.  Of course in that case,
> > 	there is no need to read the prior values of those variables.
> > 
> > 3.	All accesses to all of the variables are visible to the compiler,
> > 	and the compiler can prove that there are no concurrent accesses
> > 	to any of them.  For example, all of the variables are on-stack
> > 	variables whose addresses are never taken.
> > 
> > Does that help, or am I misunderstanding the question?
> 
> Thank you, Paul!
> 
> So it seems like READ_ONCE()/WRITE_ONCE() are totally useless here.
> Or I still miss something?

Yes, given that the compiler will already avoid inventing data-race-prone
C-language accesses to shared variables, so if that was the only reason
that you were using READ_ONCE() or WRITE_ONCE(), then READ_ONCE() and
WRITE_ONCE() won't be helping you.

Or perhaps better to put it a different way...  The fact that the compiler
is not permitted to invent data-racy reads and writes is exactly why
you do not normally need READ_ONCE() and WRITE_ONCE() for accesses in
lock-based critical sections.  Instead, you only need READ_ONCE() and
WRITE_ONCE() when you have lockless accesses to the same shared variables.

							Thanx, Paul
Shakeel Butt Feb. 21, 2023, 11:13 p.m. UTC | #20
On Tue, Feb 21, 2023 at 2:38 PM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Tue, Feb 21, 2023 at 02:23:31PM -0800, Roman Gushchin wrote:
> > On Tue, Feb 21, 2023 at 10:23:59AM -0800, Paul E. McKenney wrote:
> > > On Tue, Feb 21, 2023 at 08:56:59AM -0800, Shakeel Butt wrote:
> > > > +Paul & Marco
> > > >
> > > > On Tue, Feb 21, 2023 at 5:51 AM Matthew Wilcox <willy@infradead.org> wrote:
> > > > >
> > > > > On Mon, Feb 20, 2023 at 10:52:10PM -0800, Shakeel Butt wrote:
> > > > > > On Mon, Feb 20, 2023 at 9:17 PM Roman Gushchin <roman.gushchin@linux.dev> wrote:
> > > > > > > > On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote:
> > > > > > > >
> > > > > > > > On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote:
> > > > > > > >>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote:
> > > > > > > >>> The knob for cgroup v2 memory controller: memory.oom.group
> > > > > > > >>> will be read and written simultaneously by user space
> > > > > > > >>> programs, thus we'd better change memcg->oom_group access
> > > > > > > >>> with atomic operations to avoid concurrency problems.
> > > > > > > >>>
> > > > > > > >>> Signed-off-by: Yue Zhao <findns94@gmail.com>
> > > > > > > >>
> > > > > > > >> Hi Yue!
> > > > > > > >>
> > > > > > > >> I'm curious, have any seen any real issues which your patch is solving?
> > > > > > > >> Can you, please, provide a bit more details.
> > > > > > > >>
> > > > > > > >
> > > > > > > > IMHO such details are not needed. oom_group is being accessed
> > > > > > > > concurrently and one of them can be a write access. At least
> > > > > > > > READ_ONCE/WRITE_ONCE is needed here.
> > > > > > >
> > > > > > > Needed for what?
> > > > > >
> > > > > > For this particular case, documenting such an access. Though I don't
> > > > > > think there are any architectures which may tear a one byte read/write
> > > > > > and merging/refetching is not an issue for this.
> > > > >
> > > > > Wouldn't a compiler be within its rights to implement a one byte store as:
> > > > >
> > > > >         load-word
> > > > >         modify-byte-in-word
> > > > >         store-word
> > > > >
> > > > > and if this is a lockless store to a word which has an adjacent byte also
> > > > > being modified by another CPU, one of those CPUs can lose its store?
> > > > > And WRITE_ONCE would prevent the compiler from implementing the store
> > > > > in that way.
> > > >
> > > > Thanks Willy for pointing this out. If the compiler can really do this
> > > > then [READ|WRITE]_ONCE are required here. I always have big bad
> > > > compiler lwn article open in a tab. I couldn't map this transformation
> > > > to ones mentioned in that article. Do we have name of this one?
> > >
> > > No, recent compilers are absolutely forbidden from doing this sort of
> > > thing except under very special circumstances.
> > >
> > > Before C11, compilers could and in fact did do things like this.  This is
> > > after all a great way to keep the CPU's vector unit from getting bored.
> > > Unfortunately for those who prize optimization above all else, doing
> > > this can introduce data races, for example:
> > >
> > >     char a;
> > >     char b;
> > >     spin_lock la;
> > >     spin_lock lb;
> > >
> > >     void change_a(char new_a)
> > >     {
> > >             spin_lock(&la);
> > >             a = new_a;
> > >             spin_unlock(&la);
> > >     }
> > >
> > >     void change_b(char new_b)
> > >     {
> > >             spin_lock(&lb);
> > >             b = new_b;
> > >             spin_unlock(&lb);
> > >     }
> > >
> > > If the compiler "optimized" that "a = new_a" so as to produce a non-atomic
> > > read-modify-write sequence, it would be introducing a data race.
> > > And since C11, the compiler is absolutely forbidden from introducing
> > > data races.  So, again, no, the compiler cannot invent writes to
> > > variables.
> > >
> > > What are those very special circumstances?
> > >
> > > 1.  The other variables were going to be written to anyway, and
> > >     none of the writes was non-volatile and there was no ordering
> > >     directive between any of those writes.
> > >
> > > 2.  The other variables are dead, as in there are no subsequent
> > >     reads from them anywhere in the program.  Of course in that case,
> > >     there is no need to read the prior values of those variables.
> > >
> > > 3.  All accesses to all of the variables are visible to the compiler,
> > >     and the compiler can prove that there are no concurrent accesses
> > >     to any of them.  For example, all of the variables are on-stack
> > >     variables whose addresses are never taken.
> > >
> > > Does that help, or am I misunderstanding the question?
> >
> > Thank you, Paul!
> >
> > So it seems like READ_ONCE()/WRITE_ONCE() are totally useless here.
> > Or I still miss something?
>
> Yes, given that the compiler will already avoid inventing data-race-prone
> C-language accesses to shared variables, so if that was the only reason
> that you were using READ_ONCE() or WRITE_ONCE(), then READ_ONCE() and
> WRITE_ONCE() won't be helping you.
>
> Or perhaps better to put it a different way...  The fact that the compiler
> is not permitted to invent data-racy reads and writes is exactly why
> you do not normally need READ_ONCE() and WRITE_ONCE() for accesses in
> lock-based critical sections.  Instead, you only need READ_ONCE() and
> WRITE_ONCE() when you have lockless accesses to the same shared variables.
>

This is lockless access to memcg->oom_group potentially from multiple
CPUs, so, READ_ONCE() and WRITE_ONCE() are needed, right?
Paul E. McKenney Feb. 21, 2023, 11:38 p.m. UTC | #21
On Tue, Feb 21, 2023 at 03:13:36PM -0800, Shakeel Butt wrote:
> On Tue, Feb 21, 2023 at 2:38 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Tue, Feb 21, 2023 at 02:23:31PM -0800, Roman Gushchin wrote:
> > > On Tue, Feb 21, 2023 at 10:23:59AM -0800, Paul E. McKenney wrote:
> > > > On Tue, Feb 21, 2023 at 08:56:59AM -0800, Shakeel Butt wrote:
> > > > > +Paul & Marco
> > > > >
> > > > > On Tue, Feb 21, 2023 at 5:51 AM Matthew Wilcox <willy@infradead.org> wrote:
> > > > > >
> > > > > > On Mon, Feb 20, 2023 at 10:52:10PM -0800, Shakeel Butt wrote:
> > > > > > > On Mon, Feb 20, 2023 at 9:17 PM Roman Gushchin <roman.gushchin@linux.dev> wrote:
> > > > > > > > > On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote:
> > > > > > > > >
> > > > > > > > > On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote:
> > > > > > > > >>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote:
> > > > > > > > >>> The knob for cgroup v2 memory controller: memory.oom.group
> > > > > > > > >>> will be read and written simultaneously by user space
> > > > > > > > >>> programs, thus we'd better change memcg->oom_group access
> > > > > > > > >>> with atomic operations to avoid concurrency problems.
> > > > > > > > >>>
> > > > > > > > >>> Signed-off-by: Yue Zhao <findns94@gmail.com>
> > > > > > > > >>
> > > > > > > > >> Hi Yue!
> > > > > > > > >>
> > > > > > > > >> I'm curious, have any seen any real issues which your patch is solving?
> > > > > > > > >> Can you, please, provide a bit more details.
> > > > > > > > >>
> > > > > > > > >
> > > > > > > > > IMHO such details are not needed. oom_group is being accessed
> > > > > > > > > concurrently and one of them can be a write access. At least
> > > > > > > > > READ_ONCE/WRITE_ONCE is needed here.
> > > > > > > >
> > > > > > > > Needed for what?
> > > > > > >
> > > > > > > For this particular case, documenting such an access. Though I don't
> > > > > > > think there are any architectures which may tear a one byte read/write
> > > > > > > and merging/refetching is not an issue for this.
> > > > > >
> > > > > > Wouldn't a compiler be within its rights to implement a one byte store as:
> > > > > >
> > > > > >         load-word
> > > > > >         modify-byte-in-word
> > > > > >         store-word
> > > > > >
> > > > > > and if this is a lockless store to a word which has an adjacent byte also
> > > > > > being modified by another CPU, one of those CPUs can lose its store?
> > > > > > And WRITE_ONCE would prevent the compiler from implementing the store
> > > > > > in that way.
> > > > >
> > > > > Thanks Willy for pointing this out. If the compiler can really do this
> > > > > then [READ|WRITE]_ONCE are required here. I always have big bad
> > > > > compiler lwn article open in a tab. I couldn't map this transformation
> > > > > to ones mentioned in that article. Do we have name of this one?
> > > >
> > > > No, recent compilers are absolutely forbidden from doing this sort of
> > > > thing except under very special circumstances.
> > > >
> > > > Before C11, compilers could and in fact did do things like this.  This is
> > > > after all a great way to keep the CPU's vector unit from getting bored.
> > > > Unfortunately for those who prize optimization above all else, doing
> > > > this can introduce data races, for example:
> > > >
> > > >     char a;
> > > >     char b;
> > > >     spin_lock la;
> > > >     spin_lock lb;
> > > >
> > > >     void change_a(char new_a)
> > > >     {
> > > >             spin_lock(&la);
> > > >             a = new_a;
> > > >             spin_unlock(&la);
> > > >     }
> > > >
> > > >     void change_b(char new_b)
> > > >     {
> > > >             spin_lock(&lb);
> > > >             b = new_b;
> > > >             spin_unlock(&lb);
> > > >     }
> > > >
> > > > If the compiler "optimized" that "a = new_a" so as to produce a non-atomic
> > > > read-modify-write sequence, it would be introducing a data race.
> > > > And since C11, the compiler is absolutely forbidden from introducing
> > > > data races.  So, again, no, the compiler cannot invent writes to
> > > > variables.
> > > >
> > > > What are those very special circumstances?
> > > >
> > > > 1.  The other variables were going to be written to anyway, and
> > > >     none of the writes was non-volatile and there was no ordering
> > > >     directive between any of those writes.
> > > >
> > > > 2.  The other variables are dead, as in there are no subsequent
> > > >     reads from them anywhere in the program.  Of course in that case,
> > > >     there is no need to read the prior values of those variables.
> > > >
> > > > 3.  All accesses to all of the variables are visible to the compiler,
> > > >     and the compiler can prove that there are no concurrent accesses
> > > >     to any of them.  For example, all of the variables are on-stack
> > > >     variables whose addresses are never taken.
> > > >
> > > > Does that help, or am I misunderstanding the question?
> > >
> > > Thank you, Paul!
> > >
> > > So it seems like READ_ONCE()/WRITE_ONCE() are totally useless here.
> > > Or I still miss something?
> >
> > Yes, given that the compiler will already avoid inventing data-race-prone
> > C-language accesses to shared variables, so if that was the only reason
> > that you were using READ_ONCE() or WRITE_ONCE(), then READ_ONCE() and
> > WRITE_ONCE() won't be helping you.
> >
> > Or perhaps better to put it a different way...  The fact that the compiler
> > is not permitted to invent data-racy reads and writes is exactly why
> > you do not normally need READ_ONCE() and WRITE_ONCE() for accesses in
> > lock-based critical sections.  Instead, you only need READ_ONCE() and
> > WRITE_ONCE() when you have lockless accesses to the same shared variables.
> 
> This is lockless access to memcg->oom_group potentially from multiple
> CPUs, so, READ_ONCE() and WRITE_ONCE() are needed, right?

Agreed, lockless concurrent accesses should use READ_ONCE() and WRITE_ONCE().
And if either conflicting access is lockless, it is lockless.  ;-)

							Thanx, Paul
Roman Gushchin Feb. 21, 2023, 11:57 p.m. UTC | #22
On Tue, Feb 21, 2023 at 03:38:24PM -0800, Paul E. McKenney wrote:
> On Tue, Feb 21, 2023 at 03:13:36PM -0800, Shakeel Butt wrote:
> > On Tue, Feb 21, 2023 at 2:38 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > >
> > > On Tue, Feb 21, 2023 at 02:23:31PM -0800, Roman Gushchin wrote:
> > > > On Tue, Feb 21, 2023 at 10:23:59AM -0800, Paul E. McKenney wrote:
> > > > > On Tue, Feb 21, 2023 at 08:56:59AM -0800, Shakeel Butt wrote:
> > > > > > +Paul & Marco
> > > > > >
> > > > > > On Tue, Feb 21, 2023 at 5:51 AM Matthew Wilcox <willy@infradead.org> wrote:
> > > > > > >
> > > > > > > On Mon, Feb 20, 2023 at 10:52:10PM -0800, Shakeel Butt wrote:
> > > > > > > > On Mon, Feb 20, 2023 at 9:17 PM Roman Gushchin <roman.gushchin@linux.dev> wrote:
> > > > > > > > > > On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote:
> > > > > > > > > >
> > > > > > > > > > On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote:
> > > > > > > > > >>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote:
> > > > > > > > > >>> The knob for cgroup v2 memory controller: memory.oom.group
> > > > > > > > > >>> will be read and written simultaneously by user space
> > > > > > > > > >>> programs, thus we'd better change memcg->oom_group access
> > > > > > > > > >>> with atomic operations to avoid concurrency problems.
> > > > > > > > > >>>
> > > > > > > > > >>> Signed-off-by: Yue Zhao <findns94@gmail.com>
> > > > > > > > > >>
> > > > > > > > > >> Hi Yue!
> > > > > > > > > >>
> > > > > > > > > >> I'm curious, have any seen any real issues which your patch is solving?
> > > > > > > > > >> Can you, please, provide a bit more details.
> > > > > > > > > >>
> > > > > > > > > >
> > > > > > > > > > IMHO such details are not needed. oom_group is being accessed
> > > > > > > > > > concurrently and one of them can be a write access. At least
> > > > > > > > > > READ_ONCE/WRITE_ONCE is needed here.
> > > > > > > > >
> > > > > > > > > Needed for what?
> > > > > > > >
> > > > > > > > For this particular case, documenting such an access. Though I don't
> > > > > > > > think there are any architectures which may tear a one byte read/write
> > > > > > > > and merging/refetching is not an issue for this.
> > > > > > >
> > > > > > > Wouldn't a compiler be within its rights to implement a one byte store as:
> > > > > > >
> > > > > > >         load-word
> > > > > > >         modify-byte-in-word
> > > > > > >         store-word
> > > > > > >
> > > > > > > and if this is a lockless store to a word which has an adjacent byte also
> > > > > > > being modified by another CPU, one of those CPUs can lose its store?
> > > > > > > And WRITE_ONCE would prevent the compiler from implementing the store
> > > > > > > in that way.
> > > > > >
> > > > > > Thanks Willy for pointing this out. If the compiler can really do this
> > > > > > then [READ|WRITE]_ONCE are required here. I always have big bad
> > > > > > compiler lwn article open in a tab. I couldn't map this transformation
> > > > > > to ones mentioned in that article. Do we have name of this one?
> > > > >
> > > > > No, recent compilers are absolutely forbidden from doing this sort of
> > > > > thing except under very special circumstances.
> > > > >
> > > > > Before C11, compilers could and in fact did do things like this.  This is
> > > > > after all a great way to keep the CPU's vector unit from getting bored.
> > > > > Unfortunately for those who prize optimization above all else, doing
> > > > > this can introduce data races, for example:
> > > > >
> > > > >     char a;
> > > > >     char b;
> > > > >     spin_lock la;
> > > > >     spin_lock lb;
> > > > >
> > > > >     void change_a(char new_a)
> > > > >     {
> > > > >             spin_lock(&la);
> > > > >             a = new_a;
> > > > >             spin_unlock(&la);
> > > > >     }
> > > > >
> > > > >     void change_b(char new_b)
> > > > >     {
> > > > >             spin_lock(&lb);
> > > > >             b = new_b;
> > > > >             spin_unlock(&lb);
> > > > >     }
> > > > >
> > > > > If the compiler "optimized" that "a = new_a" so as to produce a non-atomic
> > > > > read-modify-write sequence, it would be introducing a data race.
> > > > > And since C11, the compiler is absolutely forbidden from introducing
> > > > > data races.  So, again, no, the compiler cannot invent writes to
> > > > > variables.
> > > > >
> > > > > What are those very special circumstances?
> > > > >
> > > > > 1.  The other variables were going to be written to anyway, and
> > > > >     none of the writes was non-volatile and there was no ordering
> > > > >     directive between any of those writes.
> > > > >
> > > > > 2.  The other variables are dead, as in there are no subsequent
> > > > >     reads from them anywhere in the program.  Of course in that case,
> > > > >     there is no need to read the prior values of those variables.
> > > > >
> > > > > 3.  All accesses to all of the variables are visible to the compiler,
> > > > >     and the compiler can prove that there are no concurrent accesses
> > > > >     to any of them.  For example, all of the variables are on-stack
> > > > >     variables whose addresses are never taken.
> > > > >
> > > > > Does that help, or am I misunderstanding the question?
> > > >
> > > > Thank you, Paul!
> > > >
> > > > So it seems like READ_ONCE()/WRITE_ONCE() are totally useless here.
> > > > Or I still miss something?
> > >
> > > Yes, given that the compiler will already avoid inventing data-race-prone
> > > C-language accesses to shared variables, so if that was the only reason
> > > that you were using READ_ONCE() or WRITE_ONCE(), then READ_ONCE() and
> > > WRITE_ONCE() won't be helping you.
> > >
> > > Or perhaps better to put it a different way...  The fact that the compiler
> > > is not permitted to invent data-racy reads and writes is exactly why
> > > you do not normally need READ_ONCE() and WRITE_ONCE() for accesses in
> > > lock-based critical sections.  Instead, you only need READ_ONCE() and
> > > WRITE_ONCE() when you have lockless accesses to the same shared variables.
> > 
> > This is lockless access to memcg->oom_group potentially from multiple
> > CPUs, so, READ_ONCE() and WRITE_ONCE() are needed, right?
> 
> Agreed, lockless concurrent accesses should use READ_ONCE() and WRITE_ONCE().
> And if either conflicting access is lockless, it is lockless.  ;-)

Now I'm confused, why we should use it here?
Writing is happening from a separate syscall (a single write from a syscall),
reading is happening from a oom context. The variable is boolean, it's either
0 or 1. What difference READ_ONCE()/WRITE_ONCE() will make here?
Thanks!
Paul E. McKenney Feb. 22, 2023, 12:37 a.m. UTC | #23
On Tue, Feb 21, 2023 at 03:57:58PM -0800, Roman Gushchin wrote:
> On Tue, Feb 21, 2023 at 03:38:24PM -0800, Paul E. McKenney wrote:
> > On Tue, Feb 21, 2023 at 03:13:36PM -0800, Shakeel Butt wrote:
> > > On Tue, Feb 21, 2023 at 2:38 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > >
> > > > On Tue, Feb 21, 2023 at 02:23:31PM -0800, Roman Gushchin wrote:
> > > > > On Tue, Feb 21, 2023 at 10:23:59AM -0800, Paul E. McKenney wrote:
> > > > > > On Tue, Feb 21, 2023 at 08:56:59AM -0800, Shakeel Butt wrote:
> > > > > > > +Paul & Marco
> > > > > > >
> > > > > > > On Tue, Feb 21, 2023 at 5:51 AM Matthew Wilcox <willy@infradead.org> wrote:
> > > > > > > >
> > > > > > > > On Mon, Feb 20, 2023 at 10:52:10PM -0800, Shakeel Butt wrote:
> > > > > > > > > On Mon, Feb 20, 2023 at 9:17 PM Roman Gushchin <roman.gushchin@linux.dev> wrote:
> > > > > > > > > > > On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > > On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote:
> > > > > > > > > > >>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote:
> > > > > > > > > > >>> The knob for cgroup v2 memory controller: memory.oom.group
> > > > > > > > > > >>> will be read and written simultaneously by user space
> > > > > > > > > > >>> programs, thus we'd better change memcg->oom_group access
> > > > > > > > > > >>> with atomic operations to avoid concurrency problems.
> > > > > > > > > > >>>
> > > > > > > > > > >>> Signed-off-by: Yue Zhao <findns94@gmail.com>
> > > > > > > > > > >>
> > > > > > > > > > >> Hi Yue!
> > > > > > > > > > >>
> > > > > > > > > > >> I'm curious, have any seen any real issues which your patch is solving?
> > > > > > > > > > >> Can you, please, provide a bit more details.
> > > > > > > > > > >>
> > > > > > > > > > >
> > > > > > > > > > > IMHO such details are not needed. oom_group is being accessed
> > > > > > > > > > > concurrently and one of them can be a write access. At least
> > > > > > > > > > > READ_ONCE/WRITE_ONCE is needed here.
> > > > > > > > > >
> > > > > > > > > > Needed for what?
> > > > > > > > >
> > > > > > > > > For this particular case, documenting such an access. Though I don't
> > > > > > > > > think there are any architectures which may tear a one byte read/write
> > > > > > > > > and merging/refetching is not an issue for this.
> > > > > > > >
> > > > > > > > Wouldn't a compiler be within its rights to implement a one byte store as:
> > > > > > > >
> > > > > > > >         load-word
> > > > > > > >         modify-byte-in-word
> > > > > > > >         store-word
> > > > > > > >
> > > > > > > > and if this is a lockless store to a word which has an adjacent byte also
> > > > > > > > being modified by another CPU, one of those CPUs can lose its store?
> > > > > > > > And WRITE_ONCE would prevent the compiler from implementing the store
> > > > > > > > in that way.
> > > > > > >
> > > > > > > Thanks Willy for pointing this out. If the compiler can really do this
> > > > > > > then [READ|WRITE]_ONCE are required here. I always have big bad
> > > > > > > compiler lwn article open in a tab. I couldn't map this transformation
> > > > > > > to ones mentioned in that article. Do we have name of this one?
> > > > > >
> > > > > > No, recent compilers are absolutely forbidden from doing this sort of
> > > > > > thing except under very special circumstances.
> > > > > >
> > > > > > Before C11, compilers could and in fact did do things like this.  This is
> > > > > > after all a great way to keep the CPU's vector unit from getting bored.
> > > > > > Unfortunately for those who prize optimization above all else, doing
> > > > > > this can introduce data races, for example:
> > > > > >
> > > > > >     char a;
> > > > > >     char b;
> > > > > >     spin_lock la;
> > > > > >     spin_lock lb;
> > > > > >
> > > > > >     void change_a(char new_a)
> > > > > >     {
> > > > > >             spin_lock(&la);
> > > > > >             a = new_a;
> > > > > >             spin_unlock(&la);
> > > > > >     }
> > > > > >
> > > > > >     void change_b(char new_b)
> > > > > >     {
> > > > > >             spin_lock(&lb);
> > > > > >             b = new_b;
> > > > > >             spin_unlock(&lb);
> > > > > >     }
> > > > > >
> > > > > > If the compiler "optimized" that "a = new_a" so as to produce a non-atomic
> > > > > > read-modify-write sequence, it would be introducing a data race.
> > > > > > And since C11, the compiler is absolutely forbidden from introducing
> > > > > > data races.  So, again, no, the compiler cannot invent writes to
> > > > > > variables.
> > > > > >
> > > > > > What are those very special circumstances?
> > > > > >
> > > > > > 1.  The other variables were going to be written to anyway, and
> > > > > >     none of the writes was non-volatile and there was no ordering
> > > > > >     directive between any of those writes.
> > > > > >
> > > > > > 2.  The other variables are dead, as in there are no subsequent
> > > > > >     reads from them anywhere in the program.  Of course in that case,
> > > > > >     there is no need to read the prior values of those variables.
> > > > > >
> > > > > > 3.  All accesses to all of the variables are visible to the compiler,
> > > > > >     and the compiler can prove that there are no concurrent accesses
> > > > > >     to any of them.  For example, all of the variables are on-stack
> > > > > >     variables whose addresses are never taken.
> > > > > >
> > > > > > Does that help, or am I misunderstanding the question?
> > > > >
> > > > > Thank you, Paul!
> > > > >
> > > > > So it seems like READ_ONCE()/WRITE_ONCE() are totally useless here.
> > > > > Or I still miss something?
> > > >
> > > > Yes, given that the compiler will already avoid inventing data-race-prone
> > > > C-language accesses to shared variables, so if that was the only reason
> > > > that you were using READ_ONCE() or WRITE_ONCE(), then READ_ONCE() and
> > > > WRITE_ONCE() won't be helping you.
> > > >
> > > > Or perhaps better to put it a different way...  The fact that the compiler
> > > > is not permitted to invent data-racy reads and writes is exactly why
> > > > you do not normally need READ_ONCE() and WRITE_ONCE() for accesses in
> > > > lock-based critical sections.  Instead, you only need READ_ONCE() and
> > > > WRITE_ONCE() when you have lockless accesses to the same shared variables.
> > > 
> > > This is lockless access to memcg->oom_group potentially from multiple
> > > CPUs, so, READ_ONCE() and WRITE_ONCE() are needed, right?
> > 
> > Agreed, lockless concurrent accesses should use READ_ONCE() and WRITE_ONCE().
> > And if either conflicting access is lockless, it is lockless.  ;-)
> 
> Now I'm confused, why we should use it here?
> Writing is happening from a separate syscall (a single write from a syscall),
> reading is happening from a oom context. The variable is boolean, it's either
> 0 or 1. What difference READ_ONCE()/WRITE_ONCE() will make here?
> Thanks!

In practice, not much difference other than documenting shared accesses.
Which can be valuable.

In theory, when you do a normal C-language store, the compiler is within
its rights to use the variable for temporary storage between the time
of the last read from that variable and the next write to that variable.
Back to practice, I have not heard of this happening for shared variables.
On the other hand, compilers really do this for on-stack variables whose
addresses are not taken, which is one of the reasons that gdb might say
that the variable is optimized out when you try to look at its value.

So the potential is there, and if it was my code, I would therefore use
READ_ONCE() and WRITE_ONCE().

							Thanx, Paul
Roman Gushchin Feb. 22, 2023, 4:28 a.m. UTC | #24
> On Feb 21, 2023, at 4:38 PM, Paul E. McKenney <paulmck@kernel.org> wrote:
> 
> On Tue, Feb 21, 2023 at 03:57:58PM -0800, Roman Gushchin wrote:
>>> On Tue, Feb 21, 2023 at 03:38:24PM -0800, Paul E. McKenney wrote:
>>> On Tue, Feb 21, 2023 at 03:13:36PM -0800, Shakeel Butt wrote:
>>>> On Tue, Feb 21, 2023 at 2:38 PM Paul E. McKenney <paulmck@kernel.org> wrote:
>>>>> 
>>>>> On Tue, Feb 21, 2023 at 02:23:31PM -0800, Roman Gushchin wrote:
>>>>>> On Tue, Feb 21, 2023 at 10:23:59AM -0800, Paul E. McKenney wrote:
>>>>>>> On Tue, Feb 21, 2023 at 08:56:59AM -0800, Shakeel Butt wrote:
>>>>>>>> +Paul & Marco
>>>>>>>> 
>>>>>>>> On Tue, Feb 21, 2023 at 5:51 AM Matthew Wilcox <willy@infradead.org> wrote:
>>>>>>>>> 
>>>>>>>>> On Mon, Feb 20, 2023 at 10:52:10PM -0800, Shakeel Butt wrote:
>>>>>>>>>> On Mon, Feb 20, 2023 at 9:17 PM Roman Gushchin <roman.gushchin@linux.dev> wrote:
>>>>>>>>>>>> On Feb 20, 2023, at 3:06 PM, Shakeel Butt <shakeelb@google.com> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> On Mon, Feb 20, 2023 at 01:09:44PM -0800, Roman Gushchin wrote:
>>>>>>>>>>>>>> On Mon, Feb 20, 2023 at 11:16:38PM +0800, Yue Zhao wrote:
>>>>>>>>>>>>>> The knob for cgroup v2 memory controller: memory.oom.group
>>>>>>>>>>>>>> will be read and written simultaneously by user space
>>>>>>>>>>>>>> programs, thus we'd better change memcg->oom_group access
>>>>>>>>>>>>>> with atomic operations to avoid concurrency problems.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Signed-off-by: Yue Zhao <findns94@gmail.com>
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Hi Yue!
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I'm curious, have any seen any real issues which your patch is solving?
>>>>>>>>>>>>> Can you, please, provide a bit more details.
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> IMHO such details are not needed. oom_group is being accessed
>>>>>>>>>>>> concurrently and one of them can be a write access. At least
>>>>>>>>>>>> READ_ONCE/WRITE_ONCE is needed here.
>>>>>>>>>>> 
>>>>>>>>>>> Needed for what?
>>>>>>>>>> 
>>>>>>>>>> For this particular case, documenting such an access. Though I don't
>>>>>>>>>> think there are any architectures which may tear a one byte read/write
>>>>>>>>>> and merging/refetching is not an issue for this.
>>>>>>>>> 
>>>>>>>>> Wouldn't a compiler be within its rights to implement a one byte store as:
>>>>>>>>> 
>>>>>>>>>        load-word
>>>>>>>>>        modify-byte-in-word
>>>>>>>>>        store-word
>>>>>>>>> 
>>>>>>>>> and if this is a lockless store to a word which has an adjacent byte also
>>>>>>>>> being modified by another CPU, one of those CPUs can lose its store?
>>>>>>>>> And WRITE_ONCE would prevent the compiler from implementing the store
>>>>>>>>> in that way.
>>>>>>>> 
>>>>>>>> Thanks Willy for pointing this out. If the compiler can really do this
>>>>>>>> then [READ|WRITE]_ONCE are required here. I always have big bad
>>>>>>>> compiler lwn article open in a tab. I couldn't map this transformation
>>>>>>>> to ones mentioned in that article. Do we have name of this one?
>>>>>>> 
>>>>>>> No, recent compilers are absolutely forbidden from doing this sort of
>>>>>>> thing except under very special circumstances.
>>>>>>> 
>>>>>>> Before C11, compilers could and in fact did do things like this.  This is
>>>>>>> after all a great way to keep the CPU's vector unit from getting bored.
>>>>>>> Unfortunately for those who prize optimization above all else, doing
>>>>>>> this can introduce data races, for example:
>>>>>>> 
>>>>>>>    char a;
>>>>>>>    char b;
>>>>>>>    spin_lock la;
>>>>>>>    spin_lock lb;
>>>>>>> 
>>>>>>>    void change_a(char new_a)
>>>>>>>    {
>>>>>>>            spin_lock(&la);
>>>>>>>            a = new_a;
>>>>>>>            spin_unlock(&la);
>>>>>>>    }
>>>>>>> 
>>>>>>>    void change_b(char new_b)
>>>>>>>    {
>>>>>>>            spin_lock(&lb);
>>>>>>>            b = new_b;
>>>>>>>            spin_unlock(&lb);
>>>>>>>    }
>>>>>>> 
>>>>>>> If the compiler "optimized" that "a = new_a" so as to produce a non-atomic
>>>>>>> read-modify-write sequence, it would be introducing a data race.
>>>>>>> And since C11, the compiler is absolutely forbidden from introducing
>>>>>>> data races.  So, again, no, the compiler cannot invent writes to
>>>>>>> variables.
>>>>>>> 
>>>>>>> What are those very special circumstances?
>>>>>>> 
>>>>>>> 1.  The other variables were going to be written to anyway, and
>>>>>>>    none of the writes was non-volatile and there was no ordering
>>>>>>>    directive between any of those writes.
>>>>>>> 
>>>>>>> 2.  The other variables are dead, as in there are no subsequent
>>>>>>>    reads from them anywhere in the program.  Of course in that case,
>>>>>>>    there is no need to read the prior values of those variables.
>>>>>>> 
>>>>>>> 3.  All accesses to all of the variables are visible to the compiler,
>>>>>>>    and the compiler can prove that there are no concurrent accesses
>>>>>>>    to any of them.  For example, all of the variables are on-stack
>>>>>>>    variables whose addresses are never taken.
>>>>>>> 
>>>>>>> Does that help, or am I misunderstanding the question?
>>>>>> 
>>>>>> Thank you, Paul!
>>>>>> 
>>>>>> So it seems like READ_ONCE()/WRITE_ONCE() are totally useless here.
>>>>>> Or I still miss something?
>>>>> 
>>>>> Yes, given that the compiler will already avoid inventing data-race-prone
>>>>> C-language accesses to shared variables, so if that was the only reason
>>>>> that you were using READ_ONCE() or WRITE_ONCE(), then READ_ONCE() and
>>>>> WRITE_ONCE() won't be helping you.
>>>>> 
>>>>> Or perhaps better to put it a different way...  The fact that the compiler
>>>>> is not permitted to invent data-racy reads and writes is exactly why
>>>>> you do not normally need READ_ONCE() and WRITE_ONCE() for accesses in
>>>>> lock-based critical sections.  Instead, you only need READ_ONCE() and
>>>>> WRITE_ONCE() when you have lockless accesses to the same shared variables.
>>>> 
>>>> This is lockless access to memcg->oom_group potentially from multiple
>>>> CPUs, so, READ_ONCE() and WRITE_ONCE() are needed, right?
>>> 
>>> Agreed, lockless concurrent accesses should use READ_ONCE() and WRITE_ONCE().
>>> And if either conflicting access is lockless, it is lockless.  ;-)
>> 
>> Now I'm confused, why we should use it here?
>> Writing is happening from a separate syscall (a single write from a syscall),
>> reading is happening from a oom context. The variable is boolean, it's either
>> 0 or 1. What difference READ_ONCE()/WRITE_ONCE() will make here?
>> Thanks!
> 
> In practice, not much difference other than documenting shared accesses.
> Which can be valuable.
> 
> In theory, when you do a normal C-language store, the compiler is within
> its rights to use the variable for temporary storage between the time
> of the last read from that variable and the next write to that variable.
> Back to practice, I have not heard of this happening for shared variables.
> On the other hand, compilers really do this for on-stack variables whose
> addresses are not taken, which is one of the reasons that gdb might say
> that the variable is optimized out when you try to look at its value.
> 
> So the potential is there, and if it was my code, I would therefore use
> READ_ONCE() and WRITE_ONCE().

Got it, Paul, thank you for the explanation!

It seems like the resolution is that putting READ_ONCE()/WRITE_ONCE() across knobs in mm/memcontrol.c is generally a good idea, but mostly for cosmetic reasons.

Yue, can you, please, update the patch?

Btw, what a thread! Apparently writing & reading a single boolean is not that simple… :)
Thanks for all participants!

Roman
David Laight Feb. 22, 2023, 9:01 a.m. UTC | #25
From: Matthew Wilcox
> Sent: 21 February 2023 13:51
...
> > For this particular case, documenting such an access. Though I don't
> > think there are any architectures which may tear a one byte read/write
> > and merging/refetching is not an issue for this.
> 
> Wouldn't a compiler be within its rights to implement a one byte store as:
> 
> 	load-word
> 	modify-byte-in-word
> 	store-word
> 
> and if this is a lockless store to a word which has an adjacent byte also
> being modified by another CPU, one of those CPUs can lose its store?
> And WRITE_ONCE would prevent the compiler from implementing the store
> in that way.

Some alpha cpu couldn't do byte memory accesses - so always
did 32bit read-modify-write. But Linux doesn't support those
ones any more.

On arm 16bit structure members can be accessed with 32bit
instructions because the 16bit ones have a smaller offset.

On x86 the bit operations might access the (possibly misaligned)
32bit word containing the required bit - but they are locked.

ISTR a problem where gcc was using wider instructions and
doing a RMW on an adjacent volatile field.

I really can't remember the justification for not marking
fields that have unlocked accesses 'volatile' instead of
requiring all the accesses be done as explicit volatile ones.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
diff mbox series

Patch

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 73afff8062f9..e4695fb80bda 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2057,7 +2057,7 @@  struct mem_cgroup *mem_cgroup_get_oom_group(struct task_struct *victim,
 	 * highest-level memory cgroup with oom.group set.
 	 */
 	for (; memcg; memcg = parent_mem_cgroup(memcg)) {
-		if (memcg->oom_group)
+		if (READ_ONCE(memcg->oom_group))
 			oom_group = memcg;
 
 		if (memcg == oom_domain)
@@ -6569,7 +6569,7 @@  static int memory_oom_group_show(struct seq_file *m, void *v)
 {
 	struct mem_cgroup *memcg = mem_cgroup_from_seq(m);
 
-	seq_printf(m, "%d\n", memcg->oom_group);
+	seq_printf(m, "%d\n", READ_ONCE(memcg->oom_group));
 
 	return 0;
 }
@@ -6591,7 +6591,7 @@  static ssize_t memory_oom_group_write(struct kernfs_open_file *of,
 	if (oom_group != 0 && oom_group != 1)
 		return -EINVAL;
 
-	memcg->oom_group = oom_group;
+	WRITE_ONCE(memcg->oom_group, oom_group);
 
 	return nbytes;
 }