mbox series

[0/3] using guard/__free in networking

Message ID 20240325223905.100979-5-johannes@sipsolutions.net (mailing list archive)
Headers show
Series using guard/__free in networking | expand

Message

Johannes Berg March 25, 2024, 10:31 p.m. UTC
Hi,

So I started playing with this for wifi, and overall that
does look pretty nice, but it's a bit weird if we can do

  guard(wiphy)(&rdev->wiphy);

or so, but still have to manually handle the RTNL in the
same code.

Hence these patches. The third one is a bit more RFC than
the others, I guess. It's also not tested, I'm not sure I
even know how to hit all the BPF parts.

johannes

Comments

Jakub Kicinski March 26, 2024, 2:09 a.m. UTC | #1
On Mon, 25 Mar 2024 23:31:25 +0100 Johannes Berg wrote:
> Hi,
> 
> So I started playing with this for wifi, and overall that
> does look pretty nice, but it's a bit weird if we can do
> 
>   guard(wiphy)(&rdev->wiphy);
> 
> or so, but still have to manually handle the RTNL in the
> same code.

Dunno, it locks code instead of data accesses.
Forgive the comparison but it feels too much like Java to me :)
scoped_guard is fine, the guard() not so much.

But happy for other netdev maintainers to override me..

Do you have a piece of code in wireless where the conversion
made you go "wow, this is so much cleaner"?
Johannes Berg March 26, 2024, 8:42 a.m. UTC | #2
On Mon, 2024-03-25 at 19:09 -0700, Jakub Kicinski wrote:
> On Mon, 25 Mar 2024 23:31:25 +0100 Johannes Berg wrote:
> > Hi,
> > 
> > So I started playing with this for wifi, and overall that
> > does look pretty nice, but it's a bit weird if we can do
> > 
> >   guard(wiphy)(&rdev->wiphy);
> > 
> > or so, but still have to manually handle the RTNL in the
> > same code.
> 
> Dunno, it locks code instead of data accesses.

Well, I'm not sure that's a fair complaint. After all, without any more
compiler help, even rtnl_lock()/rtnl_unlock() _necessarily_ locks code.
Clearly

	rtnl_lock();
	// something
	rtnl_unlock();

also locks the "// something" code, after all., and yeah that might be
doing data accesses, but it might also be a function call or a whole
bunch of other things?

Or if you look at something like bpf_xdp_link_attach(), I don't think
you can really say that it locks only data. That doesn't even do the
allocation outside the lock (though I did convert that one to
scoped_guard because of that.)

Or even something simple like unregister_netdev(), it just requires the
RTNL for some data accesses and consistency deep inside
unregister_netdevice(), not for any specific data accessed there.

So yeah, this is always going to be a trade-off, but all the locking is.
We even make similar trade-offs manually, e.g. look at
bpf_xdp_link_update(), it will do the bpf_prog_put() under the RTNL
still, for no good reason other than simplifying the cleanup path there.


Anyway, I can live with it either way (unless you tell me you won't pull
wireless code using guard), just thought doing the wireless locking with
guard and the RTNL around it without it (only in a few places do we
still use RTNL though) looked odd.


> Forgive the comparison but it feels too much like Java to me :)

Heh. Haven't used Java in 20 years or so...

> scoped_guard is fine, the guard() not so much.

I think you can't get scoped_guard() without guard(), so does that mean
you'd accept the first patch in the series?

> Do you have a piece of code in wireless where the conversion
> made you go "wow, this is so much cleaner"?

Mostly long and complex error paths. Found a double-unlock bug (in
iwlwifi) too, when converting some locking there.

Doing a more broader conversion on cfg80211/mac80211 removes around 200
lines of unlocking, mostly error handling, code.

Doing __free() too will probably clean up even more.

johannes
Jakub Kicinski March 26, 2024, 2:37 p.m. UTC | #3
On Tue, 26 Mar 2024 09:42:43 +0100 Johannes Berg wrote:
> On Mon, 2024-03-25 at 19:09 -0700, Jakub Kicinski wrote:
> > On Mon, 25 Mar 2024 23:31:25 +0100 Johannes Berg wrote:  
> > > Hi,
> > > 
> > > So I started playing with this for wifi, and overall that
> > > does look pretty nice, but it's a bit weird if we can do
> > > 
> > >   guard(wiphy)(&rdev->wiphy);
> > > 
> > > or so, but still have to manually handle the RTNL in the
> > > same code.  
> > 
> > Dunno, it locks code instead of data accesses.  
> 
> Well, I'm not sure that's a fair complaint. After all, without any more
> compiler help, even rtnl_lock()/rtnl_unlock() _necessarily_ locks code.
> Clearly
> 
> 	rtnl_lock();
> 	// something
> 	rtnl_unlock();
> 
> also locks the "// something" code, after all., and yeah that might be
> doing data accesses, but it might also be a function call or a whole
> bunch of other things?
> 
> Or if you look at something like bpf_xdp_link_attach(), I don't think
> you can really say that it locks only data. That doesn't even do the
> allocation outside the lock (though I did convert that one to
> scoped_guard because of that.)
> 
> Or even something simple like unregister_netdev(), it just requires the
> RTNL for some data accesses and consistency deep inside
> unregister_netdevice(), not for any specific data accessed there.
> 
> So yeah, this is always going to be a trade-off, but all the locking is.
> We even make similar trade-offs manually, e.g. look at
> bpf_xdp_link_update(), it will do the bpf_prog_put() under the RTNL
> still, for no good reason other than simplifying the cleanup path there.

At least to me the mental model is different. 99% of the time the guard
is covering the entire body. So now we're moving from "I'm touching X
so I need to lock" to "This _function_ is safe to touch X".

> Anyway, I can live with it either way (unless you tell me you won't pull
> wireless code using guard), just thought doing the wireless locking with
> guard and the RTNL around it without it (only in a few places do we
> still use RTNL though) looked odd.
> 
> 
> > Forgive the comparison but it feels too much like Java to me :)  
> 
> Heh. Haven't used Java in 20 years or so...

I only did at uni, but I think they had a decorator for a method, where
you can basically say "this method should be under lock X" and runtime
will take that lock before entering and drop it after exit,
appropriately. I wonder why the sudden love for this concept :S
Is it also present in Rust or some such?

> > scoped_guard is fine, the guard() not so much.  
> 
> I think you can't get scoped_guard() without guard(), so does that mean
> you'd accept the first patch in the series?

How can we get one without the other.. do you reckon Joe P would let us
add a checkpatch check to warn people against pure guard() under net/ ?

> > Do you have a piece of code in wireless where the conversion
> > made you go "wow, this is so much cleaner"?  
> 
> Mostly long and complex error paths. Found a double-unlock bug (in
> iwlwifi) too, when converting some locking there.
> 
> Doing a more broader conversion on cfg80211/mac80211 removes around 200
> lines of unlocking, mostly error handling, code.
> 
> Doing __free() too will probably clean up even more.

Not super convinced by that one either:
https://lore.kernel.org/all/20240321185640.6f7f4d6b@kernel.org/
maybe I'm too conservative..
Willem de Bruijn March 26, 2024, 3:23 p.m. UTC | #4
Jakub Kicinski wrote:
> On Tue, 26 Mar 2024 09:42:43 +0100 Johannes Berg wrote:
> > On Mon, 2024-03-25 at 19:09 -0700, Jakub Kicinski wrote:
> > > On Mon, 25 Mar 2024 23:31:25 +0100 Johannes Berg wrote:  
> > > > Hi,
> > > > 
> > > > So I started playing with this for wifi, and overall that
> > > > does look pretty nice, but it's a bit weird if we can do
> > > > 
> > > >   guard(wiphy)(&rdev->wiphy);
> > > > 
> > > > or so, but still have to manually handle the RTNL in the
> > > > same code.  
> > > 
> > > Dunno, it locks code instead of data accesses.  
> > 
> > Well, I'm not sure that's a fair complaint. After all, without any more
> > compiler help, even rtnl_lock()/rtnl_unlock() _necessarily_ locks code.
> > Clearly
> > 
> > 	rtnl_lock();
> > 	// something
> > 	rtnl_unlock();
> > 
> > also locks the "// something" code, after all., and yeah that might be
> > doing data accesses, but it might also be a function call or a whole
> > bunch of other things?
> > 
> > Or if you look at something like bpf_xdp_link_attach(), I don't think
> > you can really say that it locks only data. That doesn't even do the
> > allocation outside the lock (though I did convert that one to
> > scoped_guard because of that.)
> > 
> > Or even something simple like unregister_netdev(), it just requires the
> > RTNL for some data accesses and consistency deep inside
> > unregister_netdevice(), not for any specific data accessed there.
> > 
> > So yeah, this is always going to be a trade-off, but all the locking is.
> > We even make similar trade-offs manually, e.g. look at
> > bpf_xdp_link_update(), it will do the bpf_prog_put() under the RTNL
> > still, for no good reason other than simplifying the cleanup path there.
> 
> At least to me the mental model is different. 99% of the time the guard
> is covering the entire body. So now we're moving from "I'm touching X
> so I need to lock" to "This _function_ is safe to touch X".
> 
> > Anyway, I can live with it either way (unless you tell me you won't pull
> > wireless code using guard), just thought doing the wireless locking with
> > guard and the RTNL around it without it (only in a few places do we
> > still use RTNL though) looked odd.
> > 
> > 
> > > Forgive the comparison but it feels too much like Java to me :)  
> > 
> > Heh. Haven't used Java in 20 years or so...
> 
> I only did at uni, but I think they had a decorator for a method, where
> you can basically say "this method should be under lock X" and runtime
> will take that lock before entering and drop it after exit,
> appropriately. I wonder why the sudden love for this concept :S
> Is it also present in Rust or some such?
> 
> > > scoped_guard is fine, the guard() not so much.  
> > 
> > I think you can't get scoped_guard() without guard(), so does that mean
> > you'd accept the first patch in the series?
> 
> How can we get one without the other.. do you reckon Joe P would let us
> add a checkpatch check to warn people against pure guard() under net/ ?
> 
> > > Do you have a piece of code in wireless where the conversion
> > > made you go "wow, this is so much cleaner"?  
> > 
> > Mostly long and complex error paths. Found a double-unlock bug (in
> > iwlwifi) too, when converting some locking there.
> > 
> > Doing a more broader conversion on cfg80211/mac80211 removes around 200
> > lines of unlocking, mostly error handling, code.
> > 
> > Doing __free() too will probably clean up even more.
> 
> Not super convinced by that one either:
> https://lore.kernel.org/all/20240321185640.6f7f4d6b@kernel.org/
> maybe I'm too conservative..

+1 on the concept (fwiw).

Even the simple examples, such as unregister_netdevice_notifier_net,
show how it avoids boilerplate and so simplifies control flow.

That benefit multiplies with the number of resources held and number
of exit paths. Or in our case, gotos and (unlock) labels.

Error paths are notorious for seeing little test coverage and leaking
resources. This is an easy class of bugs that this RAII squashes.

Sprinkling guard statements anywhere in the scope itself makes it
perhaps hard to follow. Perhaps a heuristic would be to require these
statements at the start of scope (after variable declaration)?

Function level decorators could further inform static analysis.
But that is somewhat tangential.
Andrew Lunn March 26, 2024, 3:33 p.m. UTC | #5
On Mon, Mar 25, 2024 at 07:09:57PM -0700, Jakub Kicinski wrote:
> On Mon, 25 Mar 2024 23:31:25 +0100 Johannes Berg wrote:
> > Hi,
> > 
> > So I started playing with this for wifi, and overall that
> > does look pretty nice, but it's a bit weird if we can do
> > 
> >   guard(wiphy)(&rdev->wiphy);
> > 
> > or so, but still have to manually handle the RTNL in the
> > same code.
> 
> Dunno, it locks code instead of data accesses.
> Forgive the comparison but it feels too much like Java to me :)
> scoped_guard is fine, the guard() not so much.

I tend to agree. guard() does not feel like C. scoped_guard does.

	Andrew
Johannes Berg March 26, 2024, 3:33 p.m. UTC | #6
> > So yeah, this is always going to be a trade-off, but all the locking is.
> > We even make similar trade-offs manually, e.g. look at
> > bpf_xdp_link_update(), it will do the bpf_prog_put() under the RTNL
> > still, for no good reason other than simplifying the cleanup path there.
> 
> At least to me the mental model is different. 99% of the time the guard
> is covering the entire body. So now we're moving from "I'm touching X
> so I need to lock" to "This _function_ is safe to touch X".

Yeah, maybe. But then we also have a lot of trivial wrappers just do to
locking, like

do_something()
{
	rtnl_lock()
	ret = __do_something()
	rtnl_unlock();
	return ret;
}

because __do_something() has a bunch of error paths and it's just messy
to maintain the locking directly :)

So I guess I don't have that much of a different model, I'd actually say
it's an advantage of sorts in many cases, and in some cases you'd still
have "rtnl_lock()" only in the middle, or scoped_guard(rtnl) {...} for a
small block. But refactoring a function because it's accessing protected
data doesn't seem completely out of the question either.

> > > Forgive the comparison but it feels too much like Java to me :)  
> > 
> > Heh. Haven't used Java in 20 years or so...
> 
> I only did at uni, but I think they had a decorator for a method, where
> you can basically say "this method should be under lock X" and runtime
> will take that lock before entering and drop it after exit,
> appropriately.

Yeah, I vaguely remember that. And yes, you can obviously just slap that
on everything and call it a day, or you could also there refactor the
things that do need the locking into a separate function, and use it
there?

> I wonder why the sudden love for this concept :S

I think it's neither sudden nor love ;-) But all the cleanup paths are
_always_ a mess to maintain, and this is the least bad thing folks came
up with so far?

> Is it also present in Rust or some such?

I have no idea. I _think_ Rust actually ties the data and the locks
together more?

> > > scoped_guard is fine, the guard() not so much.  
> > 
> > I think you can't get scoped_guard() without guard(), so does that mean
> > you'd accept the first patch in the series?
> 
> How can we get one without the other.. do you reckon Joe P would let us
> add a checkpatch check to warn people against pure guard() under net/ ?

Maybe? But I think I do want to use guard() ;-)

There are a ton of functions like say cfg80211_wext_siwrts() or
nl80211_new_interface() that really just want to hold the mutex for the
(remainder of) the function. And really _nl80211_new_interface()
wouldn't even exist if we had that, because nl80211_new_interface()
would just be

nl80211_new_interface()
{
  cfg80211_destroy_ifaces(rdev);

  guard(wiphy)(&rdev->wiphy);

  // exactly existing content of _nl80211_new_interface()
  // with all the returns etc.
}

the only reason for _nl80211_new_interface() is the locking there ...

Actually, we might push that further down into the function, to just
before rdev_add_virtual_intf(), I guess? But it all just adds to the
complexity as long as it's not cleaned up automatically.

> > > Do you have a piece of code in wireless where the conversion
> > > made you go "wow, this is so much cleaner"?  
> > 
> > Mostly long and complex error paths. Found a double-unlock bug (in
> > iwlwifi) too, when converting some locking there.
> > 
> > Doing a more broader conversion on cfg80211/mac80211 removes around 200
> > lines of unlocking, mostly error handling, code.
> > 
> > Doing __free() too will probably clean up even more.
> 
> Not super convinced by that one either:
> https://lore.kernel.org/all/20240321185640.6f7f4d6b@kernel.org/
> maybe I'm too conservative..

I mean ... it's not great, and it's all new stuff to learn (especially
those caveats with the cleanup order), but error paths are the things
that are really never tested and tend to be broken, no matter that we
have smatch/sparse/coverity/etc.

So while I don't think it's perfect, and I tend to agree even that it
encourages "over-locking" (though we can refactor and use scope etc.
where it matters), I'm pretty firmly on the "we should be using this to
not worry about error paths all the time" side of the fence, I guess.

johannes
Stanislav Fomichev March 26, 2024, 5:33 p.m. UTC | #7
On 03/26, Willem de Bruijn wrote:
> Jakub Kicinski wrote:
> > On Tue, 26 Mar 2024 09:42:43 +0100 Johannes Berg wrote:
> > > On Mon, 2024-03-25 at 19:09 -0700, Jakub Kicinski wrote:
> > > > On Mon, 25 Mar 2024 23:31:25 +0100 Johannes Berg wrote:  
> > > > > Hi,
> > > > > 
> > > > > So I started playing with this for wifi, and overall that
> > > > > does look pretty nice, but it's a bit weird if we can do
> > > > > 
> > > > >   guard(wiphy)(&rdev->wiphy);
> > > > > 
> > > > > or so, but still have to manually handle the RTNL in the
> > > > > same code.  
> > > > 
> > > > Dunno, it locks code instead of data accesses.  
> > > 
> > > Well, I'm not sure that's a fair complaint. After all, without any more
> > > compiler help, even rtnl_lock()/rtnl_unlock() _necessarily_ locks code.
> > > Clearly
> > > 
> > > 	rtnl_lock();
> > > 	// something
> > > 	rtnl_unlock();
> > > 
> > > also locks the "// something" code, after all., and yeah that might be
> > > doing data accesses, but it might also be a function call or a whole
> > > bunch of other things?
> > > 
> > > Or if you look at something like bpf_xdp_link_attach(), I don't think
> > > you can really say that it locks only data. That doesn't even do the
> > > allocation outside the lock (though I did convert that one to
> > > scoped_guard because of that.)
> > > 
> > > Or even something simple like unregister_netdev(), it just requires the
> > > RTNL for some data accesses and consistency deep inside
> > > unregister_netdevice(), not for any specific data accessed there.
> > > 
> > > So yeah, this is always going to be a trade-off, but all the locking is.
> > > We even make similar trade-offs manually, e.g. look at
> > > bpf_xdp_link_update(), it will do the bpf_prog_put() under the RTNL
> > > still, for no good reason other than simplifying the cleanup path there.
> > 
> > At least to me the mental model is different. 99% of the time the guard
> > is covering the entire body. So now we're moving from "I'm touching X
> > so I need to lock" to "This _function_ is safe to touch X".
> > 
> > > Anyway, I can live with it either way (unless you tell me you won't pull
> > > wireless code using guard), just thought doing the wireless locking with
> > > guard and the RTNL around it without it (only in a few places do we
> > > still use RTNL though) looked odd.
> > > 
> > > 
> > > > Forgive the comparison but it feels too much like Java to me :)  
> > > 
> > > Heh. Haven't used Java in 20 years or so...
> > 
> > I only did at uni, but I think they had a decorator for a method, where
> > you can basically say "this method should be under lock X" and runtime
> > will take that lock before entering and drop it after exit,
> > appropriately. I wonder why the sudden love for this concept :S
> > Is it also present in Rust or some such?

It's more of a c++ thing I believe: https://en.cppreference.com/w/cpp/thread/lock_guard

Not that anybody is asking my opinion (and my mind has been a bit corrupted
by c++), but guard() syntax seems fine :-p

Rust's approach is more conventional. There is a mtx.lock() method that
returns a scoped guard that can be optionally unlock'ed IIRC.

> > > > scoped_guard is fine, the guard() not so much.  
> > > 
> > > I think you can't get scoped_guard() without guard(), so does that mean
> > > you'd accept the first patch in the series?
> > 
> > How can we get one without the other.. do you reckon Joe P would let us
> > add a checkpatch check to warn people against pure guard() under net/ ?
> > 
> > > > Do you have a piece of code in wireless where the conversion
> > > > made you go "wow, this is so much cleaner"?  
> > > 
> > > Mostly long and complex error paths. Found a double-unlock bug (in
> > > iwlwifi) too, when converting some locking there.
> > > 
> > > Doing a more broader conversion on cfg80211/mac80211 removes around 200
> > > lines of unlocking, mostly error handling, code.
> > > 
> > > Doing __free() too will probably clean up even more.
> > 
> > Not super convinced by that one either:
> > https://lore.kernel.org/all/20240321185640.6f7f4d6b@kernel.org/
> > maybe I'm too conservative..
> 
> +1 on the concept (fwiw).
> 
> Even the simple examples, such as unregister_netdevice_notifier_net,
> show how it avoids boilerplate and so simplifies control flow.
> 
> That benefit multiplies with the number of resources held and number
> of exit paths. Or in our case, gotos and (unlock) labels.
> 
> Error paths are notorious for seeing little test coverage and leaking
> resources. This is an easy class of bugs that this RAII squashes.
> 
> Sprinkling guard statements anywhere in the scope itself makes it
> perhaps hard to follow. Perhaps a heuristic would be to require these
> statements at the start of scope (after variable declaration)?
> 
> Function level decorators could further inform static analysis.
> But that is somewhat tangential.
Jakub Kicinski March 27, 2024, 12:15 a.m. UTC | #8
On Tue, 26 Mar 2024 16:33:58 +0100 Johannes Berg wrote:
> > > I think you can't get scoped_guard() without guard(), so does that mean
> > > you'd accept the first patch in the series?  
> > 
> > How can we get one without the other.. do you reckon Joe P would let us
> > add a checkpatch check to warn people against pure guard() under net/ ?  
> 
> Maybe? But I think I do want to use guard() ;-)

IIUC Willem and Stan like the construct too, so I'm fine with patches 
1 and 2. But let's not convert the exiting code just yet (leave out 3)?
Przemek Kitszel March 27, 2024, 11:15 a.m. UTC | #9
On 3/26/24 15:37, Jakub Kicinski wrote:
> On Tue, 26 Mar 2024 09:42:43 +0100 Johannes Berg wrote:
>> On Mon, 2024-03-25 at 19:09 -0700, Jakub Kicinski wrote:
>>> On Mon, 25 Mar 2024 23:31:25 +0100 Johannes Berg wrote:
>>>> Hi,
>>>>
>>>> So I started playing with this for wifi, and overall that
>>>> does look pretty nice, but it's a bit weird if we can do
>>>>
>>>>    guard(wiphy)(&rdev->wiphy);
>>>>
>>>> or so, but still have to manually handle the RTNL in the
>>>> same code.
>>>
>>> Dunno, it locks code instead of data accesses.
>>
>> Well, I'm not sure that's a fair complaint. After all, without any more
>> compiler help, even rtnl_lock()/rtnl_unlock() _necessarily_ locks code.
>> Clearly
>>
>> 	rtnl_lock();
>> 	// something
>> 	rtnl_unlock();
>>
>> also locks the "// something" code, after all., and yeah that might be
>> doing data accesses, but it might also be a function call or a whole
>> bunch of other things?
>>
>> Or if you look at something like bpf_xdp_link_attach(), I don't think
>> you can really say that it locks only data. That doesn't even do the
>> allocation outside the lock (though I did convert that one to
>> scoped_guard because of that.)
>>
>> Or even something simple like unregister_netdev(), it just requires the
>> RTNL for some data accesses and consistency deep inside
>> unregister_netdevice(), not for any specific data accessed there.
>>
>> So yeah, this is always going to be a trade-off, but all the locking is.
>> We even make similar trade-offs manually, e.g. look at
>> bpf_xdp_link_update(), it will do the bpf_prog_put() under the RTNL
>> still, for no good reason other than simplifying the cleanup path there.
> 
> At least to me the mental model is different. 99% of the time the guard
> is covering the entire body. So now we're moving from "I'm touching X
> so I need to lock" to "This _function_ is safe to touch X".
> 
>> Anyway, I can live with it either way (unless you tell me you won't pull
>> wireless code using guard), just thought doing the wireless locking with
>> guard and the RTNL around it without it (only in a few places do we
>> still use RTNL though) looked odd.
>>
>>
>>> Forgive the comparison but it feels too much like Java to me :)
>>
>> Heh. Haven't used Java in 20 years or so...
> 
> I only did at uni, but I think they had a decorator for a method, where
> you can basically say "this method should be under lock X" and runtime
> will take that lock before entering and drop it after exit,
> appropriately. I wonder why the sudden love for this concept :S
> Is it also present in Rust or some such?

:D
There is indeed a lot of proposals around the topic lately :)

I believe that "The first half of the 6.8 merge window" [lwn] article
has brought attention to the in-kernel "scope-based resource management"
availability.

[lwn] https://lwn.net/Articles/957188/

More abstraction/sugar over __free() is cleanly needed to have code both
easier to follow and less buggy.

You made a good point to encourage scoping the locks to small blocks
instead of whole functions. And "less typing" to instantiate the lock
guard variable is one way to do that.

> 
>>> scoped_guard is fine, the guard() not so much.
>>
>> I think you can't get scoped_guard() without guard(), so does that mean
>> you'd accept the first patch in the series?
> 
> How can we get one without the other.. do you reckon Joe P would let us
> add a checkpatch check to warn people against pure guard() under net/ ?
> 
>>> Do you have a piece of code in wireless where the conversion
>>> made you go "wow, this is so much cleaner"?
>>
>> Mostly long and complex error paths. Found a double-unlock bug (in
>> iwlwifi) too, when converting some locking there.
>>
>> Doing a more broader conversion on cfg80211/mac80211 removes around 200
>> lines of unlocking, mostly error handling, code.
>>
>> Doing __free() too will probably clean up even more.
> 
> Not super convinced by that one either:
> https://lore.kernel.org/all/20240321185640.6f7f4d6b@kernel.org/
> maybe I'm too conservative..
>
Johannes Berg March 27, 2024, 7:24 p.m. UTC | #10
On Tue, 2024-03-26 at 17:15 -0700, Jakub Kicinski wrote:
> 
> IIUC Willem and Stan like the construct too, so I'm fine with patches 
> 1 and 2. But let's not convert the exiting code just yet (leave out 3)?

Sure, fair. It was mostly an illustration. Do you want a resend then?

johannes
Jakub Kicinski March 27, 2024, 8:07 p.m. UTC | #11
On Wed, 27 Mar 2024 20:24:17 +0100 Johannes Berg wrote:
> > IIUC Willem and Stan like the construct too, so I'm fine with patches 
> > 1 and 2. But let's not convert the exiting code just yet (leave out 3)?  
> 
> Sure, fair. It was mostly an illustration. Do you want a resend then?

Yes, please!
Jakub Sitnicki March 27, 2024, 8:25 p.m. UTC | #12
On Tue, Mar 26, 2024 at 04:33 PM +01, Johannes Berg wrote:
>> Is it also present in Rust or some such?
>
> I have no idea. I _think_ Rust actually ties the data and the locks
> together more?

That is right. Nicely explained here:

https://marabos.nl/atomics/basics.html#rusts-mutex
Johannes Berg March 27, 2024, 9:28 p.m. UTC | #13
On Wed, 2024-03-27 at 21:25 +0100, Jakub Sitnicki wrote:
> On Tue, Mar 26, 2024 at 04:33 PM +01, Johannes Berg wrote:
> > > Is it also present in Rust or some such?
> > 
> > I have no idea. I _think_ Rust actually ties the data and the locks
> > together more?
> 
> That is right. Nicely explained here:
> 
> https://marabos.nl/atomics/basics.html#rusts-mutex

Right.

Thinking about that, we _could_ even add support for drop_guard()?

With the below patch to cleanup.h, you can write

void my_something(my_t *my)
{
...
	named_guard(lock, mutex)(&my->mutex);
...
	if (foo)
		return -EINVAL; // automatically unlocks
...
	// no need for lock any more
	drop_guard(lock);
...
	// do other things now unlocked
}


Is that too ugly? Maybe it's less Java-like and more Rust-like and
better for Jakub ;-)

In some sense that's nicer than scoped_guard() since it doesn't require
a new scope / indentation, but maybe Peter already thought about it and
rejected it :-)


Patch follows, though maybe that should be rolled into the 'base' CLASS
definition instead of defining another "_drop" one for named_guard()?

diff --git a/include/linux/cleanup.h b/include/linux/cleanup.h
index c2d09bc4f976..31298106c28b 100644
--- a/include/linux/cleanup.h
+++ b/include/linux/cleanup.h
@@ -106,7 +106,27 @@ typedef _type class_##_name##_t;					\
 static inline void class_##_name##_destructor(_type *p)			\
 { _type _T = *p; _exit; }						\
 static inline _type class_##_name##_constructor(_init_args)		\
-{ _type t = _init; return t; }
+{ _type t = _init; return t; }						\
+typedef struct class_##_name##_drop##_t {				\
+	class_##_name##_t obj;						\
+	void (*destructor)(struct class_##_name##_drop##_t *);		\
+} class_##_name##_drop##_t;						\
+static inline void							\
+class_##_name##_drop##_destructor(class_##_name##_drop##_t *p)		\
+{									\
+	if (p->obj)							\
+		class_##_name##_destructor(&p->obj);			\
+	p->obj = NULL;							\
+}									\
+static inline class_##_name##_drop##_t					\
+class_##_name##_drop##_constructor(_init_args)				\
+{									\
+	class_##_name##_drop##_t t = {					\
+		.obj = _init,						\
+		.destructor = class_##_name##_drop##_destructor,	\
+	};								\
+	return t;							\
+}
 
 #define EXTEND_CLASS(_name, ext, _init, _init_args...)			\
 typedef class_##_name##_t class_##_name##ext##_t;			\
@@ -163,6 +183,11 @@ static inline class_##_name##_t class_##_name##ext##_constructor(_init_args) \
 #define guard(_name) \
 	CLASS(_name, __UNIQUE_ID(guard))
 
+#define named_guard(_name, _class) \
+	CLASS(_class##_drop, _name)
+
+#define drop_guard(_name) do { _name.destructor(&_name); } while (0)
+
 #define __guard_ptr(_name) class_##_name##_lock_ptr
 
 #define scoped_guard(_name, args...)					\
Johannes Berg March 27, 2024, 9:43 p.m. UTC | #14
On Wed, 2024-03-27 at 22:28 +0100, Johannes Berg wrote:
> 
> +typedef struct class_##_name##_drop##_t {				\
> +	class_##_name##_t obj;						\
> +	void (*destructor)(struct class_##_name##_drop##_t *);		\
> +} class_##_name##_drop##_t;						\

No, I misread the compiler output, it does output a real destructor
function to push it into a stack variable with this...

So I guess it'd have to be

void my_something(my_t *my)
{
...
	named_guard(lock, mutex)(&my->mutex);
...
	if (foo)
		return -EINVAL; // automatically unlocks
...
	// no need for lock any more
	drop_guard(lock, mutex);
...
	// do other things now unlocked
}


instead, syntax-wise.


Which obviously simplifies the changes:


diff --git a/include/linux/cleanup.h b/include/linux/cleanup.h
index c2d09bc4f976..cf39a4a3f56f 100644
--- a/include/linux/cleanup.h
+++ b/include/linux/cleanup.h
@@ -163,6 +163,12 @@ static inline class_##_name##_t class_##_name##ext##_constructor(_init_args) \
 #define guard(_name) \
 	CLASS(_name, __UNIQUE_ID(guard))
 
+#define named_guard(_name, _class) \
+	CLASS(_class, _name)
+
+#define drop_guard(_name, _class) \
+	do { class_##_class##_destructor(&_name); _name = NULL; } while (0)
+
 #define __guard_ptr(_name) class_##_name##_lock_ptr
 
 #define scoped_guard(_name, args...)					\
Simon Horman March 29, 2024, 10:23 a.m. UTC | #15
On Tue, Mar 26, 2024 at 11:23:19AM -0400, Willem de Bruijn wrote:
> Jakub Kicinski wrote:
> > On Tue, 26 Mar 2024 09:42:43 +0100 Johannes Berg wrote:
> > > On Mon, 2024-03-25 at 19:09 -0700, Jakub Kicinski wrote:
> > > > On Mon, 25 Mar 2024 23:31:25 +0100 Johannes Berg wrote:  
> > > > > Hi,
> > > > > 
> > > > > So I started playing with this for wifi, and overall that
> > > > > does look pretty nice, but it's a bit weird if we can do
> > > > > 
> > > > >   guard(wiphy)(&rdev->wiphy);
> > > > > 
> > > > > or so, but still have to manually handle the RTNL in the
> > > > > same code.  
> > > > 
> > > > Dunno, it locks code instead of data accesses.  
> > > 
> > > Well, I'm not sure that's a fair complaint. After all, without any more
> > > compiler help, even rtnl_lock()/rtnl_unlock() _necessarily_ locks code.
> > > Clearly
> > > 
> > > 	rtnl_lock();
> > > 	// something
> > > 	rtnl_unlock();
> > > 
> > > also locks the "// something" code, after all., and yeah that might be
> > > doing data accesses, but it might also be a function call or a whole
> > > bunch of other things?
> > > 
> > > Or if you look at something like bpf_xdp_link_attach(), I don't think
> > > you can really say that it locks only data. That doesn't even do the
> > > allocation outside the lock (though I did convert that one to
> > > scoped_guard because of that.)
> > > 
> > > Or even something simple like unregister_netdev(), it just requires the
> > > RTNL for some data accesses and consistency deep inside
> > > unregister_netdevice(), not for any specific data accessed there.
> > > 
> > > So yeah, this is always going to be a trade-off, but all the locking is.
> > > We even make similar trade-offs manually, e.g. look at
> > > bpf_xdp_link_update(), it will do the bpf_prog_put() under the RTNL
> > > still, for no good reason other than simplifying the cleanup path there.
> > 
> > At least to me the mental model is different. 99% of the time the guard
> > is covering the entire body. So now we're moving from "I'm touching X
> > so I need to lock" to "This _function_ is safe to touch X".
> > 
> > > Anyway, I can live with it either way (unless you tell me you won't pull
> > > wireless code using guard), just thought doing the wireless locking with
> > > guard and the RTNL around it without it (only in a few places do we
> > > still use RTNL though) looked odd.
> > > 
> > > 
> > > > Forgive the comparison but it feels too much like Java to me :)  
> > > 
> > > Heh. Haven't used Java in 20 years or so...
> > 
> > I only did at uni, but I think they had a decorator for a method, where
> > you can basically say "this method should be under lock X" and runtime
> > will take that lock before entering and drop it after exit,
> > appropriately. I wonder why the sudden love for this concept :S
> > Is it also present in Rust or some such?
> > 
> > > > scoped_guard is fine, the guard() not so much.  
> > > 
> > > I think you can't get scoped_guard() without guard(), so does that mean
> > > you'd accept the first patch in the series?
> > 
> > How can we get one without the other.. do you reckon Joe P would let us
> > add a checkpatch check to warn people against pure guard() under net/ ?
> > 
> > > > Do you have a piece of code in wireless where the conversion
> > > > made you go "wow, this is so much cleaner"?  
> > > 
> > > Mostly long and complex error paths. Found a double-unlock bug (in
> > > iwlwifi) too, when converting some locking there.
> > > 
> > > Doing a more broader conversion on cfg80211/mac80211 removes around 200
> > > lines of unlocking, mostly error handling, code.
> > > 
> > > Doing __free() too will probably clean up even more.
> > 
> > Not super convinced by that one either:
> > https://lore.kernel.org/all/20240321185640.6f7f4d6b@kernel.org/
> > maybe I'm too conservative..
> 
> +1 on the concept (fwiw).
> 
> Even the simple examples, such as unregister_netdevice_notifier_net,
> show how it avoids boilerplate and so simplifies control flow.
> 
> That benefit multiplies with the number of resources held and number
> of exit paths. Or in our case, gotos and (unlock) labels.
> 
> Error paths are notorious for seeing little test coverage and leaking
> resources. This is an easy class of bugs that this RAII squashes.

+1

While I'm ambivalent to the constructs that have been proposed,
I do see leaking resources in error paths as a very common pattern.
Especially in new code. Making that easier to get right seems
worthwhile to me.

> 
> Sprinkling guard statements anywhere in the scope itself makes it
> perhaps hard to follow. Perhaps a heuristic would be to require these
> statements at the start of scope (after variable declaration)?
> 
> Function level decorators could further inform static analysis.
> But that is somewhat tangential.
>