Message ID | 20201223184730.30057-1-tparkin@katalix.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] ppp: hold mutex when unbridging channels in unregister path | expand |
Context | Check | Description |
---|---|---|
netdev/cover_letter | success | Link |
netdev/fixes_present | fail | Series targets non-next tree, but doesn't contain any Fixes tags |
netdev/patch_count | success | Link |
netdev/tree_selection | success | Clearly marked for net |
netdev/subject_prefix | success | Link |
netdev/cc_maintainers | fail | 4 maintainers not CCed: linux-ppp@vger.kernel.org davem@davemloft.net paulus@samba.org kuba@kernel.org |
netdev/source_inline | success | Was 0 now: 0 |
netdev/verify_signedoff | success | Link |
netdev/module_param | success | Was 0 now: 0 |
netdev/build_32bit | success | Errors and warnings before: 0 this patch: 0 |
netdev/kdoc | success | Errors and warnings before: 0 this patch: 0 |
netdev/verify_fixes | success | Link |
netdev/checkpatch | success | total: 0 errors, 0 warnings, 0 checks, 9 lines checked |
netdev/build_allmodconfig_warn | success | Errors and warnings before: 0 this patch: 0 |
netdev/header_inline | success | Link |
netdev/stable | success | Stable not CCed |
On Wed, Dec 23, 2020 at 06:47:30PM +0000, Tom Parkin wrote: > Channels are bridged using the PPPIOCBRIDGECHAN ioctl, which executes > with the ppp_mutex held. > > Unbridging may occur in two code paths: firstly an explicit > PPPIOCUNBRIDGECHAN ioctl, and secondly on channel unregister. The > latter may occur when closing the /dev/ppp instance or on teardown of > the channel itself. > > This opens up a refcount underflow bug if ppp_bridge_channels called via. > ioctl races with ppp_unbridge_channels called via. file release. > > The race is triggered by ppp_unbridge_channels taking the error path This is actually ppp_bridge_channels. > through the 'err_unset' label. In this scenario, pch->bridge has been > set, but no reference will be taken on pch->file because the function > errors out. Therefore, if ppp_unbridge_channels is called in the window > between pch->bridge being set and pch->bridge being unset, it will > erroneously drop the reference on pch->file and cause a refcount > underflow. > > To avoid this, hold the ppp_mutex while calling ppp_unbridge_channels in > the shutdown path. This serialises the unbridge operation with any > concurrently executing ioctl. > > Signed-off-by: Tom Parkin <tparkin@katalix.com> > --- > drivers/net/ppp/ppp_generic.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c > index 09c27f7773f9..e87a05fee2db 100644 > --- a/drivers/net/ppp/ppp_generic.c > +++ b/drivers/net/ppp/ppp_generic.c > @@ -2938,7 +2938,9 @@ ppp_unregister_channel(struct ppp_channel *chan) > list_del(&pch->list); > spin_unlock_bh(&pn->all_channels_lock); > > + mutex_lock(&ppp_mutex); > ppp_unbridge_channels(pch); > + mutex_unlock(&ppp_mutex); > > pch->file.dead = 1; > wake_up_interruptible(&pch->file.rwait); > -- > 2.17.1 > The problem is that assigning ->bridge and taking a reference on that channel isn't atomic. Holding ppp_mutex looks like a workaround for this problem. I think the refcount should be taken before unlocking ->upl in ppp_bridge_channels(). Something like this (completely untested): ---- 8< ---- static int ppp_bridge_channels(struct channel *pch, struct channel *pchb) { write_lock_bh(&pch->upl); if (pch->ppp || rcu_dereference_protected(pch->bridge, lockdep_is_held(&pch->upl))) { write_unlock_bh(&pch->upl); return -EALREADY; } + + refcount_inc(&pchb->file.refcnt); rcu_assign_pointer(pch->bridge, pchb); write_unlock_bh(&pch->upl); write_lock_bh(&pchb->upl); if (pchb->ppp || rcu_dereference_protected(pchb->bridge, lockdep_is_held(&pchb->upl))) { write_unlock_bh(&pchb->upl); goto err_unset; } + + refcount_inc(&pch->file.refcnt); rcu_assign_pointer(pchb->bridge, pch); write_unlock_bh(&pchb->upl); - refcount_inc(&pch->file.refcnt); - refcount_inc(&pchb->file.refcnt); - return 0; err_unset: write_lock_bh(&pch->upl); RCU_INIT_POINTER(pch->bridge, NULL); write_unlock_bh(&pch->upl); synchronize_rcu(); + + if (refcount_dec_and_test(&pchb->file.refcnt)) + ppp_destroy_channel(pchb); + return -EALREADY; }
On Thu, Dec 24, 2020 at 11:28:18 +0100, Guillaume Nault wrote: > On Wed, Dec 23, 2020 at 06:47:30PM +0000, Tom Parkin wrote: > > Channels are bridged using the PPPIOCBRIDGECHAN ioctl, which executes > > with the ppp_mutex held. > > > > Unbridging may occur in two code paths: firstly an explicit > > PPPIOCUNBRIDGECHAN ioctl, and secondly on channel unregister. The > > latter may occur when closing the /dev/ppp instance or on teardown of > > the channel itself. > > > > This opens up a refcount underflow bug if ppp_bridge_channels called via. > > ioctl races with ppp_unbridge_channels called via. file release. > > > > The race is triggered by ppp_unbridge_channels taking the error path > > This is actually ppp_bridge_channels. > Will fix, thanks. > > through the 'err_unset' label. In this scenario, pch->bridge has been > > set, but no reference will be taken on pch->file because the function > > errors out. Therefore, if ppp_unbridge_channels is called in the window > > between pch->bridge being set and pch->bridge being unset, it will > > erroneously drop the reference on pch->file and cause a refcount > > underflow. > > > > To avoid this, hold the ppp_mutex while calling ppp_unbridge_channels in > > the shutdown path. This serialises the unbridge operation with any > > concurrently executing ioctl. > > > > Signed-off-by: Tom Parkin <tparkin@katalix.com> > > --- > > drivers/net/ppp/ppp_generic.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c > > index 09c27f7773f9..e87a05fee2db 100644 > > --- a/drivers/net/ppp/ppp_generic.c > > +++ b/drivers/net/ppp/ppp_generic.c > > @@ -2938,7 +2938,9 @@ ppp_unregister_channel(struct ppp_channel *chan) > > list_del(&pch->list); > > spin_unlock_bh(&pn->all_channels_lock); > > > > + mutex_lock(&ppp_mutex); > > ppp_unbridge_channels(pch); > > + mutex_unlock(&ppp_mutex); > > > > pch->file.dead = 1; > > wake_up_interruptible(&pch->file.rwait); > > -- > > 2.17.1 > > > > The problem is that assigning ->bridge and taking a reference on that > channel isn't atomic. Holding ppp_mutex looks like a workaround for > this problem. You're quite right -- that is the underlying issue. > I think the refcount should be taken before unlocking ->upl in > ppp_bridge_channels(). Aye, that's the other option :-) I wasn't sure whether it was better to use the same locking structure as the ioctl call, or to rework the code to make the two things effectively atomic as you suggest. I'll try this approach. Thanks for your review! > > Something like this (completely untested): > > ---- 8< ---- > static int ppp_bridge_channels(struct channel *pch, struct channel *pchb) > { > write_lock_bh(&pch->upl); > if (pch->ppp || > rcu_dereference_protected(pch->bridge, lockdep_is_held(&pch->upl))) { > write_unlock_bh(&pch->upl); > return -EALREADY; > } > + > + refcount_inc(&pchb->file.refcnt); > rcu_assign_pointer(pch->bridge, pchb); > write_unlock_bh(&pch->upl); > > write_lock_bh(&pchb->upl); > if (pchb->ppp || > rcu_dereference_protected(pchb->bridge, lockdep_is_held(&pchb->upl))) { > write_unlock_bh(&pchb->upl); > goto err_unset; > } > + > + refcount_inc(&pch->file.refcnt); > rcu_assign_pointer(pchb->bridge, pch); > write_unlock_bh(&pchb->upl); > > - refcount_inc(&pch->file.refcnt); > - refcount_inc(&pchb->file.refcnt); > - > return 0; > > err_unset: > write_lock_bh(&pch->upl); > RCU_INIT_POINTER(pch->bridge, NULL); > write_unlock_bh(&pch->upl); > synchronize_rcu(); > + > + if (refcount_dec_and_test(&pchb->file.refcnt)) > + ppp_destroy_channel(pchb); > + > return -EALREADY; > } >
On Thu, Dec 24, 2020 at 02:24:32PM +0000, Tom Parkin wrote: > On Thu, Dec 24, 2020 at 11:28:18 +0100, Guillaume Nault wrote: > > On Wed, Dec 23, 2020 at 06:47:30PM +0000, Tom Parkin wrote: > > > Channels are bridged using the PPPIOCBRIDGECHAN ioctl, which executes > > > with the ppp_mutex held. > > > > > > Unbridging may occur in two code paths: firstly an explicit > > > PPPIOCUNBRIDGECHAN ioctl, and secondly on channel unregister. The > > > latter may occur when closing the /dev/ppp instance or on teardown of > > > the channel itself. > > > > > > This opens up a refcount underflow bug if ppp_bridge_channels called via. > > > ioctl races with ppp_unbridge_channels called via. file release. > > > > > > The race is triggered by ppp_unbridge_channels taking the error path > > > > This is actually ppp_bridge_channels. > > > > Will fix, thanks. > > > > through the 'err_unset' label. In this scenario, pch->bridge has been > > > set, but no reference will be taken on pch->file because the function > > > errors out. Therefore, if ppp_unbridge_channels is called in the window > > > between pch->bridge being set and pch->bridge being unset, it will > > > erroneously drop the reference on pch->file and cause a refcount > > > underflow. > > > > > > To avoid this, hold the ppp_mutex while calling ppp_unbridge_channels in > > > the shutdown path. This serialises the unbridge operation with any > > > concurrently executing ioctl. > > > > > > Signed-off-by: Tom Parkin <tparkin@katalix.com> > > > --- > > > drivers/net/ppp/ppp_generic.c | 2 ++ > > > 1 file changed, 2 insertions(+) > > > > > > diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c > > > index 09c27f7773f9..e87a05fee2db 100644 > > > --- a/drivers/net/ppp/ppp_generic.c > > > +++ b/drivers/net/ppp/ppp_generic.c > > > @@ -2938,7 +2938,9 @@ ppp_unregister_channel(struct ppp_channel *chan) > > > list_del(&pch->list); > > > spin_unlock_bh(&pn->all_channels_lock); > > > > > > + mutex_lock(&ppp_mutex); > > > ppp_unbridge_channels(pch); > > > + mutex_unlock(&ppp_mutex); > > > > > > pch->file.dead = 1; > > > wake_up_interruptible(&pch->file.rwait); > > > -- > > > 2.17.1 > > > > > > > The problem is that assigning ->bridge and taking a reference on that > > channel isn't atomic. Holding ppp_mutex looks like a workaround for > > this problem. > > You're quite right -- that is the underlying issue. > > > I think the refcount should be taken before unlocking ->upl in > > ppp_bridge_channels(). > > Aye, that's the other option :-) > > I wasn't sure whether it was better to use the same locking structure > as the ioctl call, or to rework the code to make the two things > effectively atomic as you suggest. ppp_mutex was added by commit 15fd0cd9a2ad ("net: autoconvert trivial BKL users to private mutex") as a replacement for the big kernel lock. We should head towards removing it, rather than expanding its usage (locking dependencies are already complex enough in ppp_generic). Also, as a refcount marks a dependency to another object, it's important to take it before the dependency becomes visible to external entities. > I'll try this approach. > > Thanks for your review! > > > > > Something like this (completely untested): > > > > ---- 8< ---- > > static int ppp_bridge_channels(struct channel *pch, struct channel *pchb) > > { > > write_lock_bh(&pch->upl); > > if (pch->ppp || > > rcu_dereference_protected(pch->bridge, lockdep_is_held(&pch->upl))) { > > write_unlock_bh(&pch->upl); > > return -EALREADY; > > } > > + > > + refcount_inc(&pchb->file.refcnt); > > rcu_assign_pointer(pch->bridge, pchb); > > write_unlock_bh(&pch->upl); > > > > write_lock_bh(&pchb->upl); > > if (pchb->ppp || > > rcu_dereference_protected(pchb->bridge, lockdep_is_held(&pchb->upl))) { > > write_unlock_bh(&pchb->upl); > > goto err_unset; > > } > > + > > + refcount_inc(&pch->file.refcnt); > > rcu_assign_pointer(pchb->bridge, pch); > > write_unlock_bh(&pchb->upl); > > > > - refcount_inc(&pch->file.refcnt); > > - refcount_inc(&pchb->file.refcnt); > > - > > return 0; > > > > err_unset: > > write_lock_bh(&pch->upl); > > RCU_INIT_POINTER(pch->bridge, NULL); > > write_unlock_bh(&pch->upl); > > synchronize_rcu(); > > + > > + if (refcount_dec_and_test(&pchb->file.refcnt)) > > + ppp_destroy_channel(pchb); > > + > > return -EALREADY; > > } > >
diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c index 09c27f7773f9..e87a05fee2db 100644 --- a/drivers/net/ppp/ppp_generic.c +++ b/drivers/net/ppp/ppp_generic.c @@ -2938,7 +2938,9 @@ ppp_unregister_channel(struct ppp_channel *chan) list_del(&pch->list); spin_unlock_bh(&pn->all_channels_lock); + mutex_lock(&ppp_mutex); ppp_unbridge_channels(pch); + mutex_unlock(&ppp_mutex); pch->file.dead = 1; wake_up_interruptible(&pch->file.rwait);
Channels are bridged using the PPPIOCBRIDGECHAN ioctl, which executes with the ppp_mutex held. Unbridging may occur in two code paths: firstly an explicit PPPIOCUNBRIDGECHAN ioctl, and secondly on channel unregister. The latter may occur when closing the /dev/ppp instance or on teardown of the channel itself. This opens up a refcount underflow bug if ppp_bridge_channels called via. ioctl races with ppp_unbridge_channels called via. file release. The race is triggered by ppp_unbridge_channels taking the error path through the 'err_unset' label. In this scenario, pch->bridge has been set, but no reference will be taken on pch->file because the function errors out. Therefore, if ppp_unbridge_channels is called in the window between pch->bridge being set and pch->bridge being unset, it will erroneously drop the reference on pch->file and cause a refcount underflow. To avoid this, hold the ppp_mutex while calling ppp_unbridge_channels in the shutdown path. This serialises the unbridge operation with any concurrently executing ioctl. Signed-off-by: Tom Parkin <tparkin@katalix.com> --- drivers/net/ppp/ppp_generic.c | 2 ++ 1 file changed, 2 insertions(+)