mbox series

[net-next,0/2] Bonding: support new xfrm state offload functions

Message ID 20240816035518.203704-1-liuhangbin@gmail.com (mailing list archive)
Headers show
Series Bonding: support new xfrm state offload functions | expand

Message

Hangbin Liu Aug. 16, 2024, 3:55 a.m. UTC
I planned to add the new XFRM state offload functions after Jianbo's
patchset [1], but it seems that may take some time. Therefore, I am
posting these two patches to net-next now, as our users are waiting for
this functionality. If Jianbo's patch is applied first, I can update these
patches accordingly.

[1] https://lore.kernel.org/netdev/20240815142103.2253886-2-tariqt@nvidia.com

Hangbin Liu (2):
  bonding: Add ESN support to IPSec HW offload
  bonding: support xfrm state update

 drivers/net/bonding/bond_main.c | 76 +++++++++++++++++++++++++++++++++
 1 file changed, 76 insertions(+)

Comments

Hangbin Liu Aug. 16, 2024, 4:01 a.m. UTC | #1
On Fri, Aug 16, 2024 at 11:55:16AM +0800, Hangbin Liu wrote:
> I planned to add the new XFRM state offload functions after Jianbo's
> patchset [1], but it seems that may take some time. Therefore, I am
> posting these two patches to net-next now, as our users are waiting for
> this functionality. If Jianbo's patch is applied first, I can update these
> patches accordingly.
> 
> [1] https://lore.kernel.org/netdev/20240815142103.2253886-2-tariqt@nvidia.com

Forgot to say, The xdo_dev_state_free will be added by Jianbo's patchset.
I will add the bonding xfrm policy offload in future.

Thanks
Hangbin
Nikolay Aleksandrov Aug. 16, 2024, 6 a.m. UTC | #2
On 16/08/2024 06:55, Hangbin Liu wrote:
> I planned to add the new XFRM state offload functions after Jianbo's
> patchset [1], but it seems that may take some time. Therefore, I am
> posting these two patches to net-next now, as our users are waiting for
> this functionality. If Jianbo's patch is applied first, I can update these
> patches accordingly.
> 
> [1] https://lore.kernel.org/netdev/20240815142103.2253886-2-tariqt@nvidia.com
> 
> Hangbin Liu (2):
>   bonding: Add ESN support to IPSec HW offload
>   bonding: support xfrm state update
> 
>  drivers/net/bonding/bond_main.c | 76 +++++++++++++++++++++++++++++++++
>  1 file changed, 76 insertions(+)
> 

the set looks good to me, one minor cosmetic nit is that the two
functions look very much alike only difference is the actual call
can you maybe factor out the boilerplate?
Nikolay Aleksandrov Aug. 16, 2024, 6:06 a.m. UTC | #3
On 16/08/2024 06:55, Hangbin Liu wrote:
> I planned to add the new XFRM state offload functions after Jianbo's
> patchset [1], but it seems that may take some time. Therefore, I am
> posting these two patches to net-next now, as our users are waiting for
> this functionality. If Jianbo's patch is applied first, I can update these
> patches accordingly.
> 
> [1] https://lore.kernel.org/netdev/20240815142103.2253886-2-tariqt@nvidia.com
> 
> Hangbin Liu (2):
>   bonding: Add ESN support to IPSec HW offload
>   bonding: support xfrm state update
> 
>  drivers/net/bonding/bond_main.c | 76 +++++++++++++++++++++++++++++++++
>  1 file changed, 76 insertions(+)
> 

(not related to this set, but to bond xfrm)
By the way looking at bond's xfrm code, what prevents bond_ipsec_offload_ok()
from dereferencing a null ptr?
I mean it does:
        curr_active = rcu_dereference(bond->curr_active_slave);
        real_dev = curr_active->dev;

If this is running only under RCU as the code suggests then
curr_active_slave can change to NULL in parallel. Should there be a
check for curr_active before deref or am I missing something?

Cheers,
 Nik
Hangbin Liu Aug. 16, 2024, 8:32 a.m. UTC | #4
On Fri, Aug 16, 2024 at 09:06:12AM +0300, Nikolay Aleksandrov wrote:
> On 16/08/2024 06:55, Hangbin Liu wrote:
> > I planned to add the new XFRM state offload functions after Jianbo's
> > patchset [1], but it seems that may take some time. Therefore, I am
> > posting these two patches to net-next now, as our users are waiting for
> > this functionality. If Jianbo's patch is applied first, I can update these
> > patches accordingly.
> > 
> > [1] https://lore.kernel.org/netdev/20240815142103.2253886-2-tariqt@nvidia.com
> > 
> > Hangbin Liu (2):
> >   bonding: Add ESN support to IPSec HW offload
> >   bonding: support xfrm state update
> > 
> >  drivers/net/bonding/bond_main.c | 76 +++++++++++++++++++++++++++++++++
> >  1 file changed, 76 insertions(+)
> > 
> 
> (not related to this set, but to bond xfrm)
> By the way looking at bond's xfrm code, what prevents bond_ipsec_offload_ok()
> from dereferencing a null ptr?
> I mean it does:
>         curr_active = rcu_dereference(bond->curr_active_slave);
>         real_dev = curr_active->dev;
> 
> If this is running only under RCU as the code suggests then
> curr_active_slave can change to NULL in parallel. Should there be a
> check for curr_active before deref or am I missing something?

Yes, we can do like
real_dev = curr_active ? curr_active->dev : NULL;

Thanks
Hangbin
Hangbin Liu Aug. 16, 2024, 8:35 a.m. UTC | #5
On Fri, Aug 16, 2024 at 09:00:17AM +0300, Nikolay Aleksandrov wrote:
> the set looks good to me, one minor cosmetic nit is that the two
> functions look very much alike only difference is the actual call
> can you maybe factor out the boilerplate?

Thanks, I will update the patch to use a common function for the checking.

Hangbin
Nikolay Aleksandrov Aug. 16, 2024, 8:37 a.m. UTC | #6
On 16/08/2024 11:32, Hangbin Liu wrote:
> On Fri, Aug 16, 2024 at 09:06:12AM +0300, Nikolay Aleksandrov wrote:
>> On 16/08/2024 06:55, Hangbin Liu wrote:
>>> I planned to add the new XFRM state offload functions after Jianbo's
>>> patchset [1], but it seems that may take some time. Therefore, I am
>>> posting these two patches to net-next now, as our users are waiting for
>>> this functionality. If Jianbo's patch is applied first, I can update these
>>> patches accordingly.
>>>
>>> [1] https://lore.kernel.org/netdev/20240815142103.2253886-2-tariqt@nvidia.com
>>>
>>> Hangbin Liu (2):
>>>   bonding: Add ESN support to IPSec HW offload
>>>   bonding: support xfrm state update
>>>
>>>  drivers/net/bonding/bond_main.c | 76 +++++++++++++++++++++++++++++++++
>>>  1 file changed, 76 insertions(+)
>>>
>>
>> (not related to this set, but to bond xfrm)
>> By the way looking at bond's xfrm code, what prevents bond_ipsec_offload_ok()
>> from dereferencing a null ptr?
>> I mean it does:
>>         curr_active = rcu_dereference(bond->curr_active_slave);
>>         real_dev = curr_active->dev;
>>
>> If this is running only under RCU as the code suggests then
>> curr_active_slave can change to NULL in parallel. Should there be a
>> check for curr_active before deref or am I missing something?
> 
> Yes, we can do like
> real_dev = curr_active ? curr_active->dev : NULL;
> 
> Thanks
> Hangbin

Right, let me try and trigger it and I'll send a patch. :)
Nikolay Aleksandrov Aug. 16, 2024, 9:04 a.m. UTC | #7
On 16/08/2024 11:37, Nikolay Aleksandrov wrote:
> On 16/08/2024 11:32, Hangbin Liu wrote:
>> On Fri, Aug 16, 2024 at 09:06:12AM +0300, Nikolay Aleksandrov wrote:
>>> On 16/08/2024 06:55, Hangbin Liu wrote:
>>>> I planned to add the new XFRM state offload functions after Jianbo's
>>>> patchset [1], but it seems that may take some time. Therefore, I am
>>>> posting these two patches to net-next now, as our users are waiting for
>>>> this functionality. If Jianbo's patch is applied first, I can update these
>>>> patches accordingly.
>>>>
>>>> [1] https://lore.kernel.org/netdev/20240815142103.2253886-2-tariqt@nvidia.com
>>>>
>>>> Hangbin Liu (2):
>>>>   bonding: Add ESN support to IPSec HW offload
>>>>   bonding: support xfrm state update
>>>>
>>>>  drivers/net/bonding/bond_main.c | 76 +++++++++++++++++++++++++++++++++
>>>>  1 file changed, 76 insertions(+)
>>>>
>>>
>>> (not related to this set, but to bond xfrm)
>>> By the way looking at bond's xfrm code, what prevents bond_ipsec_offload_ok()
>>> from dereferencing a null ptr?
>>> I mean it does:
>>>         curr_active = rcu_dereference(bond->curr_active_slave);
>>>         real_dev = curr_active->dev;
>>>
>>> If this is running only under RCU as the code suggests then
>>> curr_active_slave can change to NULL in parallel. Should there be a
>>> check for curr_active before deref or am I missing something?
>>
>> Yes, we can do like
>> real_dev = curr_active ? curr_active->dev : NULL;
>>
>> Thanks
>> Hangbin
> 
> Right, let me try and trigger it and I'll send a patch. :)
> 

Just fyi I was able to trigger this null deref easily and one more
thing - there is a second null deref after this one because real_dev is
used directly :(

I'll send a patch to fix both in a bit.
Nikolay Aleksandrov Aug. 16, 2024, 9:58 a.m. UTC | #8
On 16/08/2024 12:04, Nikolay Aleksandrov wrote:
> On 16/08/2024 11:37, Nikolay Aleksandrov wrote:
>> On 16/08/2024 11:32, Hangbin Liu wrote:
>>> On Fri, Aug 16, 2024 at 09:06:12AM +0300, Nikolay Aleksandrov wrote:
>>>> On 16/08/2024 06:55, Hangbin Liu wrote:
>>>>> I planned to add the new XFRM state offload functions after Jianbo's
>>>>> patchset [1], but it seems that may take some time. Therefore, I am
>>>>> posting these two patches to net-next now, as our users are waiting for
>>>>> this functionality. If Jianbo's patch is applied first, I can update these
>>>>> patches accordingly.
>>>>>
>>>>> [1] https://lore.kernel.org/netdev/20240815142103.2253886-2-tariqt@nvidia.com
>>>>>
>>>>> Hangbin Liu (2):
>>>>>   bonding: Add ESN support to IPSec HW offload
>>>>>   bonding: support xfrm state update
>>>>>
>>>>>  drivers/net/bonding/bond_main.c | 76 +++++++++++++++++++++++++++++++++
>>>>>  1 file changed, 76 insertions(+)
>>>>>
>>>>
>>>> (not related to this set, but to bond xfrm)
>>>> By the way looking at bond's xfrm code, what prevents bond_ipsec_offload_ok()
>>>> from dereferencing a null ptr?
>>>> I mean it does:
>>>>         curr_active = rcu_dereference(bond->curr_active_slave);
>>>>         real_dev = curr_active->dev;
>>>>
>>>> If this is running only under RCU as the code suggests then
>>>> curr_active_slave can change to NULL in parallel. Should there be a
>>>> check for curr_active before deref or am I missing something?
>>>
>>> Yes, we can do like
>>> real_dev = curr_active ? curr_active->dev : NULL;
>>>
>>> Thanks
>>> Hangbin
>>
>> Right, let me try and trigger it and I'll send a patch. :)
>>
> 
> Just fyi I was able to trigger this null deref easily and one more
> thing - there is a second null deref after this one because real_dev is
> used directly :(
> 
> I'll send a patch to fix both in a bit.
> 
> 

TBH I don't know how bond xfrm code is running at all. There are *many*
bugs when I took a closer look. I'll try to prepare a patch-set to address
them all. This code is seriously broken...