Message ID | 20210614141849.3587683-1-kristian.evensen@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] qmi_wwan: Clone the skb when in pass-through mode | expand |
Context | Check | Description |
---|---|---|
netdev/cover_letter | success | Link |
netdev/fixes_present | success | Link |
netdev/patch_count | success | Link |
netdev/tree_selection | success | Clearly marked for net |
netdev/subject_prefix | success | Link |
netdev/cc_maintainers | fail | 1 blamed authors not CCed: kuba@kernel.org; 3 maintainers not CCed: kuba@kernel.org davem@davemloft.net linux-usb@vger.kernel.org |
netdev/source_inline | success | Was 0 now: 0 |
netdev/verify_signedoff | success | Link |
netdev/module_param | success | Was 0 now: 0 |
netdev/build_32bit | success | Errors and warnings before: 0 this patch: 0 |
netdev/kdoc | success | Errors and warnings before: 0 this patch: 0 |
netdev/verify_fixes | fail | Link |
netdev/checkpatch | success | total: 0 errors, 0 warnings, 0 checks, 10 lines checked |
netdev/build_allmodconfig_warn | success | Errors and warnings before: 0 this patch: 0 |
netdev/header_inline | success | Link |
Kristian Evensen <kristian.evensen@gmail.com> writes: > The skb that we pass to the rmnet driver is owned by usbnet and is freed > soon after the rx_fixup() callback is called (in usbnet_bh()). There is > no guarantee that rmnet is done handling the skb before it is freed. We > should clone the skb before we call netif_rx() to prevent use-after-free > and misc. kernel oops. > > Fixes: 59e139cf0b32 ("net: qmi_wwan: Add pass through mode") > > Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com> > --- > drivers/net/usb/qmi_wwan.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c > index db8d3a4f2678..5ac307eb0bfd 100644 > --- a/drivers/net/usb/qmi_wwan.c > +++ b/drivers/net/usb/qmi_wwan.c > @@ -620,6 +620,10 @@ static int qmi_wwan_rx_fixup(struct usbnet *dev, struct sk_buff *skb) > return qmimux_rx_fixup(dev, skb); > > if (info->flags & QMI_WWAN_FLAG_PASS_THROUGH) { > + skb = skb_clone(skb, GFP_ATOMIC); > + if (!skb) > + return 0; > + > skb->protocol = htons(ETH_P_MAP); > return (netif_rx(skb) == NET_RX_SUCCESS); > } Thanks for pointing this out. But it still looks strange to me. Why do we call netif_rx(skb) here instead of just returning 1 and leave that for usbnet_skb_return()? With cloning we end up doing eth_type_trans() on the duplicate - is that wise? Bjørn
Hi Bjørn, On Mon, Jun 14, 2021 at 4:46 PM Bjørn Mork <bjorn@mork.no> wrote: > Thanks for pointing this out. But it still looks strange to me. Why do > we call netif_rx(skb) here instead of just returning 1 and leave that > for usbnet_skb_return()? With cloning we end up doing eth_type_trans() > on the duplicate - is that wise? Thanks for pointing this out. You are right and looking at the code again, we should end up in usbnet_skb_return() if the call to netif_rx is succesful. I do agree that calling netif_rx() is a bit strange, considering that usbnet_skb_return() already calls this function. Perhaps the issue is that we end up calling netif_rx() on the same skb twice? rmnet also frees the skb when done with the de-aggregation. I will do some more testing tomorrow and see if I can figure out something. So far, the only thing I know is that if I try to perform a more demanding transfer (like downloading a file) using qmi_wwan + rmnet then I get a kernel oops immediatly. Kristian
Hi Bjørn, On Mon, Jun 14, 2021 at 5:49 PM Kristian Evensen <kristian.evensen@gmail.com> wrote: > I will do some more testing tomorrow and see if I can figure out > something. So far, the only thing I know is that if I try to perform a > more demanding transfer (like downloading a file) using qmi_wwan + > rmnet then I get a kernel oops immediatly. I got curios and decided to test your proposed changed today. Replacing the call to netif_rx() with return 1, and then letting usbnet_skb_return() do the netif_rx() call, seems to have resolved the issue. At least I am not longer able to trigger the oops using my earlier reproducer (running a couple of Speedtest). Unless someone beats me to it, I will prepare another patch tomorrow. Thanks for the help. Kristian
On Mon, 14 Jun 2021 16:45:55 +0200 Bjørn Mork wrote: > Kristian Evensen <kristian.evensen@gmail.com> writes: > > > The skb that we pass to the rmnet driver is owned by usbnet and is freed > > soon after the rx_fixup() callback is called (in usbnet_bh()). There is > > no guarantee that rmnet is done handling the skb before it is freed. We > > should clone the skb before we call netif_rx() to prevent use-after-free > > and misc. kernel oops. > > > > Fixes: 59e139cf0b32 ("net: qmi_wwan: Add pass through mode") > > > > Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com> > > --- > > drivers/net/usb/qmi_wwan.c | 4 ++++ > > 1 file changed, 4 insertions(+) > > > > diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c > > index db8d3a4f2678..5ac307eb0bfd 100644 > > --- a/drivers/net/usb/qmi_wwan.c > > +++ b/drivers/net/usb/qmi_wwan.c > > @@ -620,6 +620,10 @@ static int qmi_wwan_rx_fixup(struct usbnet *dev, struct sk_buff *skb) > > return qmimux_rx_fixup(dev, skb); > > > > if (info->flags & QMI_WWAN_FLAG_PASS_THROUGH) { > > + skb = skb_clone(skb, GFP_ATOMIC); > > + if (!skb) > > + return 0; > > + > > skb->protocol = htons(ETH_P_MAP); > > return (netif_rx(skb) == NET_RX_SUCCESS); > > } > > Thanks for pointing this out. But it still looks strange to me. Why do > we call netif_rx(skb) here instead of just returning 1 and leave that > for usbnet_skb_return()? With cloning we end up doing eth_type_trans() > on the duplicate - is that wise? Agreed on the cloning being a strange solution. Kristian, were you able to reproduce the problem on upstream kernels? It does look pretty strange that qmimux_rx_fixup() copies out all packets and receives them, and then let's usbnet to process the multi-frame skb without even fulling off the qmimux_hdr. I'm probably missing something.. otherwise sth like FLAG_MULTI_PACKET may be in order?
Jakub Kicinski <kuba@kernel.org> writes: > It does look pretty strange that qmimux_rx_fixup() copies out all > packets and receives them, and then let's usbnet to process the > multi-frame skb without even fulling off the qmimux_hdr. I'm probably > missing something.. otherwise sth like FLAG_MULTI_PACKET may be in > order? Yes, maybe it is? We'd have to call usbnet_skb_return() for the plain IP frames then, but that might come out cleaner? Bjørn
Hi Jakub, On Mon, Jun 14, 2021 at 10:05 PM Jakub Kicinski <kuba@kernel.org> wrote: > Agreed on the cloning being a strange solution. Kristian, were you able > to reproduce the problem on upstream kernels? Yes, after Bjørn's comment I realized that cloning was not a good solution. I should have taken a closer look at the usbnet code, so sorry about that. The most recent kernel I have managed to work well with my boards is 5.4.123, but I see that 5.10 is available as well (OpenWrt). However, I have backported all changes made to rmnet and qmi_wwan between 5.4 and net-next as of yesterday. I was hoping that a backport of the changes to those two drivers would be enough, but perhaps there is something I have missed. I will try to get 5.10 to run and see if that helps. However, I have spent some more time looking into the code today. Bjørn is right that calling netif_rx() from qmi_wwan is strange, at least when in passthrough mode. The rx_fixup function will return 1 (assuming netif_rx() is successful), which in turn will lead to usbnet_skb_return() being called and netif_rx() being called a second time for the same skb. I have to admit that I don't know what will happen when netif_rx() is called twice, but either call seems redundant. I will submit a patch modifying the qmi_wwan rx_fixup function to return 1 when the QMI_WWAN_FLAG_PASS_THROUGH. I believe it is a nice clean-up and that is better to use as much of the existing infrastructure as possible. > It does look pretty strange that qmimux_rx_fixup() copies out all > packets and receives them, and then let's usbnet to process the > multi-frame skb without even fulling off the qmimux_hdr. I'm probably > missing something.. otherwise sth like FLAG_MULTI_PACKET may be in > order? qmimux_rx_fixup() is different from what we are discussing here. qmimux_rx_fixup() is used when the de-aggregation is performed by the qmi_wwan driver, while the passthrough flag is set when the de-aggregation is done by the rmnet driver. The logic in qmimux_rx_fixup() is very similar to how the other usbnet mini-drivers handles de-aggregation and also how de-aggregation is handled by for example rmnet. I have no opinion on if the logic makes sens or not, but at least the origin can be traced :) Kristian
Kristian Evensen <kristian.evensen@gmail.com> writes: >> It does look pretty strange that qmimux_rx_fixup() copies out all >> packets and receives them, and then let's usbnet to process the >> multi-frame skb without even fulling off the qmimux_hdr. I'm probably >> missing something.. otherwise sth like FLAG_MULTI_PACKET may be in >> order? > > qmimux_rx_fixup() is different from what we are discussing here. > qmimux_rx_fixup() is used when the de-aggregation is performed by the > qmi_wwan driver, while the passthrough flag is set when the > de-aggregation is done by the rmnet driver. The logic in > qmimux_rx_fixup() is very similar to how the other usbnet mini-drivers > handles de-aggregation and also how de-aggregation is handled by for > example rmnet. I have no opinion on if the logic makes sens or not, > but at least the origin can be traced :) Yes, FLAG_MULTI_PACKET is only applicable to the qmimux case. But I think Jakub is right that we should set it anyway. There is no way to return from rx_fixup without an error or further processing of the skb, unless we set FLAG_MULTI_PACKET. Or invent something else. But setting that flag and then add the necessary usnet_sb_return call doesn't look too bad? Bjørn
Hi Bjørn, On Tue, Jun 15, 2021 at 12:04 PM Bjørn Mork <bjorn@mork.no> wrote: > Yes, FLAG_MULTI_PACKET is only applicable to the qmimux case. But I > think Jakub is right that we should set it anyway. There is no way to > return from rx_fixup without an error or further processing of the skb, > unless we set FLAG_MULTI_PACKET. Or invent something else. But setting > that flag and then add the necessary usnet_sb_return call doesn't look > too bad? Just so that I am sure that we are on the same page. What you are suggesting is something like: * Update FLAG_MULTI_PACKET when qmimux is set/unset. * Replace the call to netif_rx() inside qmimux_rx_fixup() with a call to usbnet_skb_return. * I guess we need to keep the code that updates the qmimux interface counters. I guess we can just call this code unconditionally? I think this would be a really nice solution. The same (at least FLAG_MULTI_PACKET + usbnet_skb_return) could be applied to pass through as well, giving us consistent handling of aggregated packets. While we might not save a huge number of lines, I believe the resulting code will be easier to understand. If we agree that this is a good way forward, I can prepare a patch. I have everything set up and ready to go. Kristian
Hi, On Tue, Jun 15, 2021 at 12:51 PM Kristian Evensen <kristian.evensen@gmail.com> wrote: > I think this would be a really nice solution. The same (at least > FLAG_MULTI_PACKET + usbnet_skb_return) could be applied to pass > through as well, giving us consistent handling of aggregated packets. > While we might not save a huge number of lines, I believe the > resulting code will be easier to understand. Apologies for the noise. When I check the code again, I see that as long as FLAG_MULTI_PACKET is set, then we end up with usbnet freeing the skb (we will always jump to done in rx_process()). So for the pass-through case, I believe your initial suggestion of having rx_fixup return 1 is the way to go. Kristian
Kristian Evensen <kristian.evensen@gmail.com> writes: > On Tue, Jun 15, 2021 at 12:51 PM Kristian Evensen > <kristian.evensen@gmail.com> wrote: >> I think this would be a really nice solution. The same (at least >> FLAG_MULTI_PACKET + usbnet_skb_return) could be applied to pass >> through as well, giving us consistent handling of aggregated packets. >> While we might not save a huge number of lines, I believe the >> resulting code will be easier to understand. > > Apologies for the noise. When I check the code again, I see that as > long as FLAG_MULTI_PACKET is set, then we end up with usbnet freeing > the skb (we will always jump to done in rx_process()). So for the > pass-through case, I believe your initial suggestion of having > rx_fixup return 1 is the way to go. Yes, if we are to use FLAG_MULTI_PACKET then we must call usbnet_skb_return() for all the non-muxed cases. There is no clean way to enable FLAG_MULTI_PACKET on-demand. I am fine with either solution. Whatever Jakub wants :-) Bjørn
On Tue, 15 Jun 2021 15:39:14 +0200 Bjørn Mork wrote: > >> I think this would be a really nice solution. The same (at least > >> FLAG_MULTI_PACKET + usbnet_skb_return) could be applied to pass > >> through as well, giving us consistent handling of aggregated packets. > >> While we might not save a huge number of lines, I believe the > >> resulting code will be easier to understand. > > > > Apologies for the noise. When I check the code again, I see that as > > long as FLAG_MULTI_PACKET is set, then we end up with usbnet freeing > > the skb (we will always jump to done in rx_process()). So for the > > pass-through case, I believe your initial suggestion of having > > rx_fixup return 1 is the way to go. > > Yes, if we are to use FLAG_MULTI_PACKET then we must call > usbnet_skb_return() for all the non-muxed cases. There is no clean way > to enable FLAG_MULTI_PACKET on-demand. Tricky piece of code. Perhaps we could add another return code to the rx_fixup call? Seems that we expect 0 or 1 today, maybe we can make 2 mean "data was copied out", and use that for the qmimux case? > I am fine with either solution. Whatever Jakub wants :-) Well, turns out I was looking at the wrong netif_rx() so take what I say with a grain of salt ;)
On Tue, 15 Jun 2021 12:26:04 -0700 Jakub Kicinski wrote: > > > Apologies for the noise. When I check the code again, I see that as > > > long as FLAG_MULTI_PACKET is set, then we end up with usbnet freeing > > > the skb (we will always jump to done in rx_process()). So for the > > > pass-through case, I believe your initial suggestion of having > > > rx_fixup return 1 is the way to go. > > > > Yes, if we are to use FLAG_MULTI_PACKET then we must call > > usbnet_skb_return() for all the non-muxed cases. There is no clean way > > to enable FLAG_MULTI_PACKET on-demand. > > Tricky piece of code. Perhaps we could add another return code > to the rx_fixup call? Seems that we expect 0 or 1 today, maybe we > can make 2 mean "data was copied out", and use that for the qmimux > case? And to be clear with that still do what Bjorn suggested and return 1 instead of netif_rx() in the non-qmimux case.
Hi, On Tue, Jun 15, 2021 at 9:26 PM Jakub Kicinski <kuba@kernel.org> wrote: > Tricky piece of code. Perhaps we could add another return code > to the rx_fixup call? Seems that we expect 0 or 1 today, maybe we > can make 2 mean "data was copied out", and use that for the qmimux > case? I took another look at the qmi_wwan and usbnet code, and I think we can solve the problem at hand with the current infrastructure. I believe all we have to do, is to change qmimux_rx_fixup() to return 0 and not 1. When the rx_fixup() callback returns 0, rx_progress() will jump to "done" and queue the skb for clean-up. This is the same behavior as if FLAG_MULTI_PACKET was set. I will test and prepare a patch changing the return value, assuming it works well. Looking at the loop in qmimux_rx_fixup I also see room for some clean-up, I will try to merge everything into one. Kristian
diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c index db8d3a4f2678..5ac307eb0bfd 100644 --- a/drivers/net/usb/qmi_wwan.c +++ b/drivers/net/usb/qmi_wwan.c @@ -620,6 +620,10 @@ static int qmi_wwan_rx_fixup(struct usbnet *dev, struct sk_buff *skb) return qmimux_rx_fixup(dev, skb); if (info->flags & QMI_WWAN_FLAG_PASS_THROUGH) { + skb = skb_clone(skb, GFP_ATOMIC); + if (!skb) + return 0; + skb->protocol = htons(ETH_P_MAP); return (netif_rx(skb) == NET_RX_SUCCESS); }
The skb that we pass to the rmnet driver is owned by usbnet and is freed soon after the rx_fixup() callback is called (in usbnet_bh()). There is no guarantee that rmnet is done handling the skb before it is freed. We should clone the skb before we call netif_rx() to prevent use-after-free and misc. kernel oops. Fixes: 59e139cf0b32 ("net: qmi_wwan: Add pass through mode") Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com> --- drivers/net/usb/qmi_wwan.c | 4 ++++ 1 file changed, 4 insertions(+)