Message ID | 20240226112816.2616719-1-quic_kriskura@quicinc.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [RFC] usb: gadget: ncm: Fix handling of zero block length packets | expand |
On Mon, Feb 26, 2024 at 04:58:16PM +0530, Krishna Kurapati wrote: > While connecting to a Linux host with CDC_NCM_NTB_DEF_SIZE_TX > set to 65536, it has been observed that we receive short packets, > which come at interval of 5-10 seconds sometimes and have block > length zero but still contain 1-2 valid datagrams present. > > According to the NCM spec: > > "If wBlockLength = 0x0000, the block is terminated by a > short packet. In this case, the USB transfer must still > be shorter than dwNtbInMaxSize or dwNtbOutMaxSize. If > exactly dwNtbInMaxSize or dwNtbOutMaxSize bytes are sent, > and the size is a multiple of wMaxPacketSize for the > given pipe, then no ZLP shall be sent. > > wBlockLength= 0x0000 must be used with extreme care, because > of the possibility that the host and device may get out of > sync, and because of test issues. > > wBlockLength = 0x0000 allows the sender to reduce latency by > starting to send a very large NTB, and then shortening it when > the sender discovers that there’s not sufficient data to justify > sending a large NTB" > > However, there is a potential issue with the current implementation, > as it checks for the occurrence of multiple NTBs in a single > giveback by verifying if the leftover bytes to be processed is zero > or not. If the block length reads zero, we would process the same > NTB infintely because the leftover bytes is never zero and it leads > to a crash. Fix this by bailing out if block length reads zero. > > Fixes: 427694cfaafa ("usb: gadget: ncm: Handle decoding of multiple NTB's in unwrap call") > Signed-off-by: Krishna Kurapati <quic_kriskura@quicinc.com> > --- > > PS: Although this issue was seen after CDC_NCM_NTB_DEF_SIZE_TX > was modified to 64K on host side, I still believe this > can come up at any time as per the spec. Also I assumed > that the giveback where block length is zero, has only > one NTB and not multiple ones. Hi, This is the friendly patch-bot of Greg Kroah-Hartman. You have sent him a patch that has triggered this response. He used to manually respond to these common problems, but in order to save his sanity (he kept writing the same thing over and over, yet to different people), I was created. Hopefully you will not take offence and will fix the problem in your patch and resubmit it so that it can be accepted into the Linux kernel tree. You are receiving this message because of the following common error(s) as indicated below: - You have marked a patch with a "Fixes:" tag for a commit that is in an older released kernel, yet you do not have a cc: stable line in the signed-off-by area at all, which means that the patch will not be applied to any older kernel releases. To properly fix this, please follow the documented rules in the Documentation/process/stable-kernel-rules.rst file for how to resolve this. If you wish to discuss this problem further, or you have questions about how to resolve this issue, please feel free to respond to this email and Greg will reply once he has dug out from the pending patches received from other developers. thanks, greg k-h's patch email bot
On Mon, Feb 26, 2024 at 3:28 AM Krishna Kurapati <quic_kriskura@quicinc.com> wrote: > > While connecting to a Linux host with CDC_NCM_NTB_DEF_SIZE_TX > set to 65536, it has been observed that we receive short packets, > which come at interval of 5-10 seconds sometimes and have block > length zero but still contain 1-2 valid datagrams present. > > According to the NCM spec: > > "If wBlockLength = 0x0000, the block is terminated by a > short packet. In this case, the USB transfer must still > be shorter than dwNtbInMaxSize or dwNtbOutMaxSize. If > exactly dwNtbInMaxSize or dwNtbOutMaxSize bytes are sent, > and the size is a multiple of wMaxPacketSize for the > given pipe, then no ZLP shall be sent. > > wBlockLength= 0x0000 must be used with extreme care, because > of the possibility that the host and device may get out of > sync, and because of test issues. > > wBlockLength = 0x0000 allows the sender to reduce latency by > starting to send a very large NTB, and then shortening it when > the sender discovers that there’s not sufficient data to justify > sending a large NTB" > > However, there is a potential issue with the current implementation, > as it checks for the occurrence of multiple NTBs in a single > giveback by verifying if the leftover bytes to be processed is zero > or not. If the block length reads zero, we would process the same > NTB infintely because the leftover bytes is never zero and it leads > to a crash. Fix this by bailing out if block length reads zero. > > Fixes: 427694cfaafa ("usb: gadget: ncm: Handle decoding of multiple NTB's in unwrap call") > Signed-off-by: Krishna Kurapati <quic_kriskura@quicinc.com> > --- > > PS: Although this issue was seen after CDC_NCM_NTB_DEF_SIZE_TX > was modified to 64K on host side, I still believe this > can come up at any time as per the spec. Also I assumed > that the giveback where block length is zero, has only > one NTB and not multiple ones. > > drivers/usb/gadget/function/f_ncm.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/drivers/usb/gadget/function/f_ncm.c b/drivers/usb/gadget/function/f_ncm.c > index e2a059cfda2c..355e370e5140 100644 > --- a/drivers/usb/gadget/function/f_ncm.c > +++ b/drivers/usb/gadget/function/f_ncm.c > @@ -1337,6 +1337,9 @@ static int ncm_unwrap_ntb(struct gether *port, > VDBG(port->func.config->cdev, > "Parsed NTB with %d frames\n", dgram_counter); > > + if (block_len == 0) > + goto done; > + > to_process -= block_len; > > /* > @@ -1351,6 +1354,7 @@ static int ncm_unwrap_ntb(struct gether *port, > goto parse_ntb; > } > > +done: > dev_consume_skb_any(skb); > > return 0; > -- > 2.34.1 > In general this is of course fine (though see Greg's auto-complaint). I haven't thought too much about this, but I just wonder whether the check for block_len == 0 shouldn't be just after block_len is read, ie. somewhere just after: block_len = get_ncm(&tmp, opts->block_length); as it is kind of weird to be handling block_len == 0 at the point where you are already theoretically done processing the block... I guess, as is, this assumes the block isn't actually of length 0, since there's a bunch of following get_ncm() calls... Are those guaranteed to be valid? I guess I don't actually see the infinite loop with block_len == 0, since get_ncm() always moves us forward... Maybe your patch *is* correct as is, and you just need a comment explaining *why* block_len == 0 is terminal at the spot you're adding the check. Also couldn't you fix this without goto, by changing } else if (to_process > 0) { to } else if (to_process && block_len) { // See NCM spec. zero block_len means short packet. -- Maciej Żenczykowski, Kernel Networking Developer @ Google
On 2/27/2024 3:26 AM, Maciej Żenczykowski wrote: > On Mon, Feb 26, 2024 at 3:28 AM Krishna Kurapati > <quic_kriskura@quicinc.com> wrote: >> >> While connecting to a Linux host with CDC_NCM_NTB_DEF_SIZE_TX >> set to 65536, it has been observed that we receive short packets, >> which come at interval of 5-10 seconds sometimes and have block >> length zero but still contain 1-2 valid datagrams present. >> >> According to the NCM spec: >> >> "If wBlockLength = 0x0000, the block is terminated by a >> short packet. In this case, the USB transfer must still >> be shorter than dwNtbInMaxSize or dwNtbOutMaxSize. If >> exactly dwNtbInMaxSize or dwNtbOutMaxSize bytes are sent, >> and the size is a multiple of wMaxPacketSize for the >> given pipe, then no ZLP shall be sent. >> >> wBlockLength= 0x0000 must be used with extreme care, because >> of the possibility that the host and device may get out of >> sync, and because of test issues. >> >> wBlockLength = 0x0000 allows the sender to reduce latency by >> starting to send a very large NTB, and then shortening it when >> the sender discovers that there’s not sufficient data to justify >> sending a large NTB" >> >> However, there is a potential issue with the current implementation, >> as it checks for the occurrence of multiple NTBs in a single >> giveback by verifying if the leftover bytes to be processed is zero >> or not. If the block length reads zero, we would process the same >> NTB infintely because the leftover bytes is never zero and it leads >> to a crash. Fix this by bailing out if block length reads zero. >> >> Fixes: 427694cfaafa ("usb: gadget: ncm: Handle decoding of multiple NTB's in unwrap call") >> Signed-off-by: Krishna Kurapati <quic_kriskura@quicinc.com> >> --- >> >> PS: Although this issue was seen after CDC_NCM_NTB_DEF_SIZE_TX >> was modified to 64K on host side, I still believe this >> can come up at any time as per the spec. Also I assumed >> that the giveback where block length is zero, has only >> one NTB and not multiple ones. >> >> drivers/usb/gadget/function/f_ncm.c | 4 ++++ >> 1 file changed, 4 insertions(+) >> >> diff --git a/drivers/usb/gadget/function/f_ncm.c b/drivers/usb/gadget/function/f_ncm.c >> index e2a059cfda2c..355e370e5140 100644 >> --- a/drivers/usb/gadget/function/f_ncm.c >> +++ b/drivers/usb/gadget/function/f_ncm.c >> @@ -1337,6 +1337,9 @@ static int ncm_unwrap_ntb(struct gether *port, >> VDBG(port->func.config->cdev, >> "Parsed NTB with %d frames\n", dgram_counter); >> >> + if (block_len == 0) >> + goto done; >> + >> to_process -= block_len; >> >> /* >> @@ -1351,6 +1354,7 @@ static int ncm_unwrap_ntb(struct gether *port, >> goto parse_ntb; >> } >> >> +done: >> dev_consume_skb_any(skb); >> >> return 0; >> -- >> 2.34.1 >> > > In general this is of course fine (though see Greg's auto-complaint). > > I haven't thought too much about this, but I just wonder whether the > check for block_len == 0 > shouldn't be just after block_len is read, ie. somewhere just after: > > block_len = get_ncm(&tmp, opts->block_length); > > as it is kind of weird to be handling block_len == 0 at the point where > you are already theoretically done processing the block... > > I guess, as is, this assumes the block isn't actually of length 0, > since there's a bunch of following get_ncm() calls... > Are those guaranteed to be valid? > I did get this doubt and tried it. I bailed out as soon as I found out block len is zero without actually processing the datagrams present and when I did that even ping doesn't work. Everything works only when the datagrams in this zero block len NTB are parsed properly. > I guess I don't actually see the infinite loop with block_len == 0, > since get_ncm() always moves us forward... > The infinite loop occurs because we keep moving the buffer pointer forward and keep processing the giveback until to_process variable becomes zero or one. In case block length is zero, we never move the buffer pointer forward and never reduce to_process variable and hence keep infinitely processing the same NTB over and over again. > Maybe your patch *is* correct as is, and you just need a comment > explaining *why* block_len == 0 is terminal at the spot you're adding the check. > > Also couldn't you fix this without goto, by changing > > } else if (to_process > 0) { > to > } else if (to_process && block_len) { > // See NCM spec. zero block_len means short packet. > I will test this out once (although I know that looking at it, it would definitely work) and send v2 with this diff. Thanks for the review. Regards, Krishna,
On 2/27/2024 8:10 AM, Krishna Kurapati PSSNV wrote: > >> >> In general this is of course fine (though see Greg's auto-complaint). >> >> I haven't thought too much about this, but I just wonder whether the >> check for block_len == 0 >> shouldn't be just after block_len is read, ie. somewhere just after: >> >> block_len = get_ncm(&tmp, opts->block_length); >> >> as it is kind of weird to be handling block_len == 0 at the point where >> you are already theoretically done processing the block... >> >> I guess, as is, this assumes the block isn't actually of length 0, >> since there's a bunch of following get_ncm() calls... >> Are those guaranteed to be valid? >> > > I did get this doubt and tried it. I bailed out as soon as I found out > block len is zero without actually processing the datagrams present and > when I did that even ping doesn't work. Everything works only when the > datagrams in this zero block len NTB are parsed properly. > >> I guess I don't actually see the infinite loop with block_len == 0, >> since get_ncm() always moves us forward... >> > > The infinite loop occurs because we keep moving the buffer pointer > forward and keep processing the giveback until to_process variable > becomes zero or one. In case block length is zero, we never move the > buffer pointer forward and never reduce to_process variable and hence > keep infinitely processing the same NTB over and over again. > >> Maybe your patch *is* correct as is, and you just need a comment >> explaining *why* block_len == 0 is terminal at the spot you're adding >> the check. >> >> Also couldn't you fix this without goto, by changing >> >> } else if (to_process > 0) { >> to >> } else if (to_process && block_len) { >> // See NCM spec. zero block_len means short packet. >> > > I will test this out once (although I know that looking at it, it would > definitely work) and send v2 with this diff. > > Thanks for the review. > Hi Maciej, Greg, Thanks for approving v2. Not sure if this is the right forum to ask this question, but had one query. In the NCM driver, the register_netdev is called during bind but the cleanup for that is called during free_inst. Meaning if usb0 interface is created for ncm on bind or a composition switch into NCM (first comp switch after bootup), then it is removed only after removing the entire g1/functions/ncm.0 folder. Shouldn't we cleanup and remove the usb0 interface in unbind as a counter operation of bind ? By extension this question also applies to f_eem/ f_ecm/ f_rndis where it was done in similar manner. So was wondering if anyone could help me with info on why it was designed that way. Regards, Krishna,
diff --git a/drivers/usb/gadget/function/f_ncm.c b/drivers/usb/gadget/function/f_ncm.c index e2a059cfda2c..355e370e5140 100644 --- a/drivers/usb/gadget/function/f_ncm.c +++ b/drivers/usb/gadget/function/f_ncm.c @@ -1337,6 +1337,9 @@ static int ncm_unwrap_ntb(struct gether *port, VDBG(port->func.config->cdev, "Parsed NTB with %d frames\n", dgram_counter); + if (block_len == 0) + goto done; + to_process -= block_len; /* @@ -1351,6 +1354,7 @@ static int ncm_unwrap_ntb(struct gether *port, goto parse_ntb; } +done: dev_consume_skb_any(skb); return 0;
While connecting to a Linux host with CDC_NCM_NTB_DEF_SIZE_TX set to 65536, it has been observed that we receive short packets, which come at interval of 5-10 seconds sometimes and have block length zero but still contain 1-2 valid datagrams present. According to the NCM spec: "If wBlockLength = 0x0000, the block is terminated by a short packet. In this case, the USB transfer must still be shorter than dwNtbInMaxSize or dwNtbOutMaxSize. If exactly dwNtbInMaxSize or dwNtbOutMaxSize bytes are sent, and the size is a multiple of wMaxPacketSize for the given pipe, then no ZLP shall be sent. wBlockLength= 0x0000 must be used with extreme care, because of the possibility that the host and device may get out of sync, and because of test issues. wBlockLength = 0x0000 allows the sender to reduce latency by starting to send a very large NTB, and then shortening it when the sender discovers that there’s not sufficient data to justify sending a large NTB" However, there is a potential issue with the current implementation, as it checks for the occurrence of multiple NTBs in a single giveback by verifying if the leftover bytes to be processed is zero or not. If the block length reads zero, we would process the same NTB infintely because the leftover bytes is never zero and it leads to a crash. Fix this by bailing out if block length reads zero. Fixes: 427694cfaafa ("usb: gadget: ncm: Handle decoding of multiple NTB's in unwrap call") Signed-off-by: Krishna Kurapati <quic_kriskura@quicinc.com> --- PS: Although this issue was seen after CDC_NCM_NTB_DEF_SIZE_TX was modified to 64K on host side, I still believe this can come up at any time as per the spec. Also I assumed that the giveback where block length is zero, has only one NTB and not multiple ones. drivers/usb/gadget/function/f_ncm.c | 4 ++++ 1 file changed, 4 insertions(+)