Message ID | 1a9566c0a41ad0d940487a9d3f0008993c075ef2.1560461404.git.lorenzo@kernel.org (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Felix Fietkau |
Headers | show |
Series | mt76: usb: fix A-MSDU support | expand |
On Thu, Jun 13, 2019 at 11:43:13PM +0200, Lorenzo Bianconi wrote: > Set usb buffer size taking into account skb_shared_info in order to > not always copy the first part of received frames if A-MSDU is enabled > for SG capable devices. Moreover align usb buffer size to max_ep > boundaries and set buf_size to PAGE_SIZE even for sg case I think this should not be applied to wirless-drivers, only first patch that fix the bug and optimizations should be done in -next. > + int i, data_size; > > + data_size = rounddown(SKB_WITH_OVERHEAD(q->buf_size), > + dev->usb.in_ep[MT_EP_IN_PKT_RX].max_packet); > for (i = 0; i < nsgs; i++) { > struct page *page; > void *data; > @@ -302,7 +304,7 @@ mt76u_fill_rx_sg(struct mt76_dev *dev, struct mt76_queue *q, struct urb *urb, > > page = virt_to_head_page(data); > offset = data - page_address(page); > - sg_set_page(&urb->sg[i], page, q->buf_size, offset); > + sg_set_page(&urb->sg[i], page, data_size, offset); <snip> > - q->buf_size = dev->usb.sg_en ? MT_RX_BUF_SIZE : PAGE_SIZE; > q->ndesc = MT_NUM_RX_ENTRIES; > + q->buf_size = PAGE_SIZE; > + This should be associated with decrease of MT_SG_MAX_SIZE to value that is actually needed and currently this is 2 for 4k AMSDU. However I don't think allocating 2 pages to avoid ieee80211 header and SNAP copy is worth to do. For me best approach would be allocate 1 page for 4k AMSDU, 2 for 8k and 3 for 12k (still using sg, but without data_size change to avoid 32B copying). Stanislaw
> On Thu, Jun 13, 2019 at 11:43:13PM +0200, Lorenzo Bianconi wrote: > > Set usb buffer size taking into account skb_shared_info in order to > > not always copy the first part of received frames if A-MSDU is enabled > > for SG capable devices. Moreover align usb buffer size to max_ep > > boundaries and set buf_size to PAGE_SIZE even for sg case > > I think this should not be applied to wirless-drivers, only first patch > that fix the bug and optimizations should be done in -next. ack, right. I think patch 2/3 and 3/3 can go directly in Felix's tree > > > + int i, data_size; > > > > + data_size = rounddown(SKB_WITH_OVERHEAD(q->buf_size), > > + dev->usb.in_ep[MT_EP_IN_PKT_RX].max_packet); > > for (i = 0; i < nsgs; i++) { > > struct page *page; > > void *data; > > @@ -302,7 +304,7 @@ mt76u_fill_rx_sg(struct mt76_dev *dev, struct mt76_queue *q, struct urb *urb, > > > > page = virt_to_head_page(data); > > offset = data - page_address(page); > > - sg_set_page(&urb->sg[i], page, q->buf_size, offset); > > + sg_set_page(&urb->sg[i], page, data_size, offset); > <snip> > > - q->buf_size = dev->usb.sg_en ? MT_RX_BUF_SIZE : PAGE_SIZE; > > q->ndesc = MT_NUM_RX_ENTRIES; > > + q->buf_size = PAGE_SIZE; > > + > > This should be associated with decrease of MT_SG_MAX_SIZE to value that > is actually needed and currently this is 2 for 4k AMSDU. MT_SG_MAX_SIZE is used even on tx side and I do not think we will end up with a huge difference here > > However I don't think allocating 2 pages to avoid ieee80211 header and SNAP > copy is worth to do. For me best approach would be allocate 1 page for > 4k AMSDU, 2 for 8k and 3 for 12k (still using sg, but without data_size > change to avoid 32B copying). From my point of view it is better to avoid copying if it is possible. Are you sure there is no difference? Regards, Lorenzo > > Stanislaw
On Fri, Jun 14, 2019 at 12:22:48PM +0200, Lorenzo Bianconi wrote: > > On Thu, Jun 13, 2019 at 11:43:13PM +0200, Lorenzo Bianconi wrote: > > > Set usb buffer size taking into account skb_shared_info in order to > > > not always copy the first part of received frames if A-MSDU is enabled > > > for SG capable devices. Moreover align usb buffer size to max_ep > > > boundaries and set buf_size to PAGE_SIZE even for sg case > > > > I think this should not be applied to wirless-drivers, only first patch > > that fix the bug and optimizations should be done in -next. > > ack, right. I think patch 2/3 and 3/3 can go directly in Felix's tree > > > > > > + int i, data_size; > > > > > > + data_size = rounddown(SKB_WITH_OVERHEAD(q->buf_size), > > > + dev->usb.in_ep[MT_EP_IN_PKT_RX].max_packet); > > > for (i = 0; i < nsgs; i++) { > > > struct page *page; > > > void *data; > > > @@ -302,7 +304,7 @@ mt76u_fill_rx_sg(struct mt76_dev *dev, struct mt76_queue *q, struct urb *urb, > > > > > > page = virt_to_head_page(data); > > > offset = data - page_address(page); > > > - sg_set_page(&urb->sg[i], page, q->buf_size, offset); > > > + sg_set_page(&urb->sg[i], page, data_size, offset); > > <snip> > > > - q->buf_size = dev->usb.sg_en ? MT_RX_BUF_SIZE : PAGE_SIZE; > > > q->ndesc = MT_NUM_RX_ENTRIES; > > > + q->buf_size = PAGE_SIZE; > > > + > > > > This should be associated with decrease of MT_SG_MAX_SIZE to value that > > is actually needed and currently this is 2 for 4k AMSDU. > > MT_SG_MAX_SIZE is used even on tx side and I do not think we will end up with a > huge difference here So use different value as argument for mt76u_fill_rx_sg() in mt76u_rx_urb_alloc(). After changing buf_size to PAGE_SIZE we will allocate 8 pages per rx queue entry, but only 2 pages will be used (with data_size change, 1 without data_size change). Or I'm wrong? > > However I don't think allocating 2 pages to avoid ieee80211 header and SNAP > > copy is worth to do. For me best approach would be allocate 1 page for > > 4k AMSDU, 2 for 8k and 3 for 12k (still using sg, but without data_size > > change to avoid 32B copying). > > From my point of view it is better to avoid copying if it is possible. Are you > sure there is no difference? I do not understand what you mean by difference here. Stanislaw
> On Fri, Jun 14, 2019 at 12:22:48PM +0200, Lorenzo Bianconi wrote: > > > On Thu, Jun 13, 2019 at 11:43:13PM +0200, Lorenzo Bianconi wrote: > > > > Set usb buffer size taking into account skb_shared_info in order to > > > > not always copy the first part of received frames if A-MSDU is enabled > > > > for SG capable devices. Moreover align usb buffer size to max_ep > > > > boundaries and set buf_size to PAGE_SIZE even for sg case > > > > > > I think this should not be applied to wirless-drivers, only first patch > > > that fix the bug and optimizations should be done in -next. > > > > ack, right. I think patch 2/3 and 3/3 can go directly in Felix's tree > > > > > > > > > + int i, data_size; > > > > > > > > + data_size = rounddown(SKB_WITH_OVERHEAD(q->buf_size), > > > > + dev->usb.in_ep[MT_EP_IN_PKT_RX].max_packet); > > > > for (i = 0; i < nsgs; i++) { > > > > struct page *page; > > > > void *data; > > > > @@ -302,7 +304,7 @@ mt76u_fill_rx_sg(struct mt76_dev *dev, struct mt76_queue *q, struct urb *urb, > > > > > > > > page = virt_to_head_page(data); > > > > offset = data - page_address(page); > > > > - sg_set_page(&urb->sg[i], page, q->buf_size, offset); > > > > + sg_set_page(&urb->sg[i], page, data_size, offset); > > > <snip> > > > > - q->buf_size = dev->usb.sg_en ? MT_RX_BUF_SIZE : PAGE_SIZE; > > > > q->ndesc = MT_NUM_RX_ENTRIES; > > > > + q->buf_size = PAGE_SIZE; > > > > + > > > > > > This should be associated with decrease of MT_SG_MAX_SIZE to value that > > > is actually needed and currently this is 2 for 4k AMSDU. > > > > MT_SG_MAX_SIZE is used even on tx side and I do not think we will end up with a > > huge difference here > > So use different value as argument for mt76u_fill_rx_sg() in > mt76u_rx_urb_alloc(). After changing buf_size to PAGE_SIZE we will > allocate 8 pages per rx queue entry, but only 2 pages will be used > (with data_size change, 1 without data_size change). Or I'm wrong? yes, it is right (we will use two pages with data_size change). Maybe better to use 4 pages for each rx queue entry? (otherwise we will probably change it in the future) > > > > However I don't think allocating 2 pages to avoid ieee80211 header and SNAP > > > copy is worth to do. For me best approach would be allocate 1 page for > > > 4k AMSDU, 2 for 8k and 3 for 12k (still using sg, but without data_size > > > change to avoid 32B copying). > > > > From my point of view it is better to avoid copying if it is possible. Are you > > sure there is no difference? > > I do not understand what you mean by difference here. tpt differences, not sure if there are any Regards, Lorenzo > > Stanislaw
On Fri, Jun 14, 2019 at 02:46:36PM +0200, Lorenzo Bianconi wrote: > > > > > > ack, right. I think patch 2/3 and 3/3 can go directly in Felix's tree > > > > > > > > > > > > + int i, data_size; > > > > > > > > > > + data_size = rounddown(SKB_WITH_OVERHEAD(q->buf_size), > > > > > + dev->usb.in_ep[MT_EP_IN_PKT_RX].max_packet); > > > > > for (i = 0; i < nsgs; i++) { > > > > > struct page *page; > > > > > void *data; > > > > > @@ -302,7 +304,7 @@ mt76u_fill_rx_sg(struct mt76_dev *dev, struct mt76_queue *q, struct urb *urb, > > > > > > > > > > page = virt_to_head_page(data); > > > > > offset = data - page_address(page); > > > > > - sg_set_page(&urb->sg[i], page, q->buf_size, offset); > > > > > + sg_set_page(&urb->sg[i], page, data_size, offset); > > > > <snip> > > > > > - q->buf_size = dev->usb.sg_en ? MT_RX_BUF_SIZE : PAGE_SIZE; > > > > > q->ndesc = MT_NUM_RX_ENTRIES; > > > > > + q->buf_size = PAGE_SIZE; > > > > > + > > > > > > > > This should be associated with decrease of MT_SG_MAX_SIZE to value that > > > > is actually needed and currently this is 2 for 4k AMSDU. > > > > > > MT_SG_MAX_SIZE is used even on tx side and I do not think we will end up with a > > > huge difference here > > > > So use different value as argument for mt76u_fill_rx_sg() in > > mt76u_rx_urb_alloc(). After changing buf_size to PAGE_SIZE we will > > allocate 8 pages per rx queue entry, but only 2 pages will be used > > (with data_size change, 1 without data_size change). Or I'm wrong? > > yes, it is right (we will use two pages with data_size change). Maybe better to > use 4 pages for each rx queue entry? (otherwise we will probably change it in > the future) We should not allocate more than is required. If support for bigger rx AMSDUs will be added and announced in vht/ht capabilities to remote stations, then increase of number of segments will be needed. > > > > However I don't think allocating 2 pages to avoid ieee80211 header and SNAP > > > > copy is worth to do. For me best approach would be allocate 1 page for > > > > 4k AMSDU, 2 for 8k and 3 for 12k (still using sg, but without data_size > > > > change to avoid 32B copying). > > > > > > From my point of view it is better to avoid copying if it is possible. Are you > > > sure there is no difference? > > > > I do not understand what you mean by difference here. > > tpt differences, not sure if there are any I would not expect any measurable difference in tpt nor in cpu usage either way. But I think, if some AMSDU subframe will be spited into two fragments, data most likely will need to be linearised/copied, at some point before passed to application, what will overcome any benefit of avoiding coping 802.11 header. Thought, I don't think this somehow will be visible in benchmarking. Stanislaw
> On Fri, Jun 14, 2019 at 02:46:36PM +0200, Lorenzo Bianconi wrote: > > > > > > > > ack, right. I think patch 2/3 and 3/3 can go directly in Felix's tree > > > > > > > > > > > > > > > + int i, data_size; > > > > > > > > > > > > + data_size = rounddown(SKB_WITH_OVERHEAD(q->buf_size), > > > > > > + dev->usb.in_ep[MT_EP_IN_PKT_RX].max_packet); > > > > > > for (i = 0; i < nsgs; i++) { > > > > > > struct page *page; > > > > > > void *data; > > > > > > @@ -302,7 +304,7 @@ mt76u_fill_rx_sg(struct mt76_dev *dev, struct mt76_queue *q, struct urb *urb, > > > > > > > > > > > > page = virt_to_head_page(data); > > > > > > offset = data - page_address(page); > > > > > > - sg_set_page(&urb->sg[i], page, q->buf_size, offset); > > > > > > + sg_set_page(&urb->sg[i], page, data_size, offset); > > > > > <snip> > > > > > > - q->buf_size = dev->usb.sg_en ? MT_RX_BUF_SIZE : PAGE_SIZE; > > > > > > q->ndesc = MT_NUM_RX_ENTRIES; > > > > > > + q->buf_size = PAGE_SIZE; > > > > > > + > > > > > > > > > > This should be associated with decrease of MT_SG_MAX_SIZE to value that > > > > > is actually needed and currently this is 2 for 4k AMSDU. > > > > > > > > MT_SG_MAX_SIZE is used even on tx side and I do not think we will end up with a > > > > huge difference here > > > > > > So use different value as argument for mt76u_fill_rx_sg() in > > > mt76u_rx_urb_alloc(). After changing buf_size to PAGE_SIZE we will > > > allocate 8 pages per rx queue entry, but only 2 pages will be used > > > (with data_size change, 1 without data_size change). Or I'm wrong? > > > > yes, it is right (we will use two pages with data_size change). Maybe better to > > use 4 pages for each rx queue entry? (otherwise we will probably change it in > > the future) > > We should not allocate more than is required. If support for bigger > rx AMSDUs will be added and announced in vht/ht capabilities to remote > stations, then increase of number of segments will be needed. > > > > > > However I don't think allocating 2 pages to avoid ieee80211 header and SNAP > > > > > copy is worth to do. For me best approach would be allocate 1 page for > > > > > 4k AMSDU, 2 for 8k and 3 for 12k (still using sg, but without data_size > > > > > change to avoid 32B copying). > > > > > > > > From my point of view it is better to avoid copying if it is possible. Are you > > > > sure there is no difference? > > > > > > I do not understand what you mean by difference here. > > > > tpt differences, not sure if there are any > > I would not expect any measurable difference in tpt nor in cpu usage > either way. > > But I think, if some AMSDU subframe will be spited into two fragments, > data most likely will need to be linearised/copied, at some point before > passed to application, what will overcome any benefit of avoiding coping > 802.11 header. Thought, I don't think this somehow will be visible in > benchmarking. Sorry for the late reply. I think so. I will post a v4 soon. Regards, Lorenzo > > Stanislaw
diff --git a/drivers/net/wireless/mediatek/mt76/usb.c b/drivers/net/wireless/mediatek/mt76/usb.c index 1ee54a9b302e..2ee3f8fa1483 100644 --- a/drivers/net/wireless/mediatek/mt76/usb.c +++ b/drivers/net/wireless/mediatek/mt76/usb.c @@ -289,8 +289,10 @@ static int mt76u_fill_rx_sg(struct mt76_dev *dev, struct mt76_queue *q, struct urb *urb, int nsgs, gfp_t gfp) { - int i; + int i, data_size; + data_size = rounddown(SKB_WITH_OVERHEAD(q->buf_size), + dev->usb.in_ep[MT_EP_IN_PKT_RX].max_packet); for (i = 0; i < nsgs; i++) { struct page *page; void *data; @@ -302,7 +304,7 @@ mt76u_fill_rx_sg(struct mt76_dev *dev, struct mt76_queue *q, struct urb *urb, page = virt_to_head_page(data); offset = data - page_address(page); - sg_set_page(&urb->sg[i], page, q->buf_size, offset); + sg_set_page(&urb->sg[i], page, data_size, offset); } if (i < nsgs) { @@ -314,7 +316,7 @@ mt76u_fill_rx_sg(struct mt76_dev *dev, struct mt76_queue *q, struct urb *urb, } urb->num_sgs = max_t(int, i, urb->num_sgs); - urb->transfer_buffer_length = urb->num_sgs * q->buf_size, + urb->transfer_buffer_length = urb->num_sgs * data_size; sg_init_marker(urb->sg, urb->num_sgs); return i ? : -ENOMEM; @@ -611,8 +613,9 @@ static int mt76u_alloc_rx(struct mt76_dev *dev) if (!q->entry) return -ENOMEM; - q->buf_size = dev->usb.sg_en ? MT_RX_BUF_SIZE : PAGE_SIZE; q->ndesc = MT_NUM_RX_ENTRIES; + q->buf_size = PAGE_SIZE; + for (i = 0; i < q->ndesc; i++) { err = mt76u_rx_urb_alloc(dev, &q->entry[i]); if (err < 0)
Set usb buffer size taking into account skb_shared_info in order to not always copy the first part of received frames if A-MSDU is enabled for SG capable devices. Moreover align usb buffer size to max_ep boundaries and set buf_size to PAGE_SIZE even for sg case Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> --- drivers/net/wireless/mediatek/mt76/usb.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)