Message ID | 20181002165602.21033-1-ben.dooks@codethink.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | usbnet: smsc95xx: simplify tx_fixup code | expand |
From: Ben Dooks > Sent: 02 October 2018 17:56 > > The smsc95xx_tx_fixup is doing multiple calls to skb_push() to > put an 8-byte command header onto the packet. It would be easier > to do one skb_push() and then copy the data in once the push is > done. > > Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> > --- > drivers/net/usb/smsc95xx.c | 25 +++++++++++++------------ > 1 file changed, 13 insertions(+), 12 deletions(-) > > diff --git a/drivers/net/usb/smsc95xx.c b/drivers/net/usb/smsc95xx.c > index cb19aea139d3..813ab93ee2c3 100644 > --- a/drivers/net/usb/smsc95xx.c > +++ b/drivers/net/usb/smsc95xx.c > @@ -2006,6 +2006,7 @@ static struct sk_buff *smsc95xx_tx_fixup(struct usbnet *dev, > bool csum = skb->ip_summed == CHECKSUM_PARTIAL; > int overhead = csum ? SMSC95XX_TX_OVERHEAD_CSUM : SMSC95XX_TX_OVERHEAD; > u32 tx_cmd_a, tx_cmd_b; > + void *ptr; It might be useful to define a structure for the header. You might need to find the 'store unaligned 32bit word' macro though. (Actually that will probably be better than the memcpy() which might end up doing memory-memory copies rather than storing the register.) Although if/when you add the tx alignment that won't be needed because the header will be aligned. > /* We do not advertise SG, so skbs should be already linearized */ > BUG_ON(skb_shinfo(skb)->nr_frags); > @@ -2019,6 +2020,9 @@ static struct sk_buff *smsc95xx_tx_fixup(struct usbnet *dev, > return NULL; > } > > + tx_cmd_b = (u32)skb->len; > + tx_cmd_a = tx_cmd_b | TX_CMD_A_FIRST_SEG_ | TX_CMD_A_LAST_SEG_; > + > if (csum) { > if (skb->len <= 45) { > /* workaround - hardware tx checksum does not work > @@ -2035,21 +2039,18 @@ static struct sk_buff *smsc95xx_tx_fixup(struct usbnet *dev, > skb_push(skb, 4); > cpu_to_le32s(&csum_preamble); Not related, but csum_preamble = cpu_to_le32(csum_preamble) is likely to generate better code (at least for some architectures). David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On 2018-10-03 14:36, David Laight wrote: > From: Ben Dooks >> Sent: 02 October 2018 17:56 >> >> The smsc95xx_tx_fixup is doing multiple calls to skb_push() to >> put an 8-byte command header onto the packet. It would be easier >> to do one skb_push() and then copy the data in once the push is >> done. >> >> Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> >> --- >> drivers/net/usb/smsc95xx.c | 25 +++++++++++++------------ >> 1 file changed, 13 insertions(+), 12 deletions(-) >> >> diff --git a/drivers/net/usb/smsc95xx.c b/drivers/net/usb/smsc95xx.c >> index cb19aea139d3..813ab93ee2c3 100644 >> --- a/drivers/net/usb/smsc95xx.c >> +++ b/drivers/net/usb/smsc95xx.c >> @@ -2006,6 +2006,7 @@ static struct sk_buff *smsc95xx_tx_fixup(struct >> usbnet *dev, >> bool csum = skb->ip_summed == CHECKSUM_PARTIAL; >> int overhead = csum ? SMSC95XX_TX_OVERHEAD_CSUM : >> SMSC95XX_TX_OVERHEAD; >> u32 tx_cmd_a, tx_cmd_b; >> + void *ptr; > > It might be useful to define a structure for the header. > You might need to find the 'store unaligned 32bit word' macro though. > (Actually that will probably be better than the memcpy() which might > end up doing memory-memory copies rather than storing the register.) > Although if/when you add the tx alignment that won't be needed because > the > header will be aligned. Ok, might be worth doing. I did try to do a "u32 tx_cmd[2]" but the code generated ended up storing stuff onto the stack before copying into the packet. I agree that possibly going to the "put_unaligned" function might be nicer too. If we did enable tx-align all the time then we'd not have to care about the alignment, but I didn't want to do that if possible as that would end up sending up to 3 bytes extra per packet. I am trying not too do too many changes at one time to allow roll back. >> /* We do not advertise SG, so skbs should be already linearized */ >> BUG_ON(skb_shinfo(skb)->nr_frags); >> @@ -2019,6 +2020,9 @@ static struct sk_buff *smsc95xx_tx_fixup(struct >> usbnet *dev, >> return NULL; >> } >> >> + tx_cmd_b = (u32)skb->len; >> + tx_cmd_a = tx_cmd_b | TX_CMD_A_FIRST_SEG_ | TX_CMD_A_LAST_SEG_; >> + >> if (csum) { >> if (skb->len <= 45) { >> /* workaround - hardware tx checksum does not work >> @@ -2035,21 +2039,18 @@ static struct sk_buff >> *smsc95xx_tx_fixup(struct usbnet *dev, >> skb_push(skb, 4); >> cpu_to_le32s(&csum_preamble); > > Not related, but csum_preamble = cpu_to_le32(csum_preamble) is likely > to > generate better code (at least for some architectures). > > David > > - > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, > MK1 1PT, UK > Registration No: 1397386 (Wales)
From: Ben Dooks <ben.dooks@codethink.co.uk> Date: Tue, 2 Oct 2018 17:56:02 +0100 > - memcpy(skb->data, &tx_cmd_a, 4); > + ptr = skb_push(skb, 8); > + tx_cmd_a = cpu_to_le32(tx_cmd_a); > + tx_cmd_b = cpu_to_le32(tx_cmd_b); > + memcpy(ptr, &tx_cmd_a, 4); > + memcpy(ptr+4, &tx_cmd_b, 4); Even a memcpy() through a void pointer does not guarantee that gcc will not emit word sized loads and stores. You must use the get_unaligned()/put_unaligned() facilities to do this properly. I also agree that making a proper type and structure instead of using a void pointer would be better.
On 2018-10-05 22:24, David Miller wrote: > From: Ben Dooks <ben.dooks@codethink.co.uk> > Date: Tue, 2 Oct 2018 17:56:02 +0100 > >> - memcpy(skb->data, &tx_cmd_a, 4); >> + ptr = skb_push(skb, 8); >> + tx_cmd_a = cpu_to_le32(tx_cmd_a); >> + tx_cmd_b = cpu_to_le32(tx_cmd_b); >> + memcpy(ptr, &tx_cmd_a, 4); >> + memcpy(ptr+4, &tx_cmd_b, 4); > > Even a memcpy() through a void pointer does not guarantee that gcc will > not emit word sized loads and stores. > > You must use the get_unaligned()/put_unaligned() facilities to do this > properly. Thanks, got a new version of the series just being tested with this. Should it go into the original, or as a separate change? > > I also agree that making a proper type and structure instead of using > a void pointer would be better.
From: Ben Dooks <ben.dooks@codethink.co.uk> Date: Sat, 06 Oct 2018 12:27:27 +0100 > Thanks, got a new version of the series just being tested with this. > Should it go into the original, or as a separate change? Into the original.
From: David Miller > Sent: 05 October 2018 22:24 > > From: Ben Dooks <ben.dooks@codethink.co.uk> > Date: Tue, 2 Oct 2018 17:56:02 +0100 > > > - memcpy(skb->data, &tx_cmd_a, 4); > > + ptr = skb_push(skb, 8); > > + tx_cmd_a = cpu_to_le32(tx_cmd_a); > > + tx_cmd_b = cpu_to_le32(tx_cmd_b); > > + memcpy(ptr, &tx_cmd_a, 4); > > + memcpy(ptr+4, &tx_cmd_b, 4); > > Even a memcpy() through a void pointer does not guarantee that gcc will > not emit word sized loads and stores. True, but only if gcc can 'see' something that would require the pointer be aligned. In this case the void pointer comes from an external function so is fine. > You must use the get_unaligned()/put_unaligned() facilities to do this > properly. > > I also agree that making a proper type and structure instead of using > a void pointer would be better. The structure would need to be marked 'packed' - since its alignment isn't guaranteed. Then you don't need to use put_unaligned(). If it wasn't 'packed' then gcc would implement memcpy(&hdr->tx_cmd_a, &tx_cmd_a, 4) using an aligned write. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
diff --git a/drivers/net/usb/smsc95xx.c b/drivers/net/usb/smsc95xx.c index cb19aea139d3..813ab93ee2c3 100644 --- a/drivers/net/usb/smsc95xx.c +++ b/drivers/net/usb/smsc95xx.c @@ -2006,6 +2006,7 @@ static struct sk_buff *smsc95xx_tx_fixup(struct usbnet *dev, bool csum = skb->ip_summed == CHECKSUM_PARTIAL; int overhead = csum ? SMSC95XX_TX_OVERHEAD_CSUM : SMSC95XX_TX_OVERHEAD; u32 tx_cmd_a, tx_cmd_b; + void *ptr; /* We do not advertise SG, so skbs should be already linearized */ BUG_ON(skb_shinfo(skb)->nr_frags); @@ -2019,6 +2020,9 @@ static struct sk_buff *smsc95xx_tx_fixup(struct usbnet *dev, return NULL; } + tx_cmd_b = (u32)skb->len; + tx_cmd_a = tx_cmd_b | TX_CMD_A_FIRST_SEG_ | TX_CMD_A_LAST_SEG_; + if (csum) { if (skb->len <= 45) { /* workaround - hardware tx checksum does not work @@ -2035,21 +2039,18 @@ static struct sk_buff *smsc95xx_tx_fixup(struct usbnet *dev, skb_push(skb, 4); cpu_to_le32s(&csum_preamble); memcpy(skb->data, &csum_preamble, 4); + + tx_cmd_a += 4; + tx_cmd_b += 4; + tx_cmd_b |= TX_CMD_B_CSUM_ENABLE; } } - skb_push(skb, 4); - tx_cmd_b = (u32)(skb->len - 4); - if (csum) - tx_cmd_b |= TX_CMD_B_CSUM_ENABLE; - cpu_to_le32s(&tx_cmd_b); - memcpy(skb->data, &tx_cmd_b, 4); - - skb_push(skb, 4); - tx_cmd_a = (u32)(skb->len - 8) | TX_CMD_A_FIRST_SEG_ | - TX_CMD_A_LAST_SEG_; - cpu_to_le32s(&tx_cmd_a); - memcpy(skb->data, &tx_cmd_a, 4); + ptr = skb_push(skb, 8); + tx_cmd_a = cpu_to_le32(tx_cmd_a); + tx_cmd_b = cpu_to_le32(tx_cmd_b); + memcpy(ptr, &tx_cmd_a, 4); + memcpy(ptr+4, &tx_cmd_b, 4); return skb; }
The smsc95xx_tx_fixup is doing multiple calls to skb_push() to put an 8-byte command header onto the packet. It would be easier to do one skb_push() and then copy the data in once the push is done. Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> --- drivers/net/usb/smsc95xx.c | 25 +++++++++++++------------ 1 file changed, 13 insertions(+), 12 deletions(-)