mbox series

[v4,0/5] virtio-crypto: Improve performance

Message ID 20220424104140.44841-1-pizhenwei@bytedance.com (mailing list archive)
Headers show
Series virtio-crypto: Improve performance | expand

Message

zhenwei pi April 24, 2022, 10:41 a.m. UTC
Hi, Lei
I'd like to move helper and callback functions(Eg, virtcrypto_clear_request
 and virtcrypto_ctrlq_callback) from xx_core.c to xx_common.c,
then the xx_core.c supports:
  - probe/remove/irq affinity seting for a virtio device
  - basic virtio related operations

xx_common.c supports:
  - common helpers/functions for algos

Do you have any suggestion about this?

v3 -> v4:
 - Don't create new file virtio_common.c, the new functions are added
   into virtio_crypto_core.c
 - Split the first patch into two parts:
     1, change code style,
     2, use private buffer instead of shared buffer
 - Remove relevant change.
 - Other minor changes.

v2 -> v3:
 - Jason suggested that spliting the first patch into two part:
     1, using private buffer
     2, remove the busy polling
   Rework as Jason's suggestion, this makes the smaller change in
   each one and clear.

v1 -> v2:
 - Use kfree instead of kfree_sensitive for insensitive buffer.
 - Several coding style fix.
 - Use memory from current node, instead of memory close to device
 - Add more message in commit, also explain why removing per-device
   request buffer.
 - Add necessary comment in code to explain why using kzalloc to
   allocate struct virtio_crypto_ctrl_request.

v1:
The main point of this series is to improve the performance for
virtio crypto:
- Use wait mechanism instead of busy polling for ctrl queue, this
  reduces CPU and lock racing, it's possiable to create/destroy session
  parallelly, QPS increases from ~40K/s to ~200K/s.
- Enable retry on crypto engine to improve performance for data queue,
  this allows the larger depth instead of 1.
- Fix dst data length in akcipher service.
- Other style fix.

lei he (2):
  virtio-crypto: adjust dst_len at ops callback
  virtio-crypto: enable retry for virtio-crypto-dev

zhenwei pi (3):
  virtio-crypto: change code style
  virtio-crypto: use private buffer for control request
  virtio-crypto: wait ctrl queue instead of busy polling

 .../virtio/virtio_crypto_akcipher_algs.c      |  83 ++++++-----
 drivers/crypto/virtio/virtio_crypto_common.h  |  21 ++-
 drivers/crypto/virtio/virtio_crypto_core.c    |  55 ++++++-
 .../virtio/virtio_crypto_skcipher_algs.c      | 140 ++++++++----------
 4 files changed, 180 insertions(+), 119 deletions(-)

Comments

zhenwei pi May 5, 2022, 2:35 a.m. UTC | #1
Hi, Lei

Jason replied in another patch:
Still hundreds of lines of changes, I'd leave this change to other
maintainers to decide.

Quite frankly, the virtio crypto driver changed only a few in the past, 
and the performance of control queue is not good enough. I am in doubt 
about that this driver is not used widely. So I'd like to rework a lot, 
it would be best to complete this work in 5.18 window.

This gets different point with Jason. I would appreciate it if you could 
give me any hint.

On 4/24/22 18:41, zhenwei pi wrote:
> Hi, Lei
> I'd like to move helper and callback functions(Eg, virtcrypto_clear_request
>   and virtcrypto_ctrlq_callback) from xx_core.c to xx_common.c,
> then the xx_core.c supports:
>    - probe/remove/irq affinity seting for a virtio device
>    - basic virtio related operations
> 
> xx_common.c supports:
>    - common helpers/functions for algos
> 
> Do you have any suggestion about this?
> 
> v3 -> v4:
>   - Don't create new file virtio_common.c, the new functions are added
>     into virtio_crypto_core.c
>   - Split the first patch into two parts:
>       1, change code style,
>       2, use private buffer instead of shared buffer
>   - Remove relevant change.
>   - Other minor changes.
> 
> v2 -> v3:
>   - Jason suggested that spliting the first patch into two part:
>       1, using private buffer
>       2, remove the busy polling
>     Rework as Jason's suggestion, this makes the smaller change in
>     each one and clear.
> 
> v1 -> v2:
>   - Use kfree instead of kfree_sensitive for insensitive buffer.
>   - Several coding style fix.
>   - Use memory from current node, instead of memory close to device
>   - Add more message in commit, also explain why removing per-device
>     request buffer.
>   - Add necessary comment in code to explain why using kzalloc to
>     allocate struct virtio_crypto_ctrl_request.
> 
> v1:
> The main point of this series is to improve the performance for
> virtio crypto:
> - Use wait mechanism instead of busy polling for ctrl queue, this
>    reduces CPU and lock racing, it's possiable to create/destroy session
>    parallelly, QPS increases from ~40K/s to ~200K/s.
> - Enable retry on crypto engine to improve performance for data queue,
>    this allows the larger depth instead of 1.
> - Fix dst data length in akcipher service.
> - Other style fix.
> 
> lei he (2):
>    virtio-crypto: adjust dst_len at ops callback
>    virtio-crypto: enable retry for virtio-crypto-dev
> 
> zhenwei pi (3):
>    virtio-crypto: change code style
>    virtio-crypto: use private buffer for control request
>    virtio-crypto: wait ctrl queue instead of busy polling
> 
>   .../virtio/virtio_crypto_akcipher_algs.c      |  83 ++++++-----
>   drivers/crypto/virtio/virtio_crypto_common.h  |  21 ++-
>   drivers/crypto/virtio/virtio_crypto_core.c    |  55 ++++++-
>   .../virtio/virtio_crypto_skcipher_algs.c      | 140 ++++++++----------
>   4 files changed, 180 insertions(+), 119 deletions(-)
>
Gonglei (Arei) May 5, 2022, 3:14 a.m. UTC | #2
> -----Original Message-----
> From: zhenwei pi [mailto:pizhenwei@bytedance.com]
> Sent: Thursday, May 5, 2022 10:35 AM
> To: Gonglei (Arei) <arei.gonglei@huawei.com>; mst@redhat.com;
> jasowang@redhat.com
> Cc: herbert@gondor.apana.org.au; linux-kernel@vger.kernel.org;
> virtualization@lists.linux-foundation.org; linux-crypto@vger.kernel.org;
> helei.sig11@bytedance.com; davem@davemloft.net
> Subject: PING: [PATCH v4 0/5] virtio-crypto: Improve performance
> 
> Hi, Lei
> 
> Jason replied in another patch:
> Still hundreds of lines of changes, I'd leave this change to other maintainers to
> decide.
> 
> Quite frankly, the virtio crypto driver changed only a few in the past, and the
> performance of control queue is not good enough. I am in doubt about that this
> driver is not used widely. So I'd like to rework a lot, it would be best to complete
> this work in 5.18 window.
> 
> This gets different point with Jason. I would appreciate it if you could give me
> any hint.
> 

This is already in my todo list.

Regards,
-Gonglei

> On 4/24/22 18:41, zhenwei pi wrote:
> > Hi, Lei
> > I'd like to move helper and callback functions(Eg, virtcrypto_clear_request
> >   and virtcrypto_ctrlq_callback) from xx_core.c to xx_common.c, then
> > the xx_core.c supports:
> >    - probe/remove/irq affinity seting for a virtio device
> >    - basic virtio related operations
> >
> > xx_common.c supports:
> >    - common helpers/functions for algos
> >
> > Do you have any suggestion about this?
> >
> > v3 -> v4:
> >   - Don't create new file virtio_common.c, the new functions are added
> >     into virtio_crypto_core.c
> >   - Split the first patch into two parts:
> >       1, change code style,
> >       2, use private buffer instead of shared buffer
> >   - Remove relevant change.
> >   - Other minor changes.
> >
> > v2 -> v3:
> >   - Jason suggested that spliting the first patch into two part:
> >       1, using private buffer
> >       2, remove the busy polling
> >     Rework as Jason's suggestion, this makes the smaller change in
> >     each one and clear.
> >
> > v1 -> v2:
> >   - Use kfree instead of kfree_sensitive for insensitive buffer.
> >   - Several coding style fix.
> >   - Use memory from current node, instead of memory close to device
> >   - Add more message in commit, also explain why removing per-device
> >     request buffer.
> >   - Add necessary comment in code to explain why using kzalloc to
> >     allocate struct virtio_crypto_ctrl_request.
> >
> > v1:
> > The main point of this series is to improve the performance for virtio
> > crypto:
> > - Use wait mechanism instead of busy polling for ctrl queue, this
> >    reduces CPU and lock racing, it's possiable to create/destroy session
> >    parallelly, QPS increases from ~40K/s to ~200K/s.
> > - Enable retry on crypto engine to improve performance for data queue,
> >    this allows the larger depth instead of 1.
> > - Fix dst data length in akcipher service.
> > - Other style fix.
> >
> > lei he (2):
> >    virtio-crypto: adjust dst_len at ops callback
> >    virtio-crypto: enable retry for virtio-crypto-dev
> >
> > zhenwei pi (3):
> >    virtio-crypto: change code style
> >    virtio-crypto: use private buffer for control request
> >    virtio-crypto: wait ctrl queue instead of busy polling
> >
> >   .../virtio/virtio_crypto_akcipher_algs.c      |  83 ++++++-----
> >   drivers/crypto/virtio/virtio_crypto_common.h  |  21 ++-
> >   drivers/crypto/virtio/virtio_crypto_core.c    |  55 ++++++-
> >   .../virtio/virtio_crypto_skcipher_algs.c      | 140 ++++++++----------
> >   4 files changed, 180 insertions(+), 119 deletions(-)
> >
> 
> --
> zhenwei pi
Michael S. Tsirkin May 5, 2022, 4:57 a.m. UTC | #3
On Thu, May 05, 2022 at 03:14:40AM +0000, Gonglei (Arei) wrote:
> 
> 
> > -----Original Message-----
> > From: zhenwei pi [mailto:pizhenwei@bytedance.com]
> > Sent: Thursday, May 5, 2022 10:35 AM
> > To: Gonglei (Arei) <arei.gonglei@huawei.com>; mst@redhat.com;
> > jasowang@redhat.com
> > Cc: herbert@gondor.apana.org.au; linux-kernel@vger.kernel.org;
> > virtualization@lists.linux-foundation.org; linux-crypto@vger.kernel.org;
> > helei.sig11@bytedance.com; davem@davemloft.net
> > Subject: PING: [PATCH v4 0/5] virtio-crypto: Improve performance
> > 
> > Hi, Lei
> > 
> > Jason replied in another patch:
> > Still hundreds of lines of changes, I'd leave this change to other maintainers to
> > decide.
> > 
> > Quite frankly, the virtio crypto driver changed only a few in the past, and the
> > performance of control queue is not good enough. I am in doubt about that this
> > driver is not used widely. So I'd like to rework a lot, it would be best to complete
> > this work in 5.18 window.
> > 
> > This gets different point with Jason. I would appreciate it if you could give me
> > any hint.
> > 
> 
> This is already in my todo list.
> 
> Regards,
> -Gonglei

It's been out a month though, not really acceptable latency for review.
So I would apply this for next,  but you need to address Dan Captenter's
comment, and look for simular patterns elesewhere in your patch.


> > On 4/24/22 18:41, zhenwei pi wrote:
> > > Hi, Lei
> > > I'd like to move helper and callback functions(Eg, virtcrypto_clear_request
> > >   and virtcrypto_ctrlq_callback) from xx_core.c to xx_common.c, then
> > > the xx_core.c supports:
> > >    - probe/remove/irq affinity seting for a virtio device
> > >    - basic virtio related operations
> > >
> > > xx_common.c supports:
> > >    - common helpers/functions for algos
> > >
> > > Do you have any suggestion about this?
> > >
> > > v3 -> v4:
> > >   - Don't create new file virtio_common.c, the new functions are added
> > >     into virtio_crypto_core.c
> > >   - Split the first patch into two parts:
> > >       1, change code style,
> > >       2, use private buffer instead of shared buffer
> > >   - Remove relevant change.
> > >   - Other minor changes.
> > >
> > > v2 -> v3:
> > >   - Jason suggested that spliting the first patch into two part:
> > >       1, using private buffer
> > >       2, remove the busy polling
> > >     Rework as Jason's suggestion, this makes the smaller change in
> > >     each one and clear.
> > >
> > > v1 -> v2:
> > >   - Use kfree instead of kfree_sensitive for insensitive buffer.
> > >   - Several coding style fix.
> > >   - Use memory from current node, instead of memory close to device
> > >   - Add more message in commit, also explain why removing per-device
> > >     request buffer.
> > >   - Add necessary comment in code to explain why using kzalloc to
> > >     allocate struct virtio_crypto_ctrl_request.
> > >
> > > v1:
> > > The main point of this series is to improve the performance for virtio
> > > crypto:
> > > - Use wait mechanism instead of busy polling for ctrl queue, this
> > >    reduces CPU and lock racing, it's possiable to create/destroy session
> > >    parallelly, QPS increases from ~40K/s to ~200K/s.
> > > - Enable retry on crypto engine to improve performance for data queue,
> > >    this allows the larger depth instead of 1.
> > > - Fix dst data length in akcipher service.
> > > - Other style fix.
> > >
> > > lei he (2):
> > >    virtio-crypto: adjust dst_len at ops callback
> > >    virtio-crypto: enable retry for virtio-crypto-dev
> > >
> > > zhenwei pi (3):
> > >    virtio-crypto: change code style
> > >    virtio-crypto: use private buffer for control request
> > >    virtio-crypto: wait ctrl queue instead of busy polling
> > >
> > >   .../virtio/virtio_crypto_akcipher_algs.c      |  83 ++++++-----
> > >   drivers/crypto/virtio/virtio_crypto_common.h  |  21 ++-
> > >   drivers/crypto/virtio/virtio_crypto_core.c    |  55 ++++++-
> > >   .../virtio/virtio_crypto_skcipher_algs.c      | 140 ++++++++----------
> > >   4 files changed, 180 insertions(+), 119 deletions(-)
> > >
> > 
> > --
> > zhenwei pi
zhenwei pi May 5, 2022, 9:29 a.m. UTC | #4
On 5/5/22 12:57, Michael S. Tsirkin wrote:
> On Thu, May 05, 2022 at 03:14:40AM +0000, Gonglei (Arei) wrote:
>>
>>
>>> -----Original Message-----
>>> From: zhenwei pi [mailto:pizhenwei@bytedance.com]
>>> Sent: Thursday, May 5, 2022 10:35 AM
>>> To: Gonglei (Arei) <arei.gonglei@huawei.com>; mst@redhat.com;
>>> jasowang@redhat.com
>>> Cc: herbert@gondor.apana.org.au; linux-kernel@vger.kernel.org;
>>> virtualization@lists.linux-foundation.org; linux-crypto@vger.kernel.org;
>>> helei.sig11@bytedance.com; davem@davemloft.net
>>> Subject: PING: [PATCH v4 0/5] virtio-crypto: Improve performance
>>>
>>> Hi, Lei
>>>
>>> Jason replied in another patch:
>>> Still hundreds of lines of changes, I'd leave this change to other maintainers to
>>> decide.
>>>
>>> Quite frankly, the virtio crypto driver changed only a few in the past, and the
>>> performance of control queue is not good enough. I am in doubt about that this
>>> driver is not used widely. So I'd like to rework a lot, it would be best to complete
>>> this work in 5.18 window.
>>>
>>> This gets different point with Jason. I would appreciate it if you could give me
>>> any hint.
>>>
>>
>> This is already in my todo list.
>>
>> Regards,
>> -Gonglei
> 
> It's been out a month though, not really acceptable latency for review.
> So I would apply this for next,  but you need to address Dan Captenter's
> comment, and look for simular patterns elesewhere in your patch.
> 

I fixed this in the v5 series. Thanks!