mbox series

[net-next,v2,0/6] net/smc: Introduce virtually contiguous buffers for SMC-R

Message ID 1657791845-1060-1-git-send-email-guwen@linux.alibaba.com (mailing list archive)
Headers show
Series net/smc: Introduce virtually contiguous buffers for SMC-R | expand

Message

Wen Gu July 14, 2022, 9:43 a.m. UTC
On long-running enterprise production servers, high-order contiguous
memory pages are usually very rare and in most cases we can only get
fragmented pages.

When replacing TCP with SMC-R in such production scenarios, attempting
to allocate high-order physically contiguous sndbufs and RMBs may result
in frequent memory compaction, which will cause unexpected hung issue
and further stability risks.

So this patch set is aimed to allow SMC-R link group to use virtually
contiguous sndbufs and RMBs to avoid potential issues mentioned above.
Whether to use physically or virtually contiguous buffers can be set
by sysctl smcr_buf_type.

Note that using virtually contiguous buffers will bring an acceptable
performance regression, which can be mainly divided into two parts:

1) regression in data path, which is brought by additional address
   translation of sndbuf by RNIC in Tx. But in general, translating
   address through MTT is fast. According to qperf test, this part
   regression is basically less than 10% in latency and bandwidth.
   (see patch 5/6 for details)

2) regression in buffer initialization and destruction path, which is
   brought by additional MR operations of sndbufs. But thanks to link
   group buffer reuse mechanism, the impact of this kind of regression
   decreases as times of buffer reuse increases.

Patch set overview:
- Patch 1/6 and 2/6 mainly about simplifying and optimizing DMA sync
  operation, which will reduce overhead on the data path, especially
  when using virtually contiguous buffers;
- Patch 3/6 and 4/6 introduce a sysctl smcr_buf_type to set the type
  of buffers in new created link group;
- Patch 5/6 allows SMC-R to use virtually contiguous sndbufs and RMBs,
  including buffer creation, destruction, MR operation and access;
- patch 6/6 extends netlink attribute for buffer type of SMC-R link group;

v1->v2:
- Patch 5/6 fixes build issue on 32bit;
- Patch 3/6 adds description of new sysctl in smc-sysctl.rst;

Guangguan Wang (2):
  net/smc: remove redundant dma sync ops
  net/smc: optimize for smc_sndbuf_sync_sg_for_device and
    smc_rmb_sync_sg_for_cpu

Wen Gu (4):
  net/smc: Introduce a sysctl for setting SMC-R buffer type
  net/smc: Use sysctl-specified types of buffers in new link group
  net/smc: Allow virtually contiguous sndbufs or RMBs for SMC-R
  net/smc: Extend SMC-R link group netlink attribute

 Documentation/networking/smc-sysctl.rst |  13 ++
 include/net/netns/smc.h                 |   1 +
 include/uapi/linux/smc.h                |   1 +
 net/smc/af_smc.c                        |  68 +++++++--
 net/smc/smc_clc.c                       |   8 +-
 net/smc/smc_clc.h                       |   2 +-
 net/smc/smc_core.c                      | 246 +++++++++++++++++++++-----------
 net/smc/smc_core.h                      |  20 ++-
 net/smc/smc_ib.c                        |  44 +++++-
 net/smc/smc_ib.h                        |   2 +
 net/smc/smc_llc.c                       |  33 +++--
 net/smc/smc_rx.c                        |  92 +++++++++---
 net/smc/smc_sysctl.c                    |  11 ++
 net/smc/smc_tx.c                        |  10 +-
 14 files changed, 404 insertions(+), 147 deletions(-)

Comments

Wenjia Zhang July 14, 2022, 3:16 p.m. UTC | #1
On 14.07.22 11:43, Wen Gu wrote:
> On long-running enterprise production servers, high-order contiguous
> memory pages are usually very rare and in most cases we can only get
> fragmented pages.
> 
> When replacing TCP with SMC-R in such production scenarios, attempting
> to allocate high-order physically contiguous sndbufs and RMBs may result
> in frequent memory compaction, which will cause unexpected hung issue
> and further stability risks.
> 
> So this patch set is aimed to allow SMC-R link group to use virtually
> contiguous sndbufs and RMBs to avoid potential issues mentioned above.
> Whether to use physically or virtually contiguous buffers can be set
> by sysctl smcr_buf_type.
> 
> Note that using virtually contiguous buffers will bring an acceptable
> performance regression, which can be mainly divided into two parts:
> 
> 1) regression in data path, which is brought by additional address
>     translation of sndbuf by RNIC in Tx. But in general, translating
>     address through MTT is fast. According to qperf test, this part
>     regression is basically less than 10% in latency and bandwidth.
>     (see patch 5/6 for details)
> 
> 2) regression in buffer initialization and destruction path, which is
>     brought by additional MR operations of sndbufs. But thanks to link
>     group buffer reuse mechanism, the impact of this kind of regression
>     decreases as times of buffer reuse increases.
> 
> Patch set overview:
> - Patch 1/6 and 2/6 mainly about simplifying and optimizing DMA sync
>    operation, which will reduce overhead on the data path, especially
>    when using virtually contiguous buffers;
> - Patch 3/6 and 4/6 introduce a sysctl smcr_buf_type to set the type
>    of buffers in new created link group;
> - Patch 5/6 allows SMC-R to use virtually contiguous sndbufs and RMBs,
>    including buffer creation, destruction, MR operation and access;
> - patch 6/6 extends netlink attribute for buffer type of SMC-R link group;
> 
> v1->v2:
> - Patch 5/6 fixes build issue on 32bit;
> - Patch 3/6 adds description of new sysctl in smc-sysctl.rst;
> 
> Guangguan Wang (2):
>    net/smc: remove redundant dma sync ops
>    net/smc: optimize for smc_sndbuf_sync_sg_for_device and
>      smc_rmb_sync_sg_for_cpu
> 
> Wen Gu (4):
>    net/smc: Introduce a sysctl for setting SMC-R buffer type
>    net/smc: Use sysctl-specified types of buffers in new link group
>    net/smc: Allow virtually contiguous sndbufs or RMBs for SMC-R
>    net/smc: Extend SMC-R link group netlink attribute
> 
>   Documentation/networking/smc-sysctl.rst |  13 ++
>   include/net/netns/smc.h                 |   1 +
>   include/uapi/linux/smc.h                |   1 +
>   net/smc/af_smc.c                        |  68 +++++++--
>   net/smc/smc_clc.c                       |   8 +-
>   net/smc/smc_clc.h                       |   2 +-
>   net/smc/smc_core.c                      | 246 +++++++++++++++++++++-----------
>   net/smc/smc_core.h                      |  20 ++-
>   net/smc/smc_ib.c                        |  44 +++++-
>   net/smc/smc_ib.h                        |   2 +
>   net/smc/smc_llc.c                       |  33 +++--
>   net/smc/smc_rx.c                        |  92 +++++++++---
>   net/smc/smc_sysctl.c                    |  11 ++
>   net/smc/smc_tx.c                        |  10 +-
>   14 files changed, 404 insertions(+), 147 deletions(-)
> 
This idea is very cool! Thank you for your effort! But we still need to 
verify if this solution can run well on our system. I'll come to you soon.
patchwork-bot+netdevbpf@kernel.org July 18, 2022, 10:40 a.m. UTC | #2
Hello:

This series was applied to netdev/net-next.git (master)
by David S. Miller <davem@davemloft.net>:

On Thu, 14 Jul 2022 17:43:59 +0800 you wrote:
> On long-running enterprise production servers, high-order contiguous
> memory pages are usually very rare and in most cases we can only get
> fragmented pages.
> 
> When replacing TCP with SMC-R in such production scenarios, attempting
> to allocate high-order physically contiguous sndbufs and RMBs may result
> in frequent memory compaction, which will cause unexpected hung issue
> and further stability risks.
> 
> [...]

Here is the summary with links:
  - [net-next,v2,1/6] net/smc: remove redundant dma sync ops
    https://git.kernel.org/netdev/net-next/c/6d52e2de6415
  - [net-next,v2,2/6] net/smc: optimize for smc_sndbuf_sync_sg_for_device and smc_rmb_sync_sg_for_cpu
    https://git.kernel.org/netdev/net-next/c/0ef69e788411
  - [net-next,v2,3/6] net/smc: Introduce a sysctl for setting SMC-R buffer type
    https://git.kernel.org/netdev/net-next/c/4bc5008e4387
  - [net-next,v2,4/6] net/smc: Use sysctl-specified types of buffers in new link group
    https://git.kernel.org/netdev/net-next/c/b984f370ed51
  - [net-next,v2,5/6] net/smc: Allow virtually contiguous sndbufs or RMBs for SMC-R
    https://git.kernel.org/netdev/net-next/c/b8d199451c99
  - [net-next,v2,6/6] net/smc: Extend SMC-R link group netlink attribute
    https://git.kernel.org/netdev/net-next/c/ddefb2d20553

You are awesome, thank you!
Tony Lu July 18, 2022, 12:45 p.m. UTC | #3
On Thu, Jul 14, 2022 at 05:16:47PM +0200, Wenjia Zhang wrote:
> 
> 
> On 14.07.22 11:43, Wen Gu wrote:
> > On long-running enterprise production servers, high-order contiguous
> > memory pages are usually very rare and in most cases we can only get
> > fragmented pages.
> > 
> > When replacing TCP with SMC-R in such production scenarios, attempting
> > to allocate high-order physically contiguous sndbufs and RMBs may result
> > in frequent memory compaction, which will cause unexpected hung issue
> > and further stability risks.
> > 
> > So this patch set is aimed to allow SMC-R link group to use virtually
> > contiguous sndbufs and RMBs to avoid potential issues mentioned above.
> > Whether to use physically or virtually contiguous buffers can be set
> > by sysctl smcr_buf_type.
> > 
> > Note that using virtually contiguous buffers will bring an acceptable
> > performance regression, which can be mainly divided into two parts:
> > 
> > 1) regression in data path, which is brought by additional address
> >     translation of sndbuf by RNIC in Tx. But in general, translating
> >     address through MTT is fast. According to qperf test, this part
> >     regression is basically less than 10% in latency and bandwidth.
> >     (see patch 5/6 for details)
> > 
> > 2) regression in buffer initialization and destruction path, which is
> >     brought by additional MR operations of sndbufs. But thanks to link
> >     group buffer reuse mechanism, the impact of this kind of regression
> >     decreases as times of buffer reuse increases.
> > 
> > Patch set overview:
> > - Patch 1/6 and 2/6 mainly about simplifying and optimizing DMA sync
> >    operation, which will reduce overhead on the data path, especially
> >    when using virtually contiguous buffers;
> > - Patch 3/6 and 4/6 introduce a sysctl smcr_buf_type to set the type
> >    of buffers in new created link group;
> > - Patch 5/6 allows SMC-R to use virtually contiguous sndbufs and RMBs,
> >    including buffer creation, destruction, MR operation and access;
> > - patch 6/6 extends netlink attribute for buffer type of SMC-R link group;
> > 
> > v1->v2:
> > - Patch 5/6 fixes build issue on 32bit;
> > - Patch 3/6 adds description of new sysctl in smc-sysctl.rst;
> > 
> > Guangguan Wang (2):
> >    net/smc: remove redundant dma sync ops
> >    net/smc: optimize for smc_sndbuf_sync_sg_for_device and
> >      smc_rmb_sync_sg_for_cpu
> > 
> > Wen Gu (4):
> >    net/smc: Introduce a sysctl for setting SMC-R buffer type
> >    net/smc: Use sysctl-specified types of buffers in new link group
> >    net/smc: Allow virtually contiguous sndbufs or RMBs for SMC-R
> >    net/smc: Extend SMC-R link group netlink attribute
> > 
> >   Documentation/networking/smc-sysctl.rst |  13 ++
> >   include/net/netns/smc.h                 |   1 +
> >   include/uapi/linux/smc.h                |   1 +
> >   net/smc/af_smc.c                        |  68 +++++++--
> >   net/smc/smc_clc.c                       |   8 +-
> >   net/smc/smc_clc.h                       |   2 +-
> >   net/smc/smc_core.c                      | 246 +++++++++++++++++++++-----------
> >   net/smc/smc_core.h                      |  20 ++-
> >   net/smc/smc_ib.c                        |  44 +++++-
> >   net/smc/smc_ib.h                        |   2 +
> >   net/smc/smc_llc.c                       |  33 +++--
> >   net/smc/smc_rx.c                        |  92 +++++++++---
> >   net/smc/smc_sysctl.c                    |  11 ++
> >   net/smc/smc_tx.c                        |  10 +-
> >   14 files changed, 404 insertions(+), 147 deletions(-)
> > 
> This idea is very cool! Thank you for your effort! But we still need to
> verify if this solution can run well on our system. I'll come to you soon.

Hi Wenjia,

We have noticed that SMC community is becoming more active recently.
More and more companies have shown their interests in SMC.
Correspondingly, patches are also increasing. We (Alibaba) are trying to
apply SMC into cloud production environment, extending its abilities and
enhancing the performance. We also contributed some work to community in
the past period of time. So we are more than happy to help review SMC
patches together. If you need, we are very glad to be reviewers to share
the review work.

Hope to hear from you, thank you.

Best wishes,
Tony Lu
Wenjia Zhang July 19, 2022, 12:55 p.m. UTC | #4
On 18.07.22 14:45, Tony Lu wrote:
> On Thu, Jul 14, 2022 at 05:16:47PM +0200, Wenjia Zhang wrote:
>>
>>
>> On 14.07.22 11:43, Wen Gu wrote:
>>> On long-running enterprise production servers, high-order contiguous
>>> memory pages are usually very rare and in most cases we can only get
>>> fragmented pages.
>>>
>>> When replacing TCP with SMC-R in such production scenarios, attempting
>>> to allocate high-order physically contiguous sndbufs and RMBs may result
>>> in frequent memory compaction, which will cause unexpected hung issue
>>> and further stability risks.
>>>
>>> So this patch set is aimed to allow SMC-R link group to use virtually
>>> contiguous sndbufs and RMBs to avoid potential issues mentioned above.
>>> Whether to use physically or virtually contiguous buffers can be set
>>> by sysctl smcr_buf_type.
>>>
>>> Note that using virtually contiguous buffers will bring an acceptable
>>> performance regression, which can be mainly divided into two parts:
>>>
>>> 1) regression in data path, which is brought by additional address
>>>      translation of sndbuf by RNIC in Tx. But in general, translating
>>>      address through MTT is fast. According to qperf test, this part
>>>      regression is basically less than 10% in latency and bandwidth.
>>>      (see patch 5/6 for details)
>>>
>>> 2) regression in buffer initialization and destruction path, which is
>>>      brought by additional MR operations of sndbufs. But thanks to link
>>>      group buffer reuse mechanism, the impact of this kind of regression
>>>      decreases as times of buffer reuse increases.
>>>
>>> Patch set overview:
>>> - Patch 1/6 and 2/6 mainly about simplifying and optimizing DMA sync
>>>     operation, which will reduce overhead on the data path, especially
>>>     when using virtually contiguous buffers;
>>> - Patch 3/6 and 4/6 introduce a sysctl smcr_buf_type to set the type
>>>     of buffers in new created link group;
>>> - Patch 5/6 allows SMC-R to use virtually contiguous sndbufs and RMBs,
>>>     including buffer creation, destruction, MR operation and access;
>>> - patch 6/6 extends netlink attribute for buffer type of SMC-R link group;
>>>
>>> v1->v2:
>>> - Patch 5/6 fixes build issue on 32bit;
>>> - Patch 3/6 adds description of new sysctl in smc-sysctl.rst;
>>>
>>> Guangguan Wang (2):
>>>     net/smc: remove redundant dma sync ops
>>>     net/smc: optimize for smc_sndbuf_sync_sg_for_device and
>>>       smc_rmb_sync_sg_for_cpu
>>>
>>> Wen Gu (4):
>>>     net/smc: Introduce a sysctl for setting SMC-R buffer type
>>>     net/smc: Use sysctl-specified types of buffers in new link group
>>>     net/smc: Allow virtually contiguous sndbufs or RMBs for SMC-R
>>>     net/smc: Extend SMC-R link group netlink attribute
>>>
>>>    Documentation/networking/smc-sysctl.rst |  13 ++
>>>    include/net/netns/smc.h                 |   1 +
>>>    include/uapi/linux/smc.h                |   1 +
>>>    net/smc/af_smc.c                        |  68 +++++++--
>>>    net/smc/smc_clc.c                       |   8 +-
>>>    net/smc/smc_clc.h                       |   2 +-
>>>    net/smc/smc_core.c                      | 246 +++++++++++++++++++++-----------
>>>    net/smc/smc_core.h                      |  20 ++-
>>>    net/smc/smc_ib.c                        |  44 +++++-
>>>    net/smc/smc_ib.h                        |   2 +
>>>    net/smc/smc_llc.c                       |  33 +++--
>>>    net/smc/smc_rx.c                        |  92 +++++++++---
>>>    net/smc/smc_sysctl.c                    |  11 ++
>>>    net/smc/smc_tx.c                        |  10 +-
>>>    14 files changed, 404 insertions(+), 147 deletions(-)
>>>
>> This idea is very cool! Thank you for your effort! But we still need to
>> verify if this solution can run well on our system. I'll come to you soon.
> 
> Hi Wenjia,
> 
> We have noticed that SMC community is becoming more active recently.
> More and more companies have shown their interests in SMC.
> Correspondingly, patches are also increasing. We (Alibaba) are trying to
> apply SMC into cloud production environment, extending its abilities and
> enhancing the performance. We also contributed some work to community in
> the past period of time. So we are more than happy to help review SMC
> patches together. If you need, we are very glad to be reviewers to share
> the review work.
> 
> Hope to hear from you, thank you.
> 
> Best wishes,
> Tony Lu

Hi Tony,

That is very nice to hear that from you. It would be great for us. If 
you like, feel free to add your sign after the review.
Thank you!

Best regards
Wenjia Zhang
Wenjia Zhang July 20, 2022, 5:38 p.m. UTC | #5
On 14.07.22 11:43, Wen Gu wrote:
> On long-running enterprise production servers, high-order contiguous
> memory pages are usually very rare and in most cases we can only get
> fragmented pages.
> 
> When replacing TCP with SMC-R in such production scenarios, attempting
> to allocate high-order physically contiguous sndbufs and RMBs may result
> in frequent memory compaction, which will cause unexpected hung issue
> and further stability risks.
> 
> So this patch set is aimed to allow SMC-R link group to use virtually
> contiguous sndbufs and RMBs to avoid potential issues mentioned above.
> Whether to use physically or virtually contiguous buffers can be set
> by sysctl smcr_buf_type.
> 
> Note that using virtually contiguous buffers will bring an acceptable
> performance regression, which can be mainly divided into two parts:
> 
> 1) regression in data path, which is brought by additional address
>     translation of sndbuf by RNIC in Tx. But in general, translating
>     address through MTT is fast. According to qperf test, this part
>     regression is basically less than 10% in latency and bandwidth.
>     (see patch 5/6 for details)
> 
> 2) regression in buffer initialization and destruction path, which is
>     brought by additional MR operations of sndbufs. But thanks to link
>     group buffer reuse mechanism, the impact of this kind of regression
>     decreases as times of buffer reuse increases.
> 
> Patch set overview:
> - Patch 1/6 and 2/6 mainly about simplifying and optimizing DMA sync
>    operation, which will reduce overhead on the data path, especially
>    when using virtually contiguous buffers;
> - Patch 3/6 and 4/6 introduce a sysctl smcr_buf_type to set the type
>    of buffers in new created link group;
> - Patch 5/6 allows SMC-R to use virtually contiguous sndbufs and RMBs,
>    including buffer creation, destruction, MR operation and access;
> - patch 6/6 extends netlink attribute for buffer type of SMC-R link group;
> 
> v1->v2:
> - Patch 5/6 fixes build issue on 32bit;
> - Patch 3/6 adds description of new sysctl in smc-sysctl.rst;
> 
> Guangguan Wang (2):
>    net/smc: remove redundant dma sync ops
>    net/smc: optimize for smc_sndbuf_sync_sg_for_device and
>      smc_rmb_sync_sg_for_cpu
> 
> Wen Gu (4):
>    net/smc: Introduce a sysctl for setting SMC-R buffer type
>    net/smc: Use sysctl-specified types of buffers in new link group
>    net/smc: Allow virtually contiguous sndbufs or RMBs for SMC-R
>    net/smc: Extend SMC-R link group netlink attribute
> 
>   Documentation/networking/smc-sysctl.rst |  13 ++
>   include/net/netns/smc.h                 |   1 +
>   include/uapi/linux/smc.h                |   1 +
>   net/smc/af_smc.c                        |  68 +++++++--
>   net/smc/smc_clc.c                       |   8 +-
>   net/smc/smc_clc.h                       |   2 +-
>   net/smc/smc_core.c                      | 246 +++++++++++++++++++++-----------
>   net/smc/smc_core.h                      |  20 ++-
>   net/smc/smc_ib.c                        |  44 +++++-
>   net/smc/smc_ib.h                        |   2 +
>   net/smc/smc_llc.c                       |  33 +++--
>   net/smc/smc_rx.c                        |  92 +++++++++---
>   net/smc/smc_sysctl.c                    |  11 ++
>   net/smc/smc_tx.c                        |  10 +-
>   14 files changed, 404 insertions(+), 147 deletions(-)
> 
It looks good for us. Thank you!
Acked-by: Wenjia Zhang <wenjia@linux.ibm.com>