From patchwork Wed Mar 30 09:08:22 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wengang Wang X-Patchwork-Id: 8694051 Return-Path: X-Original-To: patchwork-linux-rdma@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 0E175C0553 for ; Wed, 30 Mar 2016 09:08:42 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 2048C2035B for ; Wed, 30 Mar 2016 09:08:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id EEB07202BE for ; Wed, 30 Mar 2016 09:08:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758821AbcC3JIi (ORCPT ); Wed, 30 Mar 2016 05:08:38 -0400 Received: from userp1040.oracle.com ([156.151.31.81]:41811 "EHLO userp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758752AbcC3JIi (ORCPT ); Wed, 30 Mar 2016 05:08:38 -0400 Received: from userv0021.oracle.com (userv0021.oracle.com [156.151.31.71]) by userp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id u2U98aFi010652 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Wed, 30 Mar 2016 09:08:37 GMT Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by userv0021.oracle.com (8.13.8/8.13.8) with ESMTP id u2U98aVf012532 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Wed, 30 Mar 2016 09:08:36 GMT Received: from abhmp0016.oracle.com (abhmp0016.oracle.com [141.146.116.22]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id u2U98aZD009247 for ; Wed, 30 Mar 2016 09:08:36 GMT Received: from oracle.com (/10.182.64.160) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 30 Mar 2016 02:08:36 -0700 From: Wengang Wang To: linux-rdma@vger.kernel.org Cc: wen.gang.wang@oracle.com Subject: [PATCH] RDS: sync congestion map updating Date: Wed, 30 Mar 2016 17:08:22 +0800 Message-Id: <1459328902-31968-1-git-send-email-wen.gang.wang@oracle.com> X-Mailer: git-send-email 2.1.0 X-Source-IP: userv0021.oracle.com [156.151.31.71] Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Spam-Status: No, score=-7.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Problem is found that some among a lot of parallel RDS communications hang. In my test ten or so among 33 communications hang. The send requests got -ENOBUF error meaning the peer socket (port) is congested. But meanwhile, peer socket (port) is not congested. The congestion map updating can happen in two paths: one is in rds_recvmsg path and the other is when it receives packets from the hardware. There is no synchronization when updating the congestion map. So a bit operation (clearing) in the rds_recvmsg path can be skipped by another bit operation (setting) in hardware packet receving path. Fix is to add a spin lock per congestion map to sync the update on it. No performance drop found during the test for the fix. Signed-off-by: Wengang Wang --- net/rds/cong.c | 7 +++++++ net/rds/rds.h | 1 + 2 files changed, 8 insertions(+) diff --git a/net/rds/cong.c b/net/rds/cong.c index e6144b8..7afc1bf 100644 --- a/net/rds/cong.c +++ b/net/rds/cong.c @@ -144,6 +144,7 @@ static struct rds_cong_map *rds_cong_from_addr(__be32 addr) if (!map) return NULL; + spin_lock_init(&map->m_lock); map->m_addr = addr; init_waitqueue_head(&map->m_waitq); INIT_LIST_HEAD(&map->m_conn_list); @@ -292,6 +293,7 @@ void rds_cong_set_bit(struct rds_cong_map *map, __be16 port) { unsigned long i; unsigned long off; + unsigned long flags; rdsdebug("setting congestion for %pI4:%u in map %p\n", &map->m_addr, ntohs(port), map); @@ -299,13 +301,16 @@ void rds_cong_set_bit(struct rds_cong_map *map, __be16 port) i = be16_to_cpu(port) / RDS_CONG_MAP_PAGE_BITS; off = be16_to_cpu(port) % RDS_CONG_MAP_PAGE_BITS; + spin_lock_irqsave(&map->m_lock, flags); __set_bit_le(off, (void *)map->m_page_addrs[i]); + spin_unlock_irqrestore(&map->m_lock, flags); } void rds_cong_clear_bit(struct rds_cong_map *map, __be16 port) { unsigned long i; unsigned long off; + unsigned long flags; rdsdebug("clearing congestion for %pI4:%u in map %p\n", &map->m_addr, ntohs(port), map); @@ -313,7 +318,9 @@ void rds_cong_clear_bit(struct rds_cong_map *map, __be16 port) i = be16_to_cpu(port) / RDS_CONG_MAP_PAGE_BITS; off = be16_to_cpu(port) % RDS_CONG_MAP_PAGE_BITS; + spin_lock_irqsave(&map->m_lock, flags); __clear_bit_le(off, (void *)map->m_page_addrs[i]); + spin_unlock_irqrestore(&map->m_lock, flags); } static int rds_cong_test_bit(struct rds_cong_map *map, __be16 port) diff --git a/net/rds/rds.h b/net/rds/rds.h index 80256b0..f359cf8 100644 --- a/net/rds/rds.h +++ b/net/rds/rds.h @@ -59,6 +59,7 @@ struct rds_cong_map { __be32 m_addr; wait_queue_head_t m_waitq; struct list_head m_conn_list; + spinlock_t m_lock; unsigned long m_page_addrs[RDS_CONG_MAP_PAGES]; };