From patchwork Fri Apr 1 19:47:24 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Santosh Shilimkar X-Patchwork-Id: 8727331 Return-Path: X-Original-To: patchwork-linux-rdma@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 2D8BDC0553 for ; Fri, 1 Apr 2016 19:47:41 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 45C2520120 for ; Fri, 1 Apr 2016 19:47:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 36042200DB for ; Fri, 1 Apr 2016 19:47:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753172AbcDATri (ORCPT ); Fri, 1 Apr 2016 15:47:38 -0400 Received: from aserp1040.oracle.com ([141.146.126.69]:25987 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752722AbcDATrh (ORCPT ); Fri, 1 Apr 2016 15:47:37 -0400 Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id u31JlWkE014523 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Fri, 1 Apr 2016 19:47:32 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by aserv0022.oracle.com (8.13.8/8.13.8) with ESMTP id u31JlWb8008077 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Fri, 1 Apr 2016 19:47:32 GMT Received: from abhmp0002.oracle.com (abhmp0002.oracle.com [141.146.116.8]) by aserv0122.oracle.com (8.13.8/8.13.8) with ESMTP id u31JlSkI004659; Fri, 1 Apr 2016 19:47:30 GMT Received: from [10.211.55.66] (/10.211.55.66) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 01 Apr 2016 12:47:28 -0700 Subject: Re: [PATCH] RDS: sync congestion map updating To: linux-rdma@vger.kernel.org References: <1459328902-31968-1-git-send-email-wen.gang.wang@oracle.com> <20160330161952.GA2670@leon.nu> <56FC09D6.7090602@oracle.com> <56FC82B7.3070504@oracle.com> <56FC927E.9090404@oracle.com> Cc: Wengang Wang , leon@leon.nu, netdev@vger.kernel.org From: santosh shilimkar Organization: Oracle Corporation Message-ID: <56FED04C.2060806@oracle.com> Date: Fri, 1 Apr 2016 12:47:24 -0700 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: <56FC927E.9090404@oracle.com> X-Source-IP: aserv0022.oracle.com [141.146.126.234] Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Spam-Status: No, score=-7.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP (cc-ing netdev) On 3/30/2016 7:59 PM, Wengang Wang wrote: > > > ? 2016?03?31? 09:51, Wengang Wang ??: >> >> >> ? 2016?03?31? 01:16, santosh shilimkar ??: >>> Hi Wengang, >>> >>> On 3/30/2016 9:19 AM, Leon Romanovsky wrote: >>>> On Wed, Mar 30, 2016 at 05:08:22PM +0800, Wengang Wang wrote: >>>>> Problem is found that some among a lot of parallel RDS >>>>> communications hang. >>>>> In my test ten or so among 33 communications hang. The send >>>>> requests got >>>>> -ENOBUF error meaning the peer socket (port) is congested. But >>>>> meanwhile, >>>>> peer socket (port) is not congested. >>>>> >>>>> The congestion map updating can happen in two paths: one is in >>>>> rds_recvmsg path >>>>> and the other is when it receives packets from the hardware. There >>>>> is no >>>>> synchronization when updating the congestion map. So a bit >>>>> operation (clearing) >>>>> in the rds_recvmsg path can be skipped by another bit operation >>>>> (setting) in >>>>> hardware packet receving path. >>>>> > > To be more detailed. Here, the two paths (user calls recvmsg and > hardware receives data) are for different rds socks. thus the > rds_sock->rs_recv_lock is not helpful to sync the updating on congestion > map. > For archive purpose, let me try to conclude the thread. I synced with Wengang offlist and came up with below fix. I was under impression that __set_bit_le() was atmoic version. After fixing it like patch(end of the email), the bug gets addressed. I will probably send this as fix for stable as well. From 5614b61f6fdcd6ae0c04e50b97efd13201762294 Mon Sep 17 00:00:00 2001 From: Santosh Shilimkar Date: Wed, 30 Mar 2016 23:26:47 -0700 Subject: [PATCH] RDS: Fix the atomicity for congestion map update Two different threads with different rds sockets may be in rds_recv_rcvbuf_delta() via receive path. If their ports both map to the same word in the congestion map, then using non-atomic ops to update it could cause the map to be incorrect. Lets use atomics to avoid such an issue. Full credit to Wengang for finding the issue, analysing it and also pointing out to offending code with spin lock based fix. Signed-off-by: Wengang Wang Signed-off-by: Santosh Shilimkar Reviewed-by: Leon Romanovsky --- net/rds/cong.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) void rds_cong_clear_bit(struct rds_cong_map *map, __be16 port) @@ -313,7 +313,7 @@ void rds_cong_clear_bit(struct rds_cong_map *map, __be16 port) i = be16_to_cpu(port) / RDS_CONG_MAP_PAGE_BITS; off = be16_to_cpu(port) % RDS_CONG_MAP_PAGE_BITS; - __clear_bit_le(off, (void *)map->m_page_addrs[i]); + clear_bit_le(off, (void *)map->m_page_addrs[i]); } static int rds_cong_test_bit(struct rds_cong_map *map, __be16 port) diff --git a/net/rds/cong.c b/net/rds/cong.c index e6144b8..6641bcf 100644 --- a/net/rds/cong.c +++ b/net/rds/cong.c @@ -299,7 +299,7 @@ void rds_cong_set_bit(struct rds_cong_map *map, __be16 port) i = be16_to_cpu(port) / RDS_CONG_MAP_PAGE_BITS; off = be16_to_cpu(port) % RDS_CONG_MAP_PAGE_BITS; - __set_bit_le(off, (void *)map->m_page_addrs[i]); + set_bit_le(off, (void *)map->m_page_addrs[i]); }