From patchwork Wed Mar 21 23:03:07 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Casey Leedom X-Patchwork-Id: 10300525 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 0E55A60349 for ; Wed, 21 Mar 2018 23:03:37 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id ED9A72903A for ; Wed, 21 Mar 2018 23:03:36 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E028F290D7; Wed, 21 Mar 2018 23:03:36 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID autolearn=ham version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 1D9D22903A for ; Wed, 21 Mar 2018 23:03:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:In-Reply-To:References: Message-ID:Date:Subject:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=k6huN5WbyQadK2ttRVRbB0XVc6+/1sUiw+sO015wzDI=; b=fxpoDXq+F+xCEl aQslauKrBQg7epJQRHvfAUMh1eJTOx7lz8anVyBobsJAs332jQm0ND8ny48w8ZB+Cqk9p4Ro0pcfi ocf0VbyUNq1ATXNTsXG25CSieVV76YHCNyVuPPVly1C1hDsm8YerDaYfeWM/sUmKRa6y9aE2R9VUJ uavZjTrX0tWPiwgONATc5d5xfn0OxUw67xwLEhJG/HGTo/yP/NuNpvt2NBlUGDPYjDKcEiryXxRVg axKE+a959TjQJKVz6gxcjuAzjEEawxBHAPHBhGVvT7D0tRmDoq3hz3LeAzuDmqMHuAG2bSXC5oEcN XSDPvc6801KZa54AdPfQ==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.90_1 #2 (Red Hat Linux)) id 1eymlT-0000IY-RD; Wed, 21 Mar 2018 23:03:27 +0000 Received: from mail-bl2nam02on0706.outbound.protection.outlook.com ([2a01:111:f400:fe46::706] helo=NAM02-BL2-obe.outbound.protection.outlook.com) by bombadil.infradead.org with esmtps (Exim 4.90_1 #2 (Red Hat Linux)) id 1eymlP-0000HB-B0 for linux-arm-kernel@lists.infradead.org; Wed, 21 Mar 2018 23:03:25 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chelsious.onmicrosoft.com; s=selector1-chelsio-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=+hOPfhOLZbk3vtDE4RSMOM4Xoz1/YvD1lb30zEG/5EI=; b=ao6dX9PIeH/hQ0mAyvO4y5CCl9tx8QHx1O8rtnW/uxZBFOBK+An9fU9cWkjN8vgfBO/ESKMQR0MnyRPTFONKPnDXlaaXxSwxLEfYTVR44NIAULiSzjdNmjFYstCOSDX/yjOYFRr+g9nea7MgXlSwKs/JF0C7ue3L15n2jpR3OPM= Received: from BY2PR1201MB0983.namprd12.prod.outlook.com (10.164.167.137) by BY2PR1201MB1046.namprd12.prod.outlook.com (10.164.167.152) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.588.14; Wed, 21 Mar 2018 23:03:08 +0000 Received: from BY2PR1201MB0983.namprd12.prod.outlook.com ([10.164.167.137]) by BY2PR1201MB0983.namprd12.prod.outlook.com ([10.164.167.137]) with mapi id 15.20.0588.017; Wed, 21 Mar 2018 23:03:08 +0000 From: Casey Leedom To: Sinan Kaya , "netdev@vger.kernel.org" , "timur@codeaurora.org" , "sulrich@codeaurora.org" Subject: Re: [PATCH v4 12/17] net: cxgb4/cxgb4vf: Eliminate duplicate barriers on weakly-ordered archs Thread-Topic: [PATCH v4 12/17] net: cxgb4/cxgb4vf: Eliminate duplicate barriers on weakly-ordered archs Thread-Index: AQHTv/U7o3Ux/WHjs0SgafrWoqnYvaPbUTyu Date: Wed, 21 Mar 2018 23:03:07 +0000 Message-ID: References: <1521513753-7325-1-git-send-email-okaya@codeaurora.org>, <1521513753-7325-13-git-send-email-okaya@codeaurora.org> In-Reply-To: <1521513753-7325-13-git-send-email-okaya@codeaurora.org> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=leedom@chelsio.com; x-originating-ip: [12.32.117.8] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; BY2PR1201MB1046; 7:dcZ+8YX5IflPf9ca6a4l+568iSxXpjQIHznv+dv/K9p41Ni/iCXrsxoiFXM/kP7Hu7tYKBL3/vicUTkn3xHs4hsZuUcn6QD+StZYAIorJoy5EEko5zJGllEdnqIGYavmUpYbqG+Xg1qFbQGE9pAZFvDT2aKqrqhroAVyLCy8YOlIG4amHHbj7ztKQ5/5SRrYxMV352x3fHFkpRCaBBis4B93DOKSZwYlaQTnFDkSaGbh9DG7Thb/tZDFspESZlQV x-ms-exchange-antispam-srfa-diagnostics: SOS; x-ms-office365-filtering-correlation-id: fb3a9ca2-142e-48a6-9663-08d58f7fe958 x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(7021125)(5600026)(4604075)(3008032)(4534165)(7022125)(4603075)(4627221)(201702281549075)(7048125)(7024125)(7027125)(7028125)(7023125)(2017052603328)(7153060)(7193020); SRVR:BY2PR1201MB1046; x-ms-traffictypediagnostic: BY2PR1201MB1046: x-ld-processed: 065db76d-a7ae-4c60-b78a-501e8fc17095,ExtAddr x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(9452136761055)(258649278758335); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(6040522)(2401047)(8121501046)(5005006)(93006095)(93001095)(3002001)(10201501046)(3231221)(944501327)(52105095)(6041310)(20161123562045)(20161123564045)(20161123560045)(2016111802025)(20161123558120)(6072148)(6043046)(201708071742011); SRVR:BY2PR1201MB1046; BCL:0; PCL:0; RULEID:; SRVR:BY2PR1201MB1046; x-forefront-prvs: 0618E4E7E1 x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(39840400004)(346002)(366004)(39380400002)(376002)(396003)(189003)(199004)(110136005)(3660700001)(68736007)(316002)(74316002)(59450400001)(305945005)(7736002)(6436002)(6246003)(6506007)(54906003)(53546011)(105586002)(2501003)(3280700002)(102836004)(25786009)(2900100001)(14454004)(33656002)(2950100002)(575784001)(26005)(77096007)(106356001)(3846002)(6116002)(9686003)(53936002)(5660300001)(4326008)(99286004)(81156014)(186003)(8676002)(55016002)(2201001)(81166006)(97736004)(8936002)(7696005)(66066001)(76176011)(478600001)(2906002)(229853002)(86362001)(446003); DIR:OUT; SFP:1102; SCL:1; SRVR:BY2PR1201MB1046; H:BY2PR1201MB0983.namprd12.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; received-spf: None (protection.outlook.com: chelsio.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: EvHyQaaYL2Bzj1lWBnDGW2Z1E9dKX3mvxMb2FzxfyJmWs+tV4YjyxXo6uQ8WkFBXfSGyEEIWaPZqpaeKhyNT9dSck43MhiAwh2WPxu405jK+TKZkj+Lwm0LI7XL4W0DTEWLtT930RWast3TlpGnOPhfdwkZC3mHKO+oSfzy5qqbDevSsyWDQvUUfEDbJuKGSIyVENcP8eRynoZgI4NKuAxR+4S22WdduPp4fBOLLe4wXzg8JAhNG/gTHi6fYwI4pbDhv/50FvMR7z2djbZjmVphJ5GHs9vVVOGXYiiPTJJQPGcYOv2IH6GxivI157kyVHU5+F0lsQGN4KFEb+NlTlg== spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: chelsio.com X-MS-Exchange-CrossTenant-Network-Message-Id: fb3a9ca2-142e-48a6-9663-08d58f7fe958 X-MS-Exchange-CrossTenant-originalarrivaltime: 21 Mar 2018 23:03:07.9746 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 065db76d-a7ae-4c60-b78a-501e8fc17095 X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY2PR1201MB1046 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20180321_160323_610556_52E2EF5A X-CRM114-Status: GOOD ( 13.27 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "linux-arm-msm@vger.kernel.org" , SWise OGC , "linux-kernel@vger.kernel.org" , Ganesh GR , Michael Werner , "linux-arm-kernel@lists.infradead.org" Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP [[ Appologies for the DUPLICATE email. I forgot to tell my Mail Agent to use Plain Text. -- Casey ]] I feel very uncomfortable with these proposed changes. Our team is right in the middle of trying to tease our way through the various platform implementations of writel(), writel_relaxed(), __raw_writel(), etc. in order to support x86, PowerPC, ARM, etc. with a single code base. This is complicated by the somewhat ... "fuzzily defined" semantics and varying platform implementations of all of these APIs. (And note that I'm just picking writel() as an example.) Additionally, many of the changes aren't even in fast paths and are thus unneeded for performance. Please don't make these changes. We're trying to get this all sussed out. Casey From: Sinan Kaya Sent: Monday, March 19, 2018 7:42:27 PM To: netdev@vger.kernel.org; timur@codeaurora.org; sulrich@codeaurora.org Cc: linux-arm-msm@vger.kernel.org; linux-arm-kernel@lists.infradead.org; Sinan Kaya; Ganesh GR; Casey Leedom; linux-kernel@vger.kernel.org Subject: [PATCH v4 12/17] net: cxgb4/cxgb4vf: Eliminate duplicate barriers on weakly-ordered archs   Code includes wmb() followed by writel(). writel() already has a barrier on some architectures like arm64. This ends up CPU observing two barriers back to back before executing the register write. Create a new wrapper function with relaxed write operator. Use the new wrapper when a write is following a wmb(). Signed-off-by: Sinan Kaya ---  drivers/net/ethernet/chelsio/cxgb4/cxgb4.h      |  6 ++++++  drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 13 +++++++------  drivers/net/ethernet/chelsio/cxgb4/sge.c        | 12 ++++++------  drivers/net/ethernet/chelsio/cxgb4/t4_hw.c      |  2 +-  drivers/net/ethernet/chelsio/cxgb4vf/adapter.h  | 14 ++++++++++++++  drivers/net/ethernet/chelsio/cxgb4vf/sge.c      | 18 ++++++++++--------  6 files changed, 44 insertions(+), 21 deletions(-) diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h index 9040e13..6bde0b9 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h @@ -1202,6 +1202,12 @@ static inline void t4_write_reg(struct adapter *adap, u32 reg_addr, u32 val)          writel(val, adap->regs + reg_addr);  }   +static inline void t4_write_reg_relaxed(struct adapter *adap, u32 reg_addr, +                                       u32 val) +{ +       writel_relaxed(val, adap->regs + reg_addr); +} +  #ifndef readq  static inline u64 readq(const volatile void __iomem *addr)  { diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c index 7b452e8..276472d 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c @@ -1723,8 +1723,8 @@ int cxgb4_sync_txq_pidx(struct net_device *dev, u16 qid, u16 pidx,                  else                          val = PIDX_T5_V(delta);                  wmb(); -               t4_write_reg(adap, MYPF_REG(SGE_PF_KDOORBELL_A), -                            QID_V(qid) | val); +               t4_write_reg_relaxed(adap, MYPF_REG(SGE_PF_KDOORBELL_A), +                                    QID_V(qid) | val);          }  out:          return ret; @@ -1902,8 +1902,9 @@ static void enable_txq_db(struct adapter *adap, struct sge_txq *q)                   * are committed before we tell HW about them.                   */                  wmb(); -               t4_write_reg(adap, MYPF_REG(SGE_PF_KDOORBELL_A), -                            QID_V(q->cntxt_id) | PIDX_V(q->db_pidx_inc)); +               t4_write_reg_relaxed(adap, MYPF_REG(SGE_PF_KDOORBELL_A), +                                    QID_V(q->cntxt_id) | +                                               PIDX_V(q->db_pidx_inc));                  q->db_pidx_inc = 0;          }          q->db_disabled = 0; @@ -2003,8 +2004,8 @@ static void sync_txq_pidx(struct adapter *adap, struct sge_txq *q)                  else                          val = PIDX_T5_V(delta);                  wmb(); -               t4_write_reg(adap, MYPF_REG(SGE_PF_KDOORBELL_A), -                            QID_V(q->cntxt_id) | val); +               t4_write_reg_relaxed(adap, MYPF_REG(SGE_PF_KDOORBELL_A), +                                    QID_V(q->cntxt_id) | val);          }  out:          q->db_disabled = 0; diff --git a/drivers/net/ethernet/chelsio/cxgb4/sge.c b/drivers/net/ethernet/chelsio/cxgb4/sge.c index 6e310a0..7388aac 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/sge.c +++ b/drivers/net/ethernet/chelsio/cxgb4/sge.c @@ -530,11 +530,11 @@ static inline void ring_fl_db(struct adapter *adap, struct sge_fl *q)                   * mechanism.                   */                  if (unlikely(q->bar2_addr == NULL)) { -                       t4_write_reg(adap, MYPF_REG(SGE_PF_KDOORBELL_A), -                                    val | QID_V(q->cntxt_id)); +                       t4_write_reg_relaxed(adap, MYPF_REG(SGE_PF_KDOORBELL_A), +                                            val | QID_V(q->cntxt_id));                  } else { -                       writel(val | QID_V(q->bar2_qid), -                              q->bar2_addr + SGE_UDB_KDOORBELL); +                       writel_relaxed(val | QID_V(q->bar2_qid), +                                      q->bar2_addr + SGE_UDB_KDOORBELL);                            /* This Write memory Barrier will force the write to                           * the User Doorbell area to be flushed. @@ -986,8 +986,8 @@ inline void cxgb4_ring_tx_db(struct adapter *adap, struct sge_txq *q, int n)                                        (q->bar2_addr + SGE_UDB_WCDOORBELL),                                        wr);                  } else { -                       writel(val | QID_V(q->bar2_qid), -                              q->bar2_addr + SGE_UDB_KDOORBELL); +                       writel_relaxed(val | QID_V(q->bar2_qid), +                                      q->bar2_addr + SGE_UDB_KDOORBELL);                  }                    /* This Write Memory Barrier will force the write to the User diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c index 920bccd..8b723a0 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c +++ b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c @@ -139,7 +139,7 @@ void t4_write_indirect(struct adapter *adap, unsigned int addr_reg,  {          while (nregs--) {                  t4_write_reg(adap, addr_reg, start_idx++); -               t4_write_reg(adap, data_reg, *vals++); +               t4_write_reg_relaxed(adap, data_reg, *vals++);          }  }   diff --git a/drivers/net/ethernet/chelsio/cxgb4vf/adapter.h b/drivers/net/ethernet/chelsio/cxgb4vf/adapter.h index 5883f09..00247be4 100644 --- a/drivers/net/ethernet/chelsio/cxgb4vf/adapter.h +++ b/drivers/net/ethernet/chelsio/cxgb4vf/adapter.h @@ -442,6 +442,20 @@ static inline void t4_write_reg(struct adapter *adapter, u32 reg_addr, u32 val)          writel(val, adapter->regs + reg_addr);  }   +/** + * t4_write_reg_relaxed - write a HW register without ordering guarantees + * @adapter: the adapter + * @reg_addr: the register address + * @val: the value to write + * + * Write a 32-bit value into the given HW register. + */ +static inline void t4_write_reg_relaxed(struct adapter *adapter, u32 reg_addr, +                                       u32 val) +{ +       writel_relaxed(val, adapter->regs + reg_addr); +} +  #ifndef readq  static inline u64 readq(const volatile void __iomem *addr)  { diff --git a/drivers/net/ethernet/chelsio/cxgb4vf/sge.c b/drivers/net/ethernet/chelsio/cxgb4vf/sge.c index dfce5df..a3a420b 100644 --- a/drivers/net/ethernet/chelsio/cxgb4vf/sge.c +++ b/drivers/net/ethernet/chelsio/cxgb4vf/sge.c @@ -546,12 +546,13 @@ static inline void ring_fl_db(struct adapter *adapter, struct sge_fl *fl)                   * mechanism.                   */                  if (unlikely(fl->bar2_addr == NULL)) { -                       t4_write_reg(adapter, -                                    T4VF_SGE_BASE_ADDR + SGE_VF_KDOORBELL, -                                    QID_V(fl->cntxt_id) | val); +                       t4_write_reg_relaxed(adapter, +                                            T4VF_SGE_BASE_ADDR + +                                                       SGE_VF_KDOORBELL, +                                            QID_V(fl->cntxt_id) | val);                  } else { -                       writel(val | QID_V(fl->bar2_qid), -                              fl->bar2_addr + SGE_UDB_KDOORBELL); +                       writel_relaxed(val | QID_V(fl->bar2_qid), +                                      fl->bar2_addr + SGE_UDB_KDOORBELL);                            /* This Write memory Barrier will force the write to                           * the User Doorbell area to be flushed. @@ -980,8 +981,9 @@ static inline void ring_tx_db(struct adapter *adapter, struct sge_txq *tq,          if (unlikely(tq->bar2_addr == NULL)) {                  u32 val = PIDX_V(n);   -               t4_write_reg(adapter, T4VF_SGE_BASE_ADDR + SGE_VF_KDOORBELL, -                            QID_V(tq->cntxt_id) | val); +               t4_write_reg_relaxed(adapter, +                                    T4VF_SGE_BASE_ADDR + SGE_VF_KDOORBELL, +                                    QID_V(tq->cntxt_id) | val);          } else {                  u32 val = PIDX_T5_V(n);   @@ -1026,7 +1028,7 @@ static inline void ring_tx_db(struct adapter *adapter, struct sge_txq *tq,                                  count--;                          }                  } else -                       writel(val | QID_V(tq->bar2_qid), +                       writel_relaxed(val | QID_V(tq->bar2_qid),                                 tq->bar2_addr + SGE_UDB_KDOORBELL);                    /* This Write Memory Barrier will force the write to the User