From patchwork Fri Jun 28 09:07:53 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 2797771 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id AD6739F3C3 for ; Fri, 28 Jun 2013 09:13:03 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 3FE0620170 for ; Fri, 28 Jun 2013 09:13:02 +0000 (UTC) Received: from casper.infradead.org (casper.infradead.org [85.118.1.10]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 992CA2016A for ; Fri, 28 Jun 2013 09:12:59 +0000 (UTC) Received: from merlin.infradead.org ([2001:4978:20e::2]) by casper.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1UsUgw-0007uO-LR; Fri, 28 Jun 2013 09:10:08 +0000 Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1UsUgL-0007gv-2i; Fri, 28 Jun 2013 09:09:29 +0000 Received: from mail-pd0-f170.google.com ([209.85.192.170]) by merlin.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1UsUg4-0007e9-CW for linux-arm-kernel@lists.infradead.org; Fri, 28 Jun 2013 09:09:16 +0000 Received: by mail-pd0-f170.google.com with SMTP id x11so918667pdj.15 for ; Fri, 28 Jun 2013 02:08:50 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=from:to:cc:subject:date:message-id:x-mailer:in-reply-to:references; bh=uK4P+zGnBLsHXIDNDLI0kMHFMbje4x1NimuzWXPrtjU=; b=Wmo7pReG7QlHHfM35bDiBSZwC2vPtBJN8Dxiwe4blGvkcXFgVnFB8wVp2nwIsLlkmh jPAa89usac1GFRsqN+GMwn3onBfM/FDDeOg9T/I9w5dDbEFwAG6I9zCXFhyycNrRScxb IRjebi6WLLoe2HWlIEjdeIYFKlkyIKc0wapaf+A+E1ENh1W8gaSrUFWc/UQY2Trg2Lpb D/Gv1ntwgZcYEjZJReisM9vhTkt5iKED0MmvKyywt7q/h6TFxwxUV4b4/yfPGPNL7Ou6 y+w4LBw2YT6k3V+NcrDTcFm/y9ol/avH8Q2Dr2E31AExfy7rwlcgOi1w4+rfyt39WmxC wT8A== X-Received: by 10.66.183.196 with SMTP id eo4mr10383308pac.156.1372410530808; Fri, 28 Jun 2013 02:08:50 -0700 (PDT) Received: from localhost ([183.37.203.121]) by mx.google.com with ESMTPSA id ri8sm7254549pbc.3.2013.06.28.02.08.45 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Fri, 28 Jun 2013 02:08:50 -0700 (PDT) From: Ming Lei To: Greg Kroah-Hartman Subject: [PATCH v3 4/4] USB: EHCI: support running URB giveback in tasklet context Date: Fri, 28 Jun 2013 17:07:53 +0800 Message-Id: <1372410473-30920-5-git-send-email-ming.lei@canonical.com> X-Mailer: git-send-email 1.7.9.5 In-Reply-To: <1372410473-30920-1-git-send-email-ming.lei@canonical.com> References: <1372410473-30920-1-git-send-email-ming.lei@canonical.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20130628_050912_635762_068EFB8A X-CRM114-Status: GOOD ( 15.66 ) X-Spam-Score: -1.9 (-) Cc: Oliver Neukum , Ming Lei , linux-usb@vger.kernel.org, Steven Rostedt , Alan Stern , Thomas Gleixner , linux-arm-kernel@lists.infradead.org X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Spam-Status: No, score=-5.5 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP All 4 transfer types can work well on EHCI HCD after switching to run URB giveback in tasklet context, so mark all HCD drivers to support it. Also we don't need to release ehci->lock during URB giveback any more. From below test results on 3 machines(2 ARM and one x86), time consumed by EHCI interrupt handler droped much without performance loss. 1 test description 1.1 mass storage performance test: - run below command 10 times and compute the average performance dd if=/dev/sdN iflag=direct of=/dev/null bs=200M count=1 - two usb mass storage device: A: sandisk extreme USB 3.0 16G(used in test case 1 & case 2) B: kingston DataTraveler G2 4GB(only used in test case 2) 1.2 uvc function test: - run one simple capture program in the below link http://kernel.ubuntu.com/~ming/up/capture.c - capture format 640*480 and results in High Bandwidth mode on the uvc device: Z-Star 0x0ac8/0x3450 - on T410(x86) laptop, also use guvcview to watch video capture/playback 1.3 about test2 and test4 - both two devices involved are tested concurrently by above test items 1.4 how to compute irq time(the time consumed by ehci_irq) - use trace points of irq:irq_handler_entry and irq:irq_handler_exit 1.5 kernel 3.10.0-rc3-next-20130528 1.6 test machines Pandaboard A1: ARM CortexA9 dural core Arndale board: ARM CortexA15 dural core T410: i5 CPU 2.67GHz quad core 2 test result 2.1 test case1: single mass storage device performance test -------------------------------------------------------------------- upstream | patched perf(MB/s)+irq time(us) | perf(MB/s)+irq time(us) -------------------------------------------------------------------- Pandaboard A1: 25.280(avg:145,max:772) | 25.540(avg:14, max:75) Arndale board: 29.700(avg:33, max:129) | 29.700(avg:10, max:50) T410: 34.430(avg:17, max:154*)| 34.660(avg:12, max:155) --------------------------------------------------------------------- 2.2 test case2: two mass storage devices' performance test -------------------------------------------------------------------- upstream | patched perf(MB/s)+irq time(us) | perf(MB/s)+irq time(us) -------------------------------------------------------------------- Pandaboard A1: 15.840/15.580(avg:158,max:1216) | 16.500/16.160(avg:15,max:139) Arndale board: 17.370/16.220(avg:33 max:234) | 17.480/16.200(avg:11, max:91) T410: 21.180/19.820(avg:18 max:160) | 21.220/19.880(avg:11, max:149) --------------------------------------------------------------------- 2.3 test case3: one uvc streaming test - uvc device works well(on x86, luvcview can be used too and has same result with uvc capture) -------------------------------------------------------------------- upstream | patched irq time(us) | irq time(us) -------------------------------------------------------------------- Pandaboard A1: (avg:445, max:873) | (avg:33, max:44) Arndale board: (avg:316, max:630) | (avg:20, max:27) T410: (avg:39, max:107) | (avg:10, max:65) --------------------------------------------------------------------- 2.4 test case4: one uvc streaming plus one mass storage device test -------------------------------------------------------------------- upstream | patched perf(MB/s)+irq time(us) | perf(MB/s)+irq time(us) -------------------------------------------------------------------- Pandaboard A1: 20.340(avg:259,max:1704)| 20.390(avg:24, max:101) Arndale board: 23.460(avg:124,max:726) | 23.370(avg:15, max:52) T410: 28.520(avg:27, max:169) | 28.630(avg:13, max:160) --------------------------------------------------------------------- 2.5 test case5: read single mass storage device with small transfer - run below command 10 times and compute the average speed dd if=/dev/sdN iflag=direct of=/dev/null bs=4K count=4000 1), test device A: -------------------------------------------------------------------- upstream | patched perf(MB/s)+irq time(us) | perf(MB/s)+irq time(us) -------------------------------------------------------------------- Pandaboard A1: 6.5(avg:21, max:64) | 6.5(avg:10, max:24) Arndale board: 8.13(avg:12, max:23) | 8.06(avg:7, max:17) T410: 6.66(avg:13, max:131) | 6.84(avg:11, max:149) --------------------------------------------------------------------- 2), test device B: -------------------------------------------------------------------- upstream | patched perf(MB/s)+irq time(us) | perf(MB/s)+irq time(us) -------------------------------------------------------------------- Pandaboard A1: 5.5(avg:21,max:43) | 5.49(avg:10, max:24) Arndale board: 5.9(avg:12, max:22) | 5.9(avg:7, max:17) T410: 5.48(avg:13, max:155) | 5.48(avg:7, max:140) --------------------------------------------------------------------- * On T410, sometimes read ehci status register in ehci_irq takes more than 100us, and the problem has been reported on the link: http://marc.info/?t=137065867300001&r=1&w=2 Cc: Alan Stern Signed-off-by: Ming Lei --- drivers/usb/host/ehci-fsl.c | 2 +- drivers/usb/host/ehci-grlib.c | 2 +- drivers/usb/host/ehci-hcd.c | 2 +- drivers/usb/host/ehci-mv.c | 2 +- drivers/usb/host/ehci-octeon.c | 2 +- drivers/usb/host/ehci-pmcmsp.c | 2 +- drivers/usb/host/ehci-ppc-of.c | 2 +- drivers/usb/host/ehci-ps3.c | 2 +- drivers/usb/host/ehci-q.c | 5 ----- drivers/usb/host/ehci-sead3.c | 2 +- drivers/usb/host/ehci-sh.c | 2 +- drivers/usb/host/ehci-tilegx.c | 2 +- drivers/usb/host/ehci-w90x900.c | 2 +- drivers/usb/host/ehci-xilinx-of.c | 2 +- 14 files changed, 13 insertions(+), 18 deletions(-) diff --git a/drivers/usb/host/ehci-fsl.c b/drivers/usb/host/ehci-fsl.c index bd831ec..330274a 100644 --- a/drivers/usb/host/ehci-fsl.c +++ b/drivers/usb/host/ehci-fsl.c @@ -669,7 +669,7 @@ static const struct hc_driver ehci_fsl_hc_driver = { * generic hardware linkage */ .irq = ehci_irq, - .flags = HCD_USB2 | HCD_MEMORY, + .flags = HCD_USB2 | HCD_MEMORY | HCD_BH, /* * basic lifecycle operations diff --git a/drivers/usb/host/ehci-grlib.c b/drivers/usb/host/ehci-grlib.c index 5d75de9..c6048ee 100644 --- a/drivers/usb/host/ehci-grlib.c +++ b/drivers/usb/host/ehci-grlib.c @@ -43,7 +43,7 @@ static const struct hc_driver ehci_grlib_hc_driver = { * generic hardware linkage */ .irq = ehci_irq, - .flags = HCD_MEMORY | HCD_USB2, + .flags = HCD_MEMORY | HCD_USB2 | HCD_BH, /* * basic lifecycle operations diff --git a/drivers/usb/host/ehci-hcd.c b/drivers/usb/host/ehci-hcd.c index e2fccc0..a0c27fd 100644 --- a/drivers/usb/host/ehci-hcd.c +++ b/drivers/usb/host/ehci-hcd.c @@ -1167,7 +1167,7 @@ static const struct hc_driver ehci_hc_driver = { * generic hardware linkage */ .irq = ehci_irq, - .flags = HCD_MEMORY | HCD_USB2, + .flags = HCD_MEMORY | HCD_USB2 | HCD_BH, /* * basic lifecycle operations diff --git a/drivers/usb/host/ehci-mv.c b/drivers/usb/host/ehci-mv.c index 915c2db..ce18a36 100644 --- a/drivers/usb/host/ehci-mv.c +++ b/drivers/usb/host/ehci-mv.c @@ -96,7 +96,7 @@ static const struct hc_driver mv_ehci_hc_driver = { * generic hardware linkage */ .irq = ehci_irq, - .flags = HCD_MEMORY | HCD_USB2, + .flags = HCD_MEMORY | HCD_USB2 | HCD_BH, /* * basic lifecycle operations diff --git a/drivers/usb/host/ehci-octeon.c b/drivers/usb/host/ehci-octeon.c index 45cc001..ab0397e 100644 --- a/drivers/usb/host/ehci-octeon.c +++ b/drivers/usb/host/ehci-octeon.c @@ -51,7 +51,7 @@ static const struct hc_driver ehci_octeon_hc_driver = { * generic hardware linkage */ .irq = ehci_irq, - .flags = HCD_MEMORY | HCD_USB2, + .flags = HCD_MEMORY | HCD_USB2 | HCD_BH, /* * basic lifecycle operations diff --git a/drivers/usb/host/ehci-pmcmsp.c b/drivers/usb/host/ehci-pmcmsp.c index 363890e..3ab32a2 100644 --- a/drivers/usb/host/ehci-pmcmsp.c +++ b/drivers/usb/host/ehci-pmcmsp.c @@ -286,7 +286,7 @@ static const struct hc_driver ehci_msp_hc_driver = { #else .irq = ehci_irq, #endif - .flags = HCD_MEMORY | HCD_USB2, + .flags = HCD_MEMORY | HCD_USB2 | HCD_BH, /* * basic lifecycle operations diff --git a/drivers/usb/host/ehci-ppc-of.c b/drivers/usb/host/ehci-ppc-of.c index 56dc732..da95a31 100644 --- a/drivers/usb/host/ehci-ppc-of.c +++ b/drivers/usb/host/ehci-ppc-of.c @@ -28,7 +28,7 @@ static const struct hc_driver ehci_ppc_of_hc_driver = { * generic hardware linkage */ .irq = ehci_irq, - .flags = HCD_MEMORY | HCD_USB2, + .flags = HCD_MEMORY | HCD_USB2 | HCD_BH, /* * basic lifecycle operations diff --git a/drivers/usb/host/ehci-ps3.c b/drivers/usb/host/ehci-ps3.c index fd98377..8188542 100644 --- a/drivers/usb/host/ehci-ps3.c +++ b/drivers/usb/host/ehci-ps3.c @@ -71,7 +71,7 @@ static const struct hc_driver ps3_ehci_hc_driver = { .product_desc = "PS3 EHCI Host Controller", .hcd_priv_size = sizeof(struct ehci_hcd), .irq = ehci_irq, - .flags = HCD_MEMORY | HCD_USB2, + .flags = HCD_MEMORY | HCD_USB2 | HCD_BH, .reset = ps3_ehci_hc_reset, .start = ehci_run, .stop = ehci_stop, diff --git a/drivers/usb/host/ehci-q.c b/drivers/usb/host/ehci-q.c index d34b399..b637a65 100644 --- a/drivers/usb/host/ehci-q.c +++ b/drivers/usb/host/ehci-q.c @@ -254,8 +254,6 @@ static int qtd_copy_status ( static void ehci_urb_done(struct ehci_hcd *ehci, struct urb *urb, int status) -__releases(ehci->lock) -__acquires(ehci->lock) { if (usb_pipetype(urb->pipe) == PIPE_INTERRUPT) { /* ... update hc-wide periodic stats */ @@ -281,11 +279,8 @@ __acquires(ehci->lock) urb->actual_length, urb->transfer_buffer_length); #endif - /* complete() can reenter this HCD */ usb_hcd_unlink_urb_from_ep(ehci_to_hcd(ehci), urb); - spin_unlock (&ehci->lock); usb_hcd_giveback_urb(ehci_to_hcd(ehci), urb, status); - spin_lock (&ehci->lock); } static int qh_schedule (struct ehci_hcd *ehci, struct ehci_qh *qh); diff --git a/drivers/usb/host/ehci-sead3.c b/drivers/usb/host/ehci-sead3.c index b2de52d..8a73449 100644 --- a/drivers/usb/host/ehci-sead3.c +++ b/drivers/usb/host/ehci-sead3.c @@ -55,7 +55,7 @@ const struct hc_driver ehci_sead3_hc_driver = { * generic hardware linkage */ .irq = ehci_irq, - .flags = HCD_MEMORY | HCD_USB2, + .flags = HCD_MEMORY | HCD_USB2 | HCD_BH, /* * basic lifecycle operations diff --git a/drivers/usb/host/ehci-sh.c b/drivers/usb/host/ehci-sh.c index c4c0ee9..1691f8e 100644 --- a/drivers/usb/host/ehci-sh.c +++ b/drivers/usb/host/ehci-sh.c @@ -36,7 +36,7 @@ static const struct hc_driver ehci_sh_hc_driver = { * generic hardware linkage */ .irq = ehci_irq, - .flags = HCD_USB2 | HCD_MEMORY, + .flags = HCD_USB2 | HCD_MEMORY | HCD_BH, /* * basic lifecycle operations diff --git a/drivers/usb/host/ehci-tilegx.c b/drivers/usb/host/ehci-tilegx.c index d72b292..204d3b6 100644 --- a/drivers/usb/host/ehci-tilegx.c +++ b/drivers/usb/host/ehci-tilegx.c @@ -61,7 +61,7 @@ static const struct hc_driver ehci_tilegx_hc_driver = { * Generic hardware linkage. */ .irq = ehci_irq, - .flags = HCD_MEMORY | HCD_USB2, + .flags = HCD_MEMORY | HCD_USB2 | HCD_BH, /* * Basic lifecycle operations. diff --git a/drivers/usb/host/ehci-w90x900.c b/drivers/usb/host/ehci-w90x900.c index 59e0e24..1c370df 100644 --- a/drivers/usb/host/ehci-w90x900.c +++ b/drivers/usb/host/ehci-w90x900.c @@ -108,7 +108,7 @@ static const struct hc_driver ehci_w90x900_hc_driver = { * generic hardware linkage */ .irq = ehci_irq, - .flags = HCD_USB2|HCD_MEMORY, + .flags = HCD_USB2|HCD_MEMORY|HCD_BH, /* * basic lifecycle operations diff --git a/drivers/usb/host/ehci-xilinx-of.c b/drivers/usb/host/ehci-xilinx-of.c index d845e3b..bb96a68 100644 --- a/drivers/usb/host/ehci-xilinx-of.c +++ b/drivers/usb/host/ehci-xilinx-of.c @@ -79,7 +79,7 @@ static const struct hc_driver ehci_xilinx_of_hc_driver = { * generic hardware linkage */ .irq = ehci_irq, - .flags = HCD_MEMORY | HCD_USB2, + .flags = HCD_MEMORY | HCD_USB2 | HCD_BH, /* * basic lifecycle operations