[RFC,0/4] spi: spi-fsl-spi: try to make cpu-mode transfers faster

Message ID	20190327143040.16013-1-rasmus.villemoes@prevas.dk (mailing list archive)
Headers	show Return-Path: <linux-spi-owner@kernel.org> From: Rasmus Villemoes <rasmus.villemoes@prevas.dk> To: Mark Brown <broonie@kernel.org>, "linux-spi@vger.kernel.org" <linux-spi@vger.kernel.org> CC: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>, Fabio Estevam <festevam@gmail.com>, Rasmus Villemoes <Rasmus.Villemoes@prevas.se> Subject: [RFC PATCH 0/4] spi: spi-fsl-spi: try to make cpu-mode transfers faster Thread-Topic: [RFC PATCH 0/4] spi: spi-fsl-spi: try to make cpu-mode transfers faster Thread-Index: AQHU5KmsQtL43pgBMESWjpaISK6vdw== Date: Wed, 27 Mar 2019 14:30:48 +0000 Message-ID: <20190327143040.16013-1-rasmus.villemoes@prevas.dk> Accept-Language: en-US Content-Language: en-US received-spf: None (protection.outlook.com: prevas.se does not designate permitted sender hosts) Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Sender: linux-spi-owner@vger.kernel.org Precedence: bulk
Series	spi: spi-fsl-spi: try to make cpu-mode transfers faster \| expand [RFC,0/4] spi: spi-fsl-spi: try to make cpu-mode transfers faster [RFC,1/4] spi: spi-fsl-spi: remove always-true conditional in fsl_spi_do_one_msg [RFC,2/4] spi: spi-fsl-spi: relax message sanity checking a little [RFC,3/4] spi: spi-fsl-spi: allow changing bits_per_word while CS is still active [RFC,4/4] spi: spi-fsl-spi: automatically adapt bits-per-word in cpu mode

Message ID

20190327143040.16013-1-rasmus.villemoes@prevas.dk (mailing list archive)

Headers

From: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
To: Mark Brown <broonie@kernel.org>,
        "linux-spi@vger.kernel.org" <linux-spi@vger.kernel.org>
CC: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
        Fabio Estevam <festevam@gmail.com>,
        Rasmus Villemoes <Rasmus.Villemoes@prevas.se>
Subject: [RFC PATCH 0/4] spi: spi-fsl-spi: try to make cpu-mode transfers
 faster
Thread-Topic: [RFC PATCH 0/4] spi: spi-fsl-spi: try to make cpu-mode transfers
 faster
Thread-Index: AQHU5KmsQtL43pgBMESWjpaISK6vdw==
Date: Wed, 27 Mar 2019 14:30:48 +0000
Message-ID: <20190327143040.16013-1-rasmus.villemoes@prevas.dk>
Accept-Language: en-US
Content-Language: en-US
x-ms-exchange-messagesentrepresentingtype: 1
x-mailer: git-send-email 2.20.1
x-originating-ip: [81.216.59.226]
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: 97ae7a9c-7397-4c57-9128-08d6b2c0ce6a
x-microsoft-antispam: 
 BCL:0;PCL:0;RULEID:(2390118)(7020095)(4652040)(8989299)(5600127)(711020)(4605104)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(2017052603328)(7153060)(7193020);SRVR:AM6PR10MB1894;
x-ms-traffictypediagnostic: AM6PR10MB1894:
x-microsoft-antispam-prvs: 
 <AM6PR10MB18940C0E0D1ECF21BF314D038A580@AM6PR10MB1894.EURPRD10.PROD.OUTLOOK.COM>
x-forefront-prvs: 0989A7979C
x-forefront-antispam-report: 
 SFV:NSPM;SFS:(10019020)(39850400004)(376002)(366004)(346002)(396003)(136003)(199004)(189003)(99286004)(72206003)(74482002)(66066001)(2501003)(14454004)(44832011)(6512007)(25786009)(256004)(5660300002)(53936002)(107886003)(7736002)(4326008)(305945005)(478600001)(6486002)(105586002)(106356001)(81166006)(8676002)(81156014)(68736007)(42882007)(54906003)(110136005)(71200400001)(102836004)(8976002)(50226002)(8936002)(186003)(26005)(6506007)(386003)(2906002)(36756003)(97736004)(71190400001)(4744005)(6436002)(3846002)(6116002)(1076003)(52116002)(2616005)(316002)(486006)(476003);DIR:OUT;SFP:1102;SCL:1;SRVR:AM6PR10MB1894;H:AM6PR10MB2661.EURPRD10.PROD.OUTLOOK.COM;FPR:;SPF:None;LANG:en;PTR:InfoNoRecords;A:1;MX:1;
received-spf: None (protection.outlook.com: prevas.se does not designate
 permitted sender hosts)
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam-message-info: 
 uia328C7+akDUwReJmMek8UkFdcKEus/jp2IdwDM2qabd0Zfh2xVNiYtVNTeNApQNXZVFuXRifSG1rinDhY6QcNQ3uUnyxfwPUvmFqnD4v6ih16ZYm/PczIexz7WEEd+OmdXHWFb6XM+DEaNEX83TYmJr1wquf2W7tHp55fCHNd3WdimU+LwmaiIjvWfYk4ytDVeIyTDOmEcPH6ffkyPFe/ajXXmShdJM3U+C4PnXzUocHRPgVsgBg2rpacGDy9TthLxPTW+RAyLAarTvctioCiPKzcggjEE3cGjjF8w576wuCqKibUgj3sMH+NNBvalQI1SXFqN5r78F5TPrhr55vCeK/8Cm5H+IKF2BtVAJExvucbALfvlkpGkXB6nr98dOHq+TodgUzXCC492eNSKpHBA+GgyXSqG0WRFkFORNNI=
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-OriginatorOrg: prevas.dk
X-MS-Exchange-CrossTenant-Network-Message-Id: 
 97ae7a9c-7397-4c57-9128-08d6b2c0ce6a
X-MS-Exchange-CrossTenant-originalarrivaltime: 27 Mar 2019 14:30:49.0232
 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: d350cf71-778d-4780-88f5-071a4cb1ed61
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM6PR10MB1894
Sender: linux-spi-owner@vger.kernel.org
Precedence: bulk
List-ID: <linux-spi.vger.kernel.org>
X-Mailing-List: linux-spi@vger.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

Series

spi: spi-fsl-spi: try to make cpu-mode transfers faster | expand

Message

Rasmus Villemoes March 27, 2019, 2:30 p.m. UTC

I doubt patches 3 and 4 are acceptable, but I'd still like to get
comments and/or alternative suggestions for making large transfers
faster.

The patches have been tested on an MPC8309 with a Cypress S25FL032P
spi-nor slave, and make various operations between 50% and 73%
faster.

We have not observed any problems, but to completely rule out the
possibility of "glitches on SPI CLK" mentioned in patch 3 would of
course require testing on a much wider set of hardware combinations.

Rasmus Villemoes (4):
  spi: spi-fsl-spi: remove always-true conditional in fsl_spi_do_one_msg
  spi: spi-fsl-spi: relax message sanity checking a little
  spi: spi-fsl-spi: allow changing bits_per_word while CS is still
    active
  spi: spi-fsl-spi: automatically adapt bits-per-word in cpu mode

 drivers/spi/spi-fsl-spi.c | 41 +++++++++++++++++++++++++++------------
 1 file changed, 29 insertions(+), 12 deletions(-)

Comments

Mark Brown April 1, 2019, 7:34 a.m. UTC | #1

On Wed, Mar 27, 2019 at 02:30:48PM +0000, Rasmus Villemoes wrote:
> I doubt patches 3 and 4 are acceptable, but I'd still like to get
> comments and/or alternative suggestions for making large transfers
> faster.

I see no problem with this from a framework point of view FWIW, it's
going to be a question of if there's any glitches like you say.  I'm not
sure how we can get wider testing/review unless the patches actually get
merged though...  I'll leave them for a bit longer but unless someone
sees a problem I'll probably go ahead and apply them.

Rasmus Villemoes April 2, 2019, 8:43 a.m. UTC | #2

On 01/04/2019 09.34, Mark Brown wrote:
> On Wed, Mar 27, 2019 at 02:30:48PM +0000, Rasmus Villemoes wrote:
>> I doubt patches 3 and 4 are acceptable, but I'd still like to get
>> comments and/or alternative suggestions for making large transfers
>> faster.
> 
> I see no problem with this from a framework point of view FWIW, it's
> going to be a question of if there's any glitches like you say.  I'm not
> sure how we can get wider testing/review unless the patches actually get
> merged though...  I'll leave them for a bit longer but unless someone
> sees a problem I'll probably go ahead and apply them.
> 

Thanks! There's one other option I can think of: don't do the interrupts
at all, but just busy-wait for the completion of each word transfer (in
a cpu_relax() loop). That could be guarded by something like
1000000*bits_per_word < hz (roughly, the word transfer takes less than 1
us). At least on -rt, having the interrupt thread scheduled in and out
again easily takes more than 1us of cpu time, and AFAIU we'd still be
preemptible throughout - and/or one can throw in a cond_resched() every
nnn words. But this might be a bit -rt specific, and the 1us threshold
is rather arbitrary.

Rasmus

Mark Brown April 2, 2019, 9:10 a.m. UTC | #3

On Tue, Apr 02, 2019 at 08:43:51AM +0000, Rasmus Villemoes wrote:

> Thanks! There's one other option I can think of: don't do the interrupts
> at all, but just busy-wait for the completion of each word transfer (in
> a cpu_relax() loop). That could be guarded by something like
> 1000000*bits_per_word < hz (roughly, the word transfer takes less than 1
> us). At least on -rt, having the interrupt thread scheduled in and out
> again easily takes more than 1us of cpu time, and AFAIU we'd still be
> preemptible throughout - and/or one can throw in a cond_resched() every
> nnn words. But this might be a bit -rt specific, and the 1us threshold
> is rather arbitrary.

Yeah, that's definitely worth exploring as a mitigation but obviously
with things like flash I/O that gets a bit rude.  Hopefully what's there
at the minute turns out to be robust enough.