From patchwork Wed Sep 5 08:19:28 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Chinner X-Patchwork-Id: 10588413 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6A306180E for ; Wed, 5 Sep 2018 08:19:41 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 509D6295CA for ; Wed, 5 Sep 2018 08:19:41 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 44E5B29789; Wed, 5 Sep 2018 08:19:41 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E156F295CA for ; Wed, 5 Sep 2018 08:19:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727764AbeIEMsn (ORCPT ); Wed, 5 Sep 2018 08:48:43 -0400 Received: from ipmail06.adl6.internode.on.net ([150.101.137.145]:24725 "EHLO ipmail06.adl6.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726342AbeIEMsn (ORCPT ); Wed, 5 Sep 2018 08:48:43 -0400 Received: from ppp59-167-129-252.static.internode.on.net (HELO dastard) ([59.167.129.252]) by ipmail06.adl6.internode.on.net with ESMTP; 05 Sep 2018 17:49:38 +0930 Received: from discord.disaster.area ([192.168.1.111]) by dastard with esmtp (Exim 4.80) (envelope-from ) id 1fxT2G-0003NA-1C for linux-xfs@vger.kernel.org; Wed, 05 Sep 2018 18:19:36 +1000 Received: from dave by discord.disaster.area with local (Exim 4.91) (envelope-from ) id 1fxT2G-0007GY-00 for linux-xfs@vger.kernel.org; Wed, 05 Sep 2018 18:19:36 +1000 From: Dave Chinner To: linux-xfs@vger.kernel.org Subject: [RFCRAP PATCH 0/4 v2] mkfs.xfs IO scalability Date: Wed, 5 Sep 2018 18:19:28 +1000 Message-Id: <20180905081932.27478-1-david@fromorbit.com> X-Mailer: git-send-email 2.17.0 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Hi folks, More on getting mkfs to be usable to testing unrealistically large filesystems. The first two patches of this series are unchanged from yesterday - the second two are new and build on them. The second two patches hack a delayed write buffer submission list in the mkfs and libxfs. It's a bit nasty, because I've chosen to ignore the fact that the libxfs has no concept of async IO or background write and instead hacked around it. You can see the result in passing a buffer list to xfs_trans_commit() to get it to add buffers to the delwri list rather than write them synchronously. Fast, loose and stupidly dangerous, all in one. Yeehaw! Better yet, it doesn't even make any difference to performance - it's just an enabling patch. The last patch is the performance improvement - it hacks a grotty, non-re-entrant AIO submission/completion ring to turn the single threaded sync write batching into a single threaded concurrent IO loop using AIO. This can drive really deep IO queues as long as it's got enough queued IO to work with, so mkfs is hacked to only submit IO every few hundred AGs it initialises. This sustains queue depths of around 100 IOs and SSD utilisation at around 80% using about half a CPU, and so the time to make an 8EB filesystem drops to around 15 minutes. This is most definitely not production code. This is a load of crap hacked together in a few hours as a proof of concept. But it's a successful proof of concept, so what we now need is someone who is looking around for a substantial project to volunteer to rewrite the libxfs buffer cache around an AIO submission/completion core and implement all this in a "proper" fashion. If you're interested, let me know... Cheers, Dave.