From patchwork Sun Jul 31 19:19:00 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 9253435 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 9B02A601C0 for ; Sun, 31 Jul 2016 19:19:14 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 71C7A28460 for ; Sun, 31 Jul 2016 19:19:14 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5038E2848A; Sun, 31 Jul 2016 19:19:14 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6D4AD28460 for ; Sun, 31 Jul 2016 19:19:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750962AbcGaTTE (ORCPT ); Sun, 31 Jul 2016 15:19:04 -0400 Received: from verein.lst.de ([213.95.11.211]:50387 "EHLO newverein.lst.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750835AbcGaTTD (ORCPT ); Sun, 31 Jul 2016 15:19:03 -0400 Received: by newverein.lst.de (Postfix, from userid 2407) id 6893D68D24; Sun, 31 Jul 2016 21:19:00 +0200 (CEST) Date: Sun, 31 Jul 2016 21:19:00 +0200 From: Christoph Hellwig To: Dave Chinner Cc: Christoph Hellwig , rpeterso@redhat.com, linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com Subject: Re: iomap infrastructure and multipage writes V5 Message-ID: <20160731191900.GA29301@lst.de> References: <1464792297-13185-1-git-send-email-hch@lst.de> <20160628002649.GI12670@dastard> <20160630172239.GA23082@lst.de> <20160718111400.GC16044@dastard> <20160718111851.GD16044@dastard> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20160718111851.GD16044@dastard> User-Agent: Mutt/1.5.17 (2007-11-01) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Another quiet weekend trying to debug this, and only minor progress. The biggest different in traces of the old vs new code is that we manage to allocate much bigger delalloc reservations at a time in xfs_bmapi_delay -> xfs_bmapi_reserve_delalloc. The old code always went for a single FSB, which also meant allocating an indlen of 7 FSBs. With the iomap code we always allocate at least 4FSB (aka a page), and sometimes 8 or 12. All of these still need 7 FSBs for the worst case indirect blocks. So what happens here is that in an ENOSPC case we manage to allocate more actual delalloc blocks before hitting ENOSPC - notwithstanding that the old case would immediately release them a little later in xfs_bmap_add_extent_hole_delay after merging the delalloc extents. On the writeback side I don't see to many changes either. We'll eventually run out of blocks when allocating the transaction in xfs_iomap_write_allocate because the reserved pool is too small. The only real difference to before is that under the ENOSPC / out of memory case we have allocated between 4 to 12 times more blocks, so we have to clean up 4 to 12 times as much while write_cache_pages continues iterating over these dirty delalloc blocks. For me this happens ~6 times as much as before, but I still don't manage to hit an endless loop. Now after spending this much time I've started wondering why we even reserve blocks in xfs_iomap_write_allocate - after all we've reserved space for the actual data blocks and the indlen worst case in xfs_bmapi_reserve_delalloc. And in fact a little hack to drop that reservation seems to solve both the root cause (depleted reserved pool) and the cleanup mess. I just haven't spend enought time to convince myself that it's actually safe, and in fact looking at the allocator makes me thing it only works by accident currently despite generally postive test results. Here is the quick patch if anyone wants to chime in: --- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index 620fc91..67c317f 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -717,7 +717,7 @@ xfs_iomap_write_allocate( nimaps = 0; while (nimaps == 0) { - nres = XFS_EXTENTADD_SPACE_RES(mp, XFS_DATA_FORK); + nres = 0; // XFS_EXTENTADD_SPACE_RES(mp, XFS_DATA_FORK); error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, nres, 0, XFS_TRANS_RESERVE, &tp);