From patchwork Wed May 9 17:33:19 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 10390413 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 6A6DD60153 for ; Wed, 9 May 2018 17:33:53 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5644728528 for ; Wed, 9 May 2018 17:33:53 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4A3012852C; Wed, 9 May 2018 17:33:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2F7AF2852B for ; Wed, 9 May 2018 17:33:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8E5186B0558; Wed, 9 May 2018 13:33:49 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 86B236B0559; Wed, 9 May 2018 13:33:49 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 70E1E6B055A; Wed, 9 May 2018 13:33:49 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk0-f197.google.com (mail-qk0-f197.google.com [209.85.220.197]) by kanga.kvack.org (Postfix) with ESMTP id 44A6B6B0558 for ; Wed, 9 May 2018 13:33:49 -0400 (EDT) Received: by mail-qk0-f197.google.com with SMTP id b202so26932739qkc.6 for ; Wed, 09 May 2018 10:33:49 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:date:from:to:cc:subject :message-id:mime-version:content-disposition:user-agent; bh=J/Kow/fnEyQfrnqWjkev8VrLQl/H3SFWh4pc96SrkdQ=; b=H637SGIz45oqQeg8k5cZC7w5nzYNGdYqIjwKgnctXImUdCV6NisicazPcdOIDaG3tS KwQwVi/7PDRrUmob/9tWaz3XewhD9jnMhW03aGNvXhiySu+FuZ4XGaqFSJH0xC38XNvK eyPrxYssU4I65CppnYRPqHNAO1ATn2acmYwVFwpjh8TzSmtgH19QezIZSK9fx3EdiZxm l1p5jlka1FwunjUdeCsSaOsbODQvK+yMMX/SgvOGkrKkaBsrJYd31Zfl+mO+LrlpRpAn rldrfdY/UwveYbiPq8GC9VZ+Fjt2M6XwaOyWh+xfzujrg5RfmHZ6WE/fazspQ7nEMh65 Tfpg== X-Gm-Message-State: ALQs6tCa6K6LSv6oTjOtM9b9Vche0FueOAfmLj2h4Cjv6TQMNl8bWoZD euk5jYp5no+tM9JXSBm9lqluXajAz64Kx2qnZTNYe1+7tv2C2J2JR0lbxfNsWZbwjOKYE18TucU 1XD8P/tOLTmflllFQ6cVngEopLxWdEirys91FWJIywxndthYbudGxgY2OGx5Xq2e4Nw== X-Received: by 2002:ac8:2237:: with SMTP id o52-v6mr42637655qto.355.1525887229033; Wed, 09 May 2018 10:33:49 -0700 (PDT) X-Google-Smtp-Source: AB8JxZrYL9W/Kanu8LX/h0BievTM3Bw/XKjpV6J8IvODX3jpW1VS5vEiEe8R0sNwmMdP9QOW4NhV X-Received: by 2002:ac8:2237:: with SMTP id o52-v6mr42637413qto.355.1525887225022; Wed, 09 May 2018 10:33:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525887224; cv=none; d=google.com; s=arc-20160816; b=zHYdFPrrwtnMJhfWVFjhHikq+ulDYpvjXWhbAjRqcpk8HDR2Oa8Zf9YJqTtI+jFUYM JZptgubvnSAn6pTnt6WzaETVABDrtBNlK62nip6q1jYMgAq8tMARk2JzuhUSkv5vDIVi e5QeVBF91bMDVidzp8u8oQC8ThsJluBIj6LT8H5EjqjukNC5Trx+sZXp/18KN29mnS8G 4Biwa6PaKmxkWxD9NSNkarjcROph/LwK+d9wNtm9+/vxFWL08qax0uH5o2rTQiPe8EL+ 5jqRy4G+iPeLf/YTyzGqwGJoeTf1Vd4sMOotq3fgqVEHxVWqQCNKkyAU87x65Yh2jF/r r09A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=user-agent:content-disposition:mime-version:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=J/Kow/fnEyQfrnqWjkev8VrLQl/H3SFWh4pc96SrkdQ=; b=xZj8p++CQ56E2auPy653L55qRzfiCuy2cMuqMvUpbJguy/SX0ovgbWS0EI4UqPWXtn OV7SWiBsAGLhf4VPOufvG+4KhyMwLJwz6VdWVgJSD5YB27pmvzfJ24CgsuelTFpGdogp 7AAr8nRvouI+7NuxDgg+8ebnpEo1ph0bpzh413QiWHrLNzT2VHOg9mfAj36O7+mC0rEJ WqrPgdWG0Rn42ATEh04h/4vQKdKH8CihOASQvA0AVTWvSxUT0qY8hArKUmPa6SIWQL/o bqCiBS6edJwcHzdufKXyvAcQ+fBKT/1gN0mbVoLDiWFudD+oyPpuaYunK9BuD0OvWHQ2 6jUA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=YJWf1Hl1; spf=pass (google.com: domain of darrick.wong@oracle.com designates 156.151.31.85 as permitted sender) smtp.mailfrom=darrick.wong@oracle.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: from userp2120.oracle.com (userp2120.oracle.com. [156.151.31.85]) by mx.google.com with ESMTPS id u17-v6si26864868qvk.226.2018.05.09.10.33.44 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 09 May 2018 10:33:44 -0700 (PDT) Received-SPF: pass (google.com: domain of darrick.wong@oracle.com designates 156.151.31.85 as permitted sender) client-ip=156.151.31.85; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=YJWf1Hl1; spf=pass (google.com: domain of darrick.wong@oracle.com designates 156.151.31.85 as permitted sender) smtp.mailfrom=darrick.wong@oracle.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w49HUiZt148104; Wed, 9 May 2018 17:33:25 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=date : from : to : cc : subject : message-id : mime-version : content-type; s=corp-2017-10-26; bh=J/Kow/fnEyQfrnqWjkev8VrLQl/H3SFWh4pc96SrkdQ=; b=YJWf1Hl1ndlxiKwuuiVQF62GvpY6/0dFCLY8ZaA0FIoF8s43qHd+45EjEMU2f0egUhaO Q7iLw+iE2fIZSAr82RGyiTD8pC0B1TlFbDfMGHJ4uDdWIWaASw7KvdaX5cL+fvnygK4Q 6KYDEoZ71DpTsbOiccLu8hrEl1qGieojJl9jcOdAfVMOK9zvdFXBw/nnqiXlmCA2vHCT ih4dT6mhYyR0WwX0i58VvMvw4Bm3XLjjhNsLQnrBImaJq9MwVZZ1kE0ajW4ce0LidSdR CBVCbNfzaFcd77B/k6K8SNPoNXyMojn/0YSvW3POXA7JmwlA4Dnj03kXdoLbK4Wnsrbh Pg== Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by userp2120.oracle.com with ESMTP id 2hs593f3ne-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 09 May 2018 17:33:25 +0000 Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by aserv0022.oracle.com (8.14.4/8.14.4) with ESMTP id w49HXMSw018177 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 9 May 2018 17:33:23 GMT Received: from abhmp0001.oracle.com (abhmp0001.oracle.com [141.146.116.7]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id w49HXLBd027135; Wed, 9 May 2018 17:33:21 GMT Received: from localhost (/67.169.218.210) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 09 May 2018 10:33:21 -0700 Date: Wed, 9 May 2018 10:33:19 -0700 From: "Darrick J. Wong" To: hch@infradead.org Cc: xfs , Jan Kara , linux-fsdevel , linux-mm@kvack.org, cyberax@amazon.com, osandov@osandov.com, Eryu Guan Subject: [PATCH v4] iomap: add a swapfile activation function Message-ID: <20180509173319.GE9510@magnolia> MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.9.4 (2018-02-28) X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8887 signatures=668698 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=1 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1711220000 definitions=main-1805090164 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Darrick J. Wong Add a new iomap_swapfile_activate function so that filesystems can activate swap files without having to use the obsolete and slow bmap function. This enables XFS to support fallocate'd swap files and swap files on realtime devices. Signed-off-by: Darrick J. Wong Reviewed-by: Jan Kara --- fs/iomap.c | 162 +++++++++++++++++++++++++++++++++++++++++++++++++ fs/xfs/xfs_aops.c | 12 ++++ include/linux/iomap.h | 11 +++ 3 files changed, 185 insertions(+) diff --git a/fs/iomap.c b/fs/iomap.c index afd163586aa0..99e7f1aa2779 100644 --- a/fs/iomap.c +++ b/fs/iomap.c @@ -27,6 +27,7 @@ #include #include #include +#include #include "internal.h" @@ -1089,3 +1090,164 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, return ret; } EXPORT_SYMBOL_GPL(iomap_dio_rw); + +/* Swapfile activation */ + +#ifdef CONFIG_SWAP +struct iomap_swapfile_info { + struct iomap iomap; /* accumulated iomap */ + struct swap_info_struct *sis; + uint64_t lowest_ppage; /* lowest physical addr seen (pages) */ + uint64_t highest_ppage; /* highest physical addr seen (pages) */ + unsigned long nr_pages; /* number of pages collected */ + int nr_extents; /* extent count */ +}; + +/* + * Collect physical extents for this swap file. Physical extents reported to + * the swap code must be trimmed to align to a page boundary. The logical + * offset within the file is irrelevant since the swapfile code maps logical + * page numbers of the swap device to the physical page-aligned extents. + */ +static int iomap_swapfile_add_extent(struct iomap_swapfile_info *isi) +{ + struct iomap *iomap = &isi->iomap; + unsigned long nr_pages; + uint64_t first_ppage; + uint64_t first_ppage_reported; + uint64_t next_ppage; + int error; + + /* + * Round the start up and the end down so that the physical + * extent aligns to a page boundary. + */ + first_ppage = ALIGN(iomap->addr, PAGE_SIZE) >> PAGE_SHIFT; + next_ppage = ALIGN_DOWN(iomap->addr + iomap->length, PAGE_SIZE) >> + PAGE_SHIFT; + + /* Skip too-short physical extents. */ + if (first_ppage >= next_ppage) + return 0; + nr_pages = next_ppage - first_ppage; + + /* + * Calculate how much swap space we're adding; the first page contains + * the swap header and doesn't count. The mm still wants that first + * page fed to add_swap_extent, however. + */ + first_ppage_reported = first_ppage; + if (iomap->offset == 0) + first_ppage_reported++; + if (isi->lowest_ppage > first_ppage_reported) + isi->lowest_ppage = first_ppage_reported; + if (isi->highest_ppage < (next_ppage - 1)) + isi->highest_ppage = next_ppage - 1; + + /* Add extent, set up for the next call. */ + error = add_swap_extent(isi->sis, isi->nr_pages, nr_pages, first_ppage); + if (error < 0) + return error; + isi->nr_extents += error; + isi->nr_pages += nr_pages; + return 0; +} + +/* + * Accumulate iomaps for this swap file. We have to accumulate iomaps because + * swap only cares about contiguous page-aligned physical extents and makes no + * distinction between written and unwritten extents. + */ +static loff_t iomap_swapfile_activate_actor(struct inode *inode, loff_t pos, + loff_t count, void *data, struct iomap *iomap) +{ + struct iomap_swapfile_info *isi = data; + int error; + + /* Skip holes. */ + if (iomap->type == IOMAP_HOLE) + goto out; + + /* Only one bdev per swap file. */ + if (iomap->bdev != isi->sis->bdev) + goto err; + + /* Only real or unwritten extents. */ + if (iomap->type != IOMAP_MAPPED && iomap->type != IOMAP_UNWRITTEN) + goto err; + + /* No uncommitted metadata or shared blocks or inline data. */ + if (iomap->flags & (IOMAP_F_DIRTY | IOMAP_F_SHARED | + IOMAP_F_DATA_INLINE)) + goto err; + + /* No null physical addresses. */ + if (iomap->addr == IOMAP_NULL_ADDR) + goto err; + + if (isi->iomap.length == 0) { + /* No accumulated extent, so just store it. */ + memcpy(&isi->iomap, iomap, sizeof(isi->iomap)); + } else if (isi->iomap.addr + isi->iomap.length == iomap->addr) { + /* Append this to the accumulated extent. */ + isi->iomap.length += iomap->length; + } else { + /* Otherwise, add the retained iomap and store this one. */ + error = iomap_swapfile_add_extent(isi); + if (error) + return error; + memcpy(&isi->iomap, iomap, sizeof(isi->iomap)); + } +out: + return count; +err: + pr_err("swapon: file cannot be used for swap\n"); + return -EINVAL; +} + +/* + * Iterate a swap file's iomaps to construct physical extents that can be + * passed to the swapfile subsystem. + */ +int iomap_swapfile_activate(struct swap_info_struct *sis, + struct file *swap_file, sector_t *pagespan, + const struct iomap_ops *ops) +{ + struct iomap_swapfile_info isi = { + .sis = sis, + .lowest_ppage = (sector_t)-1ULL, + }; + struct address_space *mapping = swap_file->f_mapping; + struct inode *inode = mapping->host; + loff_t pos = 0; + loff_t len = ALIGN_DOWN(i_size_read(inode), PAGE_SIZE); + loff_t ret; + + ret = filemap_write_and_wait(inode->i_mapping); + if (ret) + return ret; + + while (len > 0) { + ret = iomap_apply(inode, pos, len, IOMAP_REPORT, + ops, &isi, iomap_swapfile_activate_actor); + if (ret <= 0) + return ret; + + pos += ret; + len -= ret; + } + + if (isi.iomap.length) { + ret = iomap_swapfile_add_extent(&isi); + if (ret) + return ret; + } + + *pagespan = 1 + isi.highest_ppage - isi.lowest_ppage; + sis->max = isi.nr_pages; + sis->pages = isi.nr_pages - 1; + sis->highest_bit = isi.nr_pages - 1; + return isi.nr_extents; +} +EXPORT_SYMBOL_GPL(iomap_swapfile_activate); +#endif /* CONFIG_SWAP */ diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index 0ab824f574ed..80de476cecf8 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -1475,6 +1475,16 @@ xfs_vm_set_page_dirty( return newly_dirty; } +static int +xfs_iomap_swapfile_activate( + struct swap_info_struct *sis, + struct file *swap_file, + sector_t *span) +{ + sis->bdev = xfs_find_bdev_for_inode(file_inode(swap_file)); + return iomap_swapfile_activate(sis, swap_file, span, &xfs_iomap_ops); +} + const struct address_space_operations xfs_address_space_operations = { .readpage = xfs_vm_readpage, .readpages = xfs_vm_readpages, @@ -1488,6 +1498,7 @@ const struct address_space_operations xfs_address_space_operations = { .migratepage = buffer_migrate_page, .is_partially_uptodate = block_is_partially_uptodate, .error_remove_page = generic_error_remove_page, + .swap_activate = xfs_iomap_swapfile_activate, }; const struct address_space_operations xfs_dax_aops = { @@ -1495,4 +1506,5 @@ const struct address_space_operations xfs_dax_aops = { .direct_IO = noop_direct_IO, .set_page_dirty = noop_set_page_dirty, .invalidatepage = noop_invalidatepage, + .swap_activate = xfs_iomap_swapfile_activate, }; diff --git a/include/linux/iomap.h b/include/linux/iomap.h index 19a07de28212..4bd87294219a 100644 --- a/include/linux/iomap.h +++ b/include/linux/iomap.h @@ -106,4 +106,15 @@ typedef int (iomap_dio_end_io_t)(struct kiocb *iocb, ssize_t ret, ssize_t iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, const struct iomap_ops *ops, iomap_dio_end_io_t end_io); +#ifdef CONFIG_SWAP +struct file; +struct swap_info_struct; + +int iomap_swapfile_activate(struct swap_info_struct *sis, + struct file *swap_file, sector_t *pagespan, + const struct iomap_ops *ops); +#else +# define iomap_swapfile_activate(sis, swapfile, pagespan, ops) (-EIO) +#endif /* CONFIG_SWAP */ + #endif /* LINUX_IOMAP_H */