From patchwork Mon Aug 29 17:10:13 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Butsykin X-Patchwork-Id: 9304517 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 243136077C for ; Mon, 29 Aug 2016 21:45:57 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 184A926E5D for ; Mon, 29 Aug 2016 21:45:57 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0CB77289C6; Mon, 29 Aug 2016 21:45:57 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAD_ENC_HEADER,BAYES_00, DKIM_SIGNED, RCVD_IN_DNSWL_HI, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 0330C26E5D for ; Mon, 29 Aug 2016 21:45:56 +0000 (UTC) Received: from localhost ([::1]:45888 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1beUNP-0005qr-2a for patchwork-qemu-devel@patchwork.kernel.org; Mon, 29 Aug 2016 17:45:55 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:45830) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1beUMj-0005mi-Bu for qemu-devel@nongnu.org; Mon, 29 Aug 2016 17:45:17 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1beUMg-0002aj-4m for qemu-devel@nongnu.org; Mon, 29 Aug 2016 17:45:13 -0400 Received: from mail-db5eur01on0110.outbound.protection.outlook.com ([104.47.2.110]:36512 helo=EUR01-DB5-obe.outbound.protection.outlook.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1beUMY-0002Us-6U; Mon, 29 Aug 2016 17:45:02 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=virtuozzo.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=V7aCm2JiE6RGhh0GCCy2QroHvfZdoY+gh34n0xynGFs=; b=eV6fd0CCIvoepROJj5JTSKgMobR53EjrcvA4BAsKIyVUTLPMjkDl3UrrvuPpT+uoObgGyJ6uX7UpGhPqoIDNGnEIUpeJoerY5UOzF8DsFHlRwYDNCV9r6zSutj9W2c51VgAuHK6jUFxhI/S52DSmx5OIoSBjmGT+kz6MGdqz5fs= Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=pbutsykin@virtuozzo.com; Received: from pavelb-Z68P-DS3.sw.ru (195.214.232.10) by AM5PR0802MB2547.eurprd08.prod.outlook.com (10.175.45.23) with Microsoft SMTP Server (version=TLS1_0, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA_P384) id 15.1.587.13; Mon, 29 Aug 2016 17:11:26 +0000 From: Pavel Butsykin To: , Date: Mon, 29 Aug 2016 20:10:13 +0300 Message-ID: <20160829171021.4902-15-pbutsykin@virtuozzo.com> X-Mailer: git-send-email 2.8.3 In-Reply-To: <20160829171021.4902-1-pbutsykin@virtuozzo.com> References: <20160829171021.4902-1-pbutsykin@virtuozzo.com> MIME-Version: 1.0 X-Originating-IP: [195.214.232.10] X-ClientProxiedBy: AM5PR0901CA0011.eurprd09.prod.outlook.com (10.164.186.149) To AM5PR0802MB2547.eurprd08.prod.outlook.com (10.175.45.23) X-MS-Office365-Filtering-Correlation-Id: 93f0c835-bf10-4134-c7d4-08d3d02f82f4 X-Microsoft-Exchange-Diagnostics: 1; AM5PR0802MB2547; 2:HvsG6bsqNlyDDhkhf3taAKvLr8Pm2pqkp+XCcR2PhlivSXS+cQMmfEOJmWyw/rnAP762HSvx6N9RcrzrgC8JCs5ePpRrV1W6ROZQq4gjiTgZMDyThTrWUNd8mPMI3Qem26Yg2dzQidnrW6BhGf0mE06LR3CBlwCEkCkJTTDuypsAAk62wS75SjXev+owBMgF; 3:KaMf/U356hSu+tFBtvg7WRhCKKfamDKiE8e+OeYhWwx1oxLjmI8FsaERtEg3ef77f0dqGF8+mHTqYfUUFaoYQBSiuBH8pGHOiZPcmnExKfhAKqGIcGsUmcI47Bb+CqIt X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:AM5PR0802MB2547; X-Microsoft-Exchange-Diagnostics: 1; AM5PR0802MB2547; 25:q1Xxr/9Rz8NFZuCOweVt7RKRBVtnIkQWhHQKYad/6ydxbxV4a+IZkOlFfUW/xrxk8VEPsNDnYjC+AeoNtCzOneZX/SyoKFrAfB14KCNmDHe7R7g6foeJFOu4o5mYYW1z7C1sDRh0lc1b4r727OcopByBrwyqUyDbAcm7aoTWElH4yruiJrwbYO4oREwUHIG56yZ6Wwh/aPDi5qNEoxltj8YOwKynbMgjdwGTn0qmFAqXMgXBjNTx+865OwqOKBwWHdMuWd9b9ML2u8Pz/v7H0WMCOZ0wVxJE880NQr7GnuoL/9sz51JiW/F6dFVq+o39eSvxtVRsSFPTN0qOBXCXFr7rxdUlmpGl155R1OD6hlcBNb8NHCZo4CKSKRk0isoYVHishh6evGGc8VxFQ8+8h+14E4Dafgd46RyT9ONpmvUWdUTfIgosujvcRfdp+yMest0h8qpsgU9ZbqfI3/4rRVE9eS0zuXTOIhMzSMUELdu8IvhyAoK86YrXSu6ts7UzqIhlaIX3mfaA33AJ+X604UxtIZqJhmfE8amCnrSZW3eM+0YZhIfZB0gkmlW16izC+7DSVC10IexkE6J09RwccDKk/Daqw3zeUifV/+neIrqZkhiHGWy/cBqlVJa5/dAtk1y3JzkN3qGZy0eP3jJXoiU76+EUl596xGqEIFMNh9vzvK2b69bTl3CEjCMBCsTh7HT59o6uf7J85ev5pYk2wQ==; 31:DEGiQo26AVqVA6/VTbxGKkDUifH8+59/SEnuvqSr6Q2rGoOE13OC4M/ceBDaAkGyDmOwwlBM7AYsEcSQsKvxj+l6TTsbSqc79ukISlnRu+czDsf7dfxWuNoC119EHEjw9GVez3OUY88Lrf2oyjmXXRHAcJChehGOg4sRIuOxkxFx8fuvmZAZinJOuGfxnaUJRAq3TY48zsZcJQSHTRP7tDk7KgZuoPV0kFg4mdE2Avs= X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(6040176)(601004)(2401047)(8121501046)(5005006)(10201501046)(3002001)(6042046)(6043046); SRVR:AM5PR0802MB2547; BCL:0; PCL:0; RULEID:; SRVR:AM5PR0802MB2547; X-Microsoft-Exchange-Diagnostics: 1; AM5PR0802MB2547; 4:RRQMlnLAYKcDXib4UbC51RlKPfP9gFHRTCgZIxGEaiH6i27fLLxJnVjdKAwU9+S/4doKULw/BsKJcAT4/D6cFvwN9D/+PCDk0PeA/W0ZjILHrb++YNbTpXduMRSr2T4cfQdSG3xHFEZuu0fAyfg20+G9x2USy3XuAHqXOH442EYxirdLTC2gyWp8DvjVqCvCnBMIIw/vqqh8GtBPQpMcdYdvqALsq+GQQMzADfeNJEL5cfX+gArTR63zmB1ao4hLgLRdhxJHUFwRhBKeFQ4LP9DzvF8QTrfMQ2wVBOgqFwkmyp2tYw7/iP0ciXNEqw8+XSEqVjah7mDt3pzzyjO/0dbYXFqQh4g370KawO6AAY3nQH/1fsIF9TnRaEqF6FFpVz/YPtvtFLELOwloOvR/CqaqlXLIV+txRquP6O9x0MY= X-Forefront-PRVS: 0049B3F387 X-Forefront-Antispam-Report: SFV:NSPM; SFS:(10019020)(4630300001)(6009001)(7916002)(199003)(189002)(3846002)(305945005)(77096005)(86362001)(1076002)(53416004)(92566002)(68736007)(42186005)(97736004)(5001770100001)(47776003)(105586002)(2906002)(6116002)(189998001)(50986999)(4326007)(48376002)(586003)(2950100001)(5003940100001)(50466002)(66066001)(19580405001)(19580395003)(76176999)(69596002)(5660300001)(229853001)(101416001)(36756003)(7736002)(8676002)(106356001)(50226002)(7846002)(81156014)(33646002)(81166006)(21314002)(217873001); DIR:OUT; SFP:1102; SCL:1; SRVR:AM5PR0802MB2547; H:pavelb-Z68P-DS3.sw.ru; FPR:; SPF:None; PTR:InfoNoRecords; A:1; MX:1; LANG:en; Received-SPF: None (protection.outlook.com: virtuozzo.com does not designate permitted sender hosts) X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; AM5PR0802MB2547; 23:yRTBmYAO3ILQr9c8DssxniytZCOa12OmYY9NeR3?= =?us-ascii?Q?uSP8aFdrp+jJjmdt8ccnfqKyOk1koNuRZDAp22fo7PJmcIBF0Sa/dRBWRbci?= =?us-ascii?Q?4k/GUN5ZlaiOR17zgm7tVqPfx7VVrJELGVVAxAspxEvQHjj+nfo/T0cJiu14?= =?us-ascii?Q?LedABTxQLJgxOc6f937oGo4XGgAeprScOBXrgeN6th5egjgU/T7yvoTF2gd7?= =?us-ascii?Q?T43/zdNzJsColB/CYqT1Qo8P0a5QO7KysG+tvDQ5EyHPTFHGN/GtchjsMCto?= =?us-ascii?Q?nmE8H0UIdZ2EoGl0aS5zsOTIyaBNB3SYuPBcYT8+CqTKiMsCgMr+RmRTNPqD?= =?us-ascii?Q?imxctgwYLWNrup5jrMUYbGpqZLJjSJqVLM+lnk4F3i3DzBA5hlRLqyYBFbqL?= =?us-ascii?Q?BzVgxWX/8c9znsB3oMFSpdzBSTqw4ZOo8w4GXD4z/WBEPeVCPBVZ0DS+rz/V?= =?us-ascii?Q?EOKmmcGOPqwcyvDkf+r74t8KyB4z1vPnWSqLn6AYPMUI+rb0/DXXNwIQTOTM?= =?us-ascii?Q?0Gjk8jEK/Ef+BJCKpOYSd1LRRS/kb9brMA8Y0PAsCFo7Cl9/ZXxjT+NZKHYD?= =?us-ascii?Q?G6RiplGDSxHfEOUVA7pPApqaF8qZe7hi6B3+a8XWpNWlxN9sNtK8yxH7mlc+?= =?us-ascii?Q?KZ5I1AnGbPUkJG+wpSy3Vy20SAHxu9x0UKJdY5miCMQHMfSGpya0pY4j7F1c?= =?us-ascii?Q?+sYueXJoTzmFSa/gsUamLTZELOHRhPl9uVb/TCtQ+NBuuSQl5DgC03fdyhT8?= =?us-ascii?Q?RIx5VZTPZeiiXJ0xJW+PwfRhalc5q6m9b0f5f1wTVcU/M0Vs/EmJBFuj/+OP?= =?us-ascii?Q?LoMTOGIe6wS8KjPssA1iBwhkZJfwsSBbfF5CtlY4bGRXszmFCl9EXB0xu91C?= =?us-ascii?Q?NPzR2LQXwypzuWA+dl/QjbhA1W+PTdTmf7yg8QVEEGPw2tF+9cR2LiqbcN+t?= =?us-ascii?Q?imgx1AfDPFs9bDm3MEGM5xXdm19PKrfceedF9Lq4Vl/zMMzsiQusf+OKvy4p?= =?us-ascii?Q?od9gQ6Z96bs6MYKlqKHwFUcLKhy0wBoIHVEkX/4BKcIFT9ky3f8fQ+FyR257?= =?us-ascii?Q?4R3N4CZT3lYsptipZ1tM2bx0CPPxrCiRuuNKZ+nLoP3uWIjW/oCooT/ZXrU+?= =?us-ascii?Q?jmYQgDtPPQM6Czdl1INTnrGuCwKax6pWHuy2SkjMlZcLxnq2R2IrbzxmhABB?= =?us-ascii?Q?QwSi5HskF9MbIWK4=3D?= X-Microsoft-Exchange-Diagnostics: 1; AM5PR0802MB2547; 6:4DUKmBNScrS/2ur07LnR+7MdZPDpASswp9OfoVhOd6vqPFRHuNXkQFvLp3ZWcoeyPAADzdag3Ig2yQxYLmTBsk+szKxmmDcK2lGRFfK6G0utXrAV7Rw3AQniprihV1jJBUEZePv+4Q15IXavB0OmNU69k6ZXYUS0hk26vHbN/pj044XAbHiINQYpJldf0D6Sl8Jt2yK1qTuO2GiYv6QoWeJj0V6apunFGKY5up63AsG/BbDghErMCnc5J3Ra1y3X2mywm8MIQ/lNGRMLtkFhGLr1IH83Th+GhuvIgJRV3ny48zc1XeTwa88UTTlLJZ0c; 5:s/gxEfHzsmIM8fLL230SgNHve94ik6aNZHraSLioT9YdpIcxvXnj4D48arn3DbRG1Ua7UwOJxZzlp93PSO34KApB+uRBIdIPlVP+cfTOXfs0k1FRhjjcY8Gsx/FmM7GzmZjLcTlCr8Bbf7oXqWKkRA==; 24:Ab1haDuS30K0kD39jeaVT58epMdBXFZ98MvXUUr3uUGPYsI8pMZ9YTkQUx/J8osr4/NTK/6KPe1qnCr/RNZcMLuoF2lNT5Is7bCLIqIaCV4=; 7:g7hVL2247NUCdMKOmal561ayoZKRqXfa1U9+fDMxjGRJ1uf6MbgqduhlQR1is2pXo4a1DrXBWaPtaXfN9G/Y0KGlzX2iZsfW3J8mG/3sJLs6m2yxI50+BqVisVUi+c2pf7Cvt6X2Yn9lHlvXDaxIxP2F5BXTS1/Jelh4VwdRaQjF8U2oKbE5v6GW2JahUyDA6COFgKbQDTeP2N9sVKYTX7+6PLfdL2xAutsqbd0S66/FpI9CsT+QyZ6UhFsScJwb SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1; AM5PR0802MB2547; 20:qhc7O6VxJQxsHcvMY6cwbR7HJ1KCgT1JH4Z/XQSEDaL3qNFPjK+aqXB5edXA6cPvXGUaFXzorbUQwkCtWYoYuBwddKzOSxzpR75xRlMlTJAOOnYji/8OQiLCaIATRS4GuMAE3cBkev9FzacpMWKmEGXdItNoDGjYcCv3np8ivcw= X-OriginatorOrg: virtuozzo.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 29 Aug 2016 17:11:26.2987 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM5PR0802MB2547 X-detected-operating-system: by eggs.gnu.org: Windows 7 or 8 [fuzzy] X-Received-From: 104.47.2.110 Subject: [Qemu-devel] [PATCH RFC v2 14/22] block/pcache: add support for rescheduling requests X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, famz@redhat.com, mreitz@redhat.com, stefanha@redhat.com, den@openvz.org, jsnow@redhat.com Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP Now we can't drop nodes until aio write request will not be completed, because there is no guarantee that in the interval of time between the start request and its completion can be cached overlapping chunk of blocks and some data in the cache will be irrelevant. Also became possible when aio write corresponds to PCNode with status NODE_WAIT_STATUS, if we drop the nodes in aio callback, then these nodes can be skipped because there is a guarantee that at the time of processing aio read for pending node data on the disk will be relevant. Signed-off-by: Pavel Butsykin --- block/pcache.c | 136 +++++++++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 112 insertions(+), 24 deletions(-) diff --git a/block/pcache.c b/block/pcache.c index 1ff4c6a..cb5f884 100644 --- a/block/pcache.c +++ b/block/pcache.c @@ -43,6 +43,11 @@ typedef struct RbNodeKey { uint32_t size; } RbNodeKey; +typedef struct ACBEntryLink { + QTAILQ_ENTRY(ACBEntryLink) entry; + struct PrefCacheAIOCB *acb; +} ACBEntryLink; + typedef struct BlockNode { struct RbNode rb_node; union { @@ -58,6 +63,10 @@ typedef struct BlockNode { typedef struct PCNode { BlockNode cm; + struct { + QTAILQ_HEAD(acb_head, ACBEntryLink) list; + uint32_t cnt; + } wait; uint32_t status; uint32_t ref; uint8_t *data; @@ -181,7 +190,6 @@ static inline PCNode *pcache_node_ref(PCNode *node) { assert(node->status == NODE_SUCCESS_STATUS || node->status == NODE_WAIT_STATUS); - assert(atomic_read(&node->ref) == 0);/* XXX: only for sequential requests */ atomic_inc(&node->ref); return node; @@ -277,6 +285,8 @@ static inline void *pcache_node_alloc(RbNodeKey* key) node->status = NODE_WAIT_STATUS; qemu_co_mutex_init(&node->lock); node->data = g_malloc(node->cm.nb_sectors << BDRV_SECTOR_BITS); + node->wait.cnt = 0; + QTAILQ_INIT(&node->wait.list); return node; } @@ -308,15 +318,33 @@ static void pcache_node_drop(BDRVPCacheState *s, PCNode *node) pcache_node_unref(s, node); } +static inline PCNode *pcache_get_most_unused_node(BDRVPCacheState *s) +{ + PCNode *node; + assert(!QTAILQ_EMPTY(&s->pcache.lru.list)); + + qemu_co_mutex_lock(&s->pcache.lru.lock); + node = PCNODE(QTAILQ_LAST(&s->pcache.lru.list, lru_head)); + pcache_node_ref(node); + qemu_co_mutex_unlock(&s->pcache.lru.lock); + + return node; +} + static void pcache_try_shrink(BDRVPCacheState *s) { while (s->pcache.curr_size > s->cfg_cache_size) { - qemu_co_mutex_lock(&s->pcache.lru.lock); - assert(!QTAILQ_EMPTY(&s->pcache.lru.list)); - PCNode *rmv_node = PCNODE(QTAILQ_LAST(&s->pcache.lru.list, lru_head)); - qemu_co_mutex_unlock(&s->pcache.lru.lock); + PCNode *rmv_node; + /* it can happen if all nodes are waiting */ + if (QTAILQ_EMPTY(&s->pcache.lru.list)) { + DPRINTF("lru list is empty, but curr_size: %d\n", + s->pcache.curr_size); + break; + } + rmv_node = pcache_get_most_unused_node(s); pcache_node_drop(s, rmv_node); + pcache_node_unref(s, rmv_node); #ifdef PCACHE_DEBUG atomic_inc(&s->shrink_cnt_node); #endif @@ -392,7 +420,7 @@ static uint64_t ranges_overlap_size(uint64_t node1, uint32_t size1, return MIN(node1 + size1, node2 + size2) - MAX(node1, node2); } -static void pcache_node_read(PrefCacheAIOCB *acb, PCNode* node) +static inline void pcache_node_read_buf(PrefCacheAIOCB *acb, PCNode* node) { uint64_t qiov_offs = 0, node_offs = 0; uint32_t size; @@ -407,15 +435,41 @@ static void pcache_node_read(PrefCacheAIOCB *acb, PCNode* node) node->cm.sector_num, node->cm.nb_sectors) << BDRV_SECTOR_BITS; + qemu_co_mutex_lock(&node->lock); /* XXX: use rw lock */ + copy = \ + qemu_iovec_from_buf(acb->qiov, qiov_offs, node->data + node_offs, size); + qemu_co_mutex_unlock(&node->lock); + assert(copy == size); +} + +static inline void pcache_node_read_wait(PrefCacheAIOCB *acb, PCNode *node) +{ + ACBEntryLink *link = g_slice_alloc(sizeof(*link)); + link->acb = acb; + + atomic_inc(&node->wait.cnt); + QTAILQ_INSERT_HEAD(&node->wait.list, link, entry); + acb->ref++; +} + +static void pcache_node_read(PrefCacheAIOCB *acb, PCNode* node) +{ assert(node->status == NODE_SUCCESS_STATUS || + node->status == NODE_WAIT_STATUS || node->status == NODE_REMOVE_STATUS); assert(node->data != NULL); qemu_co_mutex_lock(&node->lock); - copy = \ - qemu_iovec_from_buf(acb->qiov, qiov_offs, node->data + node_offs, size); - assert(copy == size); + if (node->status == NODE_WAIT_STATUS) { + pcache_node_read_wait(acb, node); + qemu_co_mutex_unlock(&node->lock); + + return; + } qemu_co_mutex_unlock(&node->lock); + + pcache_node_read_buf(acb, node); + pcache_node_unref(acb->s, node); } static inline void prefetch_init_key(PrefCacheAIOCB *acb, RbNodeKey* key) @@ -446,10 +500,11 @@ static void pcache_pickup_parts_of_cache(PrefCacheAIOCB *acb, PCNode *node, size -= up_size; num += up_size; } - pcache_node_read(acb, node); up_size = MIN(node->cm.sector_num + node->cm.nb_sectors - num, size); - - pcache_node_unref(acb->s, node); + pcache_node_read(acb, node); /* don't use node after pcache_node_read, + * node maybe free. + */ + node = NULL; size -= up_size; num += up_size; @@ -488,7 +543,6 @@ static int32_t pcache_prefetch(PrefCacheAIOCB *acb) acb->nb_sectors) { pcache_node_read(acb, node); - pcache_node_unref(acb->s, node); return PREFETCH_FULL_UP; } pcache_pickup_parts_of_cache(acb, node, key.num, key.size); @@ -513,6 +567,31 @@ static void complete_aio_request(PrefCacheAIOCB *acb) } } +static void pcache_complete_acb_wait_queue(BDRVPCacheState *s, PCNode *node) +{ + ACBEntryLink *link, *next; + + if (atomic_read(&node->wait.cnt) == 0) { + return; + } + + QTAILQ_FOREACH_SAFE(link, &node->wait.list, entry, next) { + PrefCacheAIOCB *wait_acb = link->acb; + + QTAILQ_REMOVE(&node->wait.list, link, entry); + g_slice_free1(sizeof(*link), link); + + pcache_node_read_buf(wait_acb, node); + + assert(node->ref != 0); + pcache_node_unref(s, node); + + complete_aio_request(wait_acb); + atomic_dec(&node->wait.cnt); + } + assert(atomic_read(&node->wait.cnt) == 0); +} + static void pcache_node_submit(PrefCachePartReq *req) { PCNode *node = req->node; @@ -539,14 +618,17 @@ static void pcache_merge_requests(PrefCacheAIOCB *acb) qemu_co_mutex_lock(&acb->requests.lock); QTAILQ_FOREACH_SAFE(req, &acb->requests.list, entry, next) { + PCNode *node = req->node; QTAILQ_REMOVE(&acb->requests.list, req, entry); assert(req != NULL); - assert(req->node->status == NODE_WAIT_STATUS); + assert(node->status == NODE_WAIT_STATUS); pcache_node_submit(req); - pcache_node_read(acb, req->node); + pcache_node_read_buf(acb, node); + + pcache_complete_acb_wait_queue(acb->s, node); pcache_node_unref(acb->s, req->node); @@ -559,22 +641,27 @@ static void pcache_try_node_drop(PrefCacheAIOCB *acb) { BDRVPCacheState *s = acb->s; RbNodeKey key; + PCNode *node; + uint64_t end_offs = acb->sector_num + acb->nb_sectors; - prefetch_init_key(acb, &key); - + key.num = acb->sector_num; do { - PCNode *node; - qemu_co_mutex_lock(&s->pcache.tree.lock); + key.size = end_offs - key.num; + + qemu_co_mutex_lock(&s->pcache.tree.lock); /* XXX: use get_next_node */ node = pcache_node_search(&s->pcache.tree.root, &key); qemu_co_mutex_unlock(&s->pcache.tree.lock); if (node == NULL) { - break; + return; } - - pcache_node_drop(s, node); + if (node->status != NODE_WAIT_STATUS) { + assert(node->status == NODE_SUCCESS_STATUS); + pcache_node_drop(s, node); + } + key.num = node->cm.sector_num + node->cm.nb_sectors; pcache_node_unref(s, node); - } while (true); + } while (end_offs > key.num); } static void pcache_aio_cb(void *opaque, int ret) @@ -586,6 +673,8 @@ static void pcache_aio_cb(void *opaque, int ret) return; } pcache_merge_requests(acb); + } else { /* QEMU_AIO_WRITE */ + pcache_try_node_drop(acb); /* XXX: use write through */ } complete_aio_request(acb); @@ -649,7 +738,6 @@ static BlockAIOCB *pcache_aio_writev(BlockDriverState *bs, { PrefCacheAIOCB *acb = pcache_aio_get(bs, sector_num, qiov, nb_sectors, cb, opaque, QEMU_AIO_WRITE); - pcache_try_node_drop(acb); /* XXX: use write through */ bdrv_aio_writev(bs->file, sector_num, qiov, nb_sectors, pcache_aio_cb, acb);