Message ID | 20250412154207.2152667-1-vitaly.wool@konsulko.se (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v4] mm: add zblock allocator | expand |
2025-04-12 19:42 skrev Vitaly Wool: > zblock is a special purpose allocator for storing compressed pages. > It stores integer number of same size objects per its block. These > blocks consist of several physical pages (2**n, i. e. 1/2/4/8). > > With zblock, it is possible to densely arrange objects of various sizes > resulting in low internal fragmentation. Also this allocator tries to > fill incomplete blocks instead of adding new ones, in many cases > providing a compression ratio comparable to zmalloc's. > > zblock is also in most cases superior to zsmalloc with regard to > average performance and worst execution times, thus allowing for better > response time and real-time characteristics of the whole system. > > High memory and page migration are currently not supported by zblock. > > Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.se> > Signed-off-by: Igor Belousov <igor.b@beldev.am> Tested-by: Igor Belousov <igor.b@beldev.am> > --- > v3: > https://patchwork.kernel.org/project/linux-mm/patch/20250408125211.1611879-1-vitaly.wool@konsulko.se/ > Changes since v3: > - rebased and tested against the latest -mm tree > - the block descriptors table was updated for better compression ratio > - fixed the bug with wrong SLOT_BITS value > - slot search moved to find_and_claim_block() > > Test results (zstd compressor, 8 core Ryzen 9 VM, make bzImage): > - zblock: > real 6m52.621s > user 33m41.771s > sys 6m28.825s > Zswap: 162328 kB > Zswapped: 754468 kB > zswpin 93851 > zswpout 542481 > zswpwb 935 > - zsmalloc: > real 7m4.355s > user 34m37.538s > sys 6m22.086s > zswpin 101243 > zswpout 448217 > zswpwb 640 > Zswap: 175704 kB > Zswapped: 778692 kB > > Documentation/mm/zblock.rst | 24 ++ > MAINTAINERS | 7 + > mm/Kconfig | 12 + > mm/Makefile | 1 + > mm/zblock.c | 443 ++++++++++++++++++++++++++++++++++++ > mm/zblock.h | 176 ++++++++++++++ > 6 files changed, 663 insertions(+) > create mode 100644 Documentation/mm/zblock.rst > create mode 100644 mm/zblock.c > create mode 100644 mm/zblock.h > > diff --git a/Documentation/mm/zblock.rst b/Documentation/mm/zblock.rst > new file mode 100644 > index 000000000000..9751434d0b76 > --- /dev/null > +++ b/Documentation/mm/zblock.rst > @@ -0,0 +1,24 @@ > +.. SPDX-License-Identifier: GPL-2.0 > + > +====== > +zblock > +====== > + > +zblock is a special purpose allocator for storing compressed pages. > +It stores integer number of compressed objects per its block. These > +blocks consist of several physical pages (2**n, i. e. 1/2/4/8). > + > +With zblock, it is possible to densely arrange objects of various > sizes > +resulting in low internal fragmentation. Also this allocator tries to > +fill incomplete blocks instead of adding new ones, in many cases > +providing a compression ratio substantially higher than z3fold and > zbud > +(though lower than zmalloc's). > + > +zblock does not require MMU to operate and also is superior to > zsmalloc > +with regard to average performance and worst execution times, thus > +allowing for better response time and real-time characteristics of the > +whole system. > + > +E. g. on a series of stress-ng tests run on a Raspberry Pi 5, we get > +5-10% higher value for bogo ops/s in zblock/zsmalloc comparison. > + > diff --git a/MAINTAINERS b/MAINTAINERS > index 96b827049501..46465c986005 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -26640,6 +26640,13 @@ > F: Documentation/networking/device_drivers/hamradio/z8530drv.rst > F: drivers/net/hamradio/*scc.c > F: drivers/net/hamradio/z8530.h > > +ZBLOCK COMPRESSED SLAB MEMORY ALLOCATOR > +M: Vitaly Wool <vitaly.wool@konsulko.se> > +L: linux-mm@kvack.org > +S: Maintained > +F: Documentation/mm/zblock.rst > +F: mm/zblock.[ch] > + > ZD1211RW WIRELESS DRIVER > L: linux-wireless@vger.kernel.org > S: Orphan > diff --git a/mm/Kconfig b/mm/Kconfig > index e113f713b493..5aa1479151ec 100644 > --- a/mm/Kconfig > +++ b/mm/Kconfig > @@ -152,6 +152,18 @@ config ZSWAP_ZPOOL_DEFAULT > default "zsmalloc" if ZSWAP_ZPOOL_DEFAULT_ZSMALLOC > default "" > > +config ZBLOCK > + tristate "Fast compression allocator with high density" > + depends on ZPOOL > + help > + A special purpose allocator for storing compressed pages. > + It stores integer number of same size compressed objects per > + its block. These blocks consist of several physical pages > + (2**n, i. e. 1/2/4/8). > + > + With zblock, it is possible to densely arrange objects of > + various sizes resulting in low internal fragmentation. > + > config ZSMALLOC > tristate > prompt "N:1 compression allocator (zsmalloc)" if (ZSWAP || ZRAM) > diff --git a/mm/Makefile b/mm/Makefile > index e7f6bbf8ae5f..9d7e5b5bb694 100644 > --- a/mm/Makefile > +++ b/mm/Makefile > @@ -116,6 +116,7 @@ obj-$(CONFIG_DEBUG_VM_PGTABLE) += > debug_vm_pgtable.o > obj-$(CONFIG_PAGE_OWNER) += page_owner.o > obj-$(CONFIG_MEMORY_ISOLATION) += page_isolation.o > obj-$(CONFIG_ZPOOL) += zpool.o > +obj-$(CONFIG_ZBLOCK) += zblock.o > obj-$(CONFIG_ZSMALLOC) += zsmalloc.o > obj-$(CONFIG_GENERIC_EARLY_IOREMAP) += early_ioremap.o > obj-$(CONFIG_CMA) += cma.o > diff --git a/mm/zblock.c b/mm/zblock.c > new file mode 100644 > index 000000000000..ecc7aeb611af > --- /dev/null > +++ b/mm/zblock.c > @@ -0,0 +1,443 @@ > +// SPDX-License-Identifier: GPL-2.0-only > +/* > + * zblock.c > + * > + * Author: Vitaly Wool <vitaly.wool@konsulko.se> > + * Based on the work from Ananda Badmaev <a.badmaev@clicknet.pro> > + * Copyright (C) 2022-2025, Konsulko AB. > + * > + * Zblock is a small object allocator with the intention to serve as a > + * zpool backend. It operates on page blocks which consist of number > + * of physical pages being a power of 2 and store integer number of > + * compressed pages per block which results in determinism and > simplicity. > + * > + * zblock doesn't export any API and is meant to be used via zpool > API. > + */ > + > +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt > + > +#include <linux/atomic.h> > +#include <linux/list.h> > +#include <linux/mm.h> > +#include <linux/module.h> > +#include <linux/preempt.h> > +#include <linux/slab.h> > +#include <linux/spinlock.h> > +#include <linux/zpool.h> > +#include "zblock.h" > + > +static struct rb_root block_desc_tree = RB_ROOT; > + > +/* Encode handle of a particular slot in the pool using metadata */ > +static inline unsigned long metadata_to_handle(struct zblock_block > *block, > + unsigned int block_type, unsigned int slot) > +{ > + return (unsigned long)(block) | (block_type << SLOT_BITS) | slot; > +} > + > +/* Return block, block type and slot in the pool corresponding to > handle */ > +static inline struct zblock_block *handle_to_metadata(unsigned long > handle, > + unsigned int *block_type, unsigned int *slot) > +{ > + *block_type = (handle & (PAGE_SIZE - 1)) >> SLOT_BITS; > + *slot = handle & SLOT_MASK; > + return (struct zblock_block *)(handle & PAGE_MASK); > +} > + > +/* > + * Find a block with at least one free slot and claim it. > + * We make sure that the first block, if exists, will always work. > + */ > +static inline struct zblock_block *find_and_claim_block(struct > block_list *b, > + int block_type, unsigned long *handle) > +{ > + struct list_head *l = &b->active_list; > + unsigned int slot; > + > + if (!list_empty(l)) { > + struct zblock_block *z = list_first_entry(l, typeof(*z), link); > + > + if (--z->free_slots == 0) > + list_move(&z->link, &b->full_list); > + /* > + * There is a slot in the block and we just made sure it would > + * remain. > + * Find that slot and set the busy bit. > + */ > + for (slot = find_first_zero_bit(z->slot_info, > + block_desc[block_type].slots_per_block); > + slot < block_desc[block_type].slots_per_block; > + slot = find_next_zero_bit(z->slot_info, > + block_desc[block_type].slots_per_block, > + slot)) { > + if (!test_and_set_bit(slot, z->slot_info)) > + break; > + barrier(); > + } > + > + WARN_ON(slot >= block_desc[block_type].slots_per_block); > + *handle = metadata_to_handle(z, block_type, slot); > + return z; > + } > + return NULL; > +} > + > +/* > + * allocate new block and add it to corresponding block list > + */ > +static struct zblock_block *alloc_block(struct zblock_pool *pool, > + int block_type, gfp_t gfp, > + unsigned long *handle) > +{ > + struct zblock_block *block; > + struct block_list *block_list; > + > + block = (void *)__get_free_pages(gfp, block_desc[block_type].order); > + if (!block) > + return NULL; > + > + block_list = &pool->block_lists[block_type]; > + > + /* init block data */ > + block->free_slots = block_desc[block_type].slots_per_block - 1; > + memset(&block->slot_info, 0, sizeof(block->slot_info)); > + set_bit(0, block->slot_info); > + *handle = metadata_to_handle(block, block_type, 0); > + > + spin_lock(&block_list->lock); > + list_add(&block->link, &block_list->active_list); > + block_list->block_count++; > + spin_unlock(&block_list->lock); > + return block; > +} > + > +/***************** > + * API Functions > + *****************/ > +/** > + * zblock_create_pool() - create a new zblock pool > + * @gfp: gfp flags when allocating the zblock pool structure > + * @ops: user-defined operations for the zblock pool > + * > + * Return: pointer to the new zblock pool or NULL if the metadata > allocation > + * failed. > + */ > +static struct zblock_pool *zblock_create_pool(gfp_t gfp) > +{ > + struct zblock_pool *pool; > + struct block_list *block_list; > + int i; > + > + pool = kmalloc(sizeof(struct zblock_pool), gfp); > + if (!pool) > + return NULL; > + > + /* init each block list */ > + for (i = 0; i < ARRAY_SIZE(block_desc); i++) { > + block_list = &pool->block_lists[i]; > + spin_lock_init(&block_list->lock); > + INIT_LIST_HEAD(&block_list->full_list); > + INIT_LIST_HEAD(&block_list->active_list); > + block_list->block_count = 0; > + } > + return pool; > +} > + > +/** > + * zblock_destroy_pool() - destroys an existing zblock pool > + * @pool: the zblock pool to be destroyed > + * > + */ > +static void zblock_destroy_pool(struct zblock_pool *pool) > +{ > + kfree(pool); > +} > + > + > +/** > + * zblock_alloc() - allocates a slot of appropriate size > + * @pool: zblock pool from which to allocate > + * @size: size in bytes of the desired allocation > + * @gfp: gfp flags used if the pool needs to grow > + * @handle: handle of the new allocation > + * > + * Return: 0 if success and handle is set, otherwise -EINVAL if the > size or > + * gfp arguments are invalid or -ENOMEM if the pool was unable to > allocate > + * a new slot. > + */ > +static int zblock_alloc(struct zblock_pool *pool, size_t size, gfp_t > gfp, > + unsigned long *handle) > +{ > + int block_type = -1; > + struct zblock_block *block; > + struct block_list *block_list; > + > + if (!size) > + return -EINVAL; > + > + if (size > PAGE_SIZE) > + return -ENOSPC; > + > + /* find basic block type with suitable slot size */ > + if (size < block_desc[0].slot_size) > + block_type = 0; > + else { > + struct block_desc_node *block_node; > + struct rb_node *node = block_desc_tree.rb_node; > + > + while (node) { > + block_node = container_of(node, typeof(*block_node), node); > + if (size < block_node->this_slot_size) > + node = node->rb_left; > + else if (size >= block_node->next_slot_size) > + node = node->rb_right; > + else { > + block_type = block_node->block_idx + 1; > + break; > + } > + } > + } > + if (WARN_ON(block_type < 0)) > + return -EINVAL; > + if (block_type >= ARRAY_SIZE(block_desc)) > + return -ENOSPC; > + > + block_list = &pool->block_lists[block_type]; > + > + spin_lock(&block_list->lock); > + block = find_and_claim_block(block_list, block_type, handle); > + spin_unlock(&block_list->lock); > + if (block) > + return 0; > + > + /* not found block with free slots try to allocate new empty block */ > + block = alloc_block(pool, block_type, gfp & ~(__GFP_MOVABLE | > __GFP_HIGHMEM), handle); > + return block ? 0 : -ENOMEM; > +} > + > +/** > + * zblock_free() - frees the allocation associated with the given > handle > + * @pool: pool in which the allocation resided > + * @handle: handle associated with the allocation returned by > zblock_alloc() > + * > + */ > +static void zblock_free(struct zblock_pool *pool, unsigned long > handle) > +{ > + unsigned int slot, block_type; > + struct zblock_block *block; > + struct block_list *block_list; > + > + block = handle_to_metadata(handle, &block_type, &slot); > + block_list = &pool->block_lists[block_type]; > + > + spin_lock(&block_list->lock); > + /* if all slots in block are empty delete whole block */ > + if (++block->free_slots == block_desc[block_type].slots_per_block) { > + block_list->block_count--; > + list_del(&block->link); > + spin_unlock(&block_list->lock); > + free_pages((unsigned long)block, block_desc[block_type].order); > + return; > + } else if (block->free_slots == 1) > + list_move_tail(&block->link, &block_list->active_list); > + clear_bit(slot, block->slot_info); > + spin_unlock(&block_list->lock); > +} > + > +/** > + * zblock_map() - maps the allocation associated with the given handle > + * @pool: pool in which the allocation resides > + * @handle: handle associated with the allocation to be mapped > + * > + * > + * Returns: a pointer to the mapped allocation > + */ > +static void *zblock_map(struct zblock_pool *pool, unsigned long > handle) > +{ > + unsigned int block_type, slot; > + struct zblock_block *block; > + unsigned long offs; > + void *p; > + > + block = handle_to_metadata(handle, &block_type, &slot); > + offs = ZBLOCK_HEADER_SIZE + slot * block_desc[block_type].slot_size; > + p = (void *)block + offs; > + return p; > +} > + > +/** > + * zblock_unmap() - unmaps the allocation associated with the given > handle > + * @pool: pool in which the allocation resides > + * @handle: handle associated with the allocation to be unmapped > + */ > +static void zblock_unmap(struct zblock_pool *pool, unsigned long > handle) > +{ > +} > + > +/** > + * zblock_write() - write to the memory area defined by handle > + * @pool: pool in which the allocation resides > + * @handle: handle associated with the allocation > + * @handle_mem: pointer to source memory block > + * @mem_len: length of the memory block to write > + */ > +static void zblock_write(struct zblock_pool *pool, unsigned long > handle, > + void *handle_mem, size_t mem_len) > +{ > + unsigned int block_type, slot; > + struct zblock_block *block; > + unsigned long offs; > + void *p; > + > + block = handle_to_metadata(handle, &block_type, &slot); > + offs = ZBLOCK_HEADER_SIZE + slot * block_desc[block_type].slot_size; > + p = (void *)block + offs; > + memcpy(p, handle_mem, mem_len); > +} > + > +/** > + * zblock_get_total_pages() - gets the zblock pool size in pages > + * @pool: pool being queried > + * > + * Returns: size in bytes of the given pool. > + */ > +static u64 zblock_get_total_pages(struct zblock_pool *pool) > +{ > + u64 total_size; > + int i; > + > + total_size = 0; > + for (i = 0; i < ARRAY_SIZE(block_desc); i++) > + total_size += pool->block_lists[i].block_count << > block_desc[i].order; > + > + return total_size; > +} > + > +/***************** > + * zpool > + ****************/ > + > +static void *zblock_zpool_create(const char *name, gfp_t gfp) > +{ > + return zblock_create_pool(gfp); > +} > + > +static void zblock_zpool_destroy(void *pool) > +{ > + zblock_destroy_pool(pool); > +} > + > +static int zblock_zpool_malloc(void *pool, size_t size, gfp_t gfp, > + unsigned long *handle) > +{ > + return zblock_alloc(pool, size, gfp, handle); > +} > + > +static void zblock_zpool_free(void *pool, unsigned long handle) > +{ > + zblock_free(pool, handle); > +} > + > +static void *zblock_zpool_read_begin(void *pool, unsigned long handle, > + void *local_copy) > +{ > + return zblock_map(pool, handle); > +} > + > +static void zblock_zpool_obj_write(void *pool, unsigned long handle, > + void *handle_mem, size_t mem_len) > +{ > + zblock_write(pool, handle, handle_mem, mem_len); > +} > + > +static void zblock_zpool_read_end(void *pool, unsigned long handle, > + void *handle_mem) > +{ > + zblock_unmap(pool, handle); > +} > + > +static u64 zblock_zpool_total_pages(void *pool) > +{ > + return zblock_get_total_pages(pool); > +} > + > +static struct zpool_driver zblock_zpool_driver = { > + .type = "zblock", > + .owner = THIS_MODULE, > + .create = zblock_zpool_create, > + .destroy = zblock_zpool_destroy, > + .malloc = zblock_zpool_malloc, > + .free = zblock_zpool_free, > + .obj_read_begin = zblock_zpool_read_begin, > + .obj_read_end = zblock_zpool_read_end, > + .obj_write = zblock_zpool_obj_write, > + .total_pages = zblock_zpool_total_pages, > +}; > + > +MODULE_ALIAS("zpool-zblock"); > + > +static void delete_rbtree(void) > +{ > + while (!RB_EMPTY_ROOT(&block_desc_tree)) > + rb_erase(block_desc_tree.rb_node, &block_desc_tree); > +} > + > +static int __init create_rbtree(void) > +{ > + int i; > + > + for (i = 0; i < ARRAY_SIZE(block_desc); i++) { > + struct block_desc_node *block_node = kmalloc(sizeof(*block_node), > + GFP_KERNEL); > + struct rb_node **new = &block_desc_tree.rb_node, *parent = NULL; > + > + if (!block_node) { > + delete_rbtree(); > + return -ENOMEM; > + } > + if (i > 0 && block_desc[i].slot_size <= block_desc[i-1].slot_size) { > + pr_err("%s: block descriptors not in ascending order\n", > + __func__); > + delete_rbtree(); > + return -EINVAL; > + } > + block_node->this_slot_size = block_desc[i].slot_size; > + block_node->block_idx = i; > + if (i == ARRAY_SIZE(block_desc) - 1) > + block_node->next_slot_size = PAGE_SIZE; > + else > + block_node->next_slot_size = block_desc[i+1].slot_size; > + while (*new) { > + parent = *new; > + /* the array is sorted so we will always go to the right */ > + new = &((*new)->rb_right); > + } > + rb_link_node(&block_node->node, parent, new); > + rb_insert_color(&block_node->node, &block_desc_tree); > + } > + return 0; > +} > + > +static int __init init_zblock(void) > +{ > + int ret = create_rbtree(); > + > + if (ret) > + return ret; > + > + zpool_register_driver(&zblock_zpool_driver); > + return 0; > +} > + > +static void __exit exit_zblock(void) > +{ > + zpool_unregister_driver(&zblock_zpool_driver); > + delete_rbtree(); > +} > + > +module_init(init_zblock); > +module_exit(exit_zblock); > + > +MODULE_LICENSE("GPL"); > +MODULE_AUTHOR("Vitaly Wool <vitaly.wool@konsulko.se>"); > +MODULE_DESCRIPTION("Block allocator for compressed pages"); > diff --git a/mm/zblock.h b/mm/zblock.h > new file mode 100644 > index 000000000000..fd72961c077a > --- /dev/null > +++ b/mm/zblock.h > @@ -0,0 +1,176 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +/* > + * Author: Vitaly Wool <vitaly.wool@konsulko.com> > + * Copyright (C) 2025, Konsulko AB. > + */ > +#ifndef __ZBLOCK_H__ > +#define __ZBLOCK_H__ > + > +#include <linux/mm.h> > +#include <linux/rbtree.h> > +#include <linux/types.h> > + > +#define SLOT_FREE 0 > +#define BIT_SLOT_OCCUPIED 0 > +#define BIT_SLOT_MAPPED 1 > + > +#if PAGE_SIZE == 0x1000 > +/* max 128 slots per block, max table size 32 */ > +#define SLOT_BITS 7 > +#elif PAGE_SIZE == 0x4000 > +/* max 256 slots per block, max table size 64 */ > +#define SLOT_BITS 8 > +#else > +#error Unsupported PAGE_SIZE > +#endif > + > +#define MAX_SLOTS (1 << SLOT_BITS) > +#define SLOT_MASK ((0x1UL << SLOT_BITS) - 1) > + > +#define ZBLOCK_HEADER_SIZE round_up(sizeof(struct zblock_block), > sizeof(long)) > +#define BLOCK_DATA_SIZE(order) ((PAGE_SIZE << order) - > ZBLOCK_HEADER_SIZE) > +#define SLOT_SIZE(nslots, order) (round_down((BLOCK_DATA_SIZE(order) / > nslots), sizeof(long))) > + > +/** > + * struct zblock_block - block metadata > + * Block consists of several (1/2/4/8) pages and contains fixed > + * integer number of slots for allocating compressed pages. > + * > + * free_slots: number of free slots in the block > + * slot_info: contains data about free/occupied slots > + */ > +struct zblock_block { > + struct list_head link; > + DECLARE_BITMAP(slot_info, 1 << SLOT_BITS); > + u32 free_slots; > +}; > + > +/** > + * struct block_desc - general metadata for block lists > + * Each block list stores only blocks of corresponding type which > means > + * that all blocks in it have the same number and size of slots. > + * All slots are aligned to size of long. > + * > + * slot_size: size of slot for this list > + * slots_per_block: number of slots per block for this list > + * order: order for __get_free_pages > + */ > +struct block_desc { > + unsigned int slot_size; > + unsigned short slots_per_block; > + unsigned short order; > +}; > + > +struct block_desc_node { > + struct rb_node node; > + unsigned int this_slot_size; > + unsigned int next_slot_size; > + unsigned int block_idx; > +}; > + > +static const struct block_desc block_desc[] = { > +#if PAGE_SIZE == 0x1000 > + { SLOT_SIZE(63, 0), 63, 0 }, > + { SLOT_SIZE(32, 0), 32, 0 }, > + { SLOT_SIZE(21, 0), 21, 0 }, > + { SLOT_SIZE(15, 0), 15, 0 }, > + { SLOT_SIZE(12, 0), 12, 0 }, > + { SLOT_SIZE(10, 0), 10, 0 }, > + { SLOT_SIZE(9, 0), 9, 0 }, > + { SLOT_SIZE(8, 0), 8, 0 }, > + { SLOT_SIZE(29, 2), 29, 2 }, > + { SLOT_SIZE(13, 1), 13, 1 }, > + { SLOT_SIZE(6, 0), 6, 0 }, > + { SLOT_SIZE(11, 1), 11, 1 }, > + { SLOT_SIZE(5, 0), 5, 0 }, > + { SLOT_SIZE(9, 1), 9, 1 }, > + { SLOT_SIZE(8, 1), 8, 1 }, > + { SLOT_SIZE(29, 3), 29, 3 }, > + { SLOT_SIZE(13, 2), 13, 2 }, > + { SLOT_SIZE(12, 2), 12, 2 }, > + { SLOT_SIZE(11, 2), 11, 2 }, > + { SLOT_SIZE(10, 2), 10, 2 }, > + { SLOT_SIZE(9, 2), 9, 2 }, > + { SLOT_SIZE(17, 3), 17, 3 }, > + { SLOT_SIZE(8, 2), 8, 2 }, > + { SLOT_SIZE(15, 3), 15, 3 }, > + { SLOT_SIZE(14, 3), 14, 3 }, > + { SLOT_SIZE(13, 3), 13, 3 }, > + { SLOT_SIZE(6, 2), 6, 2 }, > + { SLOT_SIZE(11, 3), 11, 3 }, > + { SLOT_SIZE(10, 3), 10, 3 }, > + { SLOT_SIZE(9, 3), 9, 3 }, > + { SLOT_SIZE(4, 2), 4, 2 }, > +#elif PAGE_SIZE == 0x4000 > + { SLOT_SIZE(255, 0), 255, 0 }, > + { SLOT_SIZE(185, 0), 185, 0 }, > + { SLOT_SIZE(145, 0), 145, 0 }, > + { SLOT_SIZE(113, 0), 113, 0 }, > + { SLOT_SIZE(92, 0), 92, 0 }, > + { SLOT_SIZE(75, 0), 75, 0 }, > + { SLOT_SIZE(60, 0), 60, 0 }, > + { SLOT_SIZE(51, 0), 51, 0 }, > + { SLOT_SIZE(43, 0), 43, 0 }, > + { SLOT_SIZE(37, 0), 37, 0 }, > + { SLOT_SIZE(32, 0), 32, 0 }, > + { SLOT_SIZE(27, 0), 27, 0 }, > + { SLOT_SIZE(23, 0), 23, 0 }, > + { SLOT_SIZE(19, 0), 19, 0 }, > + { SLOT_SIZE(17, 0), 17, 0 }, > + { SLOT_SIZE(15, 0), 15, 0 }, > + { SLOT_SIZE(13, 0), 13, 0 }, > + { SLOT_SIZE(11, 0), 11, 0 }, > + { SLOT_SIZE(10, 0), 10, 0 }, > + { SLOT_SIZE(9, 0), 9, 0 }, > + { SLOT_SIZE(8, 0), 8, 0 }, > + { SLOT_SIZE(15, 1), 15, 1 }, > + { SLOT_SIZE(14, 1), 14, 1 }, > + { SLOT_SIZE(13, 1), 13, 1 }, > + { SLOT_SIZE(12, 1), 12, 1 }, > + { SLOT_SIZE(11, 1), 11, 1 }, > + { SLOT_SIZE(10, 1), 10, 1 }, > + { SLOT_SIZE(9, 1), 9, 1 }, > + { SLOT_SIZE(8, 1), 8, 1 }, > + { SLOT_SIZE(15, 2), 15, 2 }, > + { SLOT_SIZE(14, 2), 14, 2 }, > + { SLOT_SIZE(13, 2), 13, 2 }, > + { SLOT_SIZE(12, 2), 12, 2 }, > + { SLOT_SIZE(11, 2), 11, 2 }, > + { SLOT_SIZE(10, 2), 10, 2 }, > + { SLOT_SIZE(9, 2), 9, 2 }, > + { SLOT_SIZE(8, 2), 8, 2 }, > + { SLOT_SIZE(7, 2), 7, 2 }, > + { SLOT_SIZE(6, 2), 6, 2 }, > + { SLOT_SIZE(5, 2), 5, 2 }, > +#endif /* PAGE_SIZE */ > +}; > + > +/** > + * struct block_list - stores metadata of particular list > + * lock: protects the list of blocks > + * active_list: linked list of active (non-full) blocks > + * full_list: linked list of full blocks > + * block_count: total number of blocks in the list > + */ > +struct block_list { > + spinlock_t lock; > + struct list_head active_list; > + struct list_head full_list; > + unsigned long block_count; > +}; > + > +/** > + * struct zblock_pool - stores metadata for each zblock pool > + * @block_lists: array of block lists > + * @zpool: zpool driver > + * > + * This structure is allocated at pool creation time and maintains > metadata > + * for a particular zblock pool. > + */ > +struct zblock_pool { > + struct block_list block_lists[ARRAY_SIZE(block_desc)]; > + struct zpool *zpool; > +}; > + > + > +#endif
diff --git a/Documentation/mm/zblock.rst b/Documentation/mm/zblock.rst new file mode 100644 index 000000000000..9751434d0b76 --- /dev/null +++ b/Documentation/mm/zblock.rst @@ -0,0 +1,24 @@ +.. SPDX-License-Identifier: GPL-2.0 + +====== +zblock +====== + +zblock is a special purpose allocator for storing compressed pages. +It stores integer number of compressed objects per its block. These +blocks consist of several physical pages (2**n, i. e. 1/2/4/8). + +With zblock, it is possible to densely arrange objects of various sizes +resulting in low internal fragmentation. Also this allocator tries to +fill incomplete blocks instead of adding new ones, in many cases +providing a compression ratio substantially higher than z3fold and zbud +(though lower than zmalloc's). + +zblock does not require MMU to operate and also is superior to zsmalloc +with regard to average performance and worst execution times, thus +allowing for better response time and real-time characteristics of the +whole system. + +E. g. on a series of stress-ng tests run on a Raspberry Pi 5, we get +5-10% higher value for bogo ops/s in zblock/zsmalloc comparison. + diff --git a/MAINTAINERS b/MAINTAINERS index 96b827049501..46465c986005 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -26640,6 +26640,13 @@ F: Documentation/networking/device_drivers/hamradio/z8530drv.rst F: drivers/net/hamradio/*scc.c F: drivers/net/hamradio/z8530.h +ZBLOCK COMPRESSED SLAB MEMORY ALLOCATOR +M: Vitaly Wool <vitaly.wool@konsulko.se> +L: linux-mm@kvack.org +S: Maintained +F: Documentation/mm/zblock.rst +F: mm/zblock.[ch] + ZD1211RW WIRELESS DRIVER L: linux-wireless@vger.kernel.org S: Orphan diff --git a/mm/Kconfig b/mm/Kconfig index e113f713b493..5aa1479151ec 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -152,6 +152,18 @@ config ZSWAP_ZPOOL_DEFAULT default "zsmalloc" if ZSWAP_ZPOOL_DEFAULT_ZSMALLOC default "" +config ZBLOCK + tristate "Fast compression allocator with high density" + depends on ZPOOL + help + A special purpose allocator for storing compressed pages. + It stores integer number of same size compressed objects per + its block. These blocks consist of several physical pages + (2**n, i. e. 1/2/4/8). + + With zblock, it is possible to densely arrange objects of + various sizes resulting in low internal fragmentation. + config ZSMALLOC tristate prompt "N:1 compression allocator (zsmalloc)" if (ZSWAP || ZRAM) diff --git a/mm/Makefile b/mm/Makefile index e7f6bbf8ae5f..9d7e5b5bb694 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -116,6 +116,7 @@ obj-$(CONFIG_DEBUG_VM_PGTABLE) += debug_vm_pgtable.o obj-$(CONFIG_PAGE_OWNER) += page_owner.o obj-$(CONFIG_MEMORY_ISOLATION) += page_isolation.o obj-$(CONFIG_ZPOOL) += zpool.o +obj-$(CONFIG_ZBLOCK) += zblock.o obj-$(CONFIG_ZSMALLOC) += zsmalloc.o obj-$(CONFIG_GENERIC_EARLY_IOREMAP) += early_ioremap.o obj-$(CONFIG_CMA) += cma.o diff --git a/mm/zblock.c b/mm/zblock.c new file mode 100644 index 000000000000..ecc7aeb611af --- /dev/null +++ b/mm/zblock.c @@ -0,0 +1,443 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * zblock.c + * + * Author: Vitaly Wool <vitaly.wool@konsulko.se> + * Based on the work from Ananda Badmaev <a.badmaev@clicknet.pro> + * Copyright (C) 2022-2025, Konsulko AB. + * + * Zblock is a small object allocator with the intention to serve as a + * zpool backend. It operates on page blocks which consist of number + * of physical pages being a power of 2 and store integer number of + * compressed pages per block which results in determinism and simplicity. + * + * zblock doesn't export any API and is meant to be used via zpool API. + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include <linux/atomic.h> +#include <linux/list.h> +#include <linux/mm.h> +#include <linux/module.h> +#include <linux/preempt.h> +#include <linux/slab.h> +#include <linux/spinlock.h> +#include <linux/zpool.h> +#include "zblock.h" + +static struct rb_root block_desc_tree = RB_ROOT; + +/* Encode handle of a particular slot in the pool using metadata */ +static inline unsigned long metadata_to_handle(struct zblock_block *block, + unsigned int block_type, unsigned int slot) +{ + return (unsigned long)(block) | (block_type << SLOT_BITS) | slot; +} + +/* Return block, block type and slot in the pool corresponding to handle */ +static inline struct zblock_block *handle_to_metadata(unsigned long handle, + unsigned int *block_type, unsigned int *slot) +{ + *block_type = (handle & (PAGE_SIZE - 1)) >> SLOT_BITS; + *slot = handle & SLOT_MASK; + return (struct zblock_block *)(handle & PAGE_MASK); +} + +/* + * Find a block with at least one free slot and claim it. + * We make sure that the first block, if exists, will always work. + */ +static inline struct zblock_block *find_and_claim_block(struct block_list *b, + int block_type, unsigned long *handle) +{ + struct list_head *l = &b->active_list; + unsigned int slot; + + if (!list_empty(l)) { + struct zblock_block *z = list_first_entry(l, typeof(*z), link); + + if (--z->free_slots == 0) + list_move(&z->link, &b->full_list); + /* + * There is a slot in the block and we just made sure it would + * remain. + * Find that slot and set the busy bit. + */ + for (slot = find_first_zero_bit(z->slot_info, + block_desc[block_type].slots_per_block); + slot < block_desc[block_type].slots_per_block; + slot = find_next_zero_bit(z->slot_info, + block_desc[block_type].slots_per_block, + slot)) { + if (!test_and_set_bit(slot, z->slot_info)) + break; + barrier(); + } + + WARN_ON(slot >= block_desc[block_type].slots_per_block); + *handle = metadata_to_handle(z, block_type, slot); + return z; + } + return NULL; +} + +/* + * allocate new block and add it to corresponding block list + */ +static struct zblock_block *alloc_block(struct zblock_pool *pool, + int block_type, gfp_t gfp, + unsigned long *handle) +{ + struct zblock_block *block; + struct block_list *block_list; + + block = (void *)__get_free_pages(gfp, block_desc[block_type].order); + if (!block) + return NULL; + + block_list = &pool->block_lists[block_type]; + + /* init block data */ + block->free_slots = block_desc[block_type].slots_per_block - 1; + memset(&block->slot_info, 0, sizeof(block->slot_info)); + set_bit(0, block->slot_info); + *handle = metadata_to_handle(block, block_type, 0); + + spin_lock(&block_list->lock); + list_add(&block->link, &block_list->active_list); + block_list->block_count++; + spin_unlock(&block_list->lock); + return block; +} + +/***************** + * API Functions + *****************/ +/** + * zblock_create_pool() - create a new zblock pool + * @gfp: gfp flags when allocating the zblock pool structure + * @ops: user-defined operations for the zblock pool + * + * Return: pointer to the new zblock pool or NULL if the metadata allocation + * failed. + */ +static struct zblock_pool *zblock_create_pool(gfp_t gfp) +{ + struct zblock_pool *pool; + struct block_list *block_list; + int i; + + pool = kmalloc(sizeof(struct zblock_pool), gfp); + if (!pool) + return NULL; + + /* init each block list */ + for (i = 0; i < ARRAY_SIZE(block_desc); i++) { + block_list = &pool->block_lists[i]; + spin_lock_init(&block_list->lock); + INIT_LIST_HEAD(&block_list->full_list); + INIT_LIST_HEAD(&block_list->active_list); + block_list->block_count = 0; + } + return pool; +} + +/** + * zblock_destroy_pool() - destroys an existing zblock pool + * @pool: the zblock pool to be destroyed + * + */ +static void zblock_destroy_pool(struct zblock_pool *pool) +{ + kfree(pool); +} + + +/** + * zblock_alloc() - allocates a slot of appropriate size + * @pool: zblock pool from which to allocate + * @size: size in bytes of the desired allocation + * @gfp: gfp flags used if the pool needs to grow + * @handle: handle of the new allocation + * + * Return: 0 if success and handle is set, otherwise -EINVAL if the size or + * gfp arguments are invalid or -ENOMEM if the pool was unable to allocate + * a new slot. + */ +static int zblock_alloc(struct zblock_pool *pool, size_t size, gfp_t gfp, + unsigned long *handle) +{ + int block_type = -1; + struct zblock_block *block; + struct block_list *block_list; + + if (!size) + return -EINVAL; + + if (size > PAGE_SIZE) + return -ENOSPC; + + /* find basic block type with suitable slot size */ + if (size < block_desc[0].slot_size) + block_type = 0; + else { + struct block_desc_node *block_node; + struct rb_node *node = block_desc_tree.rb_node; + + while (node) { + block_node = container_of(node, typeof(*block_node), node); + if (size < block_node->this_slot_size) + node = node->rb_left; + else if (size >= block_node->next_slot_size) + node = node->rb_right; + else { + block_type = block_node->block_idx + 1; + break; + } + } + } + if (WARN_ON(block_type < 0)) + return -EINVAL; + if (block_type >= ARRAY_SIZE(block_desc)) + return -ENOSPC; + + block_list = &pool->block_lists[block_type]; + + spin_lock(&block_list->lock); + block = find_and_claim_block(block_list, block_type, handle); + spin_unlock(&block_list->lock); + if (block) + return 0; + + /* not found block with free slots try to allocate new empty block */ + block = alloc_block(pool, block_type, gfp & ~(__GFP_MOVABLE | __GFP_HIGHMEM), handle); + return block ? 0 : -ENOMEM; +} + +/** + * zblock_free() - frees the allocation associated with the given handle + * @pool: pool in which the allocation resided + * @handle: handle associated with the allocation returned by zblock_alloc() + * + */ +static void zblock_free(struct zblock_pool *pool, unsigned long handle) +{ + unsigned int slot, block_type; + struct zblock_block *block; + struct block_list *block_list; + + block = handle_to_metadata(handle, &block_type, &slot); + block_list = &pool->block_lists[block_type]; + + spin_lock(&block_list->lock); + /* if all slots in block are empty delete whole block */ + if (++block->free_slots == block_desc[block_type].slots_per_block) { + block_list->block_count--; + list_del(&block->link); + spin_unlock(&block_list->lock); + free_pages((unsigned long)block, block_desc[block_type].order); + return; + } else if (block->free_slots == 1) + list_move_tail(&block->link, &block_list->active_list); + clear_bit(slot, block->slot_info); + spin_unlock(&block_list->lock); +} + +/** + * zblock_map() - maps the allocation associated with the given handle + * @pool: pool in which the allocation resides + * @handle: handle associated with the allocation to be mapped + * + * + * Returns: a pointer to the mapped allocation + */ +static void *zblock_map(struct zblock_pool *pool, unsigned long handle) +{ + unsigned int block_type, slot; + struct zblock_block *block; + unsigned long offs; + void *p; + + block = handle_to_metadata(handle, &block_type, &slot); + offs = ZBLOCK_HEADER_SIZE + slot * block_desc[block_type].slot_size; + p = (void *)block + offs; + return p; +} + +/** + * zblock_unmap() - unmaps the allocation associated with the given handle + * @pool: pool in which the allocation resides + * @handle: handle associated with the allocation to be unmapped + */ +static void zblock_unmap(struct zblock_pool *pool, unsigned long handle) +{ +} + +/** + * zblock_write() - write to the memory area defined by handle + * @pool: pool in which the allocation resides + * @handle: handle associated with the allocation + * @handle_mem: pointer to source memory block + * @mem_len: length of the memory block to write + */ +static void zblock_write(struct zblock_pool *pool, unsigned long handle, + void *handle_mem, size_t mem_len) +{ + unsigned int block_type, slot; + struct zblock_block *block; + unsigned long offs; + void *p; + + block = handle_to_metadata(handle, &block_type, &slot); + offs = ZBLOCK_HEADER_SIZE + slot * block_desc[block_type].slot_size; + p = (void *)block + offs; + memcpy(p, handle_mem, mem_len); +} + +/** + * zblock_get_total_pages() - gets the zblock pool size in pages + * @pool: pool being queried + * + * Returns: size in bytes of the given pool. + */ +static u64 zblock_get_total_pages(struct zblock_pool *pool) +{ + u64 total_size; + int i; + + total_size = 0; + for (i = 0; i < ARRAY_SIZE(block_desc); i++) + total_size += pool->block_lists[i].block_count << block_desc[i].order; + + return total_size; +} + +/***************** + * zpool + ****************/ + +static void *zblock_zpool_create(const char *name, gfp_t gfp) +{ + return zblock_create_pool(gfp); +} + +static void zblock_zpool_destroy(void *pool) +{ + zblock_destroy_pool(pool); +} + +static int zblock_zpool_malloc(void *pool, size_t size, gfp_t gfp, + unsigned long *handle) +{ + return zblock_alloc(pool, size, gfp, handle); +} + +static void zblock_zpool_free(void *pool, unsigned long handle) +{ + zblock_free(pool, handle); +} + +static void *zblock_zpool_read_begin(void *pool, unsigned long handle, + void *local_copy) +{ + return zblock_map(pool, handle); +} + +static void zblock_zpool_obj_write(void *pool, unsigned long handle, + void *handle_mem, size_t mem_len) +{ + zblock_write(pool, handle, handle_mem, mem_len); +} + +static void zblock_zpool_read_end(void *pool, unsigned long handle, + void *handle_mem) +{ + zblock_unmap(pool, handle); +} + +static u64 zblock_zpool_total_pages(void *pool) +{ + return zblock_get_total_pages(pool); +} + +static struct zpool_driver zblock_zpool_driver = { + .type = "zblock", + .owner = THIS_MODULE, + .create = zblock_zpool_create, + .destroy = zblock_zpool_destroy, + .malloc = zblock_zpool_malloc, + .free = zblock_zpool_free, + .obj_read_begin = zblock_zpool_read_begin, + .obj_read_end = zblock_zpool_read_end, + .obj_write = zblock_zpool_obj_write, + .total_pages = zblock_zpool_total_pages, +}; + +MODULE_ALIAS("zpool-zblock"); + +static void delete_rbtree(void) +{ + while (!RB_EMPTY_ROOT(&block_desc_tree)) + rb_erase(block_desc_tree.rb_node, &block_desc_tree); +} + +static int __init create_rbtree(void) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(block_desc); i++) { + struct block_desc_node *block_node = kmalloc(sizeof(*block_node), + GFP_KERNEL); + struct rb_node **new = &block_desc_tree.rb_node, *parent = NULL; + + if (!block_node) { + delete_rbtree(); + return -ENOMEM; + } + if (i > 0 && block_desc[i].slot_size <= block_desc[i-1].slot_size) { + pr_err("%s: block descriptors not in ascending order\n", + __func__); + delete_rbtree(); + return -EINVAL; + } + block_node->this_slot_size = block_desc[i].slot_size; + block_node->block_idx = i; + if (i == ARRAY_SIZE(block_desc) - 1) + block_node->next_slot_size = PAGE_SIZE; + else + block_node->next_slot_size = block_desc[i+1].slot_size; + while (*new) { + parent = *new; + /* the array is sorted so we will always go to the right */ + new = &((*new)->rb_right); + } + rb_link_node(&block_node->node, parent, new); + rb_insert_color(&block_node->node, &block_desc_tree); + } + return 0; +} + +static int __init init_zblock(void) +{ + int ret = create_rbtree(); + + if (ret) + return ret; + + zpool_register_driver(&zblock_zpool_driver); + return 0; +} + +static void __exit exit_zblock(void) +{ + zpool_unregister_driver(&zblock_zpool_driver); + delete_rbtree(); +} + +module_init(init_zblock); +module_exit(exit_zblock); + +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Vitaly Wool <vitaly.wool@konsulko.se>"); +MODULE_DESCRIPTION("Block allocator for compressed pages"); diff --git a/mm/zblock.h b/mm/zblock.h new file mode 100644 index 000000000000..fd72961c077a --- /dev/null +++ b/mm/zblock.h @@ -0,0 +1,176 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Author: Vitaly Wool <vitaly.wool@konsulko.com> + * Copyright (C) 2025, Konsulko AB. + */ +#ifndef __ZBLOCK_H__ +#define __ZBLOCK_H__ + +#include <linux/mm.h> +#include <linux/rbtree.h> +#include <linux/types.h> + +#define SLOT_FREE 0 +#define BIT_SLOT_OCCUPIED 0 +#define BIT_SLOT_MAPPED 1 + +#if PAGE_SIZE == 0x1000 +/* max 128 slots per block, max table size 32 */ +#define SLOT_BITS 7 +#elif PAGE_SIZE == 0x4000 +/* max 256 slots per block, max table size 64 */ +#define SLOT_BITS 8 +#else +#error Unsupported PAGE_SIZE +#endif + +#define MAX_SLOTS (1 << SLOT_BITS) +#define SLOT_MASK ((0x1UL << SLOT_BITS) - 1) + +#define ZBLOCK_HEADER_SIZE round_up(sizeof(struct zblock_block), sizeof(long)) +#define BLOCK_DATA_SIZE(order) ((PAGE_SIZE << order) - ZBLOCK_HEADER_SIZE) +#define SLOT_SIZE(nslots, order) (round_down((BLOCK_DATA_SIZE(order) / nslots), sizeof(long))) + +/** + * struct zblock_block - block metadata + * Block consists of several (1/2/4/8) pages and contains fixed + * integer number of slots for allocating compressed pages. + * + * free_slots: number of free slots in the block + * slot_info: contains data about free/occupied slots + */ +struct zblock_block { + struct list_head link; + DECLARE_BITMAP(slot_info, 1 << SLOT_BITS); + u32 free_slots; +}; + +/** + * struct block_desc - general metadata for block lists + * Each block list stores only blocks of corresponding type which means + * that all blocks in it have the same number and size of slots. + * All slots are aligned to size of long. + * + * slot_size: size of slot for this list + * slots_per_block: number of slots per block for this list + * order: order for __get_free_pages + */ +struct block_desc { + unsigned int slot_size; + unsigned short slots_per_block; + unsigned short order; +}; + +struct block_desc_node { + struct rb_node node; + unsigned int this_slot_size; + unsigned int next_slot_size; + unsigned int block_idx; +}; + +static const struct block_desc block_desc[] = { +#if PAGE_SIZE == 0x1000 + { SLOT_SIZE(63, 0), 63, 0 }, + { SLOT_SIZE(32, 0), 32, 0 }, + { SLOT_SIZE(21, 0), 21, 0 }, + { SLOT_SIZE(15, 0), 15, 0 }, + { SLOT_SIZE(12, 0), 12, 0 }, + { SLOT_SIZE(10, 0), 10, 0 }, + { SLOT_SIZE(9, 0), 9, 0 }, + { SLOT_SIZE(8, 0), 8, 0 }, + { SLOT_SIZE(29, 2), 29, 2 }, + { SLOT_SIZE(13, 1), 13, 1 }, + { SLOT_SIZE(6, 0), 6, 0 }, + { SLOT_SIZE(11, 1), 11, 1 }, + { SLOT_SIZE(5, 0), 5, 0 }, + { SLOT_SIZE(9, 1), 9, 1 }, + { SLOT_SIZE(8, 1), 8, 1 }, + { SLOT_SIZE(29, 3), 29, 3 }, + { SLOT_SIZE(13, 2), 13, 2 }, + { SLOT_SIZE(12, 2), 12, 2 }, + { SLOT_SIZE(11, 2), 11, 2 }, + { SLOT_SIZE(10, 2), 10, 2 }, + { SLOT_SIZE(9, 2), 9, 2 }, + { SLOT_SIZE(17, 3), 17, 3 }, + { SLOT_SIZE(8, 2), 8, 2 }, + { SLOT_SIZE(15, 3), 15, 3 }, + { SLOT_SIZE(14, 3), 14, 3 }, + { SLOT_SIZE(13, 3), 13, 3 }, + { SLOT_SIZE(6, 2), 6, 2 }, + { SLOT_SIZE(11, 3), 11, 3 }, + { SLOT_SIZE(10, 3), 10, 3 }, + { SLOT_SIZE(9, 3), 9, 3 }, + { SLOT_SIZE(4, 2), 4, 2 }, +#elif PAGE_SIZE == 0x4000 + { SLOT_SIZE(255, 0), 255, 0 }, + { SLOT_SIZE(185, 0), 185, 0 }, + { SLOT_SIZE(145, 0), 145, 0 }, + { SLOT_SIZE(113, 0), 113, 0 }, + { SLOT_SIZE(92, 0), 92, 0 }, + { SLOT_SIZE(75, 0), 75, 0 }, + { SLOT_SIZE(60, 0), 60, 0 }, + { SLOT_SIZE(51, 0), 51, 0 }, + { SLOT_SIZE(43, 0), 43, 0 }, + { SLOT_SIZE(37, 0), 37, 0 }, + { SLOT_SIZE(32, 0), 32, 0 }, + { SLOT_SIZE(27, 0), 27, 0 }, + { SLOT_SIZE(23, 0), 23, 0 }, + { SLOT_SIZE(19, 0), 19, 0 }, + { SLOT_SIZE(17, 0), 17, 0 }, + { SLOT_SIZE(15, 0), 15, 0 }, + { SLOT_SIZE(13, 0), 13, 0 }, + { SLOT_SIZE(11, 0), 11, 0 }, + { SLOT_SIZE(10, 0), 10, 0 }, + { SLOT_SIZE(9, 0), 9, 0 }, + { SLOT_SIZE(8, 0), 8, 0 }, + { SLOT_SIZE(15, 1), 15, 1 }, + { SLOT_SIZE(14, 1), 14, 1 }, + { SLOT_SIZE(13, 1), 13, 1 }, + { SLOT_SIZE(12, 1), 12, 1 }, + { SLOT_SIZE(11, 1), 11, 1 }, + { SLOT_SIZE(10, 1), 10, 1 }, + { SLOT_SIZE(9, 1), 9, 1 }, + { SLOT_SIZE(8, 1), 8, 1 }, + { SLOT_SIZE(15, 2), 15, 2 }, + { SLOT_SIZE(14, 2), 14, 2 }, + { SLOT_SIZE(13, 2), 13, 2 }, + { SLOT_SIZE(12, 2), 12, 2 }, + { SLOT_SIZE(11, 2), 11, 2 }, + { SLOT_SIZE(10, 2), 10, 2 }, + { SLOT_SIZE(9, 2), 9, 2 }, + { SLOT_SIZE(8, 2), 8, 2 }, + { SLOT_SIZE(7, 2), 7, 2 }, + { SLOT_SIZE(6, 2), 6, 2 }, + { SLOT_SIZE(5, 2), 5, 2 }, +#endif /* PAGE_SIZE */ +}; + +/** + * struct block_list - stores metadata of particular list + * lock: protects the list of blocks + * active_list: linked list of active (non-full) blocks + * full_list: linked list of full blocks + * block_count: total number of blocks in the list + */ +struct block_list { + spinlock_t lock; + struct list_head active_list; + struct list_head full_list; + unsigned long block_count; +}; + +/** + * struct zblock_pool - stores metadata for each zblock pool + * @block_lists: array of block lists + * @zpool: zpool driver + * + * This structure is allocated at pool creation time and maintains metadata + * for a particular zblock pool. + */ +struct zblock_pool { + struct block_list block_lists[ARRAY_SIZE(block_desc)]; + struct zpool *zpool; +}; + + +#endif