diff mbox series

[vfio,2/6] vfio/mlx5: Allow loading of larger images than 512 MB

Message ID 20221229100734.224388-3-yishaih@nvidia.com (mailing list archive)
State New, archived
Headers show
Series Move to use cgroups for userspace persistent allocations | expand

Commit Message

Yishai Hadas Dec. 29, 2022, 10:07 a.m. UTC
Allow loading of larger images than 512 MB by dropping the arbitrary
hard-coded value that we have today and move to use the max device
loading value which is for now 4GB.

As part of that we move to use the GFP_KERNEL_ACCOUNT option upon
allocating the persistent data of mlx5 and rely on the cgroup to provide
the memory limit for the given user.

The GFP_KERNEL_ACCOUNT option lets the memory allocator know that this
is untrusted allocation triggered from userspace and should be a subject
of kmem accountingis, and as such it is controlled by the cgroup
mechanism.

Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
---
 drivers/vfio/pci/mlx5/cmd.c  | 11 ++++++-----
 drivers/vfio/pci/mlx5/main.c | 19 ++++++++++---------
 2 files changed, 16 insertions(+), 14 deletions(-)

Comments

Yishai Hadas Jan. 2, 2023, 5:10 p.m. UTC | #1
On 01/01/2023 1:39, kernel test robot wrote:
> Hi Yishai,
>
> I love your patch! Perhaps something to improve:
>
> [auto build test WARNING on linus/master]
> [also build test WARNING on v6.2-rc1 next-20221226]
> [cannot apply to awilliam-vfio/next awilliam-vfio/for-linus]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
>
> url:    https://github.com/intel-lab-lkp/linux/commits/Yishai-Hadas/vfio-mlx5-Fix-UBSAN-note/20221229-181908
> patch link:    https://lore.kernel.org/r/20221229100734.224388-3-yishaih%40nvidia.com
> patch subject: [PATCH vfio 2/6] vfio/mlx5: Allow loading of larger images than 512 MB
> config: csky-allyesconfig
> compiler: csky-linux-gcc (GCC) 12.1.0
> reproduce (this is a W=1 build):
>          wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>          chmod +x ~/bin/make.cross
>          # https://github.com/intel-lab-lkp/linux/commit/88b79e8face73c38db6ca0d13fd1c22a0fb4a818
>          git remote add linux-review https://github.com/intel-lab-lkp/linux
>          git fetch --no-tags linux-review Yishai-Hadas/vfio-mlx5-Fix-UBSAN-note/20221229-181908
>          git checkout 88b79e8face73c38db6ca0d13fd1c22a0fb4a818
>          # save the config file
>          mkdir build_dir && cp config build_dir/.config
>          COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=csky olddefconfig
>          COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=csky SHELL=/bin/bash drivers/block/rnbd/ drivers/vfio/pci/mlx5/
>
> If you fix the issue, kindly add following tag where applicable
> | Reported-by: kernel test robot <lkp@intel.com>
>
> All warnings (new ones prefixed by >>):
>
>     In file included from include/linux/bits.h:6,
>                      from include/linux/ratelimit_types.h:5,
>                      from include/linux/ratelimit.h:5,
>                      from include/linux/dev_printk.h:16,
>                      from include/linux/device.h:15,
>                      from drivers/vfio/pci/mlx5/main.c:6:
>     drivers/vfio/pci/mlx5/main.c: In function 'mlx5vf_resume_read_image_no_header':
>>> include/vdso/bits.h:7:40: warning: left shift count >= width of type [-Wshift-count-overflow]
>         7 | #define BIT(nr)                 (UL(1) << (nr))
>           |                                        ^~
>     drivers/vfio/pci/mlx5/main.c:25:24: note: in expansion of macro 'BIT'
>        25 | #define MAX_LOAD_SIZE (BIT(__mlx5_bit_sz(load_vhca_state_in, size)) - 1)
>           |                        ^~~
>     drivers/vfio/pci/mlx5/main.c:568:32: note: in expansion of macro 'MAX_LOAD_SIZE'
>       568 |         if (requested_length > MAX_LOAD_SIZE)
>           |                                ^~~~~~~~~~~~~
>     drivers/vfio/pci/mlx5/main.c: In function 'mlx5vf_resume_read_header':
>>> include/vdso/bits.h:7:40: warning: left shift count >= width of type [-Wshift-count-overflow]
>         7 | #define BIT(nr)                 (UL(1) << (nr))
>           |                                        ^~
>     drivers/vfio/pci/mlx5/main.c:25:24: note: in expansion of macro 'BIT'
>        25 | #define MAX_LOAD_SIZE (BIT(__mlx5_bit_sz(load_vhca_state_in, size)) - 1)
>           |                        ^~~
>     drivers/vfio/pci/mlx5/main.c:652:51: note: in expansion of macro 'MAX_LOAD_SIZE'
>       652 |                 if (vhca_buf->header_image_size > MAX_LOAD_SIZE) {
>           |                                                   ^~~~~~~~~~~~~
>
>
> vim +7 include/vdso/bits.h
>
> 3945ff37d2f48d Vincenzo Frascino 2020-03-20  6
> 3945ff37d2f48d Vincenzo Frascino 2020-03-20 @7  #define BIT(nr)			(UL(1) << (nr))
> 3945ff37d2f48d Vincenzo Frascino 2020-03-20  8
>
I'll fix as part of V1 to use BIT_ULL instead of BIT.

In the meantime, I run some extra testing over 6.2 RC2 but I get wired 
OOPs once the GFP_KERNEL_ACCOUNT is used over alloc_pages_bulk_array() 
as part of this patch.

Maybe some issue that came from the merge window in the SG area, still 
not sure about the root cause.

Alex,

This wasn't the case over your vfio/next which is yet over 6.1.

So, let's hold on here, once I'll have a working series over 6.2 may 
re-post V1.

Thanks,
Yishai
diff mbox series

Patch

diff --git a/drivers/vfio/pci/mlx5/cmd.c b/drivers/vfio/pci/mlx5/cmd.c
index c5dcddbc4126..0586f09c69af 100644
--- a/drivers/vfio/pci/mlx5/cmd.c
+++ b/drivers/vfio/pci/mlx5/cmd.c
@@ -373,7 +373,7 @@  mlx5vf_alloc_data_buffer(struct mlx5_vf_migration_file *migf,
 	struct mlx5_vhca_data_buffer *buf;
 	int ret;
 
-	buf = kzalloc(sizeof(*buf), GFP_KERNEL);
+	buf = kzalloc(sizeof(*buf), GFP_KERNEL_ACCOUNT);
 	if (!buf)
 		return ERR_PTR(-ENOMEM);
 
@@ -1032,7 +1032,7 @@  mlx5vf_create_rc_qp(struct mlx5_core_dev *mdev,
 	void *in;
 	int err;
 
-	qp = kzalloc(sizeof(*qp), GFP_KERNEL);
+	qp = kzalloc(sizeof(*qp), GFP_KERNEL_ACCOUNT);
 	if (!qp)
 		return ERR_PTR(-ENOMEM);
 
@@ -1213,12 +1213,13 @@  static int alloc_recv_pages(struct mlx5_vhca_recv_buf *recv_buf,
 	int i;
 
 	recv_buf->page_list = kvcalloc(npages, sizeof(*recv_buf->page_list),
-				       GFP_KERNEL);
+				       GFP_KERNEL_ACCOUNT);
 	if (!recv_buf->page_list)
 		return -ENOMEM;
 
 	for (;;) {
-		filled = alloc_pages_bulk_array(GFP_KERNEL, npages - done,
+		filled = alloc_pages_bulk_array(GFP_KERNEL_ACCOUNT,
+						npages - done,
 						recv_buf->page_list + done);
 		if (!filled)
 			goto err;
@@ -1248,7 +1249,7 @@  static int register_dma_recv_pages(struct mlx5_core_dev *mdev,
 
 	recv_buf->dma_addrs = kvcalloc(recv_buf->npages,
 				       sizeof(*recv_buf->dma_addrs),
-				       GFP_KERNEL);
+				       GFP_KERNEL_ACCOUNT);
 	if (!recv_buf->dma_addrs)
 		return -ENOMEM;
 
diff --git a/drivers/vfio/pci/mlx5/main.c b/drivers/vfio/pci/mlx5/main.c
index 9feb89c6d939..79de38931d24 100644
--- a/drivers/vfio/pci/mlx5/main.c
+++ b/drivers/vfio/pci/mlx5/main.c
@@ -21,8 +21,8 @@ 
 
 #include "cmd.h"
 
-/* Arbitrary to prevent userspace from consuming endless memory */
-#define MAX_MIGRATION_SIZE (512*1024*1024)
+/* Device specification max LOAD size */
+#define MAX_LOAD_SIZE (BIT(__mlx5_bit_sz(load_vhca_state_in, size)) - 1)
 
 static struct mlx5vf_pci_core_device *mlx5vf_drvdata(struct pci_dev *pdev)
 {
@@ -73,12 +73,13 @@  int mlx5vf_add_migration_pages(struct mlx5_vhca_data_buffer *buf,
 	int ret;
 
 	to_fill = min_t(unsigned int, npages, PAGE_SIZE / sizeof(*page_list));
-	page_list = kvzalloc(to_fill * sizeof(*page_list), GFP_KERNEL);
+	page_list = kvzalloc(to_fill * sizeof(*page_list), GFP_KERNEL_ACCOUNT);
 	if (!page_list)
 		return -ENOMEM;
 
 	do {
-		filled = alloc_pages_bulk_array(GFP_KERNEL, to_fill, page_list);
+		filled = alloc_pages_bulk_array(GFP_KERNEL_ACCOUNT, to_fill,
+						page_list);
 		if (!filled) {
 			ret = -ENOMEM;
 			goto err;
@@ -87,7 +88,7 @@  int mlx5vf_add_migration_pages(struct mlx5_vhca_data_buffer *buf,
 		ret = sg_alloc_append_table_from_pages(
 			&buf->table, page_list, filled, 0,
 			filled << PAGE_SHIFT, UINT_MAX, SG_MAX_SINGLE_ALLOC,
-			GFP_KERNEL);
+			GFP_KERNEL_ACCOUNT);
 
 		if (ret)
 			goto err;
@@ -467,7 +468,7 @@  mlx5vf_pci_save_device_data(struct mlx5vf_pci_core_device *mvdev, bool track)
 	size_t length;
 	int ret;
 
-	migf = kzalloc(sizeof(*migf), GFP_KERNEL);
+	migf = kzalloc(sizeof(*migf), GFP_KERNEL_ACCOUNT);
 	if (!migf)
 		return ERR_PTR(-ENOMEM);
 
@@ -564,7 +565,7 @@  mlx5vf_resume_read_image_no_header(struct mlx5_vhca_data_buffer *vhca_buf,
 {
 	int ret;
 
-	if (requested_length > MAX_MIGRATION_SIZE)
+	if (requested_length > MAX_LOAD_SIZE)
 		return -ENOMEM;
 
 	if (vhca_buf->allocated_length < requested_length) {
@@ -648,7 +649,7 @@  mlx5vf_resume_read_header(struct mlx5_vf_migration_file *migf,
 		u64 flags;
 
 		vhca_buf->header_image_size = le64_to_cpup((__le64 *)to_buff);
-		if (vhca_buf->header_image_size > MAX_MIGRATION_SIZE) {
+		if (vhca_buf->header_image_size > MAX_LOAD_SIZE) {
 			ret = -ENOMEM;
 			goto end;
 		}
@@ -781,7 +782,7 @@  mlx5vf_pci_resume_device_data(struct mlx5vf_pci_core_device *mvdev)
 	struct mlx5_vhca_data_buffer *buf;
 	int ret;
 
-	migf = kzalloc(sizeof(*migf), GFP_KERNEL);
+	migf = kzalloc(sizeof(*migf), GFP_KERNEL_ACCOUNT);
 	if (!migf)
 		return ERR_PTR(-ENOMEM);