mbox series

[v2,0/6] rds: rdma: Add ability to force GFP_NOIO

Message ID 20240515125342.1069999-1-haakon.bugge@oracle.com (mailing list archive)
Headers show
Series rds: rdma: Add ability to force GFP_NOIO | expand

Message

Haakon Bugge May 15, 2024, 12:53 p.m. UTC
This series enables RDS and the RDMA stack to be used as a block I/O
device. This to support a filesystem on top of a raw block device
which uses RDS and the RDMA stack as the network transport layer.

Under intense memory pressure, we get memory reclaims. Assume the
filesystem reclaims memory, goes to the raw block device, which calls
into RDS, which calls the RDMA stack. Now, if regular GFP_KERNEL
allocations in RDS or the RDMA stack require reclaims to be fulfilled,
we end up in a circular dependency.

We break this circular dependency by:

1. Force all allocations in RDS and the relevant RDMA stack to use
   GFP_NOIO, by means of a parenthetic use of
   memalloc_noio_{save,restore} on all relevant entry points.

2. Make sure work-queues inherits current->flags
   wrt. PF_MEMALLOC_{NOIO,NOFS}, such that work executed on the
   work-queue inherits the same flag(s).

Håkon Bugge (6):
  workqueue: Inherit NOIO and NOFS alloc flags
  rds: Brute force GFP_NOIO
  RDMA/cma: Brute force GFP_NOIO
  RDMA/cm: Brute force GFP_NOIO
  RDMA/mlx5: Brute force GFP_NOIO
  net/mlx5: Brute force GFP_NOIO

 drivers/infiniband/core/cm.c                  | 15 ++++-
 drivers/infiniband/core/cma.c                 | 20 ++++++-
 drivers/infiniband/hw/mlx5/main.c             | 22 +++++--
 .../net/ethernet/mellanox/mlx5/core/main.c    | 14 ++++-
 include/linux/workqueue.h                     |  2 +
 kernel/workqueue.c                            | 21 +++++++
 net/rds/af_rds.c                              | 59 ++++++++++++++++++-
 7 files changed, 141 insertions(+), 12 deletions(-)

--
2.45.0

Comments

Christoph Hellwig May 21, 2024, 2:24 p.m. UTC | #1
On Wed, May 15, 2024 at 02:53:36PM +0200, Håkon Bugge wrote:
> This series enables RDS and the RDMA stack to be used as a block I/O
> device. This to support a filesystem on top of a raw block device
> which uses RDS and the RDMA stack as the network transport layer.
> 
> Under intense memory pressure, we get memory reclaims. Assume the
> filesystem reclaims memory, goes to the raw block device, which calls
> into RDS, which calls the RDMA stack. Now, if regular GFP_KERNEL
> allocations in RDS or the RDMA stack require reclaims to be fulfilled,
> we end up in a circular dependency.

Use of network block devices or file systems from the local system
simply isn't supported in the Linux reclaim hierchary.  Trying to
hack in through module options for code you haven't even submitted
is a complete nogo.

NAK.