mbox series

[v3,0/1] xen/blkback: Squeeze page pools if a memory pressure

Message ID 20191209085839.21215-1-sjpark@amazon.com (mailing list archive)
Headers show
Series xen/blkback: Squeeze page pools if a memory pressure | expand

Message

SeongJae Park Dec. 9, 2019, 8:58 a.m. UTC
Each `blkif` has a free pages pool for the grant mapping.  The size of
the pool starts from zero and be increased on demand while processing
the I/O requests.  If current I/O requests handling is finished or 100
milliseconds has passed since last I/O requests handling, it checks and
shrinks the pool to not exceed the size limit, `max_buffer_pages`.

Therefore, `blkfront` running guests can cause a memory pressure in the
`blkback` running guest by attaching a large number of block devices and
inducing I/O.  System administrators can avoid such problematic
situations by limiting the maximum number of devices each guest can
attach.  However, finding the optimal limit is not so easy.  Improper
set of the limit can results in the memory pressure or a resource
underutilization.  This commit avoids such problematic situations by
squeezing the pools (returns every free page in the pool to the system)
for a while (users can set this duration via a module parameter) if a
memory pressure is detected.


Base Version
------------

This patch is based on v5.4.  A complete tree is also available at my
public git repo:
https://github.com/sjp38/linux/tree/blkback_aggressive_shrinking_v3


Patch History
-------------

Changes from v2 (https://lore.kernel.org/linux-block/af195033-23d5-38ed-b73b-f6e2e3b34541@amazon.com)
 - Rename the module parameter and variables for brevity (aggressive
   shrinking -> squeezing)

Changes from v1 (https://lore.kernel.org/xen-devel/20191204113419.2298-1-sjpark@amazon.com/)
 - Adjust the description to not use the term, `arbitrarily` (suggested
   by Paul Durrant)
 - Specify time unit of the duration in the parameter description,
   (suggested by Maximilian Heyne)
 - Change default aggressive shrinking duration from 1ms to 10ms
 - Merge two patches into one single patch

SeongJae Park (1):
  xen/blkback: Squeeze page pools if a memory pressure is detected

 drivers/block/xen-blkback/blkback.c | 35 +++++++++++++++++++++++++++--
 1 file changed, 33 insertions(+), 2 deletions(-)

Comments

Jürgen Groß Dec. 9, 2019, 9:39 a.m. UTC | #1
On 09.12.19 09:58, SeongJae Park wrote:
> Each `blkif` has a free pages pool for the grant mapping.  The size of
> the pool starts from zero and be increased on demand while processing
> the I/O requests.  If current I/O requests handling is finished or 100
> milliseconds has passed since last I/O requests handling, it checks and
> shrinks the pool to not exceed the size limit, `max_buffer_pages`.
> 
> Therefore, `blkfront` running guests can cause a memory pressure in the
> `blkback` running guest by attaching a large number of block devices and
> inducing I/O.

I'm having problems to understand how a guest can attach a large number
of block devices without those having been configured by the host admin
before.

If those devices have been configured, dom0 should be ready for that
number of devices, e.g. by having enough spare memory area for ballooned
pages.

So either I'm missing something here or your reasoning for the need of
the patch is wrong.


Juergen
Paul Durrant Dec. 9, 2019, 9:46 a.m. UTC | #2
> -----Original Message-----
> From: Jürgen Groß <jgross@suse.com>
> Sent: 09 December 2019 09:39
> To: Park, Seongjae <sjpark@amazon.com>; axboe@kernel.dk;
> konrad.wilk@oracle.com; roger.pau@citrix.com
> Cc: linux-block@vger.kernel.org; linux-kernel@vger.kernel.org; Durrant,
> Paul <pdurrant@amazon.com>; sj38.park@gmail.com; xen-
> devel@lists.xenproject.org
> Subject: Re: [PATCH v3 0/1] xen/blkback: Squeeze page pools if a memory
> pressure
> 
> On 09.12.19 09:58, SeongJae Park wrote:
> > Each `blkif` has a free pages pool for the grant mapping.  The size of
> > the pool starts from zero and be increased on demand while processing
> > the I/O requests.  If current I/O requests handling is finished or 100
> > milliseconds has passed since last I/O requests handling, it checks and
> > shrinks the pool to not exceed the size limit, `max_buffer_pages`.
> >
> > Therefore, `blkfront` running guests can cause a memory pressure in the
> > `blkback` running guest by attaching a large number of block devices and
> > inducing I/O.
> 
> I'm having problems to understand how a guest can attach a large number
> of block devices without those having been configured by the host admin
> before.
> 
> If those devices have been configured, dom0 should be ready for that
> number of devices, e.g. by having enough spare memory area for ballooned
> pages.
> 
> So either I'm missing something here or your reasoning for the need of
> the patch is wrong.
> 

I think the underlying issue is that persistent grant support is hogging memory in the backends, thereby compromising scalability. IIUC this patch is essentially a band-aid to get back to the scalability that was possible before persistent grant support was added. Ultimately the right answer should be to get rid of persistent grants support and use grant copy, but such a change is clearly more invasive and would need far more testing.

  Paul
Jürgen Groß Dec. 9, 2019, 10:15 a.m. UTC | #3
On 09.12.19 10:46, Durrant, Paul wrote:
>> -----Original Message-----
>> From: Jürgen Groß <jgross@suse.com>
>> Sent: 09 December 2019 09:39
>> To: Park, Seongjae <sjpark@amazon.com>; axboe@kernel.dk;
>> konrad.wilk@oracle.com; roger.pau@citrix.com
>> Cc: linux-block@vger.kernel.org; linux-kernel@vger.kernel.org; Durrant,
>> Paul <pdurrant@amazon.com>; sj38.park@gmail.com; xen-
>> devel@lists.xenproject.org
>> Subject: Re: [PATCH v3 0/1] xen/blkback: Squeeze page pools if a memory
>> pressure
>>
>> On 09.12.19 09:58, SeongJae Park wrote:
>>> Each `blkif` has a free pages pool for the grant mapping.  The size of
>>> the pool starts from zero and be increased on demand while processing
>>> the I/O requests.  If current I/O requests handling is finished or 100
>>> milliseconds has passed since last I/O requests handling, it checks and
>>> shrinks the pool to not exceed the size limit, `max_buffer_pages`.
>>>
>>> Therefore, `blkfront` running guests can cause a memory pressure in the
>>> `blkback` running guest by attaching a large number of block devices and
>>> inducing I/O.
>>
>> I'm having problems to understand how a guest can attach a large number
>> of block devices without those having been configured by the host admin
>> before.
>>
>> If those devices have been configured, dom0 should be ready for that
>> number of devices, e.g. by having enough spare memory area for ballooned
>> pages.
>>
>> So either I'm missing something here or your reasoning for the need of
>> the patch is wrong.
>>
> 
> I think the underlying issue is that persistent grant support is hogging memory in the backends, thereby compromising scalability. IIUC this patch is essentially a band-aid to get back to the scalability that was possible before persistent grant support was added. Ultimately the right answer should be to get rid of persistent grants support and use grant copy, but such a change is clearly more invasive and would need far more testing.

Persistent grants are hogging ballooned pages, which is equivalent to
memory only in case of the backend's domain memory being equal or
rather near to its max memory size.

So configuring the backend domain with enough spare area for ballooned
pages should make this problem much less serious.

Another problem in this area is the amount of maptrack frames configured
for a driver domain, which will limit the number of concurrent foreign
mappings of that domain.

So instead of having a blkback specific solution I'd rather have a
common callback for backends to release foreign mappings in order to
enable a global resource management.


Juergen
SeongJae Park Dec. 9, 2019, 10:23 a.m. UTC | #4
On   Mon, 9 Dec 2019 10:39:02 +0100  Juergen <jgross@suse.com> wrote:

>On 09.12.19 09:58, SeongJae Park wrote:
>> Each `blkif` has a free pages pool for the grant mapping.  The size of
>> the pool starts from zero and be increased on demand while processing
>> the I/O requests.  If current I/O requests handling is finished or 100
>> milliseconds has passed since last I/O requests handling, it checks and
>> shrinks the pool to not exceed the size limit, `max_buffer_pages`.
>>
>> Therefore, `blkfront` running guests can cause a memory pressure in the
>> `blkback` running guest by attaching a large number of block devices and
>> inducing I/O.
>
>I'm having problems to understand how a guest can attach a large number
>of block devices without those having been configured by the host admin
>before.
>
>If those devices have been configured, dom0 should be ready for that
>number of devices, e.g. by having enough spare memory area for ballooned
>pages.

As mentioned in the original message as below, administrators _can_ avoid this
problem, but finding the optimal configuration is hard, especially if the
number of the guests is large.

	System administrators can avoid such problematic situations by limiting
	the maximum number of devices each guest can attach.  However, finding
	the optimal limit is not so easy.  Improper set of the limit can
	results in the memory pressure or a resource underutilization.


Thanks,
SeongJae Park

>
>So either I'm missing something here or your reasoning for the need of
>the patch is wrong.
>
>
>Juergen
>
Jürgen Groß Dec. 9, 2019, 10:29 a.m. UTC | #5
On 09.12.19 11:23, SeongJae Park wrote:
> On   Mon, 9 Dec 2019 10:39:02 +0100  Juergen <jgross@suse.com> wrote:
> 
>> On 09.12.19 09:58, SeongJae Park wrote:
>>> Each `blkif` has a free pages pool for the grant mapping.  The size of
>>> the pool starts from zero and be increased on demand while processing
>>> the I/O requests.  If current I/O requests handling is finished or 100
>>> milliseconds has passed since last I/O requests handling, it checks and
>>> shrinks the pool to not exceed the size limit, `max_buffer_pages`.
>>>
>>> Therefore, `blkfront` running guests can cause a memory pressure in the
>>> `blkback` running guest by attaching a large number of block devices and
>>> inducing I/O.
>>
>> I'm having problems to understand how a guest can attach a large number
>> of block devices without those having been configured by the host admin
>> before.
>>
>> If those devices have been configured, dom0 should be ready for that
>> number of devices, e.g. by having enough spare memory area for ballooned
>> pages.
> 
> As mentioned in the original message as below, administrators _can_ avoid this
> problem, but finding the optimal configuration is hard, especially if the
> number of the guests is large.
> 
> 	System administrators can avoid such problematic situations by limiting
> 	the maximum number of devices each guest can attach.  However, finding
> 	the optimal limit is not so easy.  Improper set of the limit can
> 	results in the memory pressure or a resource underutilization.

This sounds as if the admin would set a device limit. But it is the
other way round: The admin needs to configure each possible device
with all parameters (e.g. backing dom0 resource) for enabling the
frontend to use it.


Juergen