diff mbox series

[2/2] dm-verity: prefetch all hash blocks in verity_ctr

Message ID 20250402070934.2387587-2-weilongping@oppo.com (mailing list archive)
State New
Headers show
Series [1/2] dm-bufio: improve the performance of __dm_bufio_prefetch | expand

Commit Message

LongPing Wei April 2, 2025, 7:09 a.m. UTC
At this time, all bios for hash blocks should eventually
be merged into a single large bio.

Signed-off-by: LongPing Wei <weilongping@oppo.com>
---
 drivers/md/dm-verity-target.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Mikulas Patocka April 2, 2025, 9:56 a.m. UTC | #1
On Wed, 2 Apr 2025, LongPing Wei wrote:

> At this time, all bios for hash blocks should eventually
> be merged into a single large bio.
> 
> Signed-off-by: LongPing Wei <weilongping@oppo.com>
> ---
>  drivers/md/dm-verity-target.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
> index 3c427f18a04b..813d5cfc7ffa 100644
> --- a/drivers/md/dm-verity-target.c
> +++ b/drivers/md/dm-verity-target.c
> @@ -1683,6 +1683,10 @@ static int verity_ctr(struct dm_target *ti, unsigned int argc, char **argv)
>  
>  	verity_verify_sig_opts_cleanup(&verify_args);
>  
> +	dm_bufio_prefetch_with_ioprio(v->bufio, v->hash_start,
> +		v->hash_blocks - v->hash_start,
> +		IOPRIO_PRIO_VALUE(IOPRIO_CLASS_RT, 0));
> +
>  	dm_audit_log_ctr(DM_MSG_PREFIX, ti, 1);
>  
>  	return 0;
> -- 
> 2.34.1

Hi

I would move it into the "resume" callback, so that if the user 
reconfigures the device stack between "ctr" and "resume", it won't read 
the data too early.

Don't use IOPRIO_CLASS_RT, this is not real-time requirement, 
IOPRIO_CLASS_RT would slow down concurrent I/O.

Another problem with this approach is that when the verity device is big 
and system memory is small, it just causes I/O churn - new bufio blocks 
will be displacing old blocks - and it will degrade performance, not 
improve it.

Please, describe some scenario, where this prefetch actually helps. What 
is the size of the metadata that you are prefetching? What is the total 
memory size? Is there any benchmark that shows the advantage of this 
patch?

Mikulas
LongPing Wei April 2, 2025, 10:35 a.m. UTC | #2
On 2025/4/2 17:56, Mikulas Patocka wrote:
> 
> 
> On Wed, 2 Apr 2025, LongPing Wei wrote:
> 
>> At this time, all bios for hash blocks should eventually
>> be merged into a single large bio.
>>
>> Signed-off-by: LongPing Wei <weilongping@oppo.com>
>> ---
>>   drivers/md/dm-verity-target.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
>> index 3c427f18a04b..813d5cfc7ffa 100644
>> --- a/drivers/md/dm-verity-target.c
>> +++ b/drivers/md/dm-verity-target.c
>> @@ -1683,6 +1683,10 @@ static int verity_ctr(struct dm_target *ti, unsigned int argc, char **argv)
>>   
>>   	verity_verify_sig_opts_cleanup(&verify_args);
>>   
>> +	dm_bufio_prefetch_with_ioprio(v->bufio, v->hash_start,
>> +		v->hash_blocks - v->hash_start,
>> +		IOPRIO_PRIO_VALUE(IOPRIO_CLASS_RT, 0));
>> +
>>   	dm_audit_log_ctr(DM_MSG_PREFIX, ti, 1);
>>   
>>   	return 0;
>> -- 
>> 2.34.1
> 
> Hi
> 
> I would move it into the "resume" callback, so that if the user
> reconfigures the device stack between "ctr" and "resume", it won't read
> the data too early.
> 
> Don't use IOPRIO_CLASS_RT, this is not real-time requirement,
> IOPRIO_CLASS_RT would slow down concurrent I/O.
> 
If the prefetch io is submitted with non-rt at first, the later dm io
need the same hash block will wait the non-rt bio.
> Another problem with this approach is that when the verity device is big
> and system memory is small, it just causes I/O churn - new bufio blocks
> will be displacing old blocks - and it will degrade performance, not
> improve it.
> 
Do we need a solution to check if the memory is enough to the prefetch?
For Android devices, the verity device should be created on the boot 
procedure.
> Please, describe some scenario, where this prefetch actually helps. What
> is the size of the metadata that you are prefetching? What is the total
> memory size? Is there any benchmark that shows the advantage of this
> patch?
The size of hash blocks for the ROM of our low-end devices is about
71MiB. I want to enhance probability of cache hit when
try_verify_in_tasklet is enabled. How about only doing the prefetch when
try_verify_in_tasklet is enabled?
For example:
	if (v->use_bh_wq)
		dm_bufio_prefetch_with_ioprio(v->bufio, v->hash_start,
			v->hash_blocks - v->hash_start,
			IOPRIO_PRIO_VALUE(IOPRIO_CLASS_RT, 0));

> 
> Mikulas
> 

LongPing
Mikulas Patocka April 2, 2025, 6:44 p.m. UTC | #3
On Wed, 2 Apr 2025, LongPing Wei wrote:

> On 2025/4/2 17:56, Mikulas Patocka wrote:
> > 
> > 
> > On Wed, 2 Apr 2025, LongPing Wei wrote:
> > 
> > > At this time, all bios for hash blocks should eventually
> > > be merged into a single large bio.
> > > 
> > > Signed-off-by: LongPing Wei <weilongping@oppo.com>
> > > ---
> > >   drivers/md/dm-verity-target.c | 4 ++++
> > >   1 file changed, 4 insertions(+)
> > > 
> > > diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
> > > index 3c427f18a04b..813d5cfc7ffa 100644
> > > --- a/drivers/md/dm-verity-target.c
> > > +++ b/drivers/md/dm-verity-target.c
> > > @@ -1683,6 +1683,10 @@ static int verity_ctr(struct dm_target *ti,
> > > unsigned int argc, char **argv)
> > >     	verity_verify_sig_opts_cleanup(&verify_args);
> > >   +	dm_bufio_prefetch_with_ioprio(v->bufio, v->hash_start,
> > > +		v->hash_blocks - v->hash_start,
> > > +		IOPRIO_PRIO_VALUE(IOPRIO_CLASS_RT, 0));
> > > +
> > >   	dm_audit_log_ctr(DM_MSG_PREFIX, ti, 1);
> > >     	return 0;
> > > -- 
> > > 2.34.1
> > 
> > Hi
> > 
> > I would move it into the "resume" callback, so that if the user
> > reconfigures the device stack between "ctr" and "resume", it won't read
> > the data too early.
> > 
> > Don't use IOPRIO_CLASS_RT, this is not real-time requirement,
> > IOPRIO_CLASS_RT would slow down concurrent I/O.
> > 
> If the prefetch io is submitted with non-rt at first, the later dm io
> need the same hash block will wait the non-rt bio.

Submitting large I/O with IOPRIO_CLASS_RT will block every other task that 
does some I/O, so I can't do that. It needs to be changed to 
IOPRIO_CLASS_NONE or IOPRIO_CLASS_IDLE.

> > Another problem with this approach is that when the verity device is big
> > and system memory is small, it just causes I/O churn - new bufio blocks
> > will be displacing old blocks - and it will degrade performance, not
> > improve it.
> > 
> Do we need a solution to check if the memory is enough to the prefetch?

Yes.

> For Android devices, the verity device should be created on the boot
> procedure.

> > Please, describe some scenario, where this prefetch actually helps. What
> > is the size of the metadata that you are prefetching? What is the total
> > memory size? Is there any benchmark that shows the advantage of this
> > patch?
> 
> The size of hash blocks for the ROM of our low-end devices is about
> 71MiB. I want to enhance probability of cache hit when
> try_verify_in_tasklet is enabled. How about only doing the prefetch when
> try_verify_in_tasklet is enabled?
> For example:
> 	if (v->use_bh_wq)
> 		dm_bufio_prefetch_with_ioprio(v->bufio, v->hash_start,
> 			v->hash_blocks - v->hash_start,
> 			IOPRIO_PRIO_VALUE(IOPRIO_CLASS_RT, 0));

I wouldn't overload the "use_bh_wq" option for that. Perhaps we could add 
a new option (that would be off by default), so that the patch won't cause 
problems to existing users.

How much does this patch improve Android boot time? So that we can decide 
whether the improvement is worth the complexity.

Mikulas
LongPing Wei April 2, 2025, 11:58 p.m. UTC | #4
On 2025/4/3 2:44, Mikulas Patocka wrote:
> 
> 
> On Wed, 2 Apr 2025, LongPing Wei wrote:
> 
>> On 2025/4/2 17:56, Mikulas Patocka wrote:
>>>
>>>
>>> On Wed, 2 Apr 2025, LongPing Wei wrote:
>>>
>>>> At this time, all bios for hash blocks should eventually
>>>> be merged into a single large bio.
>>>>
>>>> Signed-off-by: LongPing Wei <weilongping@oppo.com>
>>>> ---
>>>>    drivers/md/dm-verity-target.c | 4 ++++
>>>>    1 file changed, 4 insertions(+)
>>>>
>>>> diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
>>>> index 3c427f18a04b..813d5cfc7ffa 100644
>>>> --- a/drivers/md/dm-verity-target.c
>>>> +++ b/drivers/md/dm-verity-target.c
>>>> @@ -1683,6 +1683,10 @@ static int verity_ctr(struct dm_target *ti,
>>>> unsigned int argc, char **argv)
>>>>      	verity_verify_sig_opts_cleanup(&verify_args);
>>>>    +	dm_bufio_prefetch_with_ioprio(v->bufio, v->hash_start,
>>>> +		v->hash_blocks - v->hash_start,
>>>> +		IOPRIO_PRIO_VALUE(IOPRIO_CLASS_RT, 0));
>>>> +
>>>>    	dm_audit_log_ctr(DM_MSG_PREFIX, ti, 1);
>>>>      	return 0;
>>>> -- 
>>>> 2.34.1
>>>
>>> Hi
>>>
>>> I would move it into the "resume" callback, so that if the user
>>> reconfigures the device stack between "ctr" and "resume", it won't read
>>> the data too early.
>>>
>>> Don't use IOPRIO_CLASS_RT, this is not real-time requirement,
>>> IOPRIO_CLASS_RT would slow down concurrent I/O.
>>>
>> If the prefetch io is submitted with non-rt at first, the later dm io
>> need the same hash block will wait the non-rt bio.
> 
> Submitting large I/O with IOPRIO_CLASS_RT will block every other task that
> does some I/O, so I can't do that. It needs to be changed to
> IOPRIO_CLASS_NONE or IOPRIO_CLASS_IDLE.
> 
>>> Another problem with this approach is that when the verity device is big
>>> and system memory is small, it just causes I/O churn - new bufio blocks
>>> will be displacing old blocks - and it will degrade performance, not
>>> improve it.
>>>
>> Do we need a solution to check if the memory is enough to the prefetch?
> 
> Yes.
> 
>> For Android devices, the verity device should be created on the boot
>> procedure.
> 
>>> Please, describe some scenario, where this prefetch actually helps. What
>>> is the size of the metadata that you are prefetching? What is the total
>>> memory size? Is there any benchmark that shows the advantage of this
>>> patch?
>>
>> The size of hash blocks for the ROM of our low-end devices is about
>> 71MiB. I want to enhance probability of cache hit when
>> try_verify_in_tasklet is enabled. How about only doing the prefetch when
>> try_verify_in_tasklet is enabled?
>> For example:
>> 	if (v->use_bh_wq)
>> 		dm_bufio_prefetch_with_ioprio(v->bufio, v->hash_start,
>> 			v->hash_blocks - v->hash_start,
>> 			IOPRIO_PRIO_VALUE(IOPRIO_CLASS_RT, 0));
> 
> I wouldn't overload the "use_bh_wq" option for that. Perhaps we could add
> a new option (that would be off by default), so that the patch won't cause
> problems to existing users.
> 
> How much does this patch improve Android boot time? So that we can decide
> whether the improvement is worth the complexity.
> 
> Mikulas
> 

Hi, Mikulas

Can we add a parameter that is used only once in the process of ctr or
resume?

For normal boot, the improvement is not significant. But for upgrading
boot, it could be valuable to do this as dm-verity works on dm-user and
snapshotd in userspace has a large number of writes in the background.
We have met an issue of system running slowly in this situation.

If a parameter that is used only once in the process of ctr or
resume is acceptable, I want to use it only in upgrading boot mode and
the shrinker of dm-bufio should be skipped until the system is upgraded.

The goal is to reduce the number of read requests to snapuserd.

LongPing
Christoph Hellwig April 10, 2025, 8:26 a.m. UTC | #5
On Wed, Apr 02, 2025 at 03:09:36PM +0800, LongPing Wei wrote:
>  
> +	dm_bufio_prefetch_with_ioprio(v->bufio, v->hash_start,
> +		v->hash_blocks - v->hash_start,
> +		IOPRIO_PRIO_VALUE(IOPRIO_CLASS_RT, 0));

Err, no.  Random code should not prefetch with a RT priority.
Christoph Hellwig April 10, 2025, 8:28 a.m. UTC | #6
On Wed, Apr 02, 2025 at 08:44:39PM +0200, Mikulas Patocka wrote:
> Submitting large I/O with IOPRIO_CLASS_RT will block every other task that 
> does some I/O, so I can't do that. It needs to be changed to 
> IOPRIO_CLASS_NONE or IOPRIO_CLASS_IDLE.

Also random kernel code should not simply upgrade the I/O priority.
Your magic snowflake is not going to be as magic for everyone else.
diff mbox series

Patch

diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
index 3c427f18a04b..813d5cfc7ffa 100644
--- a/drivers/md/dm-verity-target.c
+++ b/drivers/md/dm-verity-target.c
@@ -1683,6 +1683,10 @@  static int verity_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 
 	verity_verify_sig_opts_cleanup(&verify_args);
 
+	dm_bufio_prefetch_with_ioprio(v->bufio, v->hash_start,
+		v->hash_blocks - v->hash_start,
+		IOPRIO_PRIO_VALUE(IOPRIO_CLASS_RT, 0));
+
 	dm_audit_log_ctr(DM_MSG_PREFIX, ti, 1);
 
 	return 0;