diff mbox series

[3/9] btrfs-progs: image: Fix a access-beyond-boundary bug when there are 32 online CPUs

Message ID 20190606110611.27176-4-wqu@suse.com (mailing list archive)
State New, archived
Headers show
Series btrfs-progs: image: Data dump support, restore optimization and small fixes | expand

Commit Message

Qu Wenruo June 6, 2019, 11:06 a.m. UTC
[BUG]
When there are over 32 (in my example, 35) online CPUs, btrfs-image -c9
will just hang.

[CAUSE]
Btrfs-image has a hard coded limit (32) on how many threads we can use.
For the "-t" option we do the up limit check.

But when we don't specify "-t" option and speicified "-c" option, then
btrfs-image will try to auto detect the number of online CPUs, and use
it without checking if it's over the up limit.

And for num_threads larger than the up limit, we will over write the
adjust members of metadump_struct/mdrestore_struct, corrupting
pthread_mutex_t and pthread_cond_t, causing synchronising problem.

Nowadays, with SMT/HT and higher cpu core counts, it's not hard to go
beyond 32 threads, and hit the bug.

[FIX]
Just do extra num_threads check before using the number from sysconf().

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 image/main.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Su Yue June 10, 2019, 1:23 a.m. UTC | #1
On 2019/6/6 7:06 PM, Qu Wenruo wrote:
> [BUG]
> When there are over 32 (in my example, 35) online CPUs, btrfs-image -c9
> will just hang.
>
> [CAUSE]
> Btrfs-image has a hard coded limit (32) on how many threads we can use.
> For the "-t" option we do the up limit check.
>
> But when we don't specify "-t" option and speicified "-c" option, then
> btrfs-image will try to auto detect the number of online CPUs, and use
> it without checking if it's over the up limit.
>
> And for num_threads larger than the up limit, we will over write the
> adjust members of metadump_struct/mdrestore_struct, corrupting
> pthread_mutex_t and pthread_cond_t, causing synchronising problem.
>
> Nowadays, with SMT/HT and higher cpu core counts, it's not hard to go
> beyond 32 threads, and hit the bug.
>
> [FIX]
> Just do extra num_threads check before using the number from sysconf().
>
> Signed-off-by: Qu Wenruo <wqu@suse.com>

This does fix an issue.
And as the commit says, why limit the max threads to 32?
Does it still make sense in nowadays multiple cores CPU?
Can we increase the limit?
However, this is another story.

For this patch:
Reviewed-by: Su Yue <Damenly_Su@gmx.com>

> ---
>   image/main.c | 1 +
>   1 file changed, 1 insertion(+)
>
> diff --git a/image/main.c b/image/main.c
> index fb9fc48c..80f09c21 100644
> --- a/image/main.c
> +++ b/image/main.c
> @@ -2758,6 +2758,7 @@ int main(int argc, char *argv[])
>
>   			if (tmp <= 0)
>   				tmp = 1;
> +			tmp = min_t(long, tmp, MAX_WORKER_THREADS);
>   			num_threads = tmp;
>   		}
>   	} else {
>
Qu Wenruo June 10, 2019, 1:28 a.m. UTC | #2
On 2019/6/10 上午9:23, Su Yue wrote:
> 
> 
> On 2019/6/6 7:06 PM, Qu Wenruo wrote:
>> [BUG]
>> When there are over 32 (in my example, 35) online CPUs, btrfs-image -c9
>> will just hang.
>>
>> [CAUSE]
>> Btrfs-image has a hard coded limit (32) on how many threads we can use.
>> For the "-t" option we do the up limit check.
>>
>> But when we don't specify "-t" option and speicified "-c" option, then
>> btrfs-image will try to auto detect the number of online CPUs, and use
>> it without checking if it's over the up limit.
>>
>> And for num_threads larger than the up limit, we will over write the
>> adjust members of metadump_struct/mdrestore_struct, corrupting
>> pthread_mutex_t and pthread_cond_t, causing synchronising problem.
>>
>> Nowadays, with SMT/HT and higher cpu core counts, it's not hard to go
>> beyond 32 threads, and hit the bug.
>>
>> [FIX]
>> Just do extra num_threads check before using the number from sysconf().
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
> 
> This does fix an issue.
> And as the commit says, why limit the max threads to 32?

That's completely due to the hard coded metadump_struct.
We can switch to dynamically allocated pthread_t. Shouldn't be that hard
to convert.

> Does it still make sense in nowadays multiple cores CPU?

Well, thanks to the slow improvement caused by monopoly (cough, cough,
Intel), after one decade we're finally getting mainstream
16core/32threads CPUs in this year.

Personally speaking 32 threads should be already good enough for such a
less-common used tools.

So I'd prefer to keep the hard-coded limit for a while.

Thanks,
Qu

> Can we increase the limit?
> However, this is another story.
> 
> For this patch:
> Reviewed-by: Su Yue <Damenly_Su@gmx.com>
> 
>> ---
>>   image/main.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/image/main.c b/image/main.c
>> index fb9fc48c..80f09c21 100644
>> --- a/image/main.c
>> +++ b/image/main.c
>> @@ -2758,6 +2758,7 @@ int main(int argc, char *argv[])
>>
>>               if (tmp <= 0)
>>                   tmp = 1;
>> +            tmp = min_t(long, tmp, MAX_WORKER_THREADS);
>>               num_threads = tmp;
>>           }
>>       } else {
>>
> 
>
diff mbox series

Patch

diff --git a/image/main.c b/image/main.c
index fb9fc48c..80f09c21 100644
--- a/image/main.c
+++ b/image/main.c
@@ -2758,6 +2758,7 @@  int main(int argc, char *argv[])
 
 			if (tmp <= 0)
 				tmp = 1;
+			tmp = min_t(long, tmp, MAX_WORKER_THREADS);
 			num_threads = tmp;
 		}
 	} else {