diff mbox series

[i-g-t,5/8] tests/i915/gem_exec_capture: Check for memory allocation failure

Message ID 20211021234044.3071069-6-John.C.Harrison@Intel.com (mailing list archive)
State New, archived
Headers show
Series Fixes for gem_exec_capture | expand

Commit Message

John Harrison Oct. 21, 2021, 11:40 p.m. UTC
From: John Harrison <John.C.Harrison@Intel.com>

The sysfs file read helper does not actually report any errors if a
realloc fails. It just silently returns a 'valid' but truncated
buffer. This then leads to the decode of the buffer failing in random
ways. So, add a check for ENOMEM being generated during the read.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 tests/i915/gem_exec_capture.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Matthew Brost Oct. 29, 2021, 2:20 a.m. UTC | #1
On Thu, Oct 21, 2021 at 04:40:41PM -0700, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
> 
> The sysfs file read helper does not actually report any errors if a
> realloc fails. It just silently returns a 'valid' but truncated
> buffer. This then leads to the decode of the buffer failing in random
> ways. So, add a check for ENOMEM being generated during the read.
> 
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>

Reviewed-by: Matthew Brost <matthew.brost@intel.com>

> ---
>  tests/i915/gem_exec_capture.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/tests/i915/gem_exec_capture.c b/tests/i915/gem_exec_capture.c
> index e373d24ed..8997125ee 100644
> --- a/tests/i915/gem_exec_capture.c
> +++ b/tests/i915/gem_exec_capture.c
> @@ -131,9 +131,11 @@ static int check_error_state(int dir, struct offset *obj_offsets, int obj_count,
>  	char *error, *str;
>  	int blobs = 0;
>  
> +	errno = 0;
>  	error = igt_sysfs_get(dir, "error");
>  	igt_sysfs_set(dir, "error", "Begone!");
>  	igt_assert(error);
> +	igt_assert(errno != ENOMEM);
>  	igt_debug("%s\n", error);
>  
>  	/* render ring --- user = 0x00000000 ffffd000 */
> -- 
> 2.25.1
>
Tvrtko Ursulin Nov. 3, 2021, 2 p.m. UTC | #2
On 22/10/2021 00:40, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
> 
> The sysfs file read helper does not actually report any errors if a
> realloc fails. It just silently returns a 'valid' but truncated
> buffer. This then leads to the decode of the buffer failing in random
> ways. So, add a check for ENOMEM being generated during the read.
> 
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> ---
>   tests/i915/gem_exec_capture.c | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/tests/i915/gem_exec_capture.c b/tests/i915/gem_exec_capture.c
> index e373d24ed..8997125ee 100644
> --- a/tests/i915/gem_exec_capture.c
> +++ b/tests/i915/gem_exec_capture.c
> @@ -131,9 +131,11 @@ static int check_error_state(int dir, struct offset *obj_offsets, int obj_count,
>   	char *error, *str;
>   	int blobs = 0;
>   
> +	errno = 0;
>   	error = igt_sysfs_get(dir, "error");
>   	igt_sysfs_set(dir, "error", "Begone!");
>   	igt_assert(error);
> +	igt_assert(errno != ENOMEM);

igt_sysfs_get:

	len = 64;
...
                 newbuf = realloc(buf, 2*len);

Maybe the problem is doubling goes out of hand. How big are your 
buffers? Perhaps you could improve the library function instead to grow 
less aggressively.

And at the same time perhaps the bug is this:

                 if (igt_debug_on(!newbuf))
                         break;
...
         return buf;

So failures to grow the buffer are ignored, while failure to allocate 
the initial one are not. Perhaps both should return NULL and then 
callers would not be surprised.

Or you think someone relies on this current odd behaviour?

Regards,

Tvrtko

>   	igt_debug("%s\n", error);
>   
>   	/* render ring --- user = 0x00000000 ffffd000 */
>
John Harrison Nov. 3, 2021, 6:36 p.m. UTC | #3
On 11/3/2021 07:00, Tvrtko Ursulin wrote:
> On 22/10/2021 00:40, John.C.Harrison@Intel.com wrote:
>> From: John Harrison <John.C.Harrison@Intel.com>
>>
>> The sysfs file read helper does not actually report any errors if a
>> realloc fails. It just silently returns a 'valid' but truncated
>> buffer. This then leads to the decode of the buffer failing in random
>> ways. So, add a check for ENOMEM being generated during the read.
>>
>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>> ---
>>   tests/i915/gem_exec_capture.c | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/tests/i915/gem_exec_capture.c 
>> b/tests/i915/gem_exec_capture.c
>> index e373d24ed..8997125ee 100644
>> --- a/tests/i915/gem_exec_capture.c
>> +++ b/tests/i915/gem_exec_capture.c
>> @@ -131,9 +131,11 @@ static int check_error_state(int dir, struct 
>> offset *obj_offsets, int obj_count,
>>       char *error, *str;
>>       int blobs = 0;
>>   +    errno = 0;
>>       error = igt_sysfs_get(dir, "error");
>>       igt_sysfs_set(dir, "error", "Begone!");
>>       igt_assert(error);
>> +    igt_assert(errno != ENOMEM);
>
> igt_sysfs_get:
>
>     len = 64;
> ...
>                 newbuf = realloc(buf, 2*len);
>
> Maybe the problem is doubling goes out of hand. How big are your 
> buffers? Perhaps you could improve the library function instead to 
> grow less aggressively.
The buffers are generally ending at 2GB in size with the capture being 
about 1.8GB (on the particular system I happen to be testing on).

I considered various options such as doubling until a given size and 
then just incrementing by fixed amounts. But where do you draw the line? 
1MB, 128MB, 1GB, 128GB? If the final result needs to be 128GB (which you 
cannot know until you have finished reading and resizing) and you are 
allocating in 1MB chunks then it is going to take a very long time to 
get there. I ended up leaving it as a straight double on the grounds 
that it is the best compromise between overallocation and taking 
ridiculous numbers of steps.



>
> And at the same time perhaps the bug is this:
>
>                 if (igt_debug_on(!newbuf))
>                         break;
> ...
>         return buf;
>
> So failures to grow the buffer are ignored, while failure to allocate 
> the initial one are not. Perhaps both should return NULL and then 
> callers would not be surprised.
>
> Or you think someone relies on this current odd behaviour?
>
As per the commit description, this is exactly the problem. However, I 
do not know for certain this is not intentional behaviour and something 
somewhere is relying on it. And I really do not have the time to audit 
this. The vast majority of uses are reading teeny tiny files and don't 
care but who knows what might not be in some particular 
test/config/platform/etc. The fact that it is explicitly saying 
'igt_debug_on' means that someone must have made a conscious decision to 
not assert. It's not like they just forgot to check for null being 
returned. Which implies it is intentional and required.

John.


> Regards,
>
> Tvrtko
>
>>       igt_debug("%s\n", error);
>>         /* render ring --- user = 0x00000000 ffffd000 */
>>
diff mbox series

Patch

diff --git a/tests/i915/gem_exec_capture.c b/tests/i915/gem_exec_capture.c
index e373d24ed..8997125ee 100644
--- a/tests/i915/gem_exec_capture.c
+++ b/tests/i915/gem_exec_capture.c
@@ -131,9 +131,11 @@  static int check_error_state(int dir, struct offset *obj_offsets, int obj_count,
 	char *error, *str;
 	int blobs = 0;
 
+	errno = 0;
 	error = igt_sysfs_get(dir, "error");
 	igt_sysfs_set(dir, "error", "Begone!");
 	igt_assert(error);
+	igt_assert(errno != ENOMEM);
 	igt_debug("%s\n", error);
 
 	/* render ring --- user = 0x00000000 ffffd000 */