diff mbox series

[v2,1/3] slab: make check_object() more consistent

Message ID 20240605-b4-slab-debug-v2-1-c535b9cd361c@linux.dev (mailing list archive)
State New
Headers show
Series slab: fix and cleanup of slub_debug | expand

Commit Message

Chengming Zhou June 5, 2024, 7:13 a.m. UTC
Now check_object() calls check_bytes_and_report() multiple times to
check every section of the object it cares about, like left and right
redzones, object poison, paddings poison and freepointer. It will
abort the checking process and return 0 once it finds an error.

There are two inconsistencies in check_object(), which are alignment
padding checking and object padding checking. We only print the error
messages but don't return 0 to tell callers that something is wrong
and needs to be handled. Please see alloc_debug_processing() and
free_debug_processing() for details.

If the above inconsistencies are not intentional, we should fix it.
And we want to do all checks without skipping, so use a local variable
"ret" to save each check result and change check_bytes_and_report() to
only report specific error findings. Then at end of check_object(),
print the trailer once if any found an error.

Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Chengming Zhou <chengming.zhou@linux.dev>
---
 mm/slub.c | 45 ++++++++++++++++++++++++---------------------
 1 file changed, 24 insertions(+), 21 deletions(-)

Comments

Vlastimil Babka June 6, 2024, 8:28 a.m. UTC | #1
On 6/5/24 9:13 AM, Chengming Zhou wrote:
> Now check_object() calls check_bytes_and_report() multiple times to
> check every section of the object it cares about, like left and right
> redzones, object poison, paddings poison and freepointer. It will
> abort the checking process and return 0 once it finds an error.
> 
> There are two inconsistencies in check_object(), which are alignment
> padding checking and object padding checking. We only print the error
> messages but don't return 0 to tell callers that something is wrong
> and needs to be handled. Please see alloc_debug_processing() and
> free_debug_processing() for details.
> 
> If the above inconsistencies are not intentional, we should fix it.

It doesn't seem intentional, I don't see why padding specifically would be
different from the other tests here.

<snip>

> -	if (!freeptr_outside_object(s) && val == SLUB_RED_ACTIVE)
> -		/*
> -		 * Object and freepointer overlap. Cannot check
> -		 * freepointer while object is allocated.
> -		 */
> -		return 1;
> -
> -	/* Check free pointer validity */
> -	if (!check_valid_pointer(s, slab, get_freepointer(s, p))) {
> +	/*
> +	 * Cannot check freepointer while object is allocated if
> +	 * object and freepointer overlap.
> +	 */
> +	if (!freeptr_outside_object(s) && val == SLUB_RED_ACTIVE &&

Seems this condition should have been logically flipped?

> +	    !check_valid_pointer(s, slab, get_freepointer(s, p))) {
>  		object_err(s, slab, p, "Freepointer corrupt");
>  		/*
>  		 * No choice but to zap it and thus lose the remainder
> @@ -1370,9 +1368,14 @@ static int check_object(struct kmem_cache *s, struct slab *slab,
>  		 * another error because the object count is now wrong.
>  		 */
>  		set_freepointer(s, p, NULL);
> -		return 0;

Should set ret = 0 here?

>  	}
> -	return 1;
> +
> +	if (!ret && !slab_add_kunit_errors()) {

Also 5/6 of slub_kunit tests now fail as we increased the number of recorded
errors vs expected. Either the slab_add_kunit_errors() test above should
have a variant (parameter?) so it will only detect we are in slab-kunit test
(to suppress the printing and taint) but doesn't increase slab_errors (we
increased them for the individual issues already), or simply raise the
expectations of the tests so it matches the new implementation.

Thanks,
Vlastimil

> +		print_trailer(s, slab, object);
> +		add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);
> +	}
> +
> +	return ret;
>  }
>  
>  static int check_slab(struct kmem_cache *s, struct slab *slab)
>
Chengming Zhou June 7, 2024, 7:26 a.m. UTC | #2
On 2024/6/6 16:28, Vlastimil Babka wrote:
> On 6/5/24 9:13 AM, Chengming Zhou wrote:
>> Now check_object() calls check_bytes_and_report() multiple times to
>> check every section of the object it cares about, like left and right
>> redzones, object poison, paddings poison and freepointer. It will
>> abort the checking process and return 0 once it finds an error.
>>
[...]
>> -	/* Check free pointer validity */
>> -	if (!check_valid_pointer(s, slab, get_freepointer(s, p))) {
>> +	/*
>> +	 * Cannot check freepointer while object is allocated if
>> +	 * object and freepointer overlap.
>> +	 */
>> +	if (!freeptr_outside_object(s) && val == SLUB_RED_ACTIVE &&
> 
> Seems this condition should have been logically flipped?

Ah, right, will fix.

> 
>> +	    !check_valid_pointer(s, slab, get_freepointer(s, p))) {
>>  		object_err(s, slab, p, "Freepointer corrupt");
>>  		/*
>>  		 * No choice but to zap it and thus lose the remainder
>> @@ -1370,9 +1368,14 @@ static int check_object(struct kmem_cache *s, struct slab *slab,
>>  		 * another error because the object count is now wrong.
>>  		 */
>>  		set_freepointer(s, p, NULL);
>> -		return 0;
> 
> Should set ret = 0 here?

Yes.

> 
>>  	}
>> -	return 1;
>> +
>> +	if (!ret && !slab_add_kunit_errors()) {
> 
> Also 5/6 of slub_kunit tests now fail as we increased the number of recorded

My bad, I didn't test with slub_kunit, will test later.

> errors vs expected. Either the slab_add_kunit_errors() test above should
> have a variant (parameter?) so it will only detect we are in slab-kunit test
> (to suppress the printing and taint) but doesn't increase slab_errors (we

I think this way is simpler for me, only suppress the printing but doesn't
increase slab_errors, will take this way and test again.

Thanks!

> increased them for the individual issues already), or simply raise the
> expectations of the tests so it matches the new implementation.
>
diff mbox series

Patch

diff --git a/mm/slub.c b/mm/slub.c
index 0809760cf789..7fbd5ce4320a 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1192,8 +1192,6 @@  static int check_bytes_and_report(struct kmem_cache *s, struct slab *slab,
 	pr_err("0x%p-0x%p @offset=%tu. First byte 0x%x instead of 0x%x\n",
 					fault, end - 1, fault - addr,
 					fault[0], value);
-	print_trailer(s, slab, object);
-	add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);
 
 skip_bug_print:
 	restore_bytes(s, what, value, fault, end);
@@ -1302,15 +1300,16 @@  static int check_object(struct kmem_cache *s, struct slab *slab,
 	u8 *p = object;
 	u8 *endobject = object + s->object_size;
 	unsigned int orig_size, kasan_meta_size;
+	int ret = 1;
 
 	if (s->flags & SLAB_RED_ZONE) {
 		if (!check_bytes_and_report(s, slab, object, "Left Redzone",
 			object - s->red_left_pad, val, s->red_left_pad))
-			return 0;
+			ret = 0;
 
 		if (!check_bytes_and_report(s, slab, object, "Right Redzone",
 			endobject, val, s->inuse - s->object_size))
-			return 0;
+			ret = 0;
 
 		if (slub_debug_orig_size(s) && val == SLUB_RED_ACTIVE) {
 			orig_size = get_orig_size(s, object);
@@ -1319,14 +1318,15 @@  static int check_object(struct kmem_cache *s, struct slab *slab,
 				!check_bytes_and_report(s, slab, object,
 					"kmalloc Redzone", p + orig_size,
 					val, s->object_size - orig_size)) {
-				return 0;
+				ret = 0;
 			}
 		}
 	} else {
 		if ((s->flags & SLAB_POISON) && s->object_size < s->inuse) {
-			check_bytes_and_report(s, slab, p, "Alignment padding",
+			if (!check_bytes_and_report(s, slab, p, "Alignment padding",
 				endobject, POISON_INUSE,
-				s->inuse - s->object_size);
+				s->inuse - s->object_size))
+				ret = 0;
 		}
 	}
 
@@ -1342,27 +1342,25 @@  static int check_object(struct kmem_cache *s, struct slab *slab,
 			    !check_bytes_and_report(s, slab, p, "Poison",
 					p + kasan_meta_size, POISON_FREE,
 					s->object_size - kasan_meta_size - 1))
-				return 0;
+				ret = 0;
 			if (kasan_meta_size < s->object_size &&
 			    !check_bytes_and_report(s, slab, p, "End Poison",
 					p + s->object_size - 1, POISON_END, 1))
-				return 0;
+				ret = 0;
 		}
 		/*
 		 * check_pad_bytes cleans up on its own.
 		 */
-		check_pad_bytes(s, slab, p);
+		if (!check_pad_bytes(s, slab, p))
+			ret = 0;
 	}
 
-	if (!freeptr_outside_object(s) && val == SLUB_RED_ACTIVE)
-		/*
-		 * Object and freepointer overlap. Cannot check
-		 * freepointer while object is allocated.
-		 */
-		return 1;
-
-	/* Check free pointer validity */
-	if (!check_valid_pointer(s, slab, get_freepointer(s, p))) {
+	/*
+	 * Cannot check freepointer while object is allocated if
+	 * object and freepointer overlap.
+	 */
+	if (!freeptr_outside_object(s) && val == SLUB_RED_ACTIVE &&
+	    !check_valid_pointer(s, slab, get_freepointer(s, p))) {
 		object_err(s, slab, p, "Freepointer corrupt");
 		/*
 		 * No choice but to zap it and thus lose the remainder
@@ -1370,9 +1368,14 @@  static int check_object(struct kmem_cache *s, struct slab *slab,
 		 * another error because the object count is now wrong.
 		 */
 		set_freepointer(s, p, NULL);
-		return 0;
 	}
-	return 1;
+
+	if (!ret && !slab_add_kunit_errors()) {
+		print_trailer(s, slab, object);
+		add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);
+	}
+
+	return ret;
 }
 
 static int check_slab(struct kmem_cache *s, struct slab *slab)