diff mbox series

[01/10] mm: page_ref_add_unless() does not trace 'u' argument

Message ID 20211208203544.2297121-2-pasha.tatashin@soleen.com (mailing list archive)
State New
Headers show
Series Hardening page _refcount | expand

Commit Message

Pasha Tatashin Dec. 8, 2021, 8:35 p.m. UTC
In other page_ref_* functions all arguments and returns are traced, but
in page_ref_add_unless the 'u' argument which stands for unless boolean
is not traced. However, what is more confusing is that in the tracing
routine:
	__page_ref_mod_unless(struct page *page, int v, int u);

The 'u' argument present, but instead a return value is passed into
this argument.

Add a new template specific for page_ref_add_unless(), and trace all
arguments and the return value.

Fixes: 95813b8faa0c ("mm/page_ref: add tracepoint to track down page reference manipulation")

Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
---
 include/linux/page_ref.h        | 10 ++++----
 include/trace/events/page_ref.h | 43 ++++++++++++++++++++++++++++++---
 mm/debug_page_ref.c             |  8 +++---
 3 files changed, 49 insertions(+), 12 deletions(-)

Comments

Matthew Wilcox Dec. 8, 2021, 8:55 p.m. UTC | #1
On Wed, Dec 08, 2021 at 08:35:35PM +0000, Pasha Tatashin wrote:
> In other page_ref_* functions all arguments and returns are traced, but
> in page_ref_add_unless the 'u' argument which stands for unless boolean
> is not traced. However, what is more confusing is that in the tracing
> routine:
> 	__page_ref_mod_unless(struct page *page, int v, int u);
> 
> The 'u' argument present, but instead a return value is passed into
> this argument.
> 
> Add a new template specific for page_ref_add_unless(), and trace all
> arguments and the return value.

The special casing of '1' for device pages is going away, so NAK
to this user-visible change.
Pasha Tatashin Dec. 9, 2021, 1:25 a.m. UTC | #2
On Wed, Dec 8, 2021 at 3:55 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Wed, Dec 08, 2021 at 08:35:35PM +0000, Pasha Tatashin wrote:
> > In other page_ref_* functions all arguments and returns are traced, but
> > in page_ref_add_unless the 'u' argument which stands for unless boolean
> > is not traced. However, what is more confusing is that in the tracing
> > routine:
> >       __page_ref_mod_unless(struct page *page, int v, int u);
> >
> > The 'u' argument present, but instead a return value is passed into
> > this argument.
> >
> > Add a new template specific for page_ref_add_unless(), and trace all
> > arguments and the return value.
>
> The special casing of '1' for device pages is going away, so NAK
> to this user-visible change.

I can drop this patch, as it really intended to fix existing oddities
and missing info. However, I do not really understand your NAK reason.
Can you please explain about the special casing of "1" for device
pages?

Pasha
Matthew Wilcox Dec. 9, 2021, 2:17 a.m. UTC | #3
On Wed, Dec 08, 2021 at 08:25:22PM -0500, Pasha Tatashin wrote:
> On Wed, Dec 8, 2021 at 3:55 PM Matthew Wilcox <willy@infradead.org> wrote:
> >
> > On Wed, Dec 08, 2021 at 08:35:35PM +0000, Pasha Tatashin wrote:
> > > In other page_ref_* functions all arguments and returns are traced, but
> > > in page_ref_add_unless the 'u' argument which stands for unless boolean
> > > is not traced. However, what is more confusing is that in the tracing
> > > routine:
> > >       __page_ref_mod_unless(struct page *page, int v, int u);
> > >
> > > The 'u' argument present, but instead a return value is passed into
> > > this argument.
> > >
> > > Add a new template specific for page_ref_add_unless(), and trace all
> > > arguments and the return value.
> >
> > The special casing of '1' for device pages is going away, so NAK
> > to this user-visible change.
> 
> I can drop this patch, as it really intended to fix existing oddities
> and missing info. However, I do not really understand your NAK reason.
> Can you please explain about the special casing of "1" for device
> pages?

$ git grep page_ref_add_unless
include/linux/mm.h:     return page_ref_add_unless(page, 1, 0);
include/linux/page_ref.h:static inline bool page_ref_add_unless(struct page *page, int nr, int u)
include/linux/page_ref.h:       return page_ref_add_unless(&folio->page, nr, u);
mm/memcontrol.c:                if (!page_ref_add_unless(page, 1, 1))

'u' is always 0, except for the caller in mm/memcontrol.c:

        if (is_device_private_entry(ent)) {
                page = pfn_swap_entry_to_page(ent);
                /*
                 * MEMORY_DEVICE_PRIVATE means ZONE_DEVICE page and which have
                 * a refcount of 1 when free (unlike normal page)
                 */
                if (!page_ref_add_unless(page, 1, 1))
                        return NULL;
                return page;
        }

That special casing of ZONE_DEVICE pages is being fixed, so 'u' will
soon always be 0, and I'm sure we'll delete it as an argument.  So
there's no point in tracing what it 'used to be'.
Pasha Tatashin Dec. 9, 2021, 3:15 p.m. UTC | #4
On Wed, Dec 8, 2021 at 9:17 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Wed, Dec 08, 2021 at 08:25:22PM -0500, Pasha Tatashin wrote:
> > On Wed, Dec 8, 2021 at 3:55 PM Matthew Wilcox <willy@infradead.org> wrote:
> > >
> > > On Wed, Dec 08, 2021 at 08:35:35PM +0000, Pasha Tatashin wrote:
> > > > In other page_ref_* functions all arguments and returns are traced, but
> > > > in page_ref_add_unless the 'u' argument which stands for unless boolean
> > > > is not traced. However, what is more confusing is that in the tracing
> > > > routine:
> > > >       __page_ref_mod_unless(struct page *page, int v, int u);
> > > >
> > > > The 'u' argument present, but instead a return value is passed into
> > > > this argument.
> > > >
> > > > Add a new template specific for page_ref_add_unless(), and trace all
> > > > arguments and the return value.
> > >
> > > The special casing of '1' for device pages is going away, so NAK
> > > to this user-visible change.
> >
> > I can drop this patch, as it really intended to fix existing oddities
> > and missing info. However, I do not really understand your NAK reason.
> > Can you please explain about the special casing of "1" for device
> > pages?
>
> $ git grep page_ref_add_unless
> include/linux/mm.h:     return page_ref_add_unless(page, 1, 0);
> include/linux/page_ref.h:static inline bool page_ref_add_unless(struct page *page, int nr, int u)
> include/linux/page_ref.h:       return page_ref_add_unless(&folio->page, nr, u);
> mm/memcontrol.c:                if (!page_ref_add_unless(page, 1, 1))
>
> 'u' is always 0, except for the caller in mm/memcontrol.c:
>
>         if (is_device_private_entry(ent)) {
>                 page = pfn_swap_entry_to_page(ent);
>                 /*
>                  * MEMORY_DEVICE_PRIVATE means ZONE_DEVICE page and which have
>                  * a refcount of 1 when free (unlike normal page)
>                  */
>                 if (!page_ref_add_unless(page, 1, 1))
>                         return NULL;
>                 return page;
>         }
>
> That special casing of ZONE_DEVICE pages is being fixed, so 'u' will
> soon always be 0, and I'm sure we'll delete it as an argument.  So
> there's no point in tracing what it 'used to be'.

Sounds good, at the time when 'u' is deleted we can also address
inconsistencies fixed in this patch. So, I will drop it from my series
in the next version.

Thanks,
Pasha
diff mbox series

Patch

diff --git a/include/linux/page_ref.h b/include/linux/page_ref.h
index 2e677e6ad09f..1903af5fb087 100644
--- a/include/linux/page_ref.h
+++ b/include/linux/page_ref.h
@@ -11,7 +11,7 @@  DECLARE_TRACEPOINT(page_ref_set);
 DECLARE_TRACEPOINT(page_ref_mod);
 DECLARE_TRACEPOINT(page_ref_mod_and_test);
 DECLARE_TRACEPOINT(page_ref_mod_and_return);
-DECLARE_TRACEPOINT(page_ref_mod_unless);
+DECLARE_TRACEPOINT(page_ref_add_unless);
 DECLARE_TRACEPOINT(page_ref_freeze);
 DECLARE_TRACEPOINT(page_ref_unfreeze);
 
@@ -30,7 +30,7 @@  extern void __page_ref_set(struct page *page, int v);
 extern void __page_ref_mod(struct page *page, int v);
 extern void __page_ref_mod_and_test(struct page *page, int v, int ret);
 extern void __page_ref_mod_and_return(struct page *page, int v, int ret);
-extern void __page_ref_mod_unless(struct page *page, int v, int u);
+extern void __page_ref_add_unless(struct page *page, int v, int u, int ret);
 extern void __page_ref_freeze(struct page *page, int v, int ret);
 extern void __page_ref_unfreeze(struct page *page, int v);
 
@@ -50,7 +50,7 @@  static inline void __page_ref_mod_and_test(struct page *page, int v, int ret)
 static inline void __page_ref_mod_and_return(struct page *page, int v, int ret)
 {
 }
-static inline void __page_ref_mod_unless(struct page *page, int v, int u)
+static inline void __page_ref_add_unless(struct page *page, int v, int u, int ret)
 {
 }
 static inline void __page_ref_freeze(struct page *page, int v, int ret)
@@ -237,8 +237,8 @@  static inline bool page_ref_add_unless(struct page *page, int nr, int u)
 {
 	bool ret = atomic_add_unless(&page->_refcount, nr, u);
 
-	if (page_ref_tracepoint_active(page_ref_mod_unless))
-		__page_ref_mod_unless(page, nr, ret);
+	if (page_ref_tracepoint_active(page_ref_add_unless))
+		__page_ref_add_unless(page, nr, u, ret);
 	return ret;
 }
 
diff --git a/include/trace/events/page_ref.h b/include/trace/events/page_ref.h
index 8a99c1cd417b..c32d6d161cdb 100644
--- a/include/trace/events/page_ref.h
+++ b/include/trace/events/page_ref.h
@@ -94,6 +94,43 @@  DECLARE_EVENT_CLASS(page_ref_mod_and_test_template,
 		__entry->val, __entry->ret)
 );
 
+DECLARE_EVENT_CLASS(page_ref_add_unless_template,
+
+	TP_PROTO(struct page *page, int v, int u, int ret),
+
+	TP_ARGS(page, v, u, ret),
+
+	TP_STRUCT__entry(
+		__field(unsigned long, pfn)
+		__field(unsigned long, flags)
+		__field(int, count)
+		__field(int, mapcount)
+		__field(void *, mapping)
+		__field(int, mt)
+		__field(int, val)
+		__field(int, unless)
+		__field(int, ret)
+	),
+
+	TP_fast_assign(
+		__entry->pfn = page_to_pfn(page);
+		__entry->flags = page->flags;
+		__entry->count = page_ref_count(page);
+		__entry->mapcount = page_mapcount(page);
+		__entry->mapping = page->mapping;
+		__entry->mt = get_pageblock_migratetype(page);
+		__entry->val = v;
+		__entry->ret = ret;
+	),
+
+	TP_printk("pfn=0x%lx flags=%s count=%d mapcount=%d mapping=%p mt=%d val=%d unless=%d ret=%d",
+		__entry->pfn,
+		show_page_flags(__entry->flags & PAGEFLAGS_MASK),
+		__entry->count,
+		__entry->mapcount, __entry->mapping, __entry->mt,
+		__entry->val, __entry->unless, __entry->ret)
+);
+
 DEFINE_EVENT(page_ref_mod_and_test_template, page_ref_mod_and_test,
 
 	TP_PROTO(struct page *page, int v, int ret),
@@ -108,11 +145,11 @@  DEFINE_EVENT(page_ref_mod_and_test_template, page_ref_mod_and_return,
 	TP_ARGS(page, v, ret)
 );
 
-DEFINE_EVENT(page_ref_mod_and_test_template, page_ref_mod_unless,
+DEFINE_EVENT(page_ref_add_unless_template, page_ref_add_unless,
 
-	TP_PROTO(struct page *page, int v, int ret),
+	TP_PROTO(struct page *page, int v, int u, int ret),
 
-	TP_ARGS(page, v, ret)
+	TP_ARGS(page, v, u, ret)
 );
 
 DEFINE_EVENT(page_ref_mod_and_test_template, page_ref_freeze,
diff --git a/mm/debug_page_ref.c b/mm/debug_page_ref.c
index f3b2c9d3ece2..1426d6887b01 100644
--- a/mm/debug_page_ref.c
+++ b/mm/debug_page_ref.c
@@ -33,12 +33,12 @@  void __page_ref_mod_and_return(struct page *page, int v, int ret)
 EXPORT_SYMBOL(__page_ref_mod_and_return);
 EXPORT_TRACEPOINT_SYMBOL(page_ref_mod_and_return);
 
-void __page_ref_mod_unless(struct page *page, int v, int u)
+void __page_ref_add_unless(struct page *page, int v, int u, int ret)
 {
-	trace_page_ref_mod_unless(page, v, u);
+	trace_page_ref_add_unless(page, v, u, ret);
 }
-EXPORT_SYMBOL(__page_ref_mod_unless);
-EXPORT_TRACEPOINT_SYMBOL(page_ref_mod_unless);
+EXPORT_SYMBOL(__page_ref_add_unless);
+EXPORT_TRACEPOINT_SYMBOL(page_ref_add_unless);
 
 void __page_ref_freeze(struct page *page, int v, int ret)
 {