diff mbox series

[2/7] mm/init: remove the unnecessary special treatment for memory-less node

Message ID 20240326061134.1055295-3-bhe@redhat.com (mailing list archive)
State New
Headers show
Series mm/init: minor clean up and improvement | expand

Commit Message

Baoquan He March 26, 2024, 6:11 a.m. UTC
Because memory-less node's ->node_present_pages and its
zone's ->present_pages are all 0, the judgement before calling
node_set_state() to set N_MEMORY, N_HIGH_MEMORY, N_NORMAL_MEMORY for
node is enough to skip memory-less node. The 'continue;' statement
inside for_each_node() loop of free_area_init() is gilding the lily.

Here, remove the special handling to make memory-less node share the
same code flow as normal node. And the code comment above the 'continue'
statement is not needed either.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 mm/mm_init.c | 18 +++---------------
 1 file changed, 3 insertions(+), 15 deletions(-)

Comments

Mike Rapoport April 2, 2024, 8:32 a.m. UTC | #1
On Tue, Mar 26, 2024 at 02:11:28PM +0800, Baoquan He wrote:
> Because memory-less node's ->node_present_pages and its
> zone's ->present_pages are all 0, the judgement before calling
> node_set_state() to set N_MEMORY, N_HIGH_MEMORY, N_NORMAL_MEMORY for
> node is enough to skip memory-less node. The 'continue;' statement
> inside for_each_node() loop of free_area_init() is gilding the lily.
> 
> Here, remove the special handling to make memory-less node share the
> same code flow as normal node. And the code comment above the 'continue'
> statement is not needed either.
> 
> Signed-off-by: Baoquan He <bhe@redhat.com>
> ---
>  mm/mm_init.c | 18 +++---------------
>  1 file changed, 3 insertions(+), 15 deletions(-)
> 
> diff --git a/mm/mm_init.c b/mm/mm_init.c
> index 089dc60159e9..99681ffb9091 100644
> --- a/mm/mm_init.c
> +++ b/mm/mm_init.c
> @@ -1834,28 +1834,16 @@ void __init free_area_init(unsigned long *max_zone_pfn)
>  				panic("Cannot allocate %zuB for node %d.\n",
>  				       sizeof(*pgdat), nid);
>  			arch_refresh_nodedata(nid, pgdat);
> -			free_area_init_node(nid);
> -
> -			/*
> -			 * We do not want to confuse userspace by sysfs
> -			 * files/directories for node without any memory
> -			 * attached to it, so this node is not marked as
> -			 * N_MEMORY and not marked online so that no sysfs
> -			 * hierarchy will be created via register_one_node for
> -			 * it. The pgdat will get fully initialized by
> -			 * hotadd_init_pgdat() when memory is hotplugged into
> -			 * this node.
> -			 */

I think this comment is still valuable. Maybe rephrase it a bit and move it
before 'if (pgdat->node_present_pages)'?

> -			continue;
>  		}
>  
>  		pgdat = NODE_DATA(nid);
>  		free_area_init_node(nid);
>  
>  		/* Any memory on that node */
> -		if (pgdat->node_present_pages)
> +		if (pgdat->node_present_pages) {
>  			node_set_state(nid, N_MEMORY);
> -		check_for_memory(pgdat);
> +			check_for_memory(pgdat);
> +		}
>  	}
>  
>  	calc_nr_kernel_pages();
> -- 
> 2.41.0
>
Baoquan He April 4, 2024, 3:23 a.m. UTC | #2
On 04/02/24 at 11:32am, Mike Rapoport wrote:
> On Tue, Mar 26, 2024 at 02:11:28PM +0800, Baoquan He wrote:
> > Because memory-less node's ->node_present_pages and its
> > zone's ->present_pages are all 0, the judgement before calling
> > node_set_state() to set N_MEMORY, N_HIGH_MEMORY, N_NORMAL_MEMORY for
> > node is enough to skip memory-less node. The 'continue;' statement
> > inside for_each_node() loop of free_area_init() is gilding the lily.
> > 
> > Here, remove the special handling to make memory-less node share the
> > same code flow as normal node. And the code comment above the 'continue'
> > statement is not needed either.
> > 
> > Signed-off-by: Baoquan He <bhe@redhat.com>
> > ---
> >  mm/mm_init.c | 18 +++---------------
> >  1 file changed, 3 insertions(+), 15 deletions(-)
> > 
> > diff --git a/mm/mm_init.c b/mm/mm_init.c
> > index 089dc60159e9..99681ffb9091 100644
> > --- a/mm/mm_init.c
> > +++ b/mm/mm_init.c
> > @@ -1834,28 +1834,16 @@ void __init free_area_init(unsigned long *max_zone_pfn)
> >  				panic("Cannot allocate %zuB for node %d.\n",
> >  				       sizeof(*pgdat), nid);
> >  			arch_refresh_nodedata(nid, pgdat);
> > -			free_area_init_node(nid);
> > -
> > -			/*
> > -			 * We do not want to confuse userspace by sysfs
> > -			 * files/directories for node without any memory
> > -			 * attached to it, so this node is not marked as
> > -			 * N_MEMORY and not marked online so that no sysfs
> > -			 * hierarchy will be created via register_one_node for
> > -			 * it. The pgdat will get fully initialized by
> > -			 * hotadd_init_pgdat() when memory is hotplugged into
> > -			 * this node.
> > -			 */
> 
> I think this comment is still valuable. Maybe rephrase it a bit and move it
> before 'if (pgdat->node_present_pages)'?

Fair enough.

Do you think below paragraph is OK to you? Please help polish or
rephrase it.

diff --git a/mm/mm_init.c b/mm/mm_init.c
index dd875f943cbb..3ce0f29637ba 100644
--- a/mm/mm_init.c
+++ b/mm/mm_init.c
@@ -1839,7 +1839,14 @@ void __init free_area_init(unsigned long *max_zone_pfn)
 		pgdat = NODE_DATA(nid);
 		free_area_init_node(nid);
 
-		/* Any memory on that node */
+		/*
+		 * No sysfs hierarcy will be created via register_one_node()
+		 *for memory-less node because here it's not marked as N_MEMORY
+		 *and won't be set online later. The benefit is userspace
+		 *program won't be confused by sysfs files/directories of
+		 *memory-less node. The pgdat will get fully initialized by
+		 *hotadd_init_pgdat() when memory is hotplugged into this node.
+		 */
 		if (pgdat->node_present_pages) {
 			node_set_state(nid, N_MEMORY);
 			check_for_memory(pgdat);
Mike Rapoport April 9, 2024, 3:40 p.m. UTC | #3
On Thu, Apr 04, 2024 at 11:23:51AM +0800, Baoquan He wrote:
> On 04/02/24 at 11:32am, Mike Rapoport wrote:
> > On Tue, Mar 26, 2024 at 02:11:28PM +0800, Baoquan He wrote:
> > > Because memory-less node's ->node_present_pages and its
> > > zone's ->present_pages are all 0, the judgement before calling
> > > node_set_state() to set N_MEMORY, N_HIGH_MEMORY, N_NORMAL_MEMORY for
> > > node is enough to skip memory-less node. The 'continue;' statement
> > > inside for_each_node() loop of free_area_init() is gilding the lily.
> > > 
> > > Here, remove the special handling to make memory-less node share the
> > > same code flow as normal node. And the code comment above the 'continue'
> > > statement is not needed either.
> > > 
> > > Signed-off-by: Baoquan He <bhe@redhat.com>
> > > ---
> > >  mm/mm_init.c | 18 +++---------------
> > >  1 file changed, 3 insertions(+), 15 deletions(-)
> > > 
> > > diff --git a/mm/mm_init.c b/mm/mm_init.c
> > > index 089dc60159e9..99681ffb9091 100644
> > > --- a/mm/mm_init.c
> > > +++ b/mm/mm_init.c
> > > @@ -1834,28 +1834,16 @@ void __init free_area_init(unsigned long *max_zone_pfn)
> > >  				panic("Cannot allocate %zuB for node %d.\n",
> > >  				       sizeof(*pgdat), nid);
> > >  			arch_refresh_nodedata(nid, pgdat);
> > > -			free_area_init_node(nid);
> > > -
> > > -			/*
> > > -			 * We do not want to confuse userspace by sysfs
> > > -			 * files/directories for node without any memory
> > > -			 * attached to it, so this node is not marked as
> > > -			 * N_MEMORY and not marked online so that no sysfs
> > > -			 * hierarchy will be created via register_one_node for
> > > -			 * it. The pgdat will get fully initialized by
> > > -			 * hotadd_init_pgdat() when memory is hotplugged into
> > > -			 * this node.
> > > -			 */
> > 
> > I think this comment is still valuable. Maybe rephrase it a bit and move it
> > before 'if (pgdat->node_present_pages)'?
> 
> Fair enough.
> 
> Do you think below paragraph is OK to you? Please help polish or
> rephrase it.
> 
> diff --git a/mm/mm_init.c b/mm/mm_init.c
> index dd875f943cbb..3ce0f29637ba 100644
> --- a/mm/mm_init.c
> +++ b/mm/mm_init.c
> @@ -1839,7 +1839,14 @@ void __init free_area_init(unsigned long *max_zone_pfn)
>  		pgdat = NODE_DATA(nid);
>  		free_area_init_node(nid);
>  
> -		/* Any memory on that node */
> +		/*
> +		 * No sysfs hierarcy will be created via register_one_node()
> +		 *for memory-less node because here it's not marked as N_MEMORY
> +		 *and won't be set online later. The benefit is userspace
> +		 *program won't be confused by sysfs files/directories of
> +		 *memory-less node. The pgdat will get fully initialized by
> +		 *hotadd_init_pgdat() when memory is hotplugged into this node.
> +		 */

Ack

>  		if (pgdat->node_present_pages) {
>  			node_set_state(nid, N_MEMORY);
>  			check_for_memory(pgdat);
>
Baoquan He April 10, 2024, 3:35 a.m. UTC | #4
Because memory-less node's ->node_present_pages and its
zone's ->present_pages are all 0, the judgement before calling
node_set_state() to set N_MEMORY, N_HIGH_MEMORY, N_NORMAL_MEMORY for
node is enough to skip memory-less node. The 'continue;' statement
inside for_each_node() loop of free_area_init() is gilding the lily.

Here, remove the special handling to make memory-less node share the
same code flow as normal node.

And also rephrase the code comments above the 'continue' statement
and move them above above line 'if (pgdat->node_present_pages)'.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
v1->v2:
- As Mike suggested, the old code comments above the 'continue'
  statement is still useful for easier understanding code and system
  behaviour. So rephrase and move them above line 'if
  (pgdat->node_present_pages)'. Thanks to Mike.

 mm/mm_init.c | 27 +++++++++++----------------
 1 file changed, 11 insertions(+), 16 deletions(-)

diff --git a/mm/mm_init.c b/mm/mm_init.c
index 2016ca8031e9..32ede966e609 100644
--- a/mm/mm_init.c
+++ b/mm/mm_init.c
@@ -1834,28 +1834,23 @@ void __init free_area_init(unsigned long *max_zone_pfn)
 				panic("Cannot allocate %zuB for node %d.\n",
 				       sizeof(*pgdat), nid);
 			arch_refresh_nodedata(nid, pgdat);
-			free_area_init_node(nid);
-
-			/*
-			 * We do not want to confuse userspace by sysfs
-			 * files/directories for node without any memory
-			 * attached to it, so this node is not marked as
-			 * N_MEMORY and not marked online so that no sysfs
-			 * hierarchy will be created via register_one_node for
-			 * it. The pgdat will get fully initialized by
-			 * hotadd_init_pgdat() when memory is hotplugged into
-			 * this node.
-			 */
-			continue;
 		}
 
 		pgdat = NODE_DATA(nid);
 		free_area_init_node(nid);
 
-		/* Any memory on that node */
-		if (pgdat->node_present_pages)
+		/*
+		 * No sysfs hierarcy will be created via register_one_node()
+		 *for memory-less node because here it's not marked as N_MEMORY
+		 *and won't be set online later. The benefit is userspace
+		 *program won't be confused by sysfs files/directories of
+		 *memory-less node. The pgdat will get fully initialized by
+		 *hotadd_init_pgdat() when memory is hotplugged into this node.
+		 */
+		if (pgdat->node_present_pages) {
 			node_set_state(nid, N_MEMORY);
-		check_for_memory(pgdat);
+			check_for_memory(pgdat);
+		}
 	}
 
 	calc_nr_kernel_pages();
Baoquan He April 10, 2024, 3:38 a.m. UTC | #5
On 04/09/24 at 06:40pm, Mike Rapoport wrote:
> On Thu, Apr 04, 2024 at 11:23:51AM +0800, Baoquan He wrote:
> > On 04/02/24 at 11:32am, Mike Rapoport wrote:
> > > On Tue, Mar 26, 2024 at 02:11:28PM +0800, Baoquan He wrote:
> > > > Because memory-less node's ->node_present_pages and its
> > > > zone's ->present_pages are all 0, the judgement before calling
> > > > node_set_state() to set N_MEMORY, N_HIGH_MEMORY, N_NORMAL_MEMORY for
> > > > node is enough to skip memory-less node. The 'continue;' statement
> > > > inside for_each_node() loop of free_area_init() is gilding the lily.
> > > > 
> > > > Here, remove the special handling to make memory-less node share the
> > > > same code flow as normal node. And the code comment above the 'continue'
> > > > statement is not needed either.
> > > > 
> > > > Signed-off-by: Baoquan He <bhe@redhat.com>
> > > > ---
> > > >  mm/mm_init.c | 18 +++---------------
> > > >  1 file changed, 3 insertions(+), 15 deletions(-)
> > > > 
> > > > diff --git a/mm/mm_init.c b/mm/mm_init.c
> > > > index 089dc60159e9..99681ffb9091 100644
> > > > --- a/mm/mm_init.c
> > > > +++ b/mm/mm_init.c
> > > > @@ -1834,28 +1834,16 @@ void __init free_area_init(unsigned long *max_zone_pfn)
> > > >  				panic("Cannot allocate %zuB for node %d.\n",
> > > >  				       sizeof(*pgdat), nid);
> > > >  			arch_refresh_nodedata(nid, pgdat);
> > > > -			free_area_init_node(nid);
> > > > -
> > > > -			/*
> > > > -			 * We do not want to confuse userspace by sysfs
> > > > -			 * files/directories for node without any memory
> > > > -			 * attached to it, so this node is not marked as
> > > > -			 * N_MEMORY and not marked online so that no sysfs
> > > > -			 * hierarchy will be created via register_one_node for
> > > > -			 * it. The pgdat will get fully initialized by
> > > > -			 * hotadd_init_pgdat() when memory is hotplugged into
> > > > -			 * this node.
> > > > -			 */
> > > 
> > > I think this comment is still valuable. Maybe rephrase it a bit and move it
> > > before 'if (pgdat->node_present_pages)'?
> > 
> > Fair enough.
> > 
> > Do you think below paragraph is OK to you? Please help polish or
> > rephrase it.
> > 
> > diff --git a/mm/mm_init.c b/mm/mm_init.c
> > index dd875f943cbb..3ce0f29637ba 100644
> > --- a/mm/mm_init.c
> > +++ b/mm/mm_init.c
> > @@ -1839,7 +1839,14 @@ void __init free_area_init(unsigned long *max_zone_pfn)
> >  		pgdat = NODE_DATA(nid);
> >  		free_area_init_node(nid);
> >  
> > -		/* Any memory on that node */
> > +		/*
> > +		 * No sysfs hierarcy will be created via register_one_node()
> > +		 *for memory-less node because here it's not marked as N_MEMORY
> > +		 *and won't be set online later. The benefit is userspace
> > +		 *program won't be confused by sysfs files/directories of
> > +		 *memory-less node. The pgdat will get fully initialized by
> > +		 *hotadd_init_pgdat() when memory is hotplugged into this node.
> > +		 */
> 
> Ack

Thanks a lot for reviewing and confirming, Mike.

Hi Andrew,

I have sent out v2 to include above code comment changing, please feel
free to pick the draft patch and append, or take the newly post v2.
Thanks.

> 
> >  		if (pgdat->node_present_pages) {
> >  			node_set_state(nid, N_MEMORY);
> >  			check_for_memory(pgdat);
> > 
> 
> -- 
> Sincerely yours,
> Mike.
>
diff mbox series

Patch

diff --git a/mm/mm_init.c b/mm/mm_init.c
index 089dc60159e9..99681ffb9091 100644
--- a/mm/mm_init.c
+++ b/mm/mm_init.c
@@ -1834,28 +1834,16 @@  void __init free_area_init(unsigned long *max_zone_pfn)
 				panic("Cannot allocate %zuB for node %d.\n",
 				       sizeof(*pgdat), nid);
 			arch_refresh_nodedata(nid, pgdat);
-			free_area_init_node(nid);
-
-			/*
-			 * We do not want to confuse userspace by sysfs
-			 * files/directories for node without any memory
-			 * attached to it, so this node is not marked as
-			 * N_MEMORY and not marked online so that no sysfs
-			 * hierarchy will be created via register_one_node for
-			 * it. The pgdat will get fully initialized by
-			 * hotadd_init_pgdat() when memory is hotplugged into
-			 * this node.
-			 */
-			continue;
 		}
 
 		pgdat = NODE_DATA(nid);
 		free_area_init_node(nid);
 
 		/* Any memory on that node */
-		if (pgdat->node_present_pages)
+		if (pgdat->node_present_pages) {
 			node_set_state(nid, N_MEMORY);
-		check_for_memory(pgdat);
+			check_for_memory(pgdat);
+		}
 	}
 
 	calc_nr_kernel_pages();