diff mbox series

net/smc: Reduce size of smc_wr_tx_tasklet_fn

Message ID 20250315062516.788528-1-richard120310@gmail.com (mailing list archive)
State New
Headers show
Series net/smc: Reduce size of smc_wr_tx_tasklet_fn | expand

Commit Message

I Hsin Cheng March 15, 2025, 6:25 a.m. UTC
The variable "polled" in smc_wr_tx_tasklet_fn is a counter to determine
whether the loop has been executed for the first time. Refactor the type
of "polled" from "int" to "bool" can reduce the size of generated code
size by 12 bytes shown with the test below

$ ./scripts/bloat-o-meter vmlinux_old vmlinux_new
add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-12 (-12)
Function                                     old     new   delta
smc_wr_tx_tasklet_fn                        1076    1064     -12
Total: Before=24795091, After=24795079, chg -0.00%

In some configuration, the compiler will complain this function for
exceeding 1024 bytes for function stack, this change can at least reduce
the size by 12 bytes within manner.

Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
---
 net/smc/smc_wr.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

Dust Li March 17, 2025, 1:34 a.m. UTC | #1
On 2025-03-15 14:25:16, I Hsin Cheng wrote:
>The variable "polled" in smc_wr_tx_tasklet_fn is a counter to determine
>whether the loop has been executed for the first time. Refactor the type
>of "polled" from "int" to "bool" can reduce the size of generated code
>size by 12 bytes shown with the test below
>
>$ ./scripts/bloat-o-meter vmlinux_old vmlinux_new
>add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-12 (-12)
>Function                                     old     new   delta
>smc_wr_tx_tasklet_fn                        1076    1064     -12
>Total: Before=24795091, After=24795079, chg -0.00%
>
>In some configuration, the compiler will complain this function for
>exceeding 1024 bytes for function stack, this change can at least reduce
>the size by 12 bytes within manner.
>
>Signed-off-by: I Hsin Cheng <richard120310@gmail.com>

Reviewed-by: Dust Li <dust.li@linux.alibaba.com>

Best regards,
Dust

>---
> net/smc/smc_wr.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
>diff --git a/net/smc/smc_wr.c b/net/smc/smc_wr.c
>index b04a21b8c511..3cc435ed7fde 100644
>--- a/net/smc/smc_wr.c
>+++ b/net/smc/smc_wr.c
>@@ -138,14 +138,14 @@ static void smc_wr_tx_tasklet_fn(struct tasklet_struct *t)
> 	struct smc_ib_device *dev = from_tasklet(dev, t, send_tasklet);
> 	struct ib_wc wc[SMC_WR_MAX_POLL_CQE];
> 	int i = 0, rc;
>-	int polled = 0;
>+	bool polled = false;
> 
> again:
>-	polled++;
>+	polled = !polled;
> 	do {
> 		memset(&wc, 0, sizeof(wc));
> 		rc = ib_poll_cq(dev->roce_cq_send, SMC_WR_MAX_POLL_CQE, wc);
>-		if (polled == 1) {
>+		if (polled) {
> 			ib_req_notify_cq(dev->roce_cq_send,
> 					 IB_CQ_NEXT_COMP |
> 					 IB_CQ_REPORT_MISSED_EVENTS);
>@@ -155,7 +155,7 @@ static void smc_wr_tx_tasklet_fn(struct tasklet_struct *t)
> 		for (i = 0; i < rc; i++)
> 			smc_wr_tx_process_cqe(&wc[i]);
> 	} while (rc > 0);
>-	if (polled == 1)
>+	if (polled)
> 		goto again;
> }
> 
>-- 
>2.43.0
>
Wenjia Zhang March 17, 2025, 11:22 a.m. UTC | #2
On 15.03.25 07:25, I Hsin Cheng wrote:
> The variable "polled" in smc_wr_tx_tasklet_fn is a counter to determine
> whether the loop has been executed for the first time. Refactor the type
> of "polled" from "int" to "bool" can reduce the size of generated code
> size by 12 bytes shown with the test below
> 
> $ ./scripts/bloat-o-meter vmlinux_old vmlinux_new
> add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-12 (-12)
> Function                                     old     new   delta
> smc_wr_tx_tasklet_fn                        1076    1064     -12
> Total: Before=24795091, After=24795079, chg -0.00%
> 
> In some configuration, the compiler will complain this function for
> exceeding 1024 bytes for function stack, this change can at least reduce
> the size by 12 bytes within manner.
> 
The code itself looks good. However, I’m curious about the specific 
situation where the compiler complained. Also, compared to exceeding the 
function stack limit by 1024 bytes, I don’t see how saving 12 bytes 
would bring any significant benefit.

Thanks,
Wenjia
Heiko Carstens March 17, 2025, 1:56 p.m. UTC | #3
On Mon, Mar 17, 2025 at 12:22:46PM +0100, Wenjia Zhang wrote:
> 
> 
> On 15.03.25 07:25, I Hsin Cheng wrote:
> > The variable "polled" in smc_wr_tx_tasklet_fn is a counter to determine
> > whether the loop has been executed for the first time. Refactor the type
> > of "polled" from "int" to "bool" can reduce the size of generated code
> > size by 12 bytes shown with the test below
> > 
> > $ ./scripts/bloat-o-meter vmlinux_old vmlinux_new
> > add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-12 (-12)
> > Function                                     old     new   delta
> > smc_wr_tx_tasklet_fn                        1076    1064     -12
> > Total: Before=24795091, After=24795079, chg -0.00%
> > 
> > In some configuration, the compiler will complain this function for
> > exceeding 1024 bytes for function stack, this change can at least reduce
> > the size by 12 bytes within manner.
> > 
> The code itself looks good. However, I’m curious about the specific
> situation where the compiler complained. Also, compared to exceeding the
> function stack limit by 1024 bytes, I don’t see how saving 12 bytes would
> bring any significant benefit.

The patch description doesn't make sense: bloat-a-meter prints the _text
size_ difference of two kernels, which really has nothing to do with
potential stack size savings.

If there are any changes in stack size with this patch is unknown; at least
if you rely only on the patch description.

You may want to have a look at scripts/stackusage and scripts/stackdelta.
Wenjia Zhang March 18, 2025, 8:43 a.m. UTC | #4
On 17.03.25 14:56, Heiko Carstens wrote:
> On Mon, Mar 17, 2025 at 12:22:46PM +0100, Wenjia Zhang wrote:
>>
>>
>> On 15.03.25 07:25, I Hsin Cheng wrote:
>>> The variable "polled" in smc_wr_tx_tasklet_fn is a counter to determine
>>> whether the loop has been executed for the first time. Refactor the type
>>> of "polled" from "int" to "bool" can reduce the size of generated code
>>> size by 12 bytes shown with the test below
>>>
>>> $ ./scripts/bloat-o-meter vmlinux_old vmlinux_new
>>> add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-12 (-12)
>>> Function                                     old     new   delta
>>> smc_wr_tx_tasklet_fn                        1076    1064     -12
>>> Total: Before=24795091, After=24795079, chg -0.00%
>>>
>>> In some configuration, the compiler will complain this function for
>>> exceeding 1024 bytes for function stack, this change can at least reduce
>>> the size by 12 bytes within manner.
>>>
>> The code itself looks good. However, I’m curious about the specific
>> situation where the compiler complained. Also, compared to exceeding the
>> function stack limit by 1024 bytes, I don’t see how saving 12 bytes would
>> bring any significant benefit.
> 
> The patch description doesn't make sense: bloat-a-meter prints the _text
> size_ difference of two kernels, which really has nothing to do with
> potential stack size savings.
> 
> If there are any changes in stack size with this patch is unknown; at least
> if you rely only on the patch description.
> 
> You may want to have a look at scripts/stackusage and scripts/stackdelta.

@Heiko, thank you for pointing it out!

Even if the potential stack size saving of 12 bytes were true, I still 
don’t see how it would benefit our code, let alone justify the incorrect 
argument.
I Hsin Cheng March 19, 2025, 2:42 a.m. UTC | #5
On Tue, Mar 18, 2025 at 09:43:07AM +0100, Wenjia Zhang wrote:
> 
> 
> On 17.03.25 14:56, Heiko Carstens wrote:
> > On Mon, Mar 17, 2025 at 12:22:46PM +0100, Wenjia Zhang wrote:
> > > 
> > > 
> > > On 15.03.25 07:25, I Hsin Cheng wrote:
> > > > The variable "polled" in smc_wr_tx_tasklet_fn is a counter to determine
> > > > whether the loop has been executed for the first time. Refactor the type
> > > > of "polled" from "int" to "bool" can reduce the size of generated code
> > > > size by 12 bytes shown with the test below
> > > > 
> > > > $ ./scripts/bloat-o-meter vmlinux_old vmlinux_new
> > > > add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-12 (-12)
> > > > Function                                     old     new   delta
> > > > smc_wr_tx_tasklet_fn                        1076    1064     -12
> > > > Total: Before=24795091, After=24795079, chg -0.00%
> > > > 
> > > > In some configuration, the compiler will complain this function for
> > > > exceeding 1024 bytes for function stack, this change can at least reduce
> > > > the size by 12 bytes within manner.
> > > > 
> > > The code itself looks good. However, I’m curious about the specific
> > > situation where the compiler complained. Also, compared to exceeding the
> > > function stack limit by 1024 bytes, I don’t see how saving 12 bytes would
> > > bring any significant benefit.
> > 
> > The patch description doesn't make sense: bloat-a-meter prints the _text
> > size_ difference of two kernels, which really has nothing to do with
> > potential stack size savings.
> > 
> > If there are any changes in stack size with this patch is unknown; at least
> > if you rely only on the patch description.
> > 
> > You may want to have a look at scripts/stackusage and scripts/stackdelta.
> 
> @Heiko, thank you for pointing it out!
> 
> Even if the potential stack size saving of 12 bytes were true, I still don’t
> see how it would benefit our code, let alone justify the incorrect argument.
>

Hi Heiko, Wenjia,

Thanks for your kindly review!

> > If there are any changes in stack size with this patch is unknown; at least
> > if you rely only on the patch description.
> >
> > You may want to have a look at scripts/stackusage and scripts/stackdelta.

Thanks for this, really appreciate! I'll try it out and see is there
anything different.

> Even if the potential stack size saving of 12 bytes were true, I still don’t
> see how it would benefit our code, let alone justify the incorrect argument.

Hmm I suppose smaller memory footprint can benefit the performace,
though I agree it won't be significant. I'm not sure how to do
performance test on this function, would you be so kind to suggest some
ideas for me to test out.

And I want to ask why's the argument incorrect? since I only change the
type of "polled", maybe you mean "polled" itself should be an integer
type?

Best regards,
I Hsin Cheng
diff mbox series

Patch

diff --git a/net/smc/smc_wr.c b/net/smc/smc_wr.c
index b04a21b8c511..3cc435ed7fde 100644
--- a/net/smc/smc_wr.c
+++ b/net/smc/smc_wr.c
@@ -138,14 +138,14 @@  static void smc_wr_tx_tasklet_fn(struct tasklet_struct *t)
 	struct smc_ib_device *dev = from_tasklet(dev, t, send_tasklet);
 	struct ib_wc wc[SMC_WR_MAX_POLL_CQE];
 	int i = 0, rc;
-	int polled = 0;
+	bool polled = false;
 
 again:
-	polled++;
+	polled = !polled;
 	do {
 		memset(&wc, 0, sizeof(wc));
 		rc = ib_poll_cq(dev->roce_cq_send, SMC_WR_MAX_POLL_CQE, wc);
-		if (polled == 1) {
+		if (polled) {
 			ib_req_notify_cq(dev->roce_cq_send,
 					 IB_CQ_NEXT_COMP |
 					 IB_CQ_REPORT_MISSED_EVENTS);
@@ -155,7 +155,7 @@  static void smc_wr_tx_tasklet_fn(struct tasklet_struct *t)
 		for (i = 0; i < rc; i++)
 			smc_wr_tx_process_cqe(&wc[i]);
 	} while (rc > 0);
-	if (polled == 1)
+	if (polled)
 		goto again;
 }