diff mbox series

[RFC,v1,13/50] Avoid some useless msecs/jiffies conversions

Message ID 202003281643.02SGhBrh000992@sdf.org (mailing list archive)
State New, archived
Headers show
Series None | expand

Commit Message

George Spelvin Aug. 22, 2019, 12:25 a.m. UTC
Rather than generating a random number of milliseconds in a
constant range and then converting to jiffies, convert the range
to jiffies (evaluated at compile time) and choose a random number
of jiffies in that range.

Likewise, "msecs_to_jiffies(seconds * 1000)" is more
conveniently written "seconds * HZ".

Signed-off-by: George Spelvin <lkml@sdf.org>
Cc: Hannes Reinecke <hare@suse.de>
Cc: linux-scsi@vger.kernel.org
Cc: Marek Lindner <mareklindner@neomailbox.ch>
Cc: Simon Wunderlich <sw@simonwunderlich.de>
Cc: Antonio Quartulli <a@unstable.cc>
Cc: Sven Eckelmann <sven@narfation.org>
Cc: b.a.t.m.a.n@lists.open-mesh.org
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: linux-wireless@vger.kernel.org
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Takashi Iwai <tiwai@suse.com>
Cc: alsa-devel@alsa-project.org
---
 drivers/scsi/fcoe/fcoe_ctlr.c | 10 +++++-----
 net/batman-adv/bat_iv_ogm.c   |  2 +-
 net/batman-adv/bat_v_ogm.c    |  8 ++++----
 net/bluetooth/hci_request.c   |  2 +-
 net/wireless/scan.c           |  2 +-
 sound/core/pcm_lib.c          |  2 +-
 sound/core/pcm_native.c       |  2 +-
 7 files changed, 14 insertions(+), 14 deletions(-)

Comments

George Spelvin March 29, 2020, 12:11 p.m. UTC | #1
On Sun, Mar 29, 2020 at 09:52:23AM +0200, Takashi Iwai wrote:
> On Thu, 22 Aug 2019 02:25:10 +0200, George Spelvin wrote:
>> Likewise, "msecs_to_jiffies(seconds * 1000)" is more
>> conveniently written "seconds * HZ".
> 
> I thought the compiler already optimizes to the constant calculation
> for the above case?

It optimizes that if the entire argument, including "seconds", is
a compile-time constant.

However, given "msecs_to_jiffies(hdev->rpa_timeout * 1000);",
the computatin is non-trivial.
Takashi Iwai March 29, 2020, 5:13 p.m. UTC | #2
On Sun, 29 Mar 2020 14:11:29 +0200,
George Spelvin wrote:
> 
> On Sun, Mar 29, 2020 at 09:52:23AM +0200, Takashi Iwai wrote:
> > On Thu, 22 Aug 2019 02:25:10 +0200, George Spelvin wrote:
> >> Likewise, "msecs_to_jiffies(seconds * 1000)" is more
> >> conveniently written "seconds * HZ".
> > 
> > I thought the compiler already optimizes to the constant calculation
> > for the above case?
> 
> It optimizes that if the entire argument, including "seconds", is
> a compile-time constant.
> 
> However, given "msecs_to_jiffies(hdev->rpa_timeout * 1000);",
> the computatin is non-trivial.

Fair enough.  But it's still a question whether an open code X * HZ is
good at all...


thanks,

Takashi
George Spelvin March 29, 2020, 5:50 p.m. UTC | #3
On Sun, Mar 29, 2020 at 07:13:33PM +0200, Takashi Iwai wrote:
> On Sun, 29 Mar 2020 14:11:29 +0200, George Spelvin wrote:
>> On Sun, Mar 29, 2020 at 09:52:23AM +0200, Takashi Iwai wrote:
>>> I thought the compiler already optimizes to the constant calculation
>>> for the above case?
>> 
>> It optimizes that if the entire argument, including "seconds", is
>> a compile-time constant.
>> 
>> However, given "msecs_to_jiffies(hdev->rpa_timeout * 1000);",
>> the computatin is non-trivial.
> 
> Fair enough.  But it's still a question whether an open code X * HZ is
> good at all...

I'm sorry, I don't understand what you mean by "good at all" here.
The value computed is exactly the same.

msecs_to_jiffies(x) is basically (x * HZ + 999) / 1000, so
  msecs_to_jiffies(s * 1000)
= (s * 1000 * HZ + 999) / 1000
= s * HZ + 999/1000
= s * HZ.
James Bottomley March 29, 2020, 6:16 p.m. UTC | #4
On Sun, 2020-03-29 at 17:50 +0000, George Spelvin wrote:
> On Sun, Mar 29, 2020 at 07:13:33PM +0200, Takashi Iwai wrote:
> > On Sun, 29 Mar 2020 14:11:29 +0200, George Spelvin wrote:
> > > On Sun, Mar 29, 2020 at 09:52:23AM +0200, Takashi Iwai wrote:
> > > > I thought the compiler already optimizes to the constant
> > > > calculation
> > > > for the above case?
> > > 
> > > It optimizes that if the entire argument, including "seconds", is
> > > a compile-time constant.
> > > 
> > > However, given "msecs_to_jiffies(hdev->rpa_timeout * 1000);",
> > > the computatin is non-trivial.
> > 
> > Fair enough.  But it's still a question whether an open code X * HZ
> > is
> > good at all...
> 
> I'm sorry, I don't understand what you mean by "good at all" here.
> The value computed is exactly the same.

I think he means what the compiler does with it.

We all assume that msecs_to_jiffies is properly optimized so there
should be no need to open code it like you're proposing.  So firstly
can you produce the assembly that shows the worse output from
msecs_to_jiffies?  If there is a problem, then we should be fixing it
in msecs_to_jiffies, not adding open coding.

James
George Spelvin March 29, 2020, 9:18 p.m. UTC | #5
On Sun, Mar 29, 2020 at 11:16:47AM -0700, James Bottomley wrote:
> On Sun, 2020-03-29 at 17:50 +0000, George Spelvin wrote:
>> On Sun, Mar 29, 2020 at 07:13:33PM +0200, Takashi Iwai wrote:
>>> Fair enough.  But it's still a question whether an open code X * HZ
>>> is good at all...
>> 
>> I'm sorry, I don't understand what you mean by "good at all" here.
>> The value computed is exactly the same.
> 
> I think he means what the compiler does with it.
> 
> We all assume that msecs_to_jiffies is properly optimized so there
> should be no need to open code it like you're proposing.  So firstly
> can you produce the assembly that shows the worse output from
> msecs_to_jiffies?  If there is a problem, then we should be fixing it
> in msecs_to_jiffies, not adding open coding.

Fair enough.  For the two alternative functions:

#include <linux/jiffies.h>

unsigned long timeout1(unsigned seconds)
{
	return jiffies + msecs_to_jiffies(seconds * 1000);
}

unsigned long timeout2(unsigned seconds)
{
	return jiffies + seconds * HZ;
}

compiled with Kbuild, the object code produced is:

Disassembly of section .text:

0000000000000000 <timeout1>:
   0:	69 ff e8 03 00 00       imul   $0x3e8,%edi,%edi
   6:	e8 00 00 00 00          callq  b <timeout1+0xb>
                        7: R_X86_64_PLT32       __msecs_to_jiffies-0x4
   b:	49 89 c0                mov    %rax,%r8
   e:	48 8b 05 00 00 00 00    mov    0x0(%rip),%rax        # 15 
<timeout1+0x15>
                        11: R_X86_64_PC32       jiffies-0x4
  15:	4c 01 c0                add    %r8,%rax
  18:	c3                      retq   
  19:	0f 1f 80 00 00 00 00    nopl   0x0(%rax)

0000000000000020 <timeout2>:
  20:	48 8b 15 00 00 00 00    mov    0x0(%rip),%rdx        # 27 
<timeout2+0x7>
                        23: R_X86_64_PC32       jiffies-0x4
  27:	69 c7 2c 01 00 00       imul   $0x12c,%edi,%eax
  2d:	48 01 d0                add    %rdx,%rax
  30:	c3                      retq   

This is the type of code I replaced: code where the number of seconds
is not known at compile time.  Notice how the first multiplies by 1000
and then calls __msecs_to_jiffies.  The second multiplies by 300 and
makes no function call.

__msecs_to_jiffies (from kernel/time/time.o) is:

0000000000000100 <__msecs_to_jiffies>:
     100:	48 b8 fe ff ff ff ff    movabs $0x3ffffffffffffffe,%rax
     107:	ff ff 3f 
     10a:	85 ff                   test   %edi,%edi
     10c:	78 1c                   js     12a 
<__msecs_to_jiffies+0x2a>
     10e:	b8 9a 99 99 99          mov    $0x9999999a,%eax
     113:	89 ff                   mov    %edi,%edi
     115:	48 0f af f8             imul   %rax,%rdi
     119:	48 b8 cc cc cc cc 01    movabs $0x1cccccccc,%rax
     120:	00 00 00 
     123:	48 01 f8                add    %rdi,%rax
     126:	48 c1 e8 21             shr    $0x21,%rax
     12a:	c3                      retq   
     12b:	0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)

I didn't try to replace code that uses compile-time constant arguments
such as include/linux/ceph/libceph.h.
Takashi Iwai March 30, 2020, 6:27 a.m. UTC | #6
On Sun, 29 Mar 2020 20:16:47 +0200,
James Bottomley wrote:
> 
> On Sun, 2020-03-29 at 17:50 +0000, George Spelvin wrote:
> > On Sun, Mar 29, 2020 at 07:13:33PM +0200, Takashi Iwai wrote:
> > > On Sun, 29 Mar 2020 14:11:29 +0200, George Spelvin wrote:
> > > > On Sun, Mar 29, 2020 at 09:52:23AM +0200, Takashi Iwai wrote:
> > > > > I thought the compiler already optimizes to the constant
> > > > > calculation
> > > > > for the above case?
> > > > 
> > > > It optimizes that if the entire argument, including "seconds", is
> > > > a compile-time constant.
> > > > 
> > > > However, given "msecs_to_jiffies(hdev->rpa_timeout * 1000);",
> > > > the computatin is non-trivial.
> > > 
> > > Fair enough.  But it's still a question whether an open code X * HZ
> > > is
> > > good at all...
> > 
> > I'm sorry, I don't understand what you mean by "good at all" here.
> > The value computed is exactly the same.
> 
> I think he means what the compiler does with it.
> 
> We all assume that msecs_to_jiffies is properly optimized so there
> should be no need to open code it like you're proposing.

Yes, it'd be best if the compiler can handle it properly.

But also I meant to keep using the macro for consistency reason.
IIRC, we wanted to eliminate the explicit use of HZ in the past, and
it's how many lines have been converted with *_to_jiffies() calls.
I don't know whether the eliminate of HZ is still wished, but
reverting to the open code is a step backward for that.


thanks,

Takashi
George Spelvin March 30, 2020, 6:51 a.m. UTC | #7
On Mon, Mar 30, 2020 at 08:27:01AM +0200, Takashi Iwai wrote:
> On Sun, 29 Mar 2020 20:16:47 +0200, James Bottomley wrote:
>> We all assume that msecs_to_jiffies is properly optimized so there
>> should be no need to open code it like you're proposing.
> 
> Yes, it'd be best if the compiler can handle it properly.

I've tried, and can't figure out how to get the compiler to detect this 
special case and not invoke the general code.  In particular, for a 
variable x, __builtin_constant_p(x * 1000 % 1000) is false.  Even if x is 
signed and ANSI lets the compiler assume that overflow doesn't happen.

If you can do it, I'm most curious how!

> But also I meant to keep using the macro for consistency reason.
> IIRC, we wanted to eliminate the explicit use of HZ in the past, and
> it's how many lines have been converted with *_to_jiffies() calls.
> I don't know whether the eliminate of HZ is still wished, but
> reverting to the open code is a step backward for that.

Well, you could always add a secs_to_jiffies(x) wrapper.  But given
that it expands to basically x * HZ, some people might wonder why
you're bothering.

I assumed that open-coding x * HZ was the preferred style, so that's
what I did.
Takashi Iwai March 30, 2020, 7:29 a.m. UTC | #8
On Mon, 30 Mar 2020 08:51:05 +0200,
George Spelvin wrote:
> 
> On Mon, Mar 30, 2020 at 08:27:01AM +0200, Takashi Iwai wrote:
> > On Sun, 29 Mar 2020 20:16:47 +0200, James Bottomley wrote:
> >> We all assume that msecs_to_jiffies is properly optimized so there
> >> should be no need to open code it like you're proposing.
> > 
> > Yes, it'd be best if the compiler can handle it properly.
> 
> I've tried, and can't figure out how to get the compiler to detect this 
> special case and not invoke the general code.  In particular, for a 
> variable x, __builtin_constant_p(x * 1000 % 1000) is false.  Even if x is 
> signed and ANSI lets the compiler assume that overflow doesn't happen.
> 
> If you can do it, I'm most curious how!

Actually in the very early version of msecs_to_jiffies() was all
inlined, so the compiler could optimize such a case, I guess.  Now it
was factored out to an external function in commit ca42aaf0c861, so it
became difficult.

> > But also I meant to keep using the macro for consistency reason.
> > IIRC, we wanted to eliminate the explicit use of HZ in the past, and
> > it's how many lines have been converted with *_to_jiffies() calls.
> > I don't know whether the eliminate of HZ is still wished, but
> > reverting to the open code is a step backward for that.
> 
> Well, you could always add a secs_to_jiffies(x) wrapper.  But given
> that it expands to basically x * HZ, some people might wonder why
> you're bothering.

Well, comparing with the expanded result doesn't make always sense.
With such a logic, you can argue why BIT(x) macro is needed, too.
After all, it's a matter of semantics.

> I assumed that open-coding x * HZ was the preferred style, so that's
> what I did.

That's my question, too -- whether the open code is preferred for this
particular purpose.


thanks,

Takashi
diff mbox series

Patch

diff --git a/drivers/scsi/fcoe/fcoe_ctlr.c b/drivers/scsi/fcoe/fcoe_ctlr.c
index 1791a393795da..9c530f8827518 100644
--- a/drivers/scsi/fcoe/fcoe_ctlr.c
+++ b/drivers/scsi/fcoe/fcoe_ctlr.c
@@ -2238,10 +2238,10 @@  static void fcoe_ctlr_vn_restart(struct fcoe_ctlr *fip)
 
 	if (fip->probe_tries < FIP_VN_RLIM_COUNT) {
 		fip->probe_tries++;
-		wait = prandom_u32() % FIP_VN_PROBE_WAIT;
+		wait = prandom_u32_max(msecs_to_jiffies(FIP_VN_PROBE_WAIT));
 	} else
-		wait = FIP_VN_RLIM_INT;
-	mod_timer(&fip->timer, jiffies + msecs_to_jiffies(wait));
+		wait = msecs_to_jiffies(FIP_VN_RLIM_INT);
+	mod_timer(&fip->timer, jiffies + wait);
 	fcoe_ctlr_set_state(fip, FIP_ST_VNMP_START);
 }
 
@@ -3132,8 +3132,8 @@  static void fcoe_ctlr_vn_timeout(struct fcoe_ctlr *fip)
 			fcoe_ctlr_vn_send(fip, FIP_SC_VN_BEACON,
 					  fcoe_all_vn2vn, 0);
 			fip->port_ka_time = jiffies +
-				 msecs_to_jiffies(FIP_VN_BEACON_INT +
-					(prandom_u32() % FIP_VN_BEACON_FUZZ));
+			  msecs_to_jiffies(FIP_VN_BEACON_INT) +
+			  prandom_u32_max(msecs_to_jiffies(FIP_VN_BEACON_FUZZ));
 		}
 		if (time_before(fip->port_ka_time, next_time))
 			next_time = fip->port_ka_time;
diff --git a/net/batman-adv/bat_iv_ogm.c b/net/batman-adv/bat_iv_ogm.c
index 46c5cd9f8019e..9efd96e91d53e 100644
--- a/net/batman-adv/bat_iv_ogm.c
+++ b/net/batman-adv/bat_iv_ogm.c
@@ -288,7 +288,7 @@  batadv_iv_ogm_emit_send_time(const struct batadv_priv *bat_priv)
 /* when do we schedule a ogm packet to be sent */
 static unsigned long batadv_iv_ogm_fwd_send_time(void)
 {
-	return jiffies + msecs_to_jiffies(prandom_u32() % (BATADV_JITTER / 2));
+	return jiffies + prandom_u32_max(msecs_to_jiffies(BATADV_JITTER / 2));
 }
 
 /* apply hop penalty for a normal link */
diff --git a/net/batman-adv/bat_v_ogm.c b/net/batman-adv/bat_v_ogm.c
index 411ce5fc6492f..61fa742ff5501 100644
--- a/net/batman-adv/bat_v_ogm.c
+++ b/net/batman-adv/bat_v_ogm.c
@@ -85,12 +85,12 @@  struct batadv_orig_node *batadv_v_ogm_orig_get(struct batadv_priv *bat_priv,
  */
 static void batadv_v_ogm_start_queue_timer(struct batadv_hard_iface *hard_iface)
 {
-	unsigned int msecs = BATADV_MAX_AGGREGATION_MS * 1000;
+	unsigned int delay = msecs_to_jiffies(BATADV_MAX_AGGREGATION_MS);
 
-	/* msecs * [0.9, 1.1] */
-	msecs += prandom_u32() % (msecs / 5) - (msecs / 10);
+	/* delay * [0.9, 1.1] */
+	delay += prandom_u32_max(delay / 5) - (delay / 10);
 	queue_delayed_work(batadv_event_workqueue, &hard_iface->bat_v.aggr_wq,
-			   msecs_to_jiffies(msecs / 1000));
+			   delay);
 }
 
 /**
diff --git a/net/bluetooth/hci_request.c b/net/bluetooth/hci_request.c
index 2a1b64dbf76e4..8b46e23b4abe7 100644
--- a/net/bluetooth/hci_request.c
+++ b/net/bluetooth/hci_request.c
@@ -1505,7 +1505,7 @@  int hci_get_random_address(struct hci_dev *hdev, bool require_privacy,
 
 		bacpy(rand_addr, &hdev->rpa);
 
-		to = msecs_to_jiffies(hdev->rpa_timeout * 1000);
+		to = hdev->rpa_timeout * HZ;
 		if (adv_instance)
 			queue_delayed_work(hdev->workqueue,
 					   &adv_instance->rpa_expired_cb, to);
diff --git a/net/wireless/scan.c b/net/wireless/scan.c
index aef240fdf8df6..b6856cbb68d3b 100644
--- a/net/wireless/scan.c
+++ b/net/wireless/scan.c
@@ -700,7 +700,7 @@  void cfg80211_bss_age(struct cfg80211_registered_device *rdev,
                       unsigned long age_secs)
 {
 	struct cfg80211_internal_bss *bss;
-	unsigned long age_jiffies = msecs_to_jiffies(age_secs * MSEC_PER_SEC);
+	unsigned long age_jiffies = age_secs * HZ;
 
 	spin_lock_bh(&rdev->bss_lock);
 	list_for_each_entry(bss, &rdev->bss_list, list)
diff --git a/sound/core/pcm_lib.c b/sound/core/pcm_lib.c
index 2236b5e0c1f25..8a2bf333200c1 100644
--- a/sound/core/pcm_lib.c
+++ b/sound/core/pcm_lib.c
@@ -1839,7 +1839,7 @@  static int wait_for_avail(struct snd_pcm_substream *substream,
 					 runtime->rate;
 				wait_time = max(t, wait_time);
 			}
-			wait_time = msecs_to_jiffies(wait_time * 1000);
+			wait_time *= HZ;
 		}
 	}
 
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
index df40d38f6e290..1ea763f9f956d 100644
--- a/sound/core/pcm_native.c
+++ b/sound/core/pcm_native.c
@@ -1937,7 +1937,7 @@  static int snd_pcm_drain(struct snd_pcm_substream *substream,
 				long t = runtime->period_size * 2 / runtime->rate;
 				tout = max(t, tout);
 			}
-			tout = msecs_to_jiffies(tout * 1000);
+			tout *= HZ;
 		}
 		tout = schedule_timeout(tout);