diff mbox

iommu/arm-smmu: add a shortcut when the @dev_node is NULL

Message ID 1452564905-2662-1-git-send-email-shijie.huang@arm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Huang Shijie Jan. 12, 2016, 2:15 a.m. UTC
When a device is added to a bus, it will trigger the chain notifier hook,
such as iommu_bus_notifier. The find_smmu_for_device() can be called here.

In some cases, the @device_node can be NULL. For example, when the device
is one of the following:
    alarmtime, serial, rtc or snd_soc.

This patch adds a shortcut for the code when the @device_node is NULL.
In my juno-r1 board, the boot time can be faster by 0.004014s.

Signed-off-by: Huang Shijie <shijie.huang@arm.com>
---
 drivers/iommu/arm-smmu.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Joerg Roedel Jan. 20, 2016, 12:02 p.m. UTC | #1
On Tue, Jan 12, 2016 at 10:15:05AM +0800, Huang Shijie wrote:
> This patch adds a shortcut for the code when the @device_node is NULL.
> In my juno-r1 board, the boot time can be faster by 0.004014s.

How have you made sure this number is reliable and not just noise in the
boot process?


	Joerg
Huang Shijie Jan. 20, 2016, 1:34 p.m. UTC | #2
On Wed, Jan 20, 2016 at 01:02:25PM +0100, Joerg Roedel wrote:
> On Tue, Jan 12, 2016 at 10:15:05AM +0800, Huang Shijie wrote:
> > This patch adds a shortcut for the code when the @device_node is NULL.
> > In my juno-r1 board, the boot time can be faster by 0.004014s.
> 
> How have you made sure this number is reliable and not just noise in the
> boot process?
In the boot process, there are 5 or more modules whose @dev_node are
NULL. Without the patch, the kernel will waste some cycles to do the
meaningless calculations for all these modules. In theory, it is not noise.
If you have interest, I can send you the kernel boot logs. :)

Of course, the 0.004014s maybe not accurate enough, it is just an
approximate number.

Thanks
Huang Shijie
Robin Murphy Jan. 20, 2016, 2:46 p.m. UTC | #3
On 20/01/16 13:34, Huang Shijie wrote:
> On Wed, Jan 20, 2016 at 01:02:25PM +0100, Joerg Roedel wrote:
>> On Tue, Jan 12, 2016 at 10:15:05AM +0800, Huang Shijie wrote:
>>> This patch adds a shortcut for the code when the @device_node is NULL.
>>> In my juno-r1 board, the boot time can be faster by 0.004014s.
>>
>> How have you made sure this number is reliable and not just noise in the
>> boot process?
> In the boot process, there are 5 or more modules whose @dev_node are
> NULL. Without the patch, the kernel will waste some cycles to do the
> meaningless calculations for all these modules.

With a quick counting hack, booting 4.4 on my r1 indeed shows 5 calls 
where dev_node is null. Plus 68 calls in which we waste cycles doing 
meaningless calculations when dev_node is non-null. The fundamental 
issue at hand is that the "platform bus" is a rubbish abstraction.

> In theory, it is not noise.
> If you have interest, I can send you the kernel boot logs. :)
>
> Of course, the 0.004014s maybe not accurate enough, it is just an
> approximate number.

A mean and standard deviation of at least, say, 5 runs each with and 
without the patch would be considerably more meaningful (even if still 
far from statistically significant).

Robin.

>
> Thanks
> Huang Shijie
>
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
>
Jon Medhurst (Tixy) Jan. 20, 2016, 4 p.m. UTC | #4
On Wed, 2016-01-20 at 14:46 +0000, Robin Murphy wrote:
> > Of course, the 0.004014s maybe not accurate enough, it is just an
> > approximate number.
> 
> A mean and standard deviation of at least, say, 5 runs each with and 
> without the patch would be considerably more meaningful (even if
> still 
> far from statistically significant).

It wouldn't surprise me if replacing the proposed change with an 'asm
volatile("nop")' or two also give a boot time delta of several
milliseconds (due to change in cache line alignment of functions). I
don't believe you can reliably measure such minor changes.

It doesn't mean that the proposed change isn't a good addition though,
it obviously results in less code getting executed for the cost of one
or two instructions for a compare and branch.
Huang Shijie Jan. 21, 2016, 2:04 a.m. UTC | #5
On Wed, Jan 20, 2016 at 02:46:34PM +0000, Robin Murphy wrote:
> On 20/01/16 13:34, Huang Shijie wrote:
> >On Wed, Jan 20, 2016 at 01:02:25PM +0100, Joerg Roedel wrote:
> >>On Tue, Jan 12, 2016 at 10:15:05AM +0800, Huang Shijie wrote:
> >>>This patch adds a shortcut for the code when the @device_node is NULL.
> >>>In my juno-r1 board, the boot time can be faster by 0.004014s.
> >>
> >>How have you made sure this number is reliable and not just noise in the
> >>boot process?
> >In the boot process, there are 5 or more modules whose @dev_node are
> >NULL. Without the patch, the kernel will waste some cycles to do the
> >meaningless calculations for all these modules.
>
> With a quick counting hack, booting 4.4 on my r1 indeed shows 5 calls where
> dev_node is null. Plus 68 calls in which we waste cycles doing meaningless
> calculations when dev_node is non-null. The fundamental issue at hand is

The 68 calls are just compare instructions, such as "b.eq".
But if the code runs to the followng branch when the @dev_node is NULL, the
instructions executed will more the 68 calls (even including
some read memory instructions when try to grab the locks).

> that the "platform bus" is a rubbish abstraction.
This is just a small trivial patch, please ignore it if we think it is
no use. :)

thanks
Huang Shijie
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
diff mbox

Patch

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 59ee4b8..6724a46 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -506,6 +506,9 @@  static struct arm_smmu_device *find_smmu_for_device(struct device *dev)
 	struct arm_smmu_master *master = NULL;
 	struct device_node *dev_node = dev_get_dev_node(dev);
 
+	if (!dev_node)
+		return NULL;
+
 	spin_lock(&arm_smmu_devices_lock);
 	list_for_each_entry(smmu, &arm_smmu_devices, list) {
 		master = find_smmu_master(smmu, dev_node);