Message ID | 1452564905-2662-1-git-send-email-shijie.huang@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, Jan 12, 2016 at 10:15:05AM +0800, Huang Shijie wrote: > This patch adds a shortcut for the code when the @device_node is NULL. > In my juno-r1 board, the boot time can be faster by 0.004014s. How have you made sure this number is reliable and not just noise in the boot process? Joerg
On Wed, Jan 20, 2016 at 01:02:25PM +0100, Joerg Roedel wrote: > On Tue, Jan 12, 2016 at 10:15:05AM +0800, Huang Shijie wrote: > > This patch adds a shortcut for the code when the @device_node is NULL. > > In my juno-r1 board, the boot time can be faster by 0.004014s. > > How have you made sure this number is reliable and not just noise in the > boot process? In the boot process, there are 5 or more modules whose @dev_node are NULL. Without the patch, the kernel will waste some cycles to do the meaningless calculations for all these modules. In theory, it is not noise. If you have interest, I can send you the kernel boot logs. :) Of course, the 0.004014s maybe not accurate enough, it is just an approximate number. Thanks Huang Shijie
On 20/01/16 13:34, Huang Shijie wrote: > On Wed, Jan 20, 2016 at 01:02:25PM +0100, Joerg Roedel wrote: >> On Tue, Jan 12, 2016 at 10:15:05AM +0800, Huang Shijie wrote: >>> This patch adds a shortcut for the code when the @device_node is NULL. >>> In my juno-r1 board, the boot time can be faster by 0.004014s. >> >> How have you made sure this number is reliable and not just noise in the >> boot process? > In the boot process, there are 5 or more modules whose @dev_node are > NULL. Without the patch, the kernel will waste some cycles to do the > meaningless calculations for all these modules. With a quick counting hack, booting 4.4 on my r1 indeed shows 5 calls where dev_node is null. Plus 68 calls in which we waste cycles doing meaningless calculations when dev_node is non-null. The fundamental issue at hand is that the "platform bus" is a rubbish abstraction. > In theory, it is not noise. > If you have interest, I can send you the kernel boot logs. :) > > Of course, the 0.004014s maybe not accurate enough, it is just an > approximate number. A mean and standard deviation of at least, say, 5 runs each with and without the patch would be considerably more meaningful (even if still far from statistically significant). Robin. > > Thanks > Huang Shijie > > _______________________________________________ > iommu mailing list > iommu@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu >
On Wed, 2016-01-20 at 14:46 +0000, Robin Murphy wrote: > > Of course, the 0.004014s maybe not accurate enough, it is just an > > approximate number. > > A mean and standard deviation of at least, say, 5 runs each with and > without the patch would be considerably more meaningful (even if > still > far from statistically significant). It wouldn't surprise me if replacing the proposed change with an 'asm volatile("nop")' or two also give a boot time delta of several milliseconds (due to change in cache line alignment of functions). I don't believe you can reliably measure such minor changes. It doesn't mean that the proposed change isn't a good addition though, it obviously results in less code getting executed for the cost of one or two instructions for a compare and branch.
On Wed, Jan 20, 2016 at 02:46:34PM +0000, Robin Murphy wrote: > On 20/01/16 13:34, Huang Shijie wrote: > >On Wed, Jan 20, 2016 at 01:02:25PM +0100, Joerg Roedel wrote: > >>On Tue, Jan 12, 2016 at 10:15:05AM +0800, Huang Shijie wrote: > >>>This patch adds a shortcut for the code when the @device_node is NULL. > >>>In my juno-r1 board, the boot time can be faster by 0.004014s. > >> > >>How have you made sure this number is reliable and not just noise in the > >>boot process? > >In the boot process, there are 5 or more modules whose @dev_node are > >NULL. Without the patch, the kernel will waste some cycles to do the > >meaningless calculations for all these modules. > > With a quick counting hack, booting 4.4 on my r1 indeed shows 5 calls where > dev_node is null. Plus 68 calls in which we waste cycles doing meaningless > calculations when dev_node is non-null. The fundamental issue at hand is The 68 calls are just compare instructions, such as "b.eq". But if the code runs to the followng branch when the @dev_node is NULL, the instructions executed will more the 68 calls (even including some read memory instructions when try to grab the locks). > that the "platform bus" is a rubbish abstraction. This is just a small trivial patch, please ignore it if we think it is no use. :) thanks Huang Shijie IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 59ee4b8..6724a46 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -506,6 +506,9 @@ static struct arm_smmu_device *find_smmu_for_device(struct device *dev) struct arm_smmu_master *master = NULL; struct device_node *dev_node = dev_get_dev_node(dev); + if (!dev_node) + return NULL; + spin_lock(&arm_smmu_devices_lock); list_for_each_entry(smmu, &arm_smmu_devices, list) { master = find_smmu_master(smmu, dev_node);
When a device is added to a bus, it will trigger the chain notifier hook, such as iommu_bus_notifier. The find_smmu_for_device() can be called here. In some cases, the @device_node can be NULL. For example, when the device is one of the following: alarmtime, serial, rtc or snd_soc. This patch adds a shortcut for the code when the @device_node is NULL. In my juno-r1 board, the boot time can be faster by 0.004014s. Signed-off-by: Huang Shijie <shijie.huang@arm.com> --- drivers/iommu/arm-smmu.c | 3 +++ 1 file changed, 3 insertions(+)