Message ID | 20160919193029.8119-4-marex@denx.de (mailing list archive) |
---|---|
State | Accepted |
Commit | bfcf928d7604692668a7c33b783d05b8e3459d09 |
Headers | show |
Hi Marek, On 19.09.2016 21:30, Marek Vasut wrote: > Same as commit ce492b3b8f99cf9d2f807ec22d8805c996a09503 > Subject: drm/fsl-dcu: use flat regmap cache > > Using flat regmap cache instead of RB-tree to avoid the following > lockdep warning on driver load: > WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:2871 lockdep_trace_alloc+0x104/0x128 > DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags)) > > The RB-tree regmap cache needs to allocate new space on first > writes. That's why SSI driver had .num_reg_defaults_raw set in regmap config. With this set and .reg_defaults_raw unset regcache_hw_init() will read existing register content which then will be used to populate cache (and so allocate RB tree nodes) at regcache initialization time. This means that allocation of new cache RB tree nodes should never happen during actual write. If you get a warning that it happened then maybe there is a write to some register not covered by .num_reg_defaults_raw - this should be fixed. The problem with flat cache is that it is zero-initialized, that is, all registers are assumed to contain zeros by default. This is generally not true in case of SSI so reads from and bit updates in non-volatile registers will corrupt their value. You can also refer to discussion when .num_reg_defaults_raw was first added here: http://www.gossamer-threads.com/lists/linux/kernel/2330573 Maciej
On 09/21/2016 06:28 PM, Maciej S. Szmigiero wrote: > Hi Marek, Hi! > On 19.09.2016 21:30, Marek Vasut wrote: >> Same as commit ce492b3b8f99cf9d2f807ec22d8805c996a09503 >> Subject: drm/fsl-dcu: use flat regmap cache >> >> Using flat regmap cache instead of RB-tree to avoid the following >> lockdep warning on driver load: >> WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:2871 lockdep_trace_alloc+0x104/0x128 >> DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags)) >> >> The RB-tree regmap cache needs to allocate new space on first >> writes. > > That's why SSI driver had .num_reg_defaults_raw set in regmap config. > With this set and .reg_defaults_raw unset regcache_hw_init() will read > existing register content which then will be used to populate cache > (and so allocate RB tree nodes) at regcache initialization time. OK > This means that allocation of new cache RB tree nodes should never happen > during actual write. > If you get a warning that it happened then maybe there is a write to > some register not covered by .num_reg_defaults_raw - this should be > fixed. I might be blind, but I really don't see that. I just discussed it with Mark and he thinks the same. I will revisit this one more time. > The problem with flat cache is that it is zero-initialized, that is, > all registers are assumed to contain zeros by default. > This is generally not true in case of SSI so reads from and bit updates > in non-volatile registers will corrupt their value. This will let the driver init the hardware into a well-defined state. Are there any bits in the SSI config registers which we must preserve? > You can also refer to discussion when .num_reg_defaults_raw was first > added here: > http://www.gossamer-threads.com/lists/linux/kernel/2330573 Thanks! > Maciej >
On 21.09.2016 21:32, Marek Vasut wrote: > On 09/21/2016 06:28 PM, Maciej S. Szmigiero wrote: >> Hi Marek, > > Hi! > >> On 19.09.2016 21:30, Marek Vasut wrote: (..) >>> The RB-tree regmap cache needs to allocate new space on first >>> writes. >> >> That's why SSI driver had .num_reg_defaults_raw set in regmap config. >> With this set and .reg_defaults_raw unset regcache_hw_init() will read >> existing register content which then will be used to populate cache >> (and so allocate RB tree nodes) at regcache initialization time. > > OK > >> This means that allocation of new cache RB tree nodes should never happen >> during actual write. >> If you get a warning that it happened then maybe there is a write to >> some register not covered by .num_reg_defaults_raw - this should be >> fixed. > > I might be blind, but I really don't see that. I just discussed it with > Mark and he thinks the same. I will revisit this one more time. Maybe it would be possible to get a backtrace when that warning was generated so we know which register write or update triggers new cache allocation? >> The problem with flat cache is that it is zero-initialized, that is, >> all registers are assumed to contain zeros by default. >> This is generally not true in case of SSI so reads from and bit updates >> in non-volatile registers will corrupt their value. > > This will let the driver init the hardware into a well-defined state. > Are there any bits in the SSI config registers which we must preserve? Hard to tell, this would need to be checked, but when datasheet values were used as cache defaults the SoC locked up when the fsl_ssi module was reloaded, at least in AC'97 mode. I guess with zeros as defaults similar problems could occur. Maciej
On 09/22/2016 12:18 AM, Maciej S. Szmigiero wrote: > On 21.09.2016 21:32, Marek Vasut wrote: >> On 09/21/2016 06:28 PM, Maciej S. Szmigiero wrote: >>> Hi Marek, >> >> Hi! >> >>> On 19.09.2016 21:30, Marek Vasut wrote: > (..) >>>> The RB-tree regmap cache needs to allocate new space on first >>>> writes. >>> >>> That's why SSI driver had .num_reg_defaults_raw set in regmap config. >>> With this set and .reg_defaults_raw unset regcache_hw_init() will read >>> existing register content which then will be used to populate cache >>> (and so allocate RB tree nodes) at regcache initialization time. >> >> OK >> >>> This means that allocation of new cache RB tree nodes should never happen >>> during actual write. >>> If you get a warning that it happened then maybe there is a write to >>> some register not covered by .num_reg_defaults_raw - this should be >>> fixed. >> >> I might be blind, but I really don't see that. I just discussed it with >> Mark and he thinks the same. I will revisit this one more time. > > Maybe it would be possible to get a backtrace when that warning was > generated so we know which register write or update triggers new cache > allocation? See patch 5/5 . >>> The problem with flat cache is that it is zero-initialized, that is, >>> all registers are assumed to contain zeros by default. >>> This is generally not true in case of SSI so reads from and bit updates >>> in non-volatile registers will corrupt their value. >> >> This will let the driver init the hardware into a well-defined state. >> Are there any bits in the SSI config registers which we must preserve? > > Hard to tell, this would need to be checked, but when datasheet values > were used as cache defaults the SoC locked up when the fsl_ssi module > was reloaded, at least in AC'97 mode. > > I guess with zeros as defaults similar problems could occur. I don't observe any problem either way on MX6SX , which should cover also most of the MX6es. > Maciej >
On 22.09.2016 00:45, Marek Vasut wrote: > On 09/22/2016 12:18 AM, Maciej S. Szmigiero wrote: >> On 21.09.2016 21:32, Marek Vasut wrote: >>> On 09/21/2016 06:28 PM, Maciej S. Szmigiero wrote: >>>> Hi Marek, >>> >>> Hi! >>> >>>> On 19.09.2016 21:30, Marek Vasut wrote: >> (..) >>>>> The RB-tree regmap cache needs to allocate new space on first >>>>> writes. >>>> >>>> That's why SSI driver had .num_reg_defaults_raw set in regmap config. >>>> With this set and .reg_defaults_raw unset regcache_hw_init() will read >>>> existing register content which then will be used to populate cache >>>> (and so allocate RB tree nodes) at regcache initialization time. >>> >>> OK >>> >>>> This means that allocation of new cache RB tree nodes should never happen >>>> during actual write. >>>> If you get a warning that it happened then maybe there is a write to >>>> some register not covered by .num_reg_defaults_raw - this should be >>>> fixed. >>> >>> I might be blind, but I really don't see that. I just discussed it with >>> Mark and he thinks the same. I will revisit this one more time. >> >> Maybe it would be possible to get a backtrace when that warning was >> generated so we know which register write or update triggers new cache >> allocation? > > See patch 5/5 . In you patch 5/5 there is backtrace with flat cache, but I meant a backtrace with original RB tree cache code when warning about allocation being done in atomic context was generated. Maciej
On Thu, Sep 22, 2016 at 01:33:47AM +0200, Maciej S. Szmigiero wrote: > In you patch 5/5 there is backtrace with flat cache, but I meant a > backtrace with original RB tree cache code when warning about allocation > being done in atomic context was generated. I think the conclusion of this and some IRC discussion yesterday is that removing patch 5 fixes things - is that right?
On 22.09.2016 12:45, Mark Brown wrote: > On Thu, Sep 22, 2016 at 01:33:47AM +0200, Maciej S. Szmigiero wrote: > >> In you patch 5/5 there is backtrace with flat cache, but I meant a >> backtrace with original RB tree cache code when warning about allocation >> being done in atomic context was generated. > > I think the conclusion of this and some IRC discussion yesterday is that > removing patch 5 fixes things - is that right? There are three possible solutions: 1) Go back to RB tree cache (revert patch 4) and debug the lockdep warning, 2) Keep the flat cache introduced by patch 4 but debug the oops at cache init time (while keeping defaults read from hardware), 3) Keep both patches and risk the same issues that were previously caused by hardcoded register defaults in driver. However, Marek says the device works fine for him with the patches. It looks to me 1) is an easier and safer solution, especially that the driver previously worked fine with RB tree cache. I would test this myself but unfortunately I don't have access to hardware this week. Maciej
On 09/22/2016 01:33 AM, Maciej S. Szmigiero wrote: > On 22.09.2016 00:45, Marek Vasut wrote: >> On 09/22/2016 12:18 AM, Maciej S. Szmigiero wrote: >>> On 21.09.2016 21:32, Marek Vasut wrote: >>>> On 09/21/2016 06:28 PM, Maciej S. Szmigiero wrote: >>>>> Hi Marek, >>>> >>>> Hi! >>>> >>>>> On 19.09.2016 21:30, Marek Vasut wrote: >>> (..) >>>>>> The RB-tree regmap cache needs to allocate new space on first >>>>>> writes. >>>>> >>>>> That's why SSI driver had .num_reg_defaults_raw set in regmap config. >>>>> With this set and .reg_defaults_raw unset regcache_hw_init() will read >>>>> existing register content which then will be used to populate cache >>>>> (and so allocate RB tree nodes) at regcache initialization time. >>>> >>>> OK >>>> >>>>> This means that allocation of new cache RB tree nodes should never happen >>>>> during actual write. >>>>> If you get a warning that it happened then maybe there is a write to >>>>> some register not covered by .num_reg_defaults_raw - this should be >>>>> fixed. >>>> >>>> I might be blind, but I really don't see that. I just discussed it with >>>> Mark and he thinks the same. I will revisit this one more time. >>> >>> Maybe it would be possible to get a backtrace when that warning was >>> generated so we know which register write or update triggers new cache >>> allocation? >> >> See patch 5/5 . > > In you patch 5/5 there is backtrace with flat cache, but I meant a > backtrace with original RB tree cache code when warning about allocation > being done in atomic context was generated. Here you go: > fsl-ssi-dai 202c000.ssi: No cache defaults, reading back from HW > ------------[ cut here ]------------ > WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:2871 lockdep_trace_alloc+0x104/0x128 > DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags)) > Modules linked in: > CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.7.4-00041-g0260041 #2 > Hardware name: Freescale i.MX6 SoloX (Device Tree) > Backtrace: > [<c010c188>] (dump_backtrace) from [<c010c380>] (show_stack+0x18/0x1c) > r7:00000000 r6:60000093 r5:00000000 r4:c0f1e77c > [<c010c368>] (show_stack) from [<c03f8184>] (dump_stack+0xb4/0xe8) > [<c03f80d0>] (dump_stack) from [<c0123638>] (__warn+0xd8/0x104) > r9:c016f054 r8:00000b37 r7:00000009 r6:c0c59f68 r5:00000000 r4:db453b10 > [<c0123560>] (__warn) from [<c01236a0>] (warn_slowpath_fmt+0x3c/0x44) > r9:00000012 r8:024000c0 r7:024000c0 r6:00000040 r5:024000c0 r4:c0c570f0 > [<c0123668>] (warn_slowpath_fmt) from [<c016f054>] (lockdep_trace_alloc+0x104/0x128) > r3:c0c5b07c r2:c0c570f0 > r4:60000093 > [<c016ef50>] (lockdep_trace_alloc) from [<c0215e24>] (__kmalloc_track_caller+0x44/0x230) > r5:024000c0 r4:db401e80 > [<c0215de0>] (__kmalloc_track_caller) from [<c01f01d0>] (krealloc+0x54/0xc0) > r10:00000058 r9:00000012 r8:c050d4f0 r7:024000c0 r6:00000040 r5:0000004c > r4:dacd7d80 > [<c01f017c>] (krealloc) from [<c050d4f0>] (regcache_rbtree_write+0x364/0x608) > r9:00000012 r8:00000004 r7:00000010 r6:dacd9a00 r5:dacd7d50 r4:00000010 > [<c050d18c>] (regcache_rbtree_write) from [<c050c230>] (regcache_write+0x5c/0x64) > r10:dacd9818 r9:dacd9964 r8:00000000 r7:dacd9a00 r6:000000ff r5:00000058 > r4:dacd9a00 > [<c050c1d4>] (regcache_write) from [<c050a3ac>] (_regmap_write+0x74/0x9c) > r7:dacd9a00 r6:000000ff r5:00000058 r4:dacd9a00 > [<c050a338>] (_regmap_write) from [<c050b5bc>] (regmap_write+0x44/0x64) > r7:00000006 r6:000000ff r5:00000058 r4:dacd9a00 > [<c050b578>] (regmap_write) from [<c0731090>] (_fsl_ssi_set_dai_fmt+0x31c/0x4fc) > r7:00000006 r6:00000000 r5:dacd9a00 r4:dacd9810 > [<c0730d74>] (_fsl_ssi_set_dai_fmt) from [<c0731f94>] (fsl_ssi_probe+0x58c/0x7b4) > r9:dacd9964 r8:00000000 r7:dbbbf164 r6:db4f0000 r5:db4f0010 r4:dacd9810 > [<c0731a08>] (fsl_ssi_probe) from [<c0501994>] (platform_drv_probe+0x3c/0x74) > r10:c0e00618 r9:00000000 r8:c0f608b4 r7:00000000 r6:c1753a70 r5:c0f608b4 > r4:db4f0010 > [<c0501958>] (platform_drv_probe) from [<c0500134>] (really_probe+0x1b8/0x264) > r5:c1753a68 r4:db4f0010 > [<c04fff7c>] (really_probe) from [<c05002b4>] (__driver_attach+0xd4/0xd8) > r9:c0e58858 r8:00000000 r7:00000000 r6:db4f0044 r5:c0f608b4 r4:db4f0010 > [<c05001e0>] (__driver_attach) from [<c04fe2c4>] (bus_for_each_dev+0x74/0xa8) > r7:00000000 r6:c05001e0 r5:c0f608b4 r4:00000000 > [<c04fe250>] (bus_for_each_dev) from [<c04ffaa4>] (driver_attach+0x20/0x28) > r6:c0f2ef08 r5:dacd6480 r4:c0f608b4 > [<c04ffa84>] (driver_attach) from [<c04ff5f4>] (bus_add_driver+0x104/0x214) > [<c04ff4f0>] (bus_add_driver) from [<c05008e0>] (driver_register+0x80/0xfc) > r7:c0d498e0 r6:c0e58850 r5:ffffe000 r4:c0f608b4 > [<c0500860>] (driver_register) from [<c0501944>] (__platform_driver_register+0x38/0x4c) > r5:ffffe000 r4:c0e4cdf8 > [<c050190c>] (__platform_driver_register) from [<c0e4ce10>] (fsl_ssi_driver_init+0x18/0x20) > [<c0e4cdf8>] (fsl_ssi_driver_init) from [<c0101924>] (do_one_initcall+0x44/0x178) > [<c01018e0>] (do_one_initcall) from [<c0e00e74>] (kernel_init_freeable+0x124/0x1e8) > r8:000000ed r7:c0d498e0 r6:c0e58850 r5:c0f6e000 r4:c0e6b590 > [<c0e00d50>] (kernel_init_freeable) from [<c0958080>] (kernel_init+0x10/0x11c) > r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c0958070 > r4:00000000 > [<c0958070>] (kernel_init) from [<c0107c70>] (ret_from_fork+0x14/0x24) > r5:c0958070 r4:00000000 > ---[ end trace 4907f763bc504c04 ]--- > imx-wm9712 sound: wm9712-hifi <-> 202c000.ssi mapping ok > wm9712-codec wm9712-codec: ASoC: mux Capture Phone Mux has no paths > wm9712-codec wm9712-codec: ASoC: mux Differential Source has no paths
On 09/22/2016 04:37 PM, Maciej S. Szmigiero wrote: > On 22.09.2016 12:45, Mark Brown wrote: >> On Thu, Sep 22, 2016 at 01:33:47AM +0200, Maciej S. Szmigiero wrote: >> >>> In you patch 5/5 there is backtrace with flat cache, but I meant a >>> backtrace with original RB tree cache code when warning about allocation >>> being done in atomic context was generated. >> >> I think the conclusion of this and some IRC discussion yesterday is that >> removing patch 5 fixes things - is that right? > > There are three possible solutions: > 1) Go back to RB tree cache (revert patch 4) and debug the lockdep warning, > 2) Keep the flat cache introduced by patch 4 but debug the oops at > cache init time (while keeping defaults read from hardware), > 3) Keep both patches and risk the same issues that were previously > caused by hardcoded register defaults in driver. > However, Marek says the device works fine for him with the patches. > > It looks to me 1) is an easier and safer solution, especially that > the driver previously worked fine with RB tree cache. > I would test this myself but unfortunately I don't have access to > hardware this week. I believe switching to flat cache is harmless and it should be a better mode for devices with small mmio address spaces. After some discussion on IRC, I believe [1] should resolve the RBTREE crash (?). [1] http://git.kernel.org/cgit/linux/kernel/git/broonie/regmap.git/commit/?h=for-next&id=b2c7f5d9c939a37c1ce7f86a642de70e3033ee9e > Maciej >
On 22.09.2016 17:27, Marek Vasut wrote: > On 09/22/2016 04:37 PM, Maciej S. Szmigiero wrote: >> On 22.09.2016 12:45, Mark Brown wrote: >>> On Thu, Sep 22, 2016 at 01:33:47AM +0200, Maciej S. Szmigiero wrote: >>> (..) >> It looks to me 1) is an easier and safer solution, especially that >> the driver previously worked fine with RB tree cache. >> I would test this myself but unfortunately I don't have access to >> hardware this week. > > I believe switching to flat cache is harmless and it should be a better > mode for devices with small mmio address spaces. After some discussion > on IRC, I believe [1] should resolve the RBTREE crash (?). > > [1] > http://git.kernel.org/cgit/linux/kernel/git/broonie/regmap.git/commit/?h=for-next&id=b2c7f5d9c939a37c1ce7f86a642de70e3033ee9e I assume you meant "flat cache crash" not "RBTREE crash". I will test flat cache version with this patch as soon as I get my UDOO board back (probably next week) - maybe everything will be fine. Maciej
On Thu, Sep 22, 2016 at 05:27:42PM +0200, Marek Vasut wrote: > On 09/22/2016 04:37 PM, Maciej S. Szmigiero wrote: > > There are three possible solutions: > > 1) Go back to RB tree cache (revert patch 4) and debug the lockdep warning, > > 2) Keep the flat cache introduced by patch 4 but debug the oops at > > cache init time (while keeping defaults read from hardware), > > 3) Keep both patches and risk the same issues that were previously > > caused by hardcoded register defaults in driver. > > However, Marek says the device works fine for him with the patches. > I believe switching to flat cache is harmless and it should be a better > mode for devices with small mmio address spaces. After some discussion > on IRC, I believe [1] should resolve the RBTREE crash (?). So, can we get a clear decision on this after testing? The discussion here and the discussion on IRC are a bit disjointed and I'd like to make sure we end up with something that definitely works.
On 09/26/2016 09:13 PM, Mark Brown wrote: > On Thu, Sep 22, 2016 at 05:27:42PM +0200, Marek Vasut wrote: >> On 09/22/2016 04:37 PM, Maciej S. Szmigiero wrote: > >>> There are three possible solutions: >>> 1) Go back to RB tree cache (revert patch 4) and debug the lockdep warning, >>> 2) Keep the flat cache introduced by patch 4 but debug the oops at >>> cache init time (while keeping defaults read from hardware), >>> 3) Keep both patches and risk the same issues that were previously >>> caused by hardcoded register defaults in driver. >>> However, Marek says the device works fine for him with the patches. > >> I believe switching to flat cache is harmless and it should be a better >> mode for devices with small mmio address spaces. After some discussion >> on IRC, I believe [1] should resolve the RBTREE crash (?). > > So, can we get a clear decision on this after testing? The discussion > here and the discussion on IRC are a bit disjointed and I'd like to make > sure we end up with something that definitely works. > I think 1,2,3,4 are certainly harmless. I think even 5 is harmless (at least according to my testing on MX6Q and MX6SX, which is not the greatest sample), but can be safely reverted if there are concerns. I also believe the following two patches should be backported for stable 4.7 (they were part of tag 'regmap-fix-v4.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap): commit b2c7f5d9c939a37c1ce7f86a642de70e3033ee9e Author: Maarten ter Huurne <maarten@treewalker.org> Date: Fri Jul 29 23:42:12 2016 +0200 regmap: cache: Fix num_reg_defaults computation from reg_defaults_raw commit 1bc8da4e143c0fd8807e061a66d91d5972601ab1 Author: Lars-Peter Clausen <lars@metafoo.de> Date: Thu Aug 4 17:22:16 2016 +0200 regmap: rbtree: Avoid overlapping nodes
On Mon, Sep 26, 2016 at 09:25:51PM +0200, Marek Vasut wrote: > commit 1bc8da4e143c0fd8807e061a66d91d5972601ab1 > Author: Lars-Peter Clausen <lars@metafoo.de> > Date: Thu Aug 4 17:22:16 2016 +0200 > regmap: rbtree: Avoid overlapping nodes This one is aimed for stable but we want to let it cook for a longer than normal time (I was thinking just after the merge window) as it feels unusually risky given that we keep uncovering unintended consequences in this area.
On 26.09.2016 21:25, Marek Vasut wrote: > On 09/26/2016 09:13 PM, Mark Brown wrote: (..) >> So, can we get a clear decision on this after testing? The discussion >> here and the discussion on IRC are a bit disjointed and I'd like to make >> sure we end up with something that definitely works. >> > I think 1,2,3,4 are certainly harmless. I think even 5 is harmless (at > least according to my testing on MX6Q and MX6SX, which is not the > greatest sample), but can be safely reverted if there are concerns. I've tested current sound/for-next on UDOO board and was able to reproduce a lockup on reloading fsl sound modules. The exact test was playing 30 second sound file and reloading modules in a loop. The lockup would happen sometimes almost immediately and sometimes after ~30 minutes of such testing. When I reverted commit 7de2763d9b325ee5e7e24ac513c93394406cfefa (that is, number 5) it seems that it no longer locks up. > I also believe the following two patches should be backported for stable > 4.7 (they were part of tag 'regmap-fix-v4.8-rc5' of > git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap): > > commit b2c7f5d9c939a37c1ce7f86a642de70e3033ee9e > Author: Maarten ter Huurne <maarten@treewalker.org> > Date: Fri Jul 29 23:42:12 2016 +0200 > > regmap: cache: Fix num_reg_defaults computation from reg_defaults_raw > > commit 1bc8da4e143c0fd8807e061a66d91d5972601ab1 > Author: Lars-Peter Clausen <lars@metafoo.de> > Date: Thu Aug 4 17:22:16 2016 +0200 > > regmap: rbtree: Avoid overlapping nodes > I can see that "regmap: cache: Fix num_reg_defaults computation from reg_defaults_raw" is present in the tree that I have tested. Maciej
On 09/28/2016 01:23 AM, Maciej S. Szmigiero wrote: > On 26.09.2016 21:25, Marek Vasut wrote: >> On 09/26/2016 09:13 PM, Mark Brown wrote: > (..) >>> So, can we get a clear decision on this after testing? The discussion >>> here and the discussion on IRC are a bit disjointed and I'd like to make >>> sure we end up with something that definitely works. >>> >> I think 1,2,3,4 are certainly harmless. I think even 5 is harmless (at >> least according to my testing on MX6Q and MX6SX, which is not the >> greatest sample), but can be safely reverted if there are concerns. > > I've tested current sound/for-next on UDOO board and was able to reproduce > a lockup on reloading fsl sound modules. > The exact test was playing 30 second sound file and reloading modules in > a loop. > The lockup would happen sometimes almost immediately and sometimes after > ~30 minutes of such testing. > > When I reverted commit 7de2763d9b325ee5e7e24ac513c93394406cfefa > (that is, number 5) it seems that it no longer locks up. Do you have an explanation how would #5 cause this problem if it was supposed to fix it ? ;-) Can you share the backtrace ? >> I also believe the following two patches should be backported for stable >> 4.7 (they were part of tag 'regmap-fix-v4.8-rc5' of >> git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap): >> >> commit b2c7f5d9c939a37c1ce7f86a642de70e3033ee9e >> Author: Maarten ter Huurne <maarten@treewalker.org> >> Date: Fri Jul 29 23:42:12 2016 +0200 >> >> regmap: cache: Fix num_reg_defaults computation from reg_defaults_raw >> >> commit 1bc8da4e143c0fd8807e061a66d91d5972601ab1 >> Author: Lars-Peter Clausen <lars@metafoo.de> >> Date: Thu Aug 4 17:22:16 2016 +0200 >> >> regmap: rbtree: Avoid overlapping nodes >> > > I can see that > "regmap: cache: Fix num_reg_defaults computation from reg_defaults_raw" is > present in the tree that I have tested. And the rbtree patch is also present ? > Maciej >
On 28.09.2016 01:29, Marek Vasut wrote: > On 09/28/2016 01:23 AM, Maciej S. Szmigiero wrote: >> On 26.09.2016 21:25, Marek Vasut wrote: >>> On 09/26/2016 09:13 PM, Mark Brown wrote: >> (..) >>>> So, can we get a clear decision on this after testing? The discussion >>>> here and the discussion on IRC are a bit disjointed and I'd like to make >>>> sure we end up with something that definitely works. >>>> >>> I think 1,2,3,4 are certainly harmless. I think even 5 is harmless (at >>> least according to my testing on MX6Q and MX6SX, which is not the >>> greatest sample), but can be safely reverted if there are concerns. >> >> I've tested current sound/for-next on UDOO board and was able to reproduce >> a lockup on reloading fsl sound modules. >> The exact test was playing 30 second sound file and reloading modules in >> a loop. >> The lockup would happen sometimes almost immediately and sometimes after >> ~30 minutes of such testing. >> >> When I reverted commit 7de2763d9b325ee5e7e24ac513c93394406cfefa >> (that is, number 5) it seems that it no longer locks up. > > Do you have an explanation how would #5 cause this problem if it was > supposed to fix it ? ;-) Can you share the backtrace ? I think the reason why it locks up is similar to previous case that it locked up in similar circumstances with hardcoded register defaults - there is some inconsistency between register value in cache and actual value in hardware which then gets (partially?) rewritten with cached value. But I don't know the exact problematic register. It looks like the SoC just locks up after this error: ac97-codec ac97-codec.0: ASoC: codec DAI prepare error: -22 >>> I also believe the following two patches should be backported for >>> 4.7 (they were part of tag 'regmap-fix-v4.8-rc5' of >>> git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap): >>> >>> commit b2c7f5d9c939a37c1ce7f86a642de70e3033ee9e >>> Author: Maarten ter Huurne <maarten@treewalker.org> >>> Date: Fri Jul 29 23:42:12 2016 +0200 >>> >>> regmap: cache: Fix num_reg_defaults computation from reg_defaults_raw >>> >>> commit 1bc8da4e143c0fd8807e061a66d91d5972601ab1 >>> Author: Lars-Peter Clausen <lars@metafoo.de> >>> Date: Thu Aug 4 17:22:16 2016 +0200 >>> >>> regmap: rbtree: Avoid overlapping nodes >>> >> >> I can see that >> "regmap: cache: Fix num_reg_defaults computation from reg_defaults_raw" is >> present in the tree that I have tested. > > And the rbtree patch is also present ? Yes - I commented only on the first one since you said it would probably fix the rbtree crash (which I understand is the one which patch 5 was also trying to fix). Maciej
On 09/28/2016 02:04 AM, Maciej S. Szmigiero wrote: > On 28.09.2016 01:29, Marek Vasut wrote: >> On 09/28/2016 01:23 AM, Maciej S. Szmigiero wrote: >>> On 26.09.2016 21:25, Marek Vasut wrote: >>>> On 09/26/2016 09:13 PM, Mark Brown wrote: >>> (..) >>>>> So, can we get a clear decision on this after testing? The discussion >>>>> here and the discussion on IRC are a bit disjointed and I'd like to make >>>>> sure we end up with something that definitely works. >>>>> >>>> I think 1,2,3,4 are certainly harmless. I think even 5 is harmless (at >>>> least according to my testing on MX6Q and MX6SX, which is not the >>>> greatest sample), but can be safely reverted if there are concerns. >>> >>> I've tested current sound/for-next on UDOO board and was able to reproduce >>> a lockup on reloading fsl sound modules. >>> The exact test was playing 30 second sound file and reloading modules in >>> a loop. >>> The lockup would happen sometimes almost immediately and sometimes after >>> ~30 minutes of such testing. >>> >>> When I reverted commit 7de2763d9b325ee5e7e24ac513c93394406cfefa >>> (that is, number 5) it seems that it no longer locks up. >> >> Do you have an explanation how would #5 cause this problem if it was >> supposed to fix it ? ;-) Can you share the backtrace ? > > I think the reason why it locks up is similar to previous case that it > locked up in similar circumstances with hardcoded register defaults - > there is some inconsistency between register value in cache and actual > value in hardware which then gets (partially?) rewritten with cached > value. > But I don't know the exact problematic register. > > It looks like the SoC just locks up after this error: > ac97-codec ac97-codec.0: ASoC: codec DAI prepare error: -22 This looks like a different error, can you poke into this please ? >>>> I also believe the following two patches should be backported for >>>> 4.7 (they were part of tag 'regmap-fix-v4.8-rc5' of >>>> git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap): >>>> >>>> commit b2c7f5d9c939a37c1ce7f86a642de70e3033ee9e >>>> Author: Maarten ter Huurne <maarten@treewalker.org> >>>> Date: Fri Jul 29 23:42:12 2016 +0200 >>>> >>>> regmap: cache: Fix num_reg_defaults computation from reg_defaults_raw >>>> >>>> commit 1bc8da4e143c0fd8807e061a66d91d5972601ab1 >>>> Author: Lars-Peter Clausen <lars@metafoo.de> >>>> Date: Thu Aug 4 17:22:16 2016 +0200 >>>> >>>> regmap: rbtree: Avoid overlapping nodes >>>> >>> >>> I can see that >>> "regmap: cache: Fix num_reg_defaults computation from reg_defaults_raw" is >>> present in the tree that I have tested. >> >> And the rbtree patch is also present ? > > Yes - I commented only on the first one since you said it would probably > fix the rbtree crash (which I understand is the one which > patch 5 was also trying to fix). Seems like we have more problems then, sigh. btw I also had to revert 5881826ded79cf3c3314ee2d84c3bfa94e111b42 on latest next as it made my AC97 misbehave. > Maciej >
diff --git a/sound/soc/fsl/fsl_ssi.c b/sound/soc/fsl/fsl_ssi.c index bedec4a..5034943 100644 --- a/sound/soc/fsl/fsl_ssi.c +++ b/sound/soc/fsl/fsl_ssi.c @@ -182,7 +182,7 @@ static const struct regmap_config fsl_ssi_regconfig = { .volatile_reg = fsl_ssi_volatile_reg, .precious_reg = fsl_ssi_precious_reg, .writeable_reg = fsl_ssi_writeable_reg, - .cache_type = REGCACHE_RBTREE, + .cache_type = REGCACHE_FLAT, }; struct fsl_ssi_soc_data {
Same as commit ce492b3b8f99cf9d2f807ec22d8805c996a09503 Subject: drm/fsl-dcu: use flat regmap cache Using flat regmap cache instead of RB-tree to avoid the following lockdep warning on driver load: WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:2871 lockdep_trace_alloc+0x104/0x128 DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags)) The RB-tree regmap cache needs to allocate new space on first writes. However, allocations in an atomic context (e.g. when a spinlock is held) are not allowed. The function regmap_write calls map->lock, which acquires a spinlock in the fast_io case. Since the driver uses MMIO, the regmap bus of type regmap_mmio is being used which has fast_io set to true. Signed-off-by: Marek Vasut <marex@denx.de> Cc: Fabio Estevam <fabio.estevam@nxp.com> Cc: Liam Girdwood <lgirdwood@gmail.com> Cc: Mark Brown <broonie@kernel.org> Cc: Nicolin Chen <nicoleotsuka@gmail.com> Cc: Timur Tabi <timur@tabi.org> Cc: Xiubo Li <Xiubo.Lee@gmail.com> --- sound/soc/fsl/fsl_ssi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)