Message ID | 20200508105304.14065-6-Sergey.Semin@baikalelectronics.ru (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | dmaengine: dw: Take Baikal-T1 SoC DW DMAC peculiarities into account | expand |
On Fri, May 08, 2020 at 01:53:03PM +0300, Serge Semin wrote: > IP core of the DW DMA controller may be synthesized with different > max burst length of the transfers per each channel. According to Synopsis > having the fixed maximum burst transactions length may provide some > performance gain. At the same time setting up the source and destination > multi size exceeding the max burst length limitation may cause a serious > problems. In our case the system just hangs up. In order to fix this > lets introduce the max burst length platform config of the DW DMA > controller device and don't let the DMA channels configuration code > exceed the burst length hardware limitation. Depending on the IP core > configuration the maximum value can vary from channel to channel. > It can be detected either in runtime from the DWC parameter registers > or from the dedicated dts property. I'm wondering what can be the scenario when your peripheral will ask something which is not supported by DMA controller? Peripheral needs to supply a lot of configuration parameters specific to the DMA controller in use (that's why we have struct dw_dma_slave). So, seems to me the feasible approach is supply correct data in the first place. If you have specific channels to acquire then you probably need to provide a custom xlate / filter functions. Because above seems a bit hackish workaround of dynamic channel allocation mechanism. But let's see what we can do better. Since maximum is defined on the slave side device, it probably needs to define minimum as well, otherwise it's possible that some hardware can't cope underrun bursts. Vinod, what do you think?
On Fri, May 08, 2020 at 02:41:53PM +0300, Andy Shevchenko wrote: > On Fri, May 08, 2020 at 01:53:03PM +0300, Serge Semin wrote: > > IP core of the DW DMA controller may be synthesized with different > > max burst length of the transfers per each channel. According to Synopsis > > having the fixed maximum burst transactions length may provide some > > performance gain. At the same time setting up the source and destination > > multi size exceeding the max burst length limitation may cause a serious > > problems. In our case the system just hangs up. In order to fix this > > lets introduce the max burst length platform config of the DW DMA > > controller device and don't let the DMA channels configuration code > > exceed the burst length hardware limitation. Depending on the IP core > > configuration the maximum value can vary from channel to channel. > > It can be detected either in runtime from the DWC parameter registers > > or from the dedicated dts property. > > I'm wondering what can be the scenario when your peripheral will ask something > which is not supported by DMA controller? I may misunderstood your statement, because seeing your activity around my patchsets including the SPI patchset and sometimes very helpful comments, this question answer seems too obvious to see you asking it. No need to go far for an example. See the DW APB SSI driver. Its DMA module specifies the burst length to be 16, while not all of ours channels supports it. Yes, originally it has been developed for the Intel Midfield SPI, but since I converted the driver into a generic code we can't use a fixed value. For instance in our hardware only two DMA channels of total 16 are capable of bursting up to 16 bytes (data items) at a time, the rest of them are limited with up to 4 bytes burst length. While there are two SPI interfaces, each of which need to have two DMA channels for communications. So I need four channels in total to allocate to provide the DMA capability for all interfaces. In order to set the SPI controller up with valid optimized parameters the max-burst-length is required. Otherwise we can end up with buffers overrun/underrun. > > Peripheral needs to supply a lot of configuration parameters specific to the > DMA controller in use (that's why we have struct dw_dma_slave). > So, seems to me the feasible approach is supply correct data in the first place. How to supply a valid data if clients don't know the DMA controller limitations in general? > > If you have specific channels to acquire then you probably need to provide a > custom xlate / filter functions. Because above seems a bit hackish workaround > of dynamic channel allocation mechanism. No, I don't have a specific channel to acquire and in general you may use any returned from the DMA subsystem (though some platforms may need a dedicated channels to use, in this case xlate / filter is required). In our SoC any DW DMAC channel can be used for any DMA-capable peripherals like SPI, I2C, UART. But the their DMA settings must properly and optimally configured. It can be only done if you know the DMA controller parameters like max burst length, max block-size, etc. So no. The change proposed by this patch isn't workaround, but a useful feature, moreover expected to be supported by the generic DMA subsystem. > > But let's see what we can do better. Since maximum is defined on the slave side > device, it probably needs to define minimum as well, otherwise it's possible > that some hardware can't cope underrun bursts. There is no need to define minimum if such limit doesn't exists except a natural 1. Moreover it doesn't exist for all DMA controllers seeing noone has added such capability into the generic DMA subsystem so far. -Sergey > > Vinod, what do you think? > > -- > With Best Regards, > Andy Shevchenko > >
On Tue, May 12, 2020 at 05:08:20PM +0300, Serge Semin wrote: > On Fri, May 08, 2020 at 02:41:53PM +0300, Andy Shevchenko wrote: > > On Fri, May 08, 2020 at 01:53:03PM +0300, Serge Semin wrote: > > > IP core of the DW DMA controller may be synthesized with different > > > max burst length of the transfers per each channel. According to Synopsis > > > having the fixed maximum burst transactions length may provide some > > > performance gain. At the same time setting up the source and destination > > > multi size exceeding the max burst length limitation may cause a serious > > > problems. In our case the system just hangs up. In order to fix this > > > lets introduce the max burst length platform config of the DW DMA > > > controller device and don't let the DMA channels configuration code > > > exceed the burst length hardware limitation. Depending on the IP core > > > configuration the maximum value can vary from channel to channel. > > > It can be detected either in runtime from the DWC parameter registers > > > or from the dedicated dts property. > > > > I'm wondering what can be the scenario when your peripheral will ask something > > which is not supported by DMA controller? > > I may misunderstood your statement, because seeing your activity around my > patchsets including the SPI patchset and sometimes very helpful comments, > this question answer seems too obvious to see you asking it. > > No need to go far for an example. See the DW APB SSI driver. Its DMA module > specifies the burst length to be 16, while not all of ours channels supports it. > Yes, originally it has been developed for the Intel Midfield SPI, but since I > converted the driver into a generic code we can't use a fixed value. For instance > in our hardware only two DMA channels of total 16 are capable of bursting up to > 16 bytes (data items) at a time, the rest of them are limited with up to 4 bytes > burst length. While there are two SPI interfaces, each of which need to have two > DMA channels for communications. So I need four channels in total to allocate to > provide the DMA capability for all interfaces. In order to set the SPI controller > up with valid optimized parameters the max-burst-length is required. Otherwise we > can end up with buffers overrun/underrun. Right, and we come to the question which channel better to be used by SPI and the rest devices. Without specific filter function you can easily get into a case of inverted optimizations, when SPI got channels with burst = 4, while it's needed 16, and other hardware otherwise. Performance wise it's worse scenario which we may avoid in the first place, right? > > Peripheral needs to supply a lot of configuration parameters specific to the > > DMA controller in use (that's why we have struct dw_dma_slave). > > So, seems to me the feasible approach is supply correct data in the first place. > > How to supply a valid data if clients don't know the DMA controller limitations > in general? This is a good question. DMA controllers are quite different and having unified capabilities structure for all is almost impossible task to fulfil. That's why custom filter function(s) can help here. Based on compatible string you can implement whatever customized quirks like two functions, for example, to try 16 burst size first and fallback to 4 if none was previously found. > > If you have specific channels to acquire then you probably need to provide a > > custom xlate / filter functions. Because above seems a bit hackish workaround > > of dynamic channel allocation mechanism. > > No, I don't have a specific channel to acquire and in general you may use any > returned from the DMA subsystem (though some platforms may need a dedicated > channels to use, in this case xlate / filter is required). In our SoC any DW DMAC > channel can be used for any DMA-capable peripherals like SPI, I2C, UART. But the > their DMA settings must properly and optimally configured. It can be only done > if you know the DMA controller parameters like max burst length, max block-size, > etc. > > So no. The change proposed by this patch isn't workaround, but a useful feature, > moreover expected to be supported by the generic DMA subsystem. See above. > > But let's see what we can do better. Since maximum is defined on the slave side > > device, it probably needs to define minimum as well, otherwise it's possible > > that some hardware can't cope underrun bursts. > > There is no need to define minimum if such limit doesn't exists except a > natural 1. Moreover it doesn't exist for all DMA controllers seeing noone has > added such capability into the generic DMA subsystem so far. There is a contract between provider and consumer about DMA resource. That's why both sides should participate in fulfilling it. Theoretically it may be a hardware that doesn't support minimum burst available in DMA by a reason. For such we would need minimum to be provided as well.
On Tue, May 12, 2020 at 10:12:08PM +0300, Andy Shevchenko wrote: > On Tue, May 12, 2020 at 05:08:20PM +0300, Serge Semin wrote: > > On Fri, May 08, 2020 at 02:41:53PM +0300, Andy Shevchenko wrote: > > > On Fri, May 08, 2020 at 01:53:03PM +0300, Serge Semin wrote: > > > > IP core of the DW DMA controller may be synthesized with different > > > > max burst length of the transfers per each channel. According to Synopsis > > > > having the fixed maximum burst transactions length may provide some > > > > performance gain. At the same time setting up the source and destination > > > > multi size exceeding the max burst length limitation may cause a serious > > > > problems. In our case the system just hangs up. In order to fix this > > > > lets introduce the max burst length platform config of the DW DMA > > > > controller device and don't let the DMA channels configuration code > > > > exceed the burst length hardware limitation. Depending on the IP core > > > > configuration the maximum value can vary from channel to channel. > > > > It can be detected either in runtime from the DWC parameter registers > > > > or from the dedicated dts property. > > > > > > I'm wondering what can be the scenario when your peripheral will ask something > > > which is not supported by DMA controller? > > > > I may misunderstood your statement, because seeing your activity around my > > patchsets including the SPI patchset and sometimes very helpful comments, > > this question answer seems too obvious to see you asking it. > > > > No need to go far for an example. See the DW APB SSI driver. Its DMA module > > specifies the burst length to be 16, while not all of ours channels supports it. > > Yes, originally it has been developed for the Intel Midfield SPI, but since I > > converted the driver into a generic code we can't use a fixed value. For instance > > in our hardware only two DMA channels of total 16 are capable of bursting up to > > 16 bytes (data items) at a time, the rest of them are limited with up to 4 bytes > > burst length. While there are two SPI interfaces, each of which need to have two > > DMA channels for communications. So I need four channels in total to allocate to > > provide the DMA capability for all interfaces. In order to set the SPI controller > > up with valid optimized parameters the max-burst-length is required. Otherwise we > > can end up with buffers overrun/underrun. > > Right, and we come to the question which channel better to be used by SPI and > the rest devices. Without specific filter function you can easily get into a > case of inverted optimizations, when SPI got channels with burst = 4, while > it's needed 16, and other hardware otherwise. Performance wise it's worse > scenario which we may avoid in the first place, right? If we start thinking like you said, we'll get stuck at a problem of which interfaces should get faster DMA channels and which one should be left with slowest. In general this task can't be solved, because without any application-specific requirement they all are equally valuable and deserve to have the best resources allocated. So we shouldn't assume that some interface is better or more valuable than another, therefore in generic DMA client code any filtering is redundant. > > > > Peripheral needs to supply a lot of configuration parameters specific to the > > > DMA controller in use (that's why we have struct dw_dma_slave). > > > So, seems to me the feasible approach is supply correct data in the first place. > > > > How to supply a valid data if clients don't know the DMA controller limitations > > in general? > > This is a good question. DMA controllers are quite different and having unified > capabilities structure for all is almost impossible task to fulfil. That's why > custom filter function(s) can help here. Based on compatible string you can > implement whatever customized quirks like two functions, for example, to try 16 > burst size first and fallback to 4 if none was previously found. Right. As I said in the previous email it's up to the corresponding platforms to decide the criteria of the filtering including the max-burst length value. Even though the DW DMA channels resources aren't uniform on Baikal-T1 SoC I also won't do the filter-based channel allocation, because I can't predict the SoC application. Some of them may be used on a platform with active SPI interface utilization, some with specific requirements to UARTs and so on. > > > > If you have specific channels to acquire then you probably need to provide a > > > custom xlate / filter functions. Because above seems a bit hackish workaround > > > of dynamic channel allocation mechanism. > > > > No, I don't have a specific channel to acquire and in general you may use any > > returned from the DMA subsystem (though some platforms may need a dedicated > > channels to use, in this case xlate / filter is required). In our SoC any DW DMAC > > channel can be used for any DMA-capable peripherals like SPI, I2C, UART. But the > > their DMA settings must properly and optimally configured. It can be only done > > if you know the DMA controller parameters like max burst length, max block-size, > > etc. > > > > So no. The change proposed by this patch isn't workaround, but a useful feature, > > moreover expected to be supported by the generic DMA subsystem. > > See above. > > > > But let's see what we can do better. Since maximum is defined on the slave side > > > device, it probably needs to define minimum as well, otherwise it's possible > > > that some hardware can't cope underrun bursts. > > > > There is no need to define minimum if such limit doesn't exists except a > > natural 1. Moreover it doesn't exist for all DMA controllers seeing noone has > > added such capability into the generic DMA subsystem so far. > > There is a contract between provider and consumer about DMA resource. That's > why both sides should participate in fulfilling it. Theoretically it may be a > hardware that doesn't support minimum burst available in DMA by a reason. For > such we would need minimum to be provided as well. I don't think 'theoretical' consideration counts when implementing something in kernel. That 'theoretical' may never happen, but you'll end up supporting a dummy functionality. Practicality is what kernel developers normally place before anything else. -Sergey > > -- > With Best Regards, > Andy Shevchenko > >
On 12-05-20, 22:12, Andy Shevchenko wrote: > On Tue, May 12, 2020 at 05:08:20PM +0300, Serge Semin wrote: > > On Fri, May 08, 2020 at 02:41:53PM +0300, Andy Shevchenko wrote: > > > On Fri, May 08, 2020 at 01:53:03PM +0300, Serge Semin wrote: > > > > IP core of the DW DMA controller may be synthesized with different > > > > max burst length of the transfers per each channel. According to Synopsis > > > > having the fixed maximum burst transactions length may provide some > > > > performance gain. At the same time setting up the source and destination > > > > multi size exceeding the max burst length limitation may cause a serious > > > > problems. In our case the system just hangs up. In order to fix this > > > > lets introduce the max burst length platform config of the DW DMA > > > > controller device and don't let the DMA channels configuration code > > > > exceed the burst length hardware limitation. Depending on the IP core > > > > configuration the maximum value can vary from channel to channel. > > > > It can be detected either in runtime from the DWC parameter registers > > > > or from the dedicated dts property. > > > > > > I'm wondering what can be the scenario when your peripheral will ask something > > > which is not supported by DMA controller? > > > > I may misunderstood your statement, because seeing your activity around my > > patchsets including the SPI patchset and sometimes very helpful comments, > > this question answer seems too obvious to see you asking it. > > > > No need to go far for an example. See the DW APB SSI driver. Its DMA module > > specifies the burst length to be 16, while not all of ours channels supports it. > > Yes, originally it has been developed for the Intel Midfield SPI, but since I > > converted the driver into a generic code we can't use a fixed value. For instance > > in our hardware only two DMA channels of total 16 are capable of bursting up to > > 16 bytes (data items) at a time, the rest of them are limited with up to 4 bytes > > burst length. While there are two SPI interfaces, each of which need to have two > > DMA channels for communications. So I need four channels in total to allocate to > > provide the DMA capability for all interfaces. In order to set the SPI controller > > up with valid optimized parameters the max-burst-length is required. Otherwise we > > can end up with buffers overrun/underrun. > > Right, and we come to the question which channel better to be used by SPI and > the rest devices. Without specific filter function you can easily get into a > case of inverted optimizations, when SPI got channels with burst = 4, while > it's needed 16, and other hardware otherwise. Performance wise it's worse > scenario which we may avoid in the first place, right? If one has channels which are different and described as such in DT, then I think it does make sense to specify in your board-dt about the specific channels you would require... > > > > Peripheral needs to supply a lot of configuration parameters specific to the > > > DMA controller in use (that's why we have struct dw_dma_slave). > > > So, seems to me the feasible approach is supply correct data in the first place. > > > > How to supply a valid data if clients don't know the DMA controller limitations > > in general? > > This is a good question. DMA controllers are quite different and having unified > capabilities structure for all is almost impossible task to fulfil. That's why > custom filter function(s) can help here. Based on compatible string you can > implement whatever customized quirks like two functions, for example, to try 16 > burst size first and fallback to 4 if none was previously found. > > > > If you have specific channels to acquire then you probably need to provide a > > > custom xlate / filter functions. Because above seems a bit hackish workaround > > > of dynamic channel allocation mechanism. > > > > No, I don't have a specific channel to acquire and in general you may use any > > returned from the DMA subsystem (though some platforms may need a dedicated > > channels to use, in this case xlate / filter is required). In our SoC any DW DMAC > > channel can be used for any DMA-capable peripherals like SPI, I2C, UART. But the > > their DMA settings must properly and optimally configured. It can be only done > > if you know the DMA controller parameters like max burst length, max block-size, > > etc. > > > > So no. The change proposed by this patch isn't workaround, but a useful feature, > > moreover expected to be supported by the generic DMA subsystem. > > See above. > > > > But let's see what we can do better. Since maximum is defined on the slave side > > > device, it probably needs to define minimum as well, otherwise it's possible > > > that some hardware can't cope underrun bursts. > > > > There is no need to define minimum if such limit doesn't exists except a > > natural 1. Moreover it doesn't exist for all DMA controllers seeing noone has > > added such capability into the generic DMA subsystem so far. > > There is a contract between provider and consumer about DMA resource. That's > why both sides should participate in fulfilling it. Theoretically it may be a > hardware that doesn't support minimum burst available in DMA by a reason. For > such we would need minimum to be provided as well. Agreed and if required caps should be extended to tell consumer the minimum values supported.
On Tue, May 12, 2020 at 10:47:34PM +0300, Serge Semin wrote: > On Tue, May 12, 2020 at 10:12:08PM +0300, Andy Shevchenko wrote: > > On Tue, May 12, 2020 at 05:08:20PM +0300, Serge Semin wrote: > > > On Fri, May 08, 2020 at 02:41:53PM +0300, Andy Shevchenko wrote: > > > > On Fri, May 08, 2020 at 01:53:03PM +0300, Serge Semin wrote: > > > > > IP core of the DW DMA controller may be synthesized with different > > > > > max burst length of the transfers per each channel. According to Synopsis > > > > > having the fixed maximum burst transactions length may provide some > > > > > performance gain. At the same time setting up the source and destination > > > > > multi size exceeding the max burst length limitation may cause a serious > > > > > problems. In our case the system just hangs up. In order to fix this > > > > > lets introduce the max burst length platform config of the DW DMA > > > > > controller device and don't let the DMA channels configuration code > > > > > exceed the burst length hardware limitation. Depending on the IP core > > > > > configuration the maximum value can vary from channel to channel. > > > > > It can be detected either in runtime from the DWC parameter registers > > > > > or from the dedicated dts property. > > > > > > > > I'm wondering what can be the scenario when your peripheral will ask something > > > > which is not supported by DMA controller? > > > > > > I may misunderstood your statement, because seeing your activity around my > > > patchsets including the SPI patchset and sometimes very helpful comments, > > > this question answer seems too obvious to see you asking it. > > > > > > No need to go far for an example. See the DW APB SSI driver. Its DMA module > > > specifies the burst length to be 16, while not all of ours channels supports it. > > > Yes, originally it has been developed for the Intel Midfield SPI, but since I > > > converted the driver into a generic code we can't use a fixed value. For instance > > > in our hardware only two DMA channels of total 16 are capable of bursting up to > > > 16 bytes (data items) at a time, the rest of them are limited with up to 4 bytes > > > burst length. While there are two SPI interfaces, each of which need to have two > > > DMA channels for communications. So I need four channels in total to allocate to > > > provide the DMA capability for all interfaces. In order to set the SPI controller > > > up with valid optimized parameters the max-burst-length is required. Otherwise we > > > can end up with buffers overrun/underrun. > > > > Right, and we come to the question which channel better to be used by SPI and > > the rest devices. Without specific filter function you can easily get into a > > case of inverted optimizations, when SPI got channels with burst = 4, while > > it's needed 16, and other hardware otherwise. Performance wise it's worse > > scenario which we may avoid in the first place, right? > > If we start thinking like you said, we'll get stuck at a problem of which interfaces > should get faster DMA channels and which one should be left with slowest. In general > this task can't be solved, because without any application-specific requirement > they all are equally valuable and deserve to have the best resources allocated. > So we shouldn't assume that some interface is better or more valuable than > another, therefore in generic DMA client code any filtering is redundant. True, that's why I called it platform dependent quirks. You may do whatever you want / need to preform on your hardware best you can. If it's okay for your hardware to have this inverse optimization, than fine, generic DMA client should really not care about it. > > > > Peripheral needs to supply a lot of configuration parameters specific to the > > > > DMA controller in use (that's why we have struct dw_dma_slave). > > > > So, seems to me the feasible approach is supply correct data in the first place. > > > > > > How to supply a valid data if clients don't know the DMA controller limitations > > > in general? > > > > This is a good question. DMA controllers are quite different and having unified > > capabilities structure for all is almost impossible task to fulfil. That's why > > custom filter function(s) can help here. Based on compatible string you can > > implement whatever customized quirks like two functions, for example, to try 16 > > burst size first and fallback to 4 if none was previously found. > > Right. As I said in the previous email it's up to the corresponding platforms to > decide the criteria of the filtering including the max-burst length value. Correct! > Even though the DW DMA channels resources aren't uniform on Baikal-T1 SoC I also > won't do the filter-based channel allocation, because I can't predict the SoC > application. Some of them may be used on a platform with active SPI interface > utilization, some with specific requirements to UARTs and so on. It's your choice as platform maintainer. > > > > If you have specific channels to acquire then you probably need to provide a > > > > custom xlate / filter functions. Because above seems a bit hackish workaround > > > > of dynamic channel allocation mechanism. > > > > > > No, I don't have a specific channel to acquire and in general you may use any > > > returned from the DMA subsystem (though some platforms may need a dedicated > > > channels to use, in this case xlate / filter is required). In our SoC any DW DMAC > > > channel can be used for any DMA-capable peripherals like SPI, I2C, UART. But the > > > their DMA settings must properly and optimally configured. It can be only done > > > if you know the DMA controller parameters like max burst length, max block-size, > > > etc. > > > > > > So no. The change proposed by this patch isn't workaround, but a useful feature, > > > moreover expected to be supported by the generic DMA subsystem. > > > > See above. > > > > > > But let's see what we can do better. Since maximum is defined on the slave side > > > > device, it probably needs to define minimum as well, otherwise it's possible > > > > that some hardware can't cope underrun bursts. > > > > > > There is no need to define minimum if such limit doesn't exists except a > > > natural 1. Moreover it doesn't exist for all DMA controllers seeing noone has > > > added such capability into the generic DMA subsystem so far. > > > > There is a contract between provider and consumer about DMA resource. That's > > why both sides should participate in fulfilling it. Theoretically it may be a > > hardware that doesn't support minimum burst available in DMA by a reason. For > > such we would need minimum to be provided as well. > > I don't think 'theoretical' consideration counts when implementing something in > kernel. That 'theoretical' may never happen, but you'll end up supporting a > dummy functionality. Practicality is what kernel developers normally place > before anything else. The point here is to avoid half-baked solutions. I'm not against max-burst logic on top of the existing interface, but would be better if we allow the range, in this case it will work for any DMA controller (as be part of DMA engine family). I guess we need summarize this very long discussion and settle the next steps. (if you can provide in short form anybody can read in 1 minute it would be nice, I already forgot tons of paragraphs you sent here, esp. taking into account tons of paragraphs in the other Baikal related series)
On Fri, May 15, 2020 at 12:09:50PM +0530, Vinod Koul wrote: > On 12-05-20, 22:12, Andy Shevchenko wrote: > > On Tue, May 12, 2020 at 05:08:20PM +0300, Serge Semin wrote: > > > On Fri, May 08, 2020 at 02:41:53PM +0300, Andy Shevchenko wrote: > > > > On Fri, May 08, 2020 at 01:53:03PM +0300, Serge Semin wrote: > > > > > IP core of the DW DMA controller may be synthesized with different > > > > > max burst length of the transfers per each channel. According to Synopsis > > > > > having the fixed maximum burst transactions length may provide some > > > > > performance gain. At the same time setting up the source and destination > > > > > multi size exceeding the max burst length limitation may cause a serious > > > > > problems. In our case the system just hangs up. In order to fix this > > > > > lets introduce the max burst length platform config of the DW DMA > > > > > controller device and don't let the DMA channels configuration code > > > > > exceed the burst length hardware limitation. Depending on the IP core > > > > > configuration the maximum value can vary from channel to channel. > > > > > It can be detected either in runtime from the DWC parameter registers > > > > > or from the dedicated dts property. > > > > > > > > I'm wondering what can be the scenario when your peripheral will ask something > > > > which is not supported by DMA controller? > > > > > > I may misunderstood your statement, because seeing your activity around my > > > patchsets including the SPI patchset and sometimes very helpful comments, > > > this question answer seems too obvious to see you asking it. > > > > > > No need to go far for an example. See the DW APB SSI driver. Its DMA module > > > specifies the burst length to be 16, while not all of ours channels supports it. > > > Yes, originally it has been developed for the Intel Midfield SPI, but since I > > > converted the driver into a generic code we can't use a fixed value. For instance > > > in our hardware only two DMA channels of total 16 are capable of bursting up to > > > 16 bytes (data items) at a time, the rest of them are limited with up to 4 bytes > > > burst length. While there are two SPI interfaces, each of which need to have two > > > DMA channels for communications. So I need four channels in total to allocate to > > > provide the DMA capability for all interfaces. In order to set the SPI controller > > > up with valid optimized parameters the max-burst-length is required. Otherwise we > > > can end up with buffers overrun/underrun. > > > > Right, and we come to the question which channel better to be used by SPI and > > the rest devices. Without specific filter function you can easily get into a > > case of inverted optimizations, when SPI got channels with burst = 4, while > > it's needed 16, and other hardware otherwise. Performance wise it's worse > > scenario which we may avoid in the first place, right? > > If one has channels which are different and described as such in DT, > then I think it does make sense to specify in your board-dt about the > specific channels you would require... Well, we do have such hardware. Our DW DMA controller has got different max burst lengths assigned to first two and the rest of the channels. But creating a functionality of the individual channels assignment is a matter of different patchset. Sorry. It's not one of my task at the moment. My primary task is to integrate the Baikal-T1 SoC support into the kernel. I've refactored a lot of code found in the Baikal-T1 SDK and currently under a pressure of a lot of review. Alas there is no time to create new functionality as you suggest. In future I may provide such, but not in the framework of this patchset. > > > > > > Peripheral needs to supply a lot of configuration parameters specific to the > > > > DMA controller in use (that's why we have struct dw_dma_slave). > > > > So, seems to me the feasible approach is supply correct data in the first place. > > > > > > How to supply a valid data if clients don't know the DMA controller limitations > > > in general? > > > > This is a good question. DMA controllers are quite different and having unified > > capabilities structure for all is almost impossible task to fulfil. That's why > > custom filter function(s) can help here. Based on compatible string you can > > implement whatever customized quirks like two functions, for example, to try 16 > > burst size first and fallback to 4 if none was previously found. > > > > > > If you have specific channels to acquire then you probably need to provide a > > > > custom xlate / filter functions. Because above seems a bit hackish workaround > > > > of dynamic channel allocation mechanism. > > > > > > No, I don't have a specific channel to acquire and in general you may use any > > > returned from the DMA subsystem (though some platforms may need a dedicated > > > channels to use, in this case xlate / filter is required). In our SoC any DW DMAC > > > channel can be used for any DMA-capable peripherals like SPI, I2C, UART. But the > > > their DMA settings must properly and optimally configured. It can be only done > > > if you know the DMA controller parameters like max burst length, max block-size, > > > etc. > > > > > > So no. The change proposed by this patch isn't workaround, but a useful feature, > > > moreover expected to be supported by the generic DMA subsystem. > > > > See above. > > > > > > But let's see what we can do better. Since maximum is defined on the slave side > > > > device, it probably needs to define minimum as well, otherwise it's possible > > > > that some hardware can't cope underrun bursts. > > > > > > There is no need to define minimum if such limit doesn't exists except a > > > natural 1. Moreover it doesn't exist for all DMA controllers seeing noone has > > > added such capability into the generic DMA subsystem so far. > > > > There is a contract between provider and consumer about DMA resource. That's > > why both sides should participate in fulfilling it. Theoretically it may be a > > hardware that doesn't support minimum burst available in DMA by a reason. For > > such we would need minimum to be provided as well. > > Agreed and if required caps should be extended to tell consumer the > minimum values supported. Sorry, it's not required by our hardware. Is there any, which actually has such limitation? (minimum burst length) -Sergey > > -- > ~Vinod
On 17-05-20, 22:38, Serge Semin wrote: > On Fri, May 15, 2020 at 12:09:50PM +0530, Vinod Koul wrote: > > On 12-05-20, 22:12, Andy Shevchenko wrote: > > > On Tue, May 12, 2020 at 05:08:20PM +0300, Serge Semin wrote: > > > > On Fri, May 08, 2020 at 02:41:53PM +0300, Andy Shevchenko wrote: > > > > > On Fri, May 08, 2020 at 01:53:03PM +0300, Serge Semin wrote: > > > > > > IP core of the DW DMA controller may be synthesized with different > > > > > > max burst length of the transfers per each channel. According to Synopsis > > > > > > having the fixed maximum burst transactions length may provide some > > > > > > performance gain. At the same time setting up the source and destination > > > > > > multi size exceeding the max burst length limitation may cause a serious > > > > > > problems. In our case the system just hangs up. In order to fix this > > > > > > lets introduce the max burst length platform config of the DW DMA > > > > > > controller device and don't let the DMA channels configuration code > > > > > > exceed the burst length hardware limitation. Depending on the IP core > > > > > > configuration the maximum value can vary from channel to channel. > > > > > > It can be detected either in runtime from the DWC parameter registers > > > > > > or from the dedicated dts property. > > > > > > > > > > I'm wondering what can be the scenario when your peripheral will ask something > > > > > which is not supported by DMA controller? > > > > > > > > I may misunderstood your statement, because seeing your activity around my > > > > patchsets including the SPI patchset and sometimes very helpful comments, > > > > this question answer seems too obvious to see you asking it. > > > > > > > > No need to go far for an example. See the DW APB SSI driver. Its DMA module > > > > specifies the burst length to be 16, while not all of ours channels supports it. > > > > Yes, originally it has been developed for the Intel Midfield SPI, but since I > > > > converted the driver into a generic code we can't use a fixed value. For instance > > > > in our hardware only two DMA channels of total 16 are capable of bursting up to > > > > 16 bytes (data items) at a time, the rest of them are limited with up to 4 bytes > > > > burst length. While there are two SPI interfaces, each of which need to have two > > > > DMA channels for communications. So I need four channels in total to allocate to > > > > provide the DMA capability for all interfaces. In order to set the SPI controller > > > > up with valid optimized parameters the max-burst-length is required. Otherwise we > > > > can end up with buffers overrun/underrun. > > > > > > Right, and we come to the question which channel better to be used by SPI and > > > the rest devices. Without specific filter function you can easily get into a > > > case of inverted optimizations, when SPI got channels with burst = 4, while > > > it's needed 16, and other hardware otherwise. Performance wise it's worse > > > scenario which we may avoid in the first place, right? > > > > If one has channels which are different and described as such in DT, > > then I think it does make sense to specify in your board-dt about the > > specific channels you would require... > > Well, we do have such hardware. Our DW DMA controller has got different max > burst lengths assigned to first two and the rest of the channels. But creating > a functionality of the individual channels assignment is a matter of different > patchset. Sorry. It's not one of my task at the moment. > > My primary task is to integrate the Baikal-T1 SoC support into the kernel. I've > refactored a lot of code found in the Baikal-T1 SDK and currently under a pressure > of a lot of review. Alas there is no time to create new functionality as you > suggest. In future I may provide such, but not in the framework of this patchset. Well you need to tell your folks that upstreaming does not work under pressure and we can't put a timeline for upstreaming. It needs to do what is deemed the right way. Reviews can take time, that needs to be comprehended as well! > > > > > Peripheral needs to supply a lot of configuration parameters specific to the > > > > > DMA controller in use (that's why we have struct dw_dma_slave). > > > > > So, seems to me the feasible approach is supply correct data in the first place. > > > > > > > > How to supply a valid data if clients don't know the DMA controller limitations > > > > in general? > > > > > > This is a good question. DMA controllers are quite different and having unified > > > capabilities structure for all is almost impossible task to fulfil. That's why > > > custom filter function(s) can help here. Based on compatible string you can > > > implement whatever customized quirks like two functions, for example, to try 16 > > > burst size first and fallback to 4 if none was previously found. > > > > > > > > If you have specific channels to acquire then you probably need to provide a > > > > > custom xlate / filter functions. Because above seems a bit hackish workaround > > > > > of dynamic channel allocation mechanism. > > > > > > > > No, I don't have a specific channel to acquire and in general you may use any > > > > returned from the DMA subsystem (though some platforms may need a dedicated > > > > channels to use, in this case xlate / filter is required). In our SoC any DW DMAC > > > > channel can be used for any DMA-capable peripherals like SPI, I2C, UART. But the > > > > their DMA settings must properly and optimally configured. It can be only done > > > > if you know the DMA controller parameters like max burst length, max block-size, > > > > etc. > > > > > > > > So no. The change proposed by this patch isn't workaround, but a useful feature, > > > > moreover expected to be supported by the generic DMA subsystem. > > > > > > See above. > > > > > > > > But let's see what we can do better. Since maximum is defined on the slave side > > > > > device, it probably needs to define minimum as well, otherwise it's possible > > > > > that some hardware can't cope underrun bursts. > > > > > > > > There is no need to define minimum if such limit doesn't exists except a > > > > natural 1. Moreover it doesn't exist for all DMA controllers seeing noone has > > > > added such capability into the generic DMA subsystem so far. > > > > > > There is a contract between provider and consumer about DMA resource. That's > > > why both sides should participate in fulfilling it. Theoretically it may be a > > > hardware that doesn't support minimum burst available in DMA by a reason. For > > > such we would need minimum to be provided as well. > > > > Agreed and if required caps should be extended to tell consumer the > > minimum values supported. > > Sorry, it's not required by our hardware. Is there any, which actually has such > limitation? (minimum burst length) IIUC the idea is that you will tell maximum and minimum values supported and client can pick the best value. Esp in case of slave transfers things like burst, msize are governed by client capability and usage. So exposing the set to pick from would make sense
On Tue, May 19, 2020 at 10:37:14PM +0530, Vinod Koul wrote: > On 17-05-20, 22:38, Serge Semin wrote: > > On Fri, May 15, 2020 at 12:09:50PM +0530, Vinod Koul wrote: > > > On 12-05-20, 22:12, Andy Shevchenko wrote: > > > > On Tue, May 12, 2020 at 05:08:20PM +0300, Serge Semin wrote: > > > > > On Fri, May 08, 2020 at 02:41:53PM +0300, Andy Shevchenko wrote: > > > > > > On Fri, May 08, 2020 at 01:53:03PM +0300, Serge Semin wrote: [nip] > > > > > > But let's see what we can do better. Since maximum is defined on the slave side > > > > > > device, it probably needs to define minimum as well, otherwise it's possible > > > > > > that some hardware can't cope underrun bursts. > > > > > > > > > > There is no need to define minimum if such limit doesn't exists except a > > > > > natural 1. Moreover it doesn't exist for all DMA controllers seeing noone has > > > > > added such capability into the generic DMA subsystem so far. > > > > > > > > There is a contract between provider and consumer about DMA resource. That's > > > > why both sides should participate in fulfilling it. Theoretically it may be a > > > > hardware that doesn't support minimum burst available in DMA by a reason. For > > > > such we would need minimum to be provided as well. > > > > > > Agreed and if required caps should be extended to tell consumer the > > > minimum values supported. > > > > Sorry, it's not required by our hardware. Is there any, which actually has such > > limitation? (minimum burst length) > > IIUC the idea is that you will tell maximum and minimum values supported > and client can pick the best value. Esp in case of slave transfers > things like burst, msize are governed by client capability and usage. So > exposing the set to pick from would make sense Agreed. I'll add min_burst capability. -Sergey > > -- > ~Vinod
diff --git a/drivers/dma/dw/core.c b/drivers/dma/dw/core.c index e4749c296fca..5b76ccc857fd 100644 --- a/drivers/dma/dw/core.c +++ b/drivers/dma/dw/core.c @@ -1053,6 +1053,7 @@ int do_dma_probe(struct dw_dma_chip *chip) { struct dw_dma *dw = chip->dw; struct dw_dma_platform_data *pdata; + u32 max_burst = DW_DMA_MAX_BURST; bool autocfg = false; unsigned int block_size = 0; unsigned int dw_params; @@ -1181,9 +1182,12 @@ int do_dma_probe(struct dw_dma_chip *chip) (4 << ((pdata->block_size >> 4 * i) & 0xf)) - 1; dwc->nollp = (dwc_params >> DWC_PARAMS_MBLK_EN & 0x1) == 0; + dwc->max_burst = + (0x4 << (dwc_params >> DWC_PARAMS_MSIZE & 0x7)); } else { dwc->block_size = pdata->block_size; dwc->nollp = !pdata->multi_block[i]; + dwc->max_burst = pdata->max_burst[i] ?: DW_DMA_MAX_BURST; } /* @@ -1198,6 +1202,15 @@ int do_dma_probe(struct dw_dma_chip *chip) if (dwc->block_size > block_size) block_size = dwc->block_size; + /* + * Find minimum of maximum burst lengths to be set in the + * DMA device descriptor. This will at least leave us on a safe + * side of using the DMA device, so the DMA clients can have it + * to properly set buffer thresholds up. + */ + if (dwc->max_burst < max_burst) + max_burst = dwc->max_burst; + /* * It might crucial for some devices to have the hardware * accelerated multi-block transfers supported. Especially it @@ -1244,6 +1257,7 @@ int do_dma_probe(struct dw_dma_chip *chip) /* DMA capabilities */ dw->dma.src_addr_widths = DW_DMA_BUSWIDTHS; dw->dma.dst_addr_widths = DW_DMA_BUSWIDTHS; + dw->dma.max_burst = max_burst; dw->dma.directions = BIT(DMA_DEV_TO_MEM) | BIT(DMA_MEM_TO_DEV) | BIT(DMA_MEM_TO_MEM); dw->dma.residue_granularity = DMA_RESIDUE_GRANULARITY_BURST; diff --git a/drivers/dma/dw/dw.c b/drivers/dma/dw/dw.c index 7a085b3c1854..4d6b1ecabda4 100644 --- a/drivers/dma/dw/dw.c +++ b/drivers/dma/dw/dw.c @@ -86,6 +86,7 @@ static void dw_dma_encode_maxburst(struct dw_dma_chan *dwc, u32 *maxburst) * Fix burst size according to dw_dmac. We need to convert them as: * 1 -> 0, 4 -> 1, 8 -> 2, 16 -> 3. */ + *maxburst = clamp(*maxburst, 0U, dwc->max_burst); *maxburst = *maxburst > 1 ? fls(*maxburst) - 2 : 0; } diff --git a/drivers/dma/dw/of.c b/drivers/dma/dw/of.c index 9e27831dee32..d7323aad7cb5 100644 --- a/drivers/dma/dw/of.c +++ b/drivers/dma/dw/of.c @@ -98,6 +98,15 @@ struct dw_dma_platform_data *dw_dma_parse_dt(struct platform_device *pdev) pdata->multi_block[tmp] = 1; } + if (!of_property_read_u32_array(np, "snps,max-burst-len", mb, + nr_channels)) { + for (tmp = 0; tmp < nr_channels; tmp++) + pdata->max_burst[tmp] = mb[tmp]; + } else { + for (tmp = 0; tmp < nr_channels; tmp++) + pdata->max_burst[tmp] = DW_DMA_MAX_BURST; + } + if (!of_property_read_u32(np, "snps,dma-protection-control", &tmp)) { if (tmp > CHAN_PROTCTL_MASK) return NULL; diff --git a/drivers/dma/dw/regs.h b/drivers/dma/dw/regs.h index 20037d64f961..f581d4809b71 100644 --- a/drivers/dma/dw/regs.h +++ b/drivers/dma/dw/regs.h @@ -125,6 +125,7 @@ struct dw_dma_regs { #define DW_PARAMS_EN 28 /* encoded parameters */ /* Bitfields in DWC_PARAMS */ +#define DWC_PARAMS_MSIZE 16 /* max group transaction size */ #define DWC_PARAMS_MBLK_EN 11 /* multi block transfer */ /* bursts size */ @@ -284,6 +285,7 @@ struct dw_dma_chan { /* hardware configuration */ unsigned int block_size; bool nollp; + u32 max_burst; /* custom slave configuration */ struct dw_dma_slave dws; diff --git a/include/linux/platform_data/dma-dw.h b/include/linux/platform_data/dma-dw.h index f3eaf9ec00a1..13e679afc0e0 100644 --- a/include/linux/platform_data/dma-dw.h +++ b/include/linux/platform_data/dma-dw.h @@ -12,6 +12,7 @@ #define DW_DMA_MAX_NR_MASTERS 4 #define DW_DMA_MAX_NR_CHANNELS 8 +#define DW_DMA_MAX_BURST 256 /** * struct dw_dma_slave - Controller-specific information about a slave @@ -42,6 +43,8 @@ struct dw_dma_slave { * @data_width: Maximum data width supported by hardware per AHB master * (in bytes, power of 2) * @multi_block: Multi block transfers supported by hardware per channel. + * @max_burst: Maximum value of burst transaction size supported by hardware + * per channel (in units of CTL.SRC_TR_WIDTH/CTL.DST_TR_WIDTH). * @protctl: Protection control signals setting per channel. */ struct dw_dma_platform_data { @@ -56,6 +59,7 @@ struct dw_dma_platform_data { unsigned char nr_masters; unsigned char data_width[DW_DMA_MAX_NR_MASTERS]; unsigned char multi_block[DW_DMA_MAX_NR_CHANNELS]; + unsigned int max_burst[DW_DMA_MAX_NR_CHANNELS]; #define CHAN_PROTCTL_PRIVILEGED BIT(0) #define CHAN_PROTCTL_BUFFERABLE BIT(1) #define CHAN_PROTCTL_CACHEABLE BIT(2)
IP core of the DW DMA controller may be synthesized with different max burst length of the transfers per each channel. According to Synopsis having the fixed maximum burst transactions length may provide some performance gain. At the same time setting up the source and destination multi size exceeding the max burst length limitation may cause a serious problems. In our case the system just hangs up. In order to fix this lets introduce the max burst length platform config of the DW DMA controller device and don't let the DMA channels configuration code exceed the burst length hardware limitation. Depending on the IP core configuration the maximum value can vary from channel to channel. It can be detected either in runtime from the DWC parameter registers or from the dedicated dts property. Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Paul Burton <paulburton@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Rob Herring <robh+dt@kernel.org> Cc: linux-mips@vger.kernel.org Cc: devicetree@vger.kernel.org --- Changelog v2: - Rearrange SoBs. - Discard dwc_get_maxburst() accessor. It's enough to have a clamping guard against exceeding the hardware max burst limitation. --- drivers/dma/dw/core.c | 14 ++++++++++++++ drivers/dma/dw/dw.c | 1 + drivers/dma/dw/of.c | 9 +++++++++ drivers/dma/dw/regs.h | 2 ++ include/linux/platform_data/dma-dw.h | 4 ++++ 5 files changed, 30 insertions(+)