Message ID | 1375719297-12871-10-git-send-email-joelf@ti.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Monday 05 August 2013 04:14 PM, Joel Fernandes wrote: > Here we implement splitting up of the total MAX number of slots > available for a channel into 2 cyclically linked sets. Transfer > completion Interrupts are enabled on both linked sets and respective > handler recycles them on completion to process the next linked set. > Both linked sets are cyclically linked to each other to ensure > continuity of DMA operations. Interrupt handlers execute asynchronously > to the EDMA events and recycles the linked sets at the right time, > as a result EDMA is not blocked or dependent on interrupts and DMA > continues till the end of the SG-lists without any interruption. > > Suggested-by: Sekhar Nori <nsekhar@ti.com> > Signed-off-by: Joel Fernandes <joelf@ti.com> > --- > drivers/dma/edma.c | 157 +++++++++++++++++++++++++++++++++++++++------------- > 1 file changed, 118 insertions(+), 39 deletions(-) > > diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c > index df50a04..70923a2 100644 > --- a/drivers/dma/edma.c > +++ b/drivers/dma/edma.c > @@ -48,6 +48,7 @@ > > /* Max of 16 segments per channel to conserve PaRAM slots */ > #define MAX_NR_SG 16 > +#define MAX_NR_LS (MAX_NR_SG >> 1) > #define EDMA_MAX_SLOTS (MAX_NR_SG+1) > #define EDMA_DESCRIPTORS 16 > > @@ -57,6 +58,7 @@ struct edma_desc { > int absync; > int pset_nr; > int total_processed; > + int next_setup_linkset; > struct edmacc_param pset[0]; > }; > > @@ -140,7 +142,9 @@ static void edma_execute(struct edma_chan *echan) > struct edma_desc *edesc; > struct device *dev = echan->vchan.chan.device->dev; > > - int i, j, total_left, total_process; > + int i, total_left, total_link_set; > + int ls_cur_off, ls_next_off, slot_off; > + struct edmacc_param tmp_param; > > /* If either we processed all psets or we're still not started */ > if (!echan->edesc || > @@ -159,48 +163,121 @@ static void edma_execute(struct edma_chan *echan) > > /* Find out how many left */ > total_left = edesc->pset_nr - edesc->total_processed; > - total_process = total_left > MAX_NR_SG ? MAX_NR_SG : total_left; > - > - > - /* Write descriptor PaRAM set(s) */ > - for (i = 0; i < total_process; i++) { > - j = i + edesc->total_processed; > - edma_write_slot(echan->slot[i], &edesc->pset[j]); > - dev_dbg(echan->vchan.chan.device->dev, > - "\n pset[%d]:\n" > - " chnum\t%d\n" > - " slot\t%d\n" > - " opt\t%08x\n" > - " src\t%08x\n" > - " dst\t%08x\n" > - " abcnt\t%08x\n" > - " ccnt\t%08x\n" > - " bidx\t%08x\n" > - " cidx\t%08x\n" > - " lkrld\t%08x\n", > - j, echan->ch_num, echan->slot[i], > - edesc->pset[j].opt, > - edesc->pset[j].src, > - edesc->pset[j].dst, > - edesc->pset[j].a_b_cnt, > - edesc->pset[j].ccnt, > - edesc->pset[j].src_dst_bidx, > - edesc->pset[j].src_dst_cidx, > - edesc->pset[j].link_bcntrld); > - /* Link to the previous slot if not the last set */ > - if (i != (total_process - 1)) > + total_link_set = total_left > MAX_NR_LS ? MAX_NR_LS : total_left; The name you gave here sounds like this is defining total number of linked PaRAM sets. Rather this is actually tracking the number of PaRAM sets (slots) in current linked set, correct? Then may be just call it 'nslots' or even 'num_slots'? There are just too many variables with "total" prefix to keep track of in this function! > + > + /* First time, setup 2 cyclically linked sets, each containing half > + the slots allocated for this channel */ > + if (edesc->total_processed == 0) { We dont need to check for this case for every DMA_COMPLETE interrupt. May be move the initial setup to another function called from edma_issue_pending()? > + for (i = 0; i < total_link_set; i++) { > + edma_write_slot(echan->slot[i+1], &edesc->pset[i]); > + > + if (i != total_link_set - 1) { > + edma_link(echan->slot[i+1], echan->slot[i+2]); > + dump_pset(echan, echan->slot[i+1], > + edesc->pset, i); > + } > + } > + > + edesc->total_processed += total_link_set; > + > + total_left = edesc->pset_nr - edesc->total_processed; > + > + total_link_set = total_left > MAX_NR_LS ? > + MAX_NR_LS : total_left; > + > + if (total_link_set) { > + /* Don't setup interrupt for first linked set for cases > + where total pset_nr is strictly within MAX_NR size */ See Documentation/CodingStyle for multi-line commenting style. > + if (total_left > total_link_set) > + edma_enable_interrupt(echan->slot[i]); > + > + /* Setup link between linked set 0 to set 1 */ > edma_link(echan->slot[i], echan->slot[i+1]); > - /* Final pset links to the dummy pset */ > - else > + > + dump_pset(echan, echan->slot[i], edesc->pset, i-1); > + > + /* Write out linked set 1 */ > + for (; i < total_link_set + MAX_NR_LS; i++) { > + edma_write_slot(echan->slot[i+1], > + &edesc->pset[i]); > + > + if (i != total_link_set + MAX_NR_LS - 1) { > + edma_link(echan->slot[i+1], > + echan->slot[i+2]); > + dump_pset(echan, echan->slot[i+1], > + edesc->pset, i); > + } > + } > + > + edesc->total_processed += total_link_set; > + total_left = edesc->pset_nr - edesc->total_processed; There is way too much duplication of code here mainly because you decided not to loop twice in the course of setting up the two linked sets. Can you use a loop instead? > + > + if (total_left) > + /* Setup a link from linked set 1 to set 0 */ > + edma_link(echan->slot[i], echan->slot[1]); If you have more SGs to service at the end of setting up the two linked sets, you should stop right there and wait for CPU to recycle the linked sets. Right now you are setup for re-DMAing old data. You wont hit this issue in testing because you have setup an interrupt for LS0 and that will most likely service before LS1 completes but we cannot rely on that timing. Just link to dummy at end of LS1 to stall the DMA and wait for the completion handler to come-in and restart the DMA after recycling LS0. I haven't reviewed rest of the patch. Lets make sure we have a common understanding here. Thanks, Sekhar
On 08/12/2013 01:56 PM, Sekhar Nori wrote: > On Monday 05 August 2013 04:14 PM, Joel Fernandes wrote: >> Here we implement splitting up of the total MAX number of slots >> available for a channel into 2 cyclically linked sets. Transfer >> completion Interrupts are enabled on both linked sets and respective >> handler recycles them on completion to process the next linked set. >> Both linked sets are cyclically linked to each other to ensure >> continuity of DMA operations. Interrupt handlers execute asynchronously >> to the EDMA events and recycles the linked sets at the right time, >> as a result EDMA is not blocked or dependent on interrupts and DMA >> continues till the end of the SG-lists without any interruption. >> >> Suggested-by: Sekhar Nori <nsekhar@ti.com> >> Signed-off-by: Joel Fernandes <joelf@ti.com> >> --- >> drivers/dma/edma.c | 157 +++++++++++++++++++++++++++++++++++++++------------- >> 1 file changed, 118 insertions(+), 39 deletions(-) >> >> diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c >> index df50a04..70923a2 100644 >> --- a/drivers/dma/edma.c >> +++ b/drivers/dma/edma.c >> @@ -48,6 +48,7 @@ >> >> /* Max of 16 segments per channel to conserve PaRAM slots */ >> #define MAX_NR_SG 16 >> +#define MAX_NR_LS (MAX_NR_SG >> 1) >> #define EDMA_MAX_SLOTS (MAX_NR_SG+1) >> #define EDMA_DESCRIPTORS 16 >> >> @@ -57,6 +58,7 @@ struct edma_desc { >> int absync; >> int pset_nr; >> int total_processed; >> + int next_setup_linkset; >> struct edmacc_param pset[0]; >> }; >> >> @@ -140,7 +142,9 @@ static void edma_execute(struct edma_chan *echan) >> struct edma_desc *edesc; >> struct device *dev = echan->vchan.chan.device->dev; >> >> - int i, j, total_left, total_process; >> + int i, total_left, total_link_set; >> + int ls_cur_off, ls_next_off, slot_off; >> + struct edmacc_param tmp_param; >> >> /* If either we processed all psets or we're still not started */ >> if (!echan->edesc || >> @@ -159,48 +163,121 @@ static void edma_execute(struct edma_chan *echan) >> >> /* Find out how many left */ >> total_left = edesc->pset_nr - edesc->total_processed; >> - total_process = total_left > MAX_NR_SG ? MAX_NR_SG : total_left; >> - >> - >> - /* Write descriptor PaRAM set(s) */ >> - for (i = 0; i < total_process; i++) { >> - j = i + edesc->total_processed; >> - edma_write_slot(echan->slot[i], &edesc->pset[j]); >> - dev_dbg(echan->vchan.chan.device->dev, >> - "\n pset[%d]:\n" >> - " chnum\t%d\n" >> - " slot\t%d\n" >> - " opt\t%08x\n" >> - " src\t%08x\n" >> - " dst\t%08x\n" >> - " abcnt\t%08x\n" >> - " ccnt\t%08x\n" >> - " bidx\t%08x\n" >> - " cidx\t%08x\n" >> - " lkrld\t%08x\n", >> - j, echan->ch_num, echan->slot[i], >> - edesc->pset[j].opt, >> - edesc->pset[j].src, >> - edesc->pset[j].dst, >> - edesc->pset[j].a_b_cnt, >> - edesc->pset[j].ccnt, >> - edesc->pset[j].src_dst_bidx, >> - edesc->pset[j].src_dst_cidx, >> - edesc->pset[j].link_bcntrld); >> - /* Link to the previous slot if not the last set */ >> - if (i != (total_process - 1)) > >> + total_link_set = total_left > MAX_NR_LS ? MAX_NR_LS : total_left; > > The name you gave here sounds like this is defining total number of > linked PaRAM sets. Rather this is actually tracking the number of PaRAM > sets (slots) in current linked set, correct? Then may be just call it > 'nslots' or even 'num_slots'? There are just too many variables with > "total" prefix to keep track of in this function! I would rather just leave this naming alone. The code is quite self documenting: total_link_set means "Calculate what's the total size of a Linkset, or total no.of slots in a linkset we need". This naming is fine in my opinion and doesn't hurt line size at all, instead improving code readability. I could dump the _ between link and set to make it: total_linkset if that makes it any easier. I agree there are too many variables in this function, but they each serve a different purpose and required to implement the algorithm, which is precisely I made them naming a bit more descriptive. > >> + >> + /* First time, setup 2 cyclically linked sets, each containing half >> + the slots allocated for this channel */ >> + if (edesc->total_processed == 0) { > > We dont need to check for this case for every DMA_COMPLETE interrupt. > May be move the initial setup to another function called from > edma_issue_pending()? But how? That would only change the code to (?): if (edesc->total_processed == 0) { issue_pending(); } Further it maybe appear that this case is uncommon, but it is a very common case. Most SG transfers are within the SG limit, though at times the else case can execute a lot too. >> + for (i = 0; i < total_link_set; i++) { >> + edma_write_slot(echan->slot[i+1], &edesc->pset[i]); >> + >> + if (i != total_link_set - 1) { >> + edma_link(echan->slot[i+1], echan->slot[i+2]); >> + dump_pset(echan, echan->slot[i+1], >> + edesc->pset, i); >> + } >> + } >> + >> + edesc->total_processed += total_link_set; >> + >> + total_left = edesc->pset_nr - edesc->total_processed; >> + >> + total_link_set = total_left > MAX_NR_LS ? >> + MAX_NR_LS : total_left; >> + >> + if (total_link_set) { >> + /* Don't setup interrupt for first linked set for cases >> + where total pset_nr is strictly within MAX_NR size */ > > See Documentation/CodingStyle for multi-line commenting style. Ok thanks, changed accordingly. >> + if (total_left > total_link_set) >> + edma_enable_interrupt(echan->slot[i]); >> + >> + /* Setup link between linked set 0 to set 1 */ >> edma_link(echan->slot[i], echan->slot[i+1]); >> - /* Final pset links to the dummy pset */ >> - else >> + >> + dump_pset(echan, echan->slot[i], edesc->pset, i-1); >> + >> + /* Write out linked set 1 */ >> + for (; i < total_link_set + MAX_NR_LS; i++) { >> + edma_write_slot(echan->slot[i+1], >> + &edesc->pset[i]); >> + >> + if (i != total_link_set + MAX_NR_LS - 1) { >> + edma_link(echan->slot[i+1], >> + echan->slot[i+2]); >> + dump_pset(echan, echan->slot[i+1], >> + edesc->pset, i); >> + } >> + } >> + >> + edesc->total_processed += total_link_set; >> + total_left = edesc->pset_nr - edesc->total_processed; > > There is way too much duplication of code here mainly because you > decided not to loop twice in the course of setting up the two linked > sets. Can you use a loop instead? I tried to do this in a loop, its not possible without making the code more unreadable and introducing more variables. Further the follow 3 conditions have to be incorporated into the loop some how which kind of makes it messy.. right now it is linearly determined which case to execute. /* Setup a link from linked set 1 to set 0 */ /* Setup a link between linked set 1 to dummy */ /* First linked set was enough, simply link to dummy */ Since it is just a couple of lines more, I am more to the favor of keeping the code readable than saving a few lines (for a loop of only 2 iterations) introducing more variables and making it look hackish. There is a good chance in future that if implemented in such a way that I have to spend quite a bit of time deciphering it. >> + >> + if (total_left) >> + /* Setup a link from linked set 1 to set 0 */ >> + edma_link(echan->slot[i], echan->slot[1]); > > If you have more SGs to service at the end of setting up the two linked > sets, you should stop right there and wait for CPU to recycle the linked > sets. Right now you are setup for re-DMAing old data. The above linking you're quoting is done in advance _but_, before the link is traversed, it is _guaranteed_ that the linkset being traversed into will be recycled. This is the basis of the whole algorithm and making sure that we never stall. There never ever will be a case where we re-DMA old data because of the guarantee that the recycling will take place before the traversal. Further FWIW, interrupt takes few 100s microseconds to execute, where as DMA is seen to take milliseconds from 1 SG entry to another in my testing. > You wont hit this issue in testing because you have setup an interrupt > for LS0 and that will most likely service before LS1 completes but we > cannot rely on that timing. This goes back to my first patch series where we stall. That doesn't make any sense. In this patch series, we don't want DMA to stall at any cost. > Just link to dummy at end of LS1 to stall the DMA and wait for the > completion handler to come-in and restart the DMA after recycling LS0. Nope! Linking to dummy will absorb the events and the events will never get triggered again. Trust me I have already done what you are saying and it doesn't work. > I haven't reviewed rest of the patch. Lets make sure we have a common > understanding here. Sure, thanks. -Joel
diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c index df50a04..70923a2 100644 --- a/drivers/dma/edma.c +++ b/drivers/dma/edma.c @@ -48,6 +48,7 @@ /* Max of 16 segments per channel to conserve PaRAM slots */ #define MAX_NR_SG 16 +#define MAX_NR_LS (MAX_NR_SG >> 1) #define EDMA_MAX_SLOTS (MAX_NR_SG+1) #define EDMA_DESCRIPTORS 16 @@ -57,6 +58,7 @@ struct edma_desc { int absync; int pset_nr; int total_processed; + int next_setup_linkset; struct edmacc_param pset[0]; }; @@ -140,7 +142,9 @@ static void edma_execute(struct edma_chan *echan) struct edma_desc *edesc; struct device *dev = echan->vchan.chan.device->dev; - int i, j, total_left, total_process; + int i, total_left, total_link_set; + int ls_cur_off, ls_next_off, slot_off; + struct edmacc_param tmp_param; /* If either we processed all psets or we're still not started */ if (!echan->edesc || @@ -159,48 +163,121 @@ static void edma_execute(struct edma_chan *echan) /* Find out how many left */ total_left = edesc->pset_nr - edesc->total_processed; - total_process = total_left > MAX_NR_SG ? MAX_NR_SG : total_left; - - - /* Write descriptor PaRAM set(s) */ - for (i = 0; i < total_process; i++) { - j = i + edesc->total_processed; - edma_write_slot(echan->slot[i], &edesc->pset[j]); - dev_dbg(echan->vchan.chan.device->dev, - "\n pset[%d]:\n" - " chnum\t%d\n" - " slot\t%d\n" - " opt\t%08x\n" - " src\t%08x\n" - " dst\t%08x\n" - " abcnt\t%08x\n" - " ccnt\t%08x\n" - " bidx\t%08x\n" - " cidx\t%08x\n" - " lkrld\t%08x\n", - j, echan->ch_num, echan->slot[i], - edesc->pset[j].opt, - edesc->pset[j].src, - edesc->pset[j].dst, - edesc->pset[j].a_b_cnt, - edesc->pset[j].ccnt, - edesc->pset[j].src_dst_bidx, - edesc->pset[j].src_dst_cidx, - edesc->pset[j].link_bcntrld); - /* Link to the previous slot if not the last set */ - if (i != (total_process - 1)) + total_link_set = total_left > MAX_NR_LS ? MAX_NR_LS : total_left; + + /* First time, setup 2 cyclically linked sets, each containing half + the slots allocated for this channel */ + if (edesc->total_processed == 0) { + for (i = 0; i < total_link_set; i++) { + edma_write_slot(echan->slot[i+1], &edesc->pset[i]); + + if (i != total_link_set - 1) { + edma_link(echan->slot[i+1], echan->slot[i+2]); + dump_pset(echan, echan->slot[i+1], + edesc->pset, i); + } + } + + edesc->total_processed += total_link_set; + + total_left = edesc->pset_nr - edesc->total_processed; + + total_link_set = total_left > MAX_NR_LS ? + MAX_NR_LS : total_left; + + if (total_link_set) { + /* Don't setup interrupt for first linked set for cases + where total pset_nr is strictly within MAX_NR size */ + if (total_left > total_link_set) + edma_enable_interrupt(echan->slot[i]); + + /* Setup link between linked set 0 to set 1 */ edma_link(echan->slot[i], echan->slot[i+1]); - /* Final pset links to the dummy pset */ - else + + dump_pset(echan, echan->slot[i], edesc->pset, i-1); + + /* Write out linked set 1 */ + for (; i < total_link_set + MAX_NR_LS; i++) { + edma_write_slot(echan->slot[i+1], + &edesc->pset[i]); + + if (i != total_link_set + MAX_NR_LS - 1) { + edma_link(echan->slot[i+1], + echan->slot[i+2]); + dump_pset(echan, echan->slot[i+1], + edesc->pset, i); + } + } + + edesc->total_processed += total_link_set; + total_left = edesc->pset_nr - edesc->total_processed; + + if (total_left) + /* Setup a link from linked set 1 to set 0 */ + edma_link(echan->slot[i], echan->slot[1]); + else + /* Setup a link between linked set 1 to dummy */ + edma_link(echan->slot[i], echan->ecc->dummy_slot); + } else { + /* First linked set was enough, simply link to dummy */ edma_link(echan->slot[i], echan->ecc->dummy_slot); - } + } + + edma_enable_interrupt(echan->slot[i]); + dump_pset(echan, echan->slot[i], edesc->pset, i-1); - edesc->total_processed += total_process; + edesc->next_setup_linkset = 0; - if (edesc->total_processed <= MAX_NR_SG) { + /* Start the ball rolling... */ dev_dbg(dev, "first transfer starting %d\n", echan->ch_num); + + edma_read_slot(echan->slot[1], &tmp_param); + edma_write_slot(echan->slot[0], &tmp_param); edma_start(echan->ch_num); + + return; + } + + /* We got called in the middle of an SG-list transaction as one of the + linked sets completed */ + + /* Setup offsets into echan_slot, +1 is as slot 0 is for chan */ + if (edesc->next_setup_linkset == 1) { + edesc->next_setup_linkset = 0; + ls_cur_off = MAX_NR_LS + 1; + ls_next_off = 1; + } else { + edesc->next_setup_linkset = 1; + ls_cur_off = 1; + ls_next_off = MAX_NR_LS + 1; + } + + for (i = 0; i < total_link_set; i++) { + edma_write_slot(echan->slot[i + ls_cur_off], + &edesc->pset[i + edesc->total_processed]); + + if (i != total_link_set - 1) { + edma_link(echan->slot[i + ls_cur_off], + echan->slot[i + ls_cur_off + 1]); + + dump_pset(echan, echan->slot[i + ls_cur_off], + edesc->pset, i + edesc->total_processed); + } } + + edesc->total_processed += total_link_set; + + slot_off = total_link_set + ls_cur_off - 1; + + if (edesc->total_processed == edesc->pset_nr) + edma_link(echan->slot[slot_off], echan->ecc->dummy_slot); + else + edma_link(echan->slot[slot_off], echan->slot[ls_next_off]); + + edma_enable_interrupt(echan->slot[slot_off]); + + dump_pset(echan, echan->slot[slot_off], + edesc->pset, edesc->total_processed-1); } static int edma_terminate_all(struct edma_chan *echan) @@ -417,15 +494,17 @@ static void edma_callback(unsigned ch_num, u16 ch_status, void *data) spin_lock_irqsave(&echan->vchan.lock, flags); edesc = echan->edesc; + if (edesc) { if (edesc->total_processed == edesc->pset_nr) { - dev_dbg(dev, "transfer complete." \ + dev_dbg(dev, "Transfer complete," " stopping channel %d\n", ch_num); edma_stop(echan->ch_num); vchan_cookie_complete(&edesc->vdesc); } else { - dev_dbg(dev, "Intermediate transfer complete" \ - " on channel %d\n", ch_num); + dev_dbg(dev, "Intermediate transfer " + "complete, setup next linked set on " + "%d\n ", ch_num); } edma_execute(echan);
Here we implement splitting up of the total MAX number of slots available for a channel into 2 cyclically linked sets. Transfer completion Interrupts are enabled on both linked sets and respective handler recycles them on completion to process the next linked set. Both linked sets are cyclically linked to each other to ensure continuity of DMA operations. Interrupt handlers execute asynchronously to the EDMA events and recycles the linked sets at the right time, as a result EDMA is not blocked or dependent on interrupts and DMA continues till the end of the SG-lists without any interruption. Suggested-by: Sekhar Nori <nsekhar@ti.com> Signed-off-by: Joel Fernandes <joelf@ti.com> --- drivers/dma/edma.c | 157 +++++++++++++++++++++++++++++++++++++++------------- 1 file changed, 118 insertions(+), 39 deletions(-)