Message ID | YfGxafKxQdw8GhMc@mx3210.localdomain (mailing list archive) |
---|---|
State | Accepted, archived |
Headers | show |
Series | parisc: Fix data TLB miss in sba_unmap_sg | expand |
On 1/26/22 21:39, John David Anglin wrote: > Rolf Eike Beer reported the following bug: > > [1274934.746891] Bad Address (null pointer deref?): Code=15 (Data TLB miss fault) at addr 0000004140000018 > [1274934.746891] CPU: 3 PID: 5549 Comm: cmake Not tainted 5.15.4-gentoo-parisc64 #4 > [1274934.746891] Hardware name: 9000/785/C8000 > [1274934.746891] > [1274934.746891] YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI > [1274934.746891] PSW: 00001000000001001111111000001110 Not tainted > [1274934.746891] r00-03 000000ff0804fe0e 0000000040bc9bc0 00000000406760e4 0000004140000000 > [1274934.746891] r04-07 0000000040b693c0 0000004140000000 000000004a2b08b0 0000000000000001 > [1274934.746891] r08-11 0000000041f98810 0000000000000000 000000004a0a7000 0000000000000001 > [1274934.746891] r12-15 0000000040bddbc0 0000000040c0cbc0 0000000040bddbc0 0000000040bddbc0 > [1274934.746891] r16-19 0000000040bde3c0 0000000040bddbc0 0000000040bde3c0 0000000000000007 > [1274934.746891] r20-23 0000000000000006 000000004a368950 0000000000000000 0000000000000001 > [1274934.746891] r24-27 0000000000001fff 000000000800000e 000000004a1710f0 0000000040b693c0 > [1274934.746891] r28-31 0000000000000001 0000000041f988b0 0000000041f98840 000000004a171118 > [1274934.746891] sr00-03 00000000066e5800 0000000000000000 0000000000000000 00000000066e5800 > [1274934.746891] sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > [1274934.746891] > [1274934.746891] IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000406760e8 00000000406760ec > [1274934.746891] IIR: 48780030 ISR: 0000000000000000 IOR: 0000004140000018 > [1274934.746891] CPU: 3 CR30: 00000040e3a9c000 CR31: ffffffffffffffff > [1274934.746891] ORIG_R28: 0000000040acdd58 > [1274934.746891] IAOQ[0]: sba_unmap_sg+0xb0/0x118 > [1274934.746891] IAOQ[1]: sba_unmap_sg+0xb4/0x118 > [1274934.746891] RP(r2): sba_unmap_sg+0xac/0x118 > [1274934.746891] Backtrace: > [1274934.746891] [<00000000402740cc>] dma_unmap_sg_attrs+0x6c/0x70 > [1274934.746891] [<000000004074d6bc>] scsi_dma_unmap+0x54/0x60 > [1274934.746891] [<00000000407a3488>] mptscsih_io_done+0x150/0xd70 > [1274934.746891] [<0000000040798600>] mpt_interrupt+0x168/0xa68 > [1274934.746891] [<0000000040255a48>] __handle_irq_event_percpu+0xc8/0x278 > [1274934.746891] [<0000000040255c34>] handle_irq_event_percpu+0x3c/0xd8 > [1274934.746891] [<000000004025ecb4>] handle_percpu_irq+0xb4/0xf0 > [1274934.746891] [<00000000402548e0>] generic_handle_irq+0x50/0x70 > [1274934.746891] [<000000004019a254>] call_on_stack+0x18/0x24 > [1274934.746891] > [1274934.746891] Kernel panic - not syncing: Bad Address (null pointer deref?) > > The bug is caused by overrunning the sglist and incorrectly testing > sg_dma_len(sglist) before nents. Normally this doesn't cause a crash, > but in this case sglist crossed a page boundary. This occurs in the > following code: > > while (sg_dma_len(sglist) && nents--) { Will you check the same loop in ccio-dma.c:1006 ? Helge > > The fix is simply to test nents first and move the decrement of nents > into the loop. > > Reported-by: Rolf Eike Beer <eike-kernel@sf-tec.de> > Signed-off-by: John David Anglin <dave.anglin@bell.net> > --- > > diff --git a/drivers/parisc/sba_iommu.c b/drivers/parisc/sba_iommu.c > index e60690d38d67..374b9199878d 100644 > --- a/drivers/parisc/sba_iommu.c > +++ b/drivers/parisc/sba_iommu.c > @@ -1047,7 +1047,7 @@ sba_unmap_sg(struct device *dev, struct scatterlist *sglist, int nents, > spin_unlock_irqrestore(&ioc->res_lock, flags); > #endif > > - while (sg_dma_len(sglist) && nents--) { > + while (nents && sg_dma_len(sglist)) { > > sba_unmap_page(dev, sg_dma_address(sglist), sg_dma_len(sglist), > direction, 0); > @@ -1056,6 +1056,7 @@ sba_unmap_sg(struct device *dev, struct scatterlist *sglist, int nents, > ioc->usingle_calls--; /* kluge since call is unmap_sg() */ > #endif > ++sglist; > + nents--; > } > > DBG_RUN_SG("%s() DONE (nents %d)\n", __func__, nents); >
On 2022-01-26 4:25 p.m., Helge Deller wrote: > Will you check the same loop in > ccio-dma.c:1006 ? Looks like it has same problem. Dave
Am Mittwoch, 26. Januar 2022, 21:39:05 CET schrieb John David Anglin: > Rolf Eike Beer reported the following bug: > > [1274934.746891] Bad Address (null pointer deref?): Code=15 (Data TLB miss > fault) at addr 0000004140000018 [1274934.746891] CPU: 3 PID: 5549 Comm: > cmake Not tainted 5.15.4-gentoo-parisc64 #4 [1274934.746891] Hardware name: > 9000/785/C8000 > [1274934.746891] > [1274934.746891] YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI > [1274934.746891] PSW: 00001000000001001111111000001110 Not tainted > [1274934.746891] r00-03 000000ff0804fe0e 0000000040bc9bc0 00000000406760e4 > 0000004140000000 [1274934.746891] r04-07 0000000040b693c0 0000004140000000 > 000000004a2b08b0 0000000000000001 [1274934.746891] r08-11 0000000041f98810 > 0000000000000000 000000004a0a7000 0000000000000001 [1274934.746891] r12-15 > 0000000040bddbc0 0000000040c0cbc0 0000000040bddbc0 0000000040bddbc0 > [1274934.746891] r16-19 0000000040bde3c0 0000000040bddbc0 0000000040bde3c0 > 0000000000000007 [1274934.746891] r20-23 0000000000000006 000000004a368950 > 0000000000000000 0000000000000001 [1274934.746891] r24-27 0000000000001fff > 000000000800000e 000000004a1710f0 0000000040b693c0 [1274934.746891] r28-31 > 0000000000000001 0000000041f988b0 0000000041f98840 000000004a171118 > [1274934.746891] sr00-03 00000000066e5800 0000000000000000 > 0000000000000000 00000000066e5800 [1274934.746891] sr04-07 > 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > [1274934.746891] > [1274934.746891] IASQ: 0000000000000000 0000000000000000 IAOQ: > 00000000406760e8 00000000406760ec [1274934.746891] IIR: 48780030 ISR: > 0000000000000000 IOR: 0000004140000018 [1274934.746891] CPU: 3 > CR30: 00000040e3a9c000 CR31: ffffffffffffffff [1274934.746891] ORIG_R28: > 0000000040acdd58 > [1274934.746891] IAOQ[0]: sba_unmap_sg+0xb0/0x118 > [1274934.746891] IAOQ[1]: sba_unmap_sg+0xb4/0x118 > [1274934.746891] RP(r2): sba_unmap_sg+0xac/0x118 > [1274934.746891] Backtrace: > [1274934.746891] [<00000000402740cc>] dma_unmap_sg_attrs+0x6c/0x70 > [1274934.746891] [<000000004074d6bc>] scsi_dma_unmap+0x54/0x60 > [1274934.746891] [<00000000407a3488>] mptscsih_io_done+0x150/0xd70 > [1274934.746891] [<0000000040798600>] mpt_interrupt+0x168/0xa68 > [1274934.746891] [<0000000040255a48>] __handle_irq_event_percpu+0xc8/0x278 > [1274934.746891] [<0000000040255c34>] handle_irq_event_percpu+0x3c/0xd8 > [1274934.746891] [<000000004025ecb4>] handle_percpu_irq+0xb4/0xf0 > [1274934.746891] [<00000000402548e0>] generic_handle_irq+0x50/0x70 > [1274934.746891] [<000000004019a254>] call_on_stack+0x18/0x24 > [1274934.746891] > [1274934.746891] Kernel panic - not syncing: Bad Address (null pointer > deref?) > > The bug is caused by overrunning the sglist and incorrectly testing > sg_dma_len(sglist) before nents. Normally this doesn't cause a crash, > but in this case sglist crossed a page boundary. This occurs in the > following code: > > while (sg_dma_len(sglist) && nents--) { > > The fix is simply to test nents first and move the decrement of nents > into the loop. > > Reported-by: Rolf Eike Beer <eike-kernel@sf-tec.de> > Signed-off-by: John David Anglin <dave.anglin@bell.net> > --- > > diff --git a/drivers/parisc/sba_iommu.c b/drivers/parisc/sba_iommu.c > index e60690d38d67..374b9199878d 100644 > --- a/drivers/parisc/sba_iommu.c > +++ b/drivers/parisc/sba_iommu.c > @@ -1047,7 +1047,7 @@ sba_unmap_sg(struct device *dev, struct scatterlist > *sglist, int nents, spin_unlock_irqrestore(&ioc->res_lock, flags); > #endif > > - while (sg_dma_len(sglist) && nents--) { > + while (nents && sg_dma_len(sglist)) { > What about: for (; nents && sg_dma_len(sglist); nents--) { And when you touch that area anyway, please remove the following newline as well. > sba_unmap_page(dev, sg_dma_address(sglist), sg_dma_len(sglist), > direction, 0); > @@ -1056,6 +1056,7 @@ sba_unmap_sg(struct device *dev, struct scatterlist > *sglist, int nents, ioc->usingle_calls--; /* kluge since call is unmap_sg() > */ > #endif > ++sglist; > + nents--; > } > > DBG_RUN_SG("%s() DONE (nents %d)\n", __func__, nents); Eike
Am Mittwoch, 26. Januar 2022, 21:39:05 CET schrieb John David Anglin: > Rolf Eike Beer reported the following bug: > > [1274934.746891] Bad Address (null pointer deref?): Code=15 (Data TLB miss > fault) at addr 0000004140000018 [1274934.746891] CPU: 3 PID: 5549 Comm: > cmake Not tainted 5.15.4-gentoo-parisc64 #4 [1274934.746891] Hardware name: > 9000/785/C8000 > [1274934.746891] > [1274934.746891] YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI > [1274934.746891] PSW: 00001000000001001111111000001110 Not tainted > [1274934.746891] r00-03 000000ff0804fe0e 0000000040bc9bc0 00000000406760e4 > 0000004140000000 [1274934.746891] r04-07 0000000040b693c0 0000004140000000 > 000000004a2b08b0 0000000000000001 [1274934.746891] r08-11 0000000041f98810 > 0000000000000000 000000004a0a7000 0000000000000001 [1274934.746891] r12-15 > 0000000040bddbc0 0000000040c0cbc0 0000000040bddbc0 0000000040bddbc0 > [1274934.746891] r16-19 0000000040bde3c0 0000000040bddbc0 0000000040bde3c0 > 0000000000000007 [1274934.746891] r20-23 0000000000000006 000000004a368950 > 0000000000000000 0000000000000001 [1274934.746891] r24-27 0000000000001fff > 000000000800000e 000000004a1710f0 0000000040b693c0 [1274934.746891] r28-31 > 0000000000000001 0000000041f988b0 0000000041f98840 000000004a171118 > [1274934.746891] sr00-03 00000000066e5800 0000000000000000 > 0000000000000000 00000000066e5800 [1274934.746891] sr04-07 > 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > [1274934.746891] > [1274934.746891] IASQ: 0000000000000000 0000000000000000 IAOQ: > 00000000406760e8 00000000406760ec [1274934.746891] IIR: 48780030 ISR: > 0000000000000000 IOR: 0000004140000018 [1274934.746891] CPU: 3 > CR30: 00000040e3a9c000 CR31: ffffffffffffffff [1274934.746891] ORIG_R28: > 0000000040acdd58 > [1274934.746891] IAOQ[0]: sba_unmap_sg+0xb0/0x118 > [1274934.746891] IAOQ[1]: sba_unmap_sg+0xb4/0x118 > [1274934.746891] RP(r2): sba_unmap_sg+0xac/0x118 > [1274934.746891] Backtrace: > [1274934.746891] [<00000000402740cc>] dma_unmap_sg_attrs+0x6c/0x70 > [1274934.746891] [<000000004074d6bc>] scsi_dma_unmap+0x54/0x60 > [1274934.746891] [<00000000407a3488>] mptscsih_io_done+0x150/0xd70 > [1274934.746891] [<0000000040798600>] mpt_interrupt+0x168/0xa68 > [1274934.746891] [<0000000040255a48>] __handle_irq_event_percpu+0xc8/0x278 > [1274934.746891] [<0000000040255c34>] handle_irq_event_percpu+0x3c/0xd8 > [1274934.746891] [<000000004025ecb4>] handle_percpu_irq+0xb4/0xf0 > [1274934.746891] [<00000000402548e0>] generic_handle_irq+0x50/0x70 > [1274934.746891] [<000000004019a254>] call_on_stack+0x18/0x24 > [1274934.746891] > [1274934.746891] Kernel panic - not syncing: Bad Address (null pointer > deref?) > > The bug is caused by overrunning the sglist and incorrectly testing > sg_dma_len(sglist) before nents. Normally this doesn't cause a crash, > but in this case sglist crossed a page boundary. This occurs in the > following code: > > while (sg_dma_len(sglist) && nents--) { > > The fix is simply to test nents first and move the decrement of nents > into the loop. > > Reported-by: Rolf Eike Beer <eike-kernel@sf-tec.de> > Signed-off-by: John David Anglin <dave.anglin@bell.net> This needs a "CC:stable" as well, no?
On 1/27/22 07:22, Rolf Eike Beer wrote: > Am Mittwoch, 26. Januar 2022, 21:39:05 CET schrieb John David Anglin: >> Rolf Eike Beer reported the following bug: >> >> [1274934.746891] Bad Address (null pointer deref?): Code=15 (Data TLB miss >> fault) at addr 0000004140000018 [1274934.746891] CPU: 3 PID: 5549 Comm: >> cmake Not tainted 5.15.4-gentoo-parisc64 #4 [1274934.746891] Hardware name: >> 9000/785/C8000 >> [1274934.746891] >> [1274934.746891] YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI >> [1274934.746891] PSW: 00001000000001001111111000001110 Not tainted >> [1274934.746891] r00-03 000000ff0804fe0e 0000000040bc9bc0 00000000406760e4 >> 0000004140000000 [1274934.746891] r04-07 0000000040b693c0 0000004140000000 >> 000000004a2b08b0 0000000000000001 [1274934.746891] r08-11 0000000041f98810 >> 0000000000000000 000000004a0a7000 0000000000000001 [1274934.746891] r12-15 >> 0000000040bddbc0 0000000040c0cbc0 0000000040bddbc0 0000000040bddbc0 >> [1274934.746891] r16-19 0000000040bde3c0 0000000040bddbc0 0000000040bde3c0 >> 0000000000000007 [1274934.746891] r20-23 0000000000000006 000000004a368950 >> 0000000000000000 0000000000000001 [1274934.746891] r24-27 0000000000001fff >> 000000000800000e 000000004a1710f0 0000000040b693c0 [1274934.746891] r28-31 >> 0000000000000001 0000000041f988b0 0000000041f98840 000000004a171118 >> [1274934.746891] sr00-03 00000000066e5800 0000000000000000 >> 0000000000000000 00000000066e5800 [1274934.746891] sr04-07 >> 0000000000000000 0000000000000000 0000000000000000 0000000000000000 >> [1274934.746891] >> [1274934.746891] IASQ: 0000000000000000 0000000000000000 IAOQ: >> 00000000406760e8 00000000406760ec [1274934.746891] IIR: 48780030 ISR: >> 0000000000000000 IOR: 0000004140000018 [1274934.746891] CPU: 3 >> CR30: 00000040e3a9c000 CR31: ffffffffffffffff [1274934.746891] ORIG_R28: >> 0000000040acdd58 >> [1274934.746891] IAOQ[0]: sba_unmap_sg+0xb0/0x118 >> [1274934.746891] IAOQ[1]: sba_unmap_sg+0xb4/0x118 >> [1274934.746891] RP(r2): sba_unmap_sg+0xac/0x118 >> [1274934.746891] Backtrace: >> [1274934.746891] [<00000000402740cc>] dma_unmap_sg_attrs+0x6c/0x70 >> [1274934.746891] [<000000004074d6bc>] scsi_dma_unmap+0x54/0x60 >> [1274934.746891] [<00000000407a3488>] mptscsih_io_done+0x150/0xd70 >> [1274934.746891] [<0000000040798600>] mpt_interrupt+0x168/0xa68 >> [1274934.746891] [<0000000040255a48>] __handle_irq_event_percpu+0xc8/0x278 >> [1274934.746891] [<0000000040255c34>] handle_irq_event_percpu+0x3c/0xd8 >> [1274934.746891] [<000000004025ecb4>] handle_percpu_irq+0xb4/0xf0 >> [1274934.746891] [<00000000402548e0>] generic_handle_irq+0x50/0x70 >> [1274934.746891] [<000000004019a254>] call_on_stack+0x18/0x24 >> [1274934.746891] >> [1274934.746891] Kernel panic - not syncing: Bad Address (null pointer >> deref?) >> >> The bug is caused by overrunning the sglist and incorrectly testing >> sg_dma_len(sglist) before nents. Normally this doesn't cause a crash, >> but in this case sglist crossed a page boundary. This occurs in the >> following code: >> >> while (sg_dma_len(sglist) && nents--) { >> >> The fix is simply to test nents first and move the decrement of nents >> into the loop. >> >> Reported-by: Rolf Eike Beer <eike-kernel@sf-tec.de> >> Signed-off-by: John David Anglin <dave.anglin@bell.net> >> --- >> >> diff --git a/drivers/parisc/sba_iommu.c b/drivers/parisc/sba_iommu.c >> index e60690d38d67..374b9199878d 100644 >> --- a/drivers/parisc/sba_iommu.c >> +++ b/drivers/parisc/sba_iommu.c >> @@ -1047,7 +1047,7 @@ sba_unmap_sg(struct device *dev, struct scatterlist >> *sglist, int nents, spin_unlock_irqrestore(&ioc->res_lock, flags); >> #endif >> >> - while (sg_dma_len(sglist) && nents--) { >> + while (nents && sg_dma_len(sglist)) { >> > > What about: > > for (; nents && sg_dma_len(sglist); nents--) { The way how Dave wrote it is more clean, IMHO. By the way, since you ran into this issue, did you tested it, if it really solves the problem you see? If so, do you want to add a Tested-by: tag ? Helge > And when you touch that area anyway, please remove the following newline as > well. > >> sba_unmap_page(dev, sg_dma_address(sglist), > sg_dma_len(sglist), >> direction, 0); >> @@ -1056,6 +1056,7 @@ sba_unmap_sg(struct device *dev, struct scatterlist >> *sglist, int nents, ioc->usingle_calls--; /* kluge since call is unmap_sg() >> */ >> #endif >> ++sglist; >> + nents--; >> } >> >> DBG_RUN_SG("%s() DONE (nents %d)\n", __func__, nents); > > Eike >
Am Donnerstag, 27. Januar 2022, 07:58:19 CET schrieb Helge Deller: > On 1/27/22 07:22, Rolf Eike Beer wrote: > > Am Mittwoch, 26. Januar 2022, 21:39:05 CET schrieb John David Anglin: > >> The bug is caused by overrunning the sglist and incorrectly testing > >> sg_dma_len(sglist) before nents. Normally this doesn't cause a crash, > >> but in this case sglist crossed a page boundary. This occurs in the > >> > >> following code: > >> while (sg_dma_len(sglist) && nents--) { > >> > >> The fix is simply to test nents first and move the decrement of nents > >> into the loop. > >> > >> Reported-by: Rolf Eike Beer <eike-kernel@sf-tec.de> > >> Signed-off-by: John David Anglin <dave.anglin@bell.net> > >> --- > >> > >> diff --git a/drivers/parisc/sba_iommu.c b/drivers/parisc/sba_iommu.c > >> index e60690d38d67..374b9199878d 100644 > >> --- a/drivers/parisc/sba_iommu.c > >> +++ b/drivers/parisc/sba_iommu.c > >> @@ -1047,7 +1047,7 @@ sba_unmap_sg(struct device *dev, struct scatterlist > >> *sglist, int nents, spin_unlock_irqrestore(&ioc->res_lock, flags); > >> > >> #endif > >> > >> - while (sg_dma_len(sglist) && nents--) { > >> + while (nents && sg_dma_len(sglist)) { > > > > What about: > > for (; nents && sg_dma_len(sglist); nents--) { > > The way how Dave wrote it is more clean, IMHO. YMMV :P > By the way, since you ran into this issue, did you tested it, > if it really solves the problem you see? > If so, do you want to add a Tested-by: tag ? No, I'm glad the machine is up and only crashing userspace processes atm. I can't remember seeing this before, so I guess it was pure luck. Eike
On 2022-01-27 1:58 a.m., Helge Deller wrote: >>> diff --git a/drivers/parisc/sba_iommu.c b/drivers/parisc/sba_iommu.c >>> index e60690d38d67..374b9199878d 100644 >>> --- a/drivers/parisc/sba_iommu.c >>> +++ b/drivers/parisc/sba_iommu.c >>> @@ -1047,7 +1047,7 @@ sba_unmap_sg(struct device *dev, struct scatterlist >>> *sglist, int nents, spin_unlock_irqrestore(&ioc->res_lock, flags); >>> #endif >>> >>> - while (sg_dma_len(sglist) && nents--) { >>> + while (nents && sg_dma_len(sglist)) { >>> >> What about: >> >> for (; nents && sg_dma_len(sglist); nents--) { > The way how Dave wrote it is more clean, IMHO. I'm going to leave the change to sba_iommu.c as proposed. While i'm sure the suggested for statement would be fine, I looked at how gcc handled the while loop. It is quite subtle. Except for an initial test and decrement, the iteration of nents is replaced by a calculation of the the final value for sglist. Regarding the newline, the file has several places where newlines precede #ifdef statements. I think the current style is okay and checkpatch.pl doesn't object to that format. I think whitespace changes should usually be in separate patches. Dave
Am Donnerstag, 27. Januar 2022, 17:16:35 CET schrieb John David Anglin: > On 2022-01-27 1:58 a.m., Helge Deller wrote: > >>> diff --git a/drivers/parisc/sba_iommu.c b/drivers/parisc/sba_iommu.c > >>> index e60690d38d67..374b9199878d 100644 > >>> --- a/drivers/parisc/sba_iommu.c > >>> +++ b/drivers/parisc/sba_iommu.c > >>> @@ -1047,7 +1047,7 @@ sba_unmap_sg(struct device *dev, struct > >>> scatterlist > >>> *sglist, int nents, spin_unlock_irqrestore(&ioc->res_lock, flags); > >>> > >>> #endif > >>> > >>> - while (sg_dma_len(sglist) && nents--) { > >>> + while (nents && sg_dma_len(sglist)) { > >> > >> What about: > >> for (; nents && sg_dma_len(sglist); nents--) { > > > > The way how Dave wrote it is more clean, IMHO. > > I'm going to leave the change to sba_iommu.c as proposed. While i'm sure > the suggested for statement would be fine, I looked at how gcc handled the > while loop. It is quite subtle. Except for an initial test and decrement, > the iteration of nents is replaced by a calculation of the the final value > for sglist. > > Regarding the newline, the file has several places where newlines precede > #ifdef statements. I think the current style is okay and checkpatch.pl > doesn't object to that format. It was not about the #ifdef, it was the line immediately following the opening brace of the loop. Eike
diff --git a/drivers/parisc/sba_iommu.c b/drivers/parisc/sba_iommu.c index e60690d38d67..374b9199878d 100644 --- a/drivers/parisc/sba_iommu.c +++ b/drivers/parisc/sba_iommu.c @@ -1047,7 +1047,7 @@ sba_unmap_sg(struct device *dev, struct scatterlist *sglist, int nents, spin_unlock_irqrestore(&ioc->res_lock, flags); #endif - while (sg_dma_len(sglist) && nents--) { + while (nents && sg_dma_len(sglist)) { sba_unmap_page(dev, sg_dma_address(sglist), sg_dma_len(sglist), direction, 0); @@ -1056,6 +1056,7 @@ sba_unmap_sg(struct device *dev, struct scatterlist *sglist, int nents, ioc->usingle_calls--; /* kluge since call is unmap_sg() */ #endif ++sglist; + nents--; } DBG_RUN_SG("%s() DONE (nents %d)\n", __func__, nents);
Rolf Eike Beer reported the following bug: [1274934.746891] Bad Address (null pointer deref?): Code=15 (Data TLB miss fault) at addr 0000004140000018 [1274934.746891] CPU: 3 PID: 5549 Comm: cmake Not tainted 5.15.4-gentoo-parisc64 #4 [1274934.746891] Hardware name: 9000/785/C8000 [1274934.746891] [1274934.746891] YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI [1274934.746891] PSW: 00001000000001001111111000001110 Not tainted [1274934.746891] r00-03 000000ff0804fe0e 0000000040bc9bc0 00000000406760e4 0000004140000000 [1274934.746891] r04-07 0000000040b693c0 0000004140000000 000000004a2b08b0 0000000000000001 [1274934.746891] r08-11 0000000041f98810 0000000000000000 000000004a0a7000 0000000000000001 [1274934.746891] r12-15 0000000040bddbc0 0000000040c0cbc0 0000000040bddbc0 0000000040bddbc0 [1274934.746891] r16-19 0000000040bde3c0 0000000040bddbc0 0000000040bde3c0 0000000000000007 [1274934.746891] r20-23 0000000000000006 000000004a368950 0000000000000000 0000000000000001 [1274934.746891] r24-27 0000000000001fff 000000000800000e 000000004a1710f0 0000000040b693c0 [1274934.746891] r28-31 0000000000000001 0000000041f988b0 0000000041f98840 000000004a171118 [1274934.746891] sr00-03 00000000066e5800 0000000000000000 0000000000000000 00000000066e5800 [1274934.746891] sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [1274934.746891] [1274934.746891] IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000406760e8 00000000406760ec [1274934.746891] IIR: 48780030 ISR: 0000000000000000 IOR: 0000004140000018 [1274934.746891] CPU: 3 CR30: 00000040e3a9c000 CR31: ffffffffffffffff [1274934.746891] ORIG_R28: 0000000040acdd58 [1274934.746891] IAOQ[0]: sba_unmap_sg+0xb0/0x118 [1274934.746891] IAOQ[1]: sba_unmap_sg+0xb4/0x118 [1274934.746891] RP(r2): sba_unmap_sg+0xac/0x118 [1274934.746891] Backtrace: [1274934.746891] [<00000000402740cc>] dma_unmap_sg_attrs+0x6c/0x70 [1274934.746891] [<000000004074d6bc>] scsi_dma_unmap+0x54/0x60 [1274934.746891] [<00000000407a3488>] mptscsih_io_done+0x150/0xd70 [1274934.746891] [<0000000040798600>] mpt_interrupt+0x168/0xa68 [1274934.746891] [<0000000040255a48>] __handle_irq_event_percpu+0xc8/0x278 [1274934.746891] [<0000000040255c34>] handle_irq_event_percpu+0x3c/0xd8 [1274934.746891] [<000000004025ecb4>] handle_percpu_irq+0xb4/0xf0 [1274934.746891] [<00000000402548e0>] generic_handle_irq+0x50/0x70 [1274934.746891] [<000000004019a254>] call_on_stack+0x18/0x24 [1274934.746891] [1274934.746891] Kernel panic - not syncing: Bad Address (null pointer deref?) The bug is caused by overrunning the sglist and incorrectly testing sg_dma_len(sglist) before nents. Normally this doesn't cause a crash, but in this case sglist crossed a page boundary. This occurs in the following code: while (sg_dma_len(sglist) && nents--) { The fix is simply to test nents first and move the decrement of nents into the loop. Reported-by: Rolf Eike Beer <eike-kernel@sf-tec.de> Signed-off-by: John David Anglin <dave.anglin@bell.net> ---