Message ID | 1371858502-10083-7-git-send-email-santosh.shilimkar@ti.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, 21 Jun 2013, Santosh Shilimkar wrote: > From: Sricharan R <r.sricharan@ti.com> > > The current phys_to_virt patching mechanism does not work > for 64 bit physical addressesp. Note that constant used in add/sub > instructions is encoded in to the last 8 bits of the opcode. So shift > the _pv_offset constant by 24 to get it in to the correct place. > > The v2p patching mechanism patches the higher 32bits of physical > address with a constant. While this is correct, in those platforms > where the lowmem addressable physical memory spawns across 4GB boundary, > a carry bit can be produced as a result of addition of lower 32bits. > This has to be taken in to account and added in to the upper. The patched > __pv_offset and va are added in lower 32bits, where __pv_offset can be > in two's complement form when PA_START < VA_START and that can result > in a false carry bit. > > e.g PA = 0x80000000 VA = 0xC0000000 > __pv_offset = PA - VA = 0xC0000000 (2's complement) > > So adding __pv_offset + VA should never result in a true overflow. So in > order to differentiate between a true carry, a extra flag __pv_sign_flag > is introduced. I'm still wondering if this is worth bothering about. If PA = 0x80000000 and VA = 0xC0000000 there will never be a real carry to propagate to the high word of the physical address as the VA space cannot be larger than 0x40000000. So is there really a case where: 1) physical memory is crossing the 4GB mark, and ... 2) physical memory start address is higher than virtual memory start address needing a carry due to the 32-bit add overflow? It is easy to create (2) just by having a different user:kernel address space split. However I wonder if (1) is likely. Sure you need a memory alias in physical space to be able to boot, however you shouldn't need to address that memory alias via virtual addresses for any significant amount of time. In fact, as soon as the MMU is turned on, there shouldn't be any issue simply using the final physical memory addresses right away. What am I missing? Nicolas
On Tuesday 23 July 2013 09:10 PM, Nicolas Pitre wrote: > On Fri, 21 Jun 2013, Santosh Shilimkar wrote: > >> From: Sricharan R <r.sricharan@ti.com> >> >> The current phys_to_virt patching mechanism does not work >> for 64 bit physical addressesp. Note that constant used in add/sub >> instructions is encoded in to the last 8 bits of the opcode. So shift >> the _pv_offset constant by 24 to get it in to the correct place. >> >> The v2p patching mechanism patches the higher 32bits of physical >> address with a constant. While this is correct, in those platforms >> where the lowmem addressable physical memory spawns across 4GB boundary, >> a carry bit can be produced as a result of addition of lower 32bits. >> This has to be taken in to account and added in to the upper. The patched >> __pv_offset and va are added in lower 32bits, where __pv_offset can be >> in two's complement form when PA_START < VA_START and that can result >> in a false carry bit. >> >> e.g PA = 0x80000000 VA = 0xC0000000 >> __pv_offset = PA - VA = 0xC0000000 (2's complement) >> >> So adding __pv_offset + VA should never result in a true overflow. So in >> order to differentiate between a true carry, a extra flag __pv_sign_flag >> is introduced. > First of all thanks for the review. > I'm still wondering if this is worth bothering about. > > If PA = 0x80000000 and VA = 0xC0000000 there will never be a real carry > to propagate to the high word of the physical address as the VA space > cannot be larger than 0x40000000. > Agreed. > So is there really a case where: > > 1) physical memory is crossing the 4GB mark, and ... > > 2) physical memory start address is higher than virtual memory start > address needing a carry due to the 32-bit add overflow? > Consider below two cases of memory layout apart from one mentioned above where the carry is bit irrelevant as you rightly said. 1) PA = 0x8_0000_0000, VA= 0xC000_0000, absolute pv_offset = 0x7_4000_0000 2) PA = 0x2_8000_0000, VA= 0xC000_000, absolute pv_offset = 0x1_C000_0000 In both of these cases there a true carry which needs to be considered. > It is easy to create (2) just by having a different user:kernel address > space split. However I wonder if (1) is likely. Sure you need a memory > alias in physical space to be able to boot, however you shouldn't need > to address that memory alias via virtual addresses for any > significant amount of time. In fact, as soon as the MMU is turned on, > there shouldn't be any issue simply using the final physical memory > addresses right away. > > What am I missing? > I thought about switching to the final address space along with MMU enable at startup but then based on the discussion earlier (RMK suggested), to have such a patching support in least disruptive manner, we could patch once at boot, and then re-patch at switch over. This also gives flexibility to be able to patch code post machine init. Hopefully I haven't missed your point here. regards, Santosh
On Tue, 23 Jul 2013, Santosh Shilimkar wrote: > On Tuesday 23 July 2013 09:10 PM, Nicolas Pitre wrote: > > On Fri, 21 Jun 2013, Santosh Shilimkar wrote: > > > >> From: Sricharan R <r.sricharan@ti.com> > >> > >> The current phys_to_virt patching mechanism does not work > >> for 64 bit physical addressesp. Note that constant used in add/sub > >> instructions is encoded in to the last 8 bits of the opcode. So shift > >> the _pv_offset constant by 24 to get it in to the correct place. > >> > >> The v2p patching mechanism patches the higher 32bits of physical > >> address with a constant. While this is correct, in those platforms > >> where the lowmem addressable physical memory spawns across 4GB boundary, > >> a carry bit can be produced as a result of addition of lower 32bits. > >> This has to be taken in to account and added in to the upper. The patched > >> __pv_offset and va are added in lower 32bits, where __pv_offset can be > >> in two's complement form when PA_START < VA_START and that can result > >> in a false carry bit. > >> > >> e.g PA = 0x80000000 VA = 0xC0000000 > >> __pv_offset = PA - VA = 0xC0000000 (2's complement) > >> > >> So adding __pv_offset + VA should never result in a true overflow. So in > >> order to differentiate between a true carry, a extra flag __pv_sign_flag > >> is introduced. > > > First of all thanks for the review. > > > I'm still wondering if this is worth bothering about. > > > > If PA = 0x80000000 and VA = 0xC0000000 there will never be a real carry > > to propagate to the high word of the physical address as the VA space > > cannot be larger than 0x40000000. > > > Agreed. > > > So is there really a case where: > > > > 1) physical memory is crossing the 4GB mark, and ... > > > > 2) physical memory start address is higher than virtual memory start > > address needing a carry due to the 32-bit add overflow? > > > Consider below two cases of memory layout apart from one mentioned > above where the carry is bit irrelevant as you rightly said. > > 1) PA = 0x8_0000_0000, VA= 0xC000_0000, absolute pv_offset = 0x7_4000_0000 This can be patched as: mov phys_hi, #0x8 add phys_lo, virt, #0x40000000 @ carry ignored > 2) PA = 0x2_8000_0000, VA= 0xC000_000, absolute pv_offset = 0x1_C000_0000 mov phys_hi, #0x2 add phys_lo, virt, #0xc0000000 @ carry ignored > In both of these cases there a true carry which needs to be > considered. Well, not really. However, if you have: 3) PA = 0x2_8000_0000, VA = 0x4000-0000, pv_offset = 0x2-4000-0000 ... then you need: mov phys_hi, #0x2 adds phys_lo, virt, #0x40000000 adc phys_hi, phys_hi, #0 My question is: how likely is this? What is your actual physical memory start address? If we really need to cope with the carry, then the __pv_sign_flag should instead be represented in pv_offset directly: Taking example #2 above, that would be: mov phys_hi, #0x1 adds phys_lo, virt, #0xc0000000 adc phys_hi, phys_hi, #0 If PA = 0x8000-0000 and VA = 0xc000-0000 then pv_offset is 0xffff-ffff-c000-0000, meaning: mvn phys_hi, #0 add phys_lo, virt, #0xc0000000 adc phys_hi, phys_hi, #0 So that would require a special case in the patching code where a mvn with 0 is used if the high part of pv_offset is 0xffffffff. Nicolas
On Wednesday 24 July 2013 08:19 AM, Nicolas Pitre wrote: > On Tue, 23 Jul 2013, Santosh Shilimkar wrote: > >> On Tuesday 23 July 2013 09:10 PM, Nicolas Pitre wrote: >>> On Fri, 21 Jun 2013, Santosh Shilimkar wrote: >>> >>>> From: Sricharan R <r.sricharan@ti.com> >>>> >>>> The current phys_to_virt patching mechanism does not work >>>> for 64 bit physical addressesp. Note that constant used in add/sub >>>> instructions is encoded in to the last 8 bits of the opcode. So shift >>>> the _pv_offset constant by 24 to get it in to the correct place. >>>> >>>> The v2p patching mechanism patches the higher 32bits of physical >>>> address with a constant. While this is correct, in those platforms >>>> where the lowmem addressable physical memory spawns across 4GB boundary, >>>> a carry bit can be produced as a result of addition of lower 32bits. >>>> This has to be taken in to account and added in to the upper. The patched >>>> __pv_offset and va are added in lower 32bits, where __pv_offset can be >>>> in two's complement form when PA_START < VA_START and that can result >>>> in a false carry bit. >>>> >>>> e.g PA = 0x80000000 VA = 0xC0000000 >>>> __pv_offset = PA - VA = 0xC0000000 (2's complement) >>>> >>>> So adding __pv_offset + VA should never result in a true overflow. So in >>>> order to differentiate between a true carry, a extra flag __pv_sign_flag >>>> is introduced. >> First of all thanks for the review. >> >>> I'm still wondering if this is worth bothering about. >>> >>> If PA = 0x80000000 and VA = 0xC0000000 there will never be a real carry >>> to propagate to the high word of the physical address as the VA space >>> cannot be larger than 0x40000000. >>> >> Agreed. >> >>> So is there really a case where: >>> >>> 1) physical memory is crossing the 4GB mark, and ... >>> >>> 2) physical memory start address is higher than virtual memory start >>> address needing a carry due to the 32-bit add overflow? >>> >> Consider below two cases of memory layout apart from one mentioned >> above where the carry is bit irrelevant as you rightly said. >> >> 1) PA = 0x8_0000_0000, VA= 0xC000_0000, absolute pv_offset = 0x7_4000_0000 > This can be patched as: > > mov phys_hi, #0x8 > add phys_lo, virt, #0x40000000 @ carry ignored > >> 2) PA = 0x2_8000_0000, VA= 0xC000_000, absolute pv_offset = 0x1_C000_0000 > mov phys_hi, #0x2 > add phys_lo, virt, #0xc0000000 @ carry ignored > >> In both of these cases there a true carry which needs to be >> considered. > Well, not really. However, if you have: > > 3) PA = 0x2_8000_0000, VA = 0x4000-0000, pv_offset = 0x2-4000-0000 > > ... then you need: > > mov phys_hi, #0x2 > adds phys_lo, virt, #0x40000000 > adc phys_hi, phys_hi, #0 > > My question is: how likely is this? > > What is your actual physical memory start address? Agreed. In our case we do not have the Physical address crossing across 4GB. So ignoring the carry would have be been OK. But we are also addressing the other case where it would really crossover. > If we really need to cope with the carry, then the __pv_sign_flag should > instead be represented in pv_offset directly: > > Taking example #2 above, that would be: > > mov phys_hi, #0x1 > adds phys_lo, virt, #0xc0000000 > adc phys_hi, phys_hi, #0 > > If PA = 0x8000-0000 and VA = 0xc000-0000 then pv_offset is > 0xffff-ffff-c000-0000, meaning: > > mvn phys_hi, #0 > add phys_lo, virt, #0xc0000000 > adc phys_hi, phys_hi, #0 > > So that would require a special case in the patching code where a mvn > with 0 is used if the high part of pv_offset is 0xffffffff. > > > Nicolas Extending pv_offset to 64bit is really neat way. When PA > VA, then pv_offset is going to be actual value and not 2's complement. Fine here. When running from higher physical address space, we will always fall here. So for the second case where pv_offset is 0xffffffff .., (PA < VA) is a problem only when we run from lower physical address. So we can safely assume that the higher 32bits of PA are '0' and stub it initially. In this way we can avoid the special case. Regards, Sricharan
On Wednesday 24 July 2013 05:20 PM, Sricharan R wrote: > On Wednesday 24 July 2013 08:19 AM, Nicolas Pitre wrote: >> On Tue, 23 Jul 2013, Santosh Shilimkar wrote: >> >>> On Tuesday 23 July 2013 09:10 PM, Nicolas Pitre wrote: >>>> On Fri, 21 Jun 2013, Santosh Shilimkar wrote: >>>> >>>>> From: Sricharan R <r.sricharan@ti.com> >>>>> >>>>> The current phys_to_virt patching mechanism does not work >>>>> for 64 bit physical addressesp. Note that constant used in add/sub >>>>> instructions is encoded in to the last 8 bits of the opcode. So shift >>>>> the _pv_offset constant by 24 to get it in to the correct place. >>>>> >>>>> The v2p patching mechanism patches the higher 32bits of physical >>>>> address with a constant. While this is correct, in those platforms >>>>> where the lowmem addressable physical memory spawns across 4GB boundary, >>>>> a carry bit can be produced as a result of addition of lower 32bits. >>>>> This has to be taken in to account and added in to the upper. The patched >>>>> __pv_offset and va are added in lower 32bits, where __pv_offset can be >>>>> in two's complement form when PA_START < VA_START and that can result >>>>> in a false carry bit. >>>>> >>>>> e.g PA = 0x80000000 VA = 0xC0000000 >>>>> __pv_offset = PA - VA = 0xC0000000 (2's complement) >>>>> >>>>> So adding __pv_offset + VA should never result in a true overflow. So in >>>>> order to differentiate between a true carry, a extra flag __pv_sign_flag >>>>> is introduced. >>> First of all thanks for the review. >>> >>>> I'm still wondering if this is worth bothering about. >>>> >>>> If PA = 0x80000000 and VA = 0xC0000000 there will never be a real carry >>>> to propagate to the high word of the physical address as the VA space >>>> cannot be larger than 0x40000000. >>>> >>> Agreed. >>> >>>> So is there really a case where: >>>> >>>> 1) physical memory is crossing the 4GB mark, and ... >>>> >>>> 2) physical memory start address is higher than virtual memory start >>>> address needing a carry due to the 32-bit add overflow? >>>> >>> Consider below two cases of memory layout apart from one mentioned >>> above where the carry is bit irrelevant as you rightly said. >>> >>> 1) PA = 0x8_0000_0000, VA= 0xC000_0000, absolute pv_offset = 0x7_4000_0000 >> This can be patched as: >> >> mov phys_hi, #0x8 >> add phys_lo, virt, #0x40000000 @ carry ignored >> >>> 2) PA = 0x2_8000_0000, VA= 0xC000_000, absolute pv_offset = 0x1_C000_0000 >> mov phys_hi, #0x2 >> add phys_lo, virt, #0xc0000000 @ carry ignored >> >>> In both of these cases there a true carry which needs to be >>> considered. >> Well, not really. However, if you have: >> >> 3) PA = 0x2_8000_0000, VA = 0x4000-0000, pv_offset = 0x2-4000-0000 >> >> ... then you need: >> >> mov phys_hi, #0x2 >> adds phys_lo, virt, #0x40000000 >> adc phys_hi, phys_hi, #0 >> >> My question is: how likely is this? >> >> What is your actual physical memory start address? > Agreed. In our case we do not have the Physical address crossing across > 4GB. So ignoring the carry would have be been OK. But we are > also addressing the other case where it would really crossover. > >> If we really need to cope with the carry, then the __pv_sign_flag should >> instead be represented in pv_offset directly: >> >> Taking example #2 above, that would be: >> >> mov phys_hi, #0x1 >> adds phys_lo, virt, #0xc0000000 >> adc phys_hi, phys_hi, #0 >> >> If PA = 0x8000-0000 and VA = 0xc000-0000 then pv_offset is >> 0xffff-ffff-c000-0000, meaning: >> >> mvn phys_hi, #0 >> add phys_lo, virt, #0xc0000000 >> adc phys_hi, phys_hi, #0 >> >> So that would require a special case in the patching code where a mvn >> with 0 is used if the high part of pv_offset is 0xffffffff. >> >> >> Nicolas > Extending pv_offset to 64bit is really neat way. When PA > VA, then pv_offset > is going to be actual value and not 2's complement. Fine here. > When running from higher physical address space, we will always fall here. > > So for the second case where pv_offset is 0xffffffff .., (PA < VA) > is a problem only when we run from lower physical address. So we can safely > assume that the higher 32bits of PA are '0' and stub it initially. In this way we > can avoid the special case. Sorry, I missed one more point here. In the second case,we should patch it with 0x0 when (PA > VA) and with 0xffffffff when (PA < VA). Regards, Sricharan
On Wednesday 24 July 2013 08:07 AM, Sricharan R wrote: > On Wednesday 24 July 2013 05:20 PM, Sricharan R wrote: >> On Wednesday 24 July 2013 08:19 AM, Nicolas Pitre wrote: >>> On Tue, 23 Jul 2013, Santosh Shilimkar wrote: >>> >>>> On Tuesday 23 July 2013 09:10 PM, Nicolas Pitre wrote: >>>>> On Fri, 21 Jun 2013, Santosh Shilimkar wrote: >>>>> >>>>>> From: Sricharan R <r.sricharan@ti.com> >>>>>> >>>>>> The current phys_to_virt patching mechanism does not work >>>>>> for 64 bit physical addressesp. Note that constant used in add/sub >>>>>> instructions is encoded in to the last 8 bits of the opcode. So shift >>>>>> the _pv_offset constant by 24 to get it in to the correct place. >>>>>> >>>>>> The v2p patching mechanism patches the higher 32bits of physical >>>>>> address with a constant. While this is correct, in those platforms >>>>>> where the lowmem addressable physical memory spawns across 4GB boundary, >>>>>> a carry bit can be produced as a result of addition of lower 32bits. >>>>>> This has to be taken in to account and added in to the upper. The patched >>>>>> __pv_offset and va are added in lower 32bits, where __pv_offset can be >>>>>> in two's complement form when PA_START < VA_START and that can result >>>>>> in a false carry bit. >>>>>> >>>>>> e.g PA = 0x80000000 VA = 0xC0000000 >>>>>> __pv_offset = PA - VA = 0xC0000000 (2's complement) >>>>>> >>>>>> So adding __pv_offset + VA should never result in a true overflow. So in >>>>>> order to differentiate between a true carry, a extra flag __pv_sign_flag >>>>>> is introduced. >>>> First of all thanks for the review. >>>> >>>>> I'm still wondering if this is worth bothering about. >>>>> >>>>> If PA = 0x80000000 and VA = 0xC0000000 there will never be a real carry >>>>> to propagate to the high word of the physical address as the VA space >>>>> cannot be larger than 0x40000000. >>>>> >>>> Agreed. >>>> >>>>> So is there really a case where: >>>>> >>>>> 1) physical memory is crossing the 4GB mark, and ... >>>>> >>>>> 2) physical memory start address is higher than virtual memory start >>>>> address needing a carry due to the 32-bit add overflow? >>>>> >>>> Consider below two cases of memory layout apart from one mentioned >>>> above where the carry is bit irrelevant as you rightly said. >>>> >>>> 1) PA = 0x8_0000_0000, VA= 0xC000_0000, absolute pv_offset = 0x7_4000_0000 >>> This can be patched as: >>> >>> mov phys_hi, #0x8 >>> add phys_lo, virt, #0x40000000 @ carry ignored >>> >>>> 2) PA = 0x2_8000_0000, VA= 0xC000_000, absolute pv_offset = 0x1_C000_0000 >>> mov phys_hi, #0x2 >>> add phys_lo, virt, #0xc0000000 @ carry ignored >>> >>>> In both of these cases there a true carry which needs to be >>>> considered. >>> Well, not really. However, if you have: >>> >>> 3) PA = 0x2_8000_0000, VA = 0x4000-0000, pv_offset = 0x2-4000-0000 >>> >>> ... then you need: >>> >>> mov phys_hi, #0x2 >>> adds phys_lo, virt, #0x40000000 >>> adc phys_hi, phys_hi, #0 >>> >>> My question is: how likely is this? >>> >>> What is your actual physical memory start address? >> Agreed. In our case we do not have the Physical address crossing across >> 4GB. So ignoring the carry would have be been OK. But we are >> also addressing the other case where it would really crossover. >> Yes. We don't need to worry this case. We can get to this with kernel:user split but nobody uses that case so we can safely ignore this case. >>> If we really need to cope with the carry, then the __pv_sign_flag should >>> instead be represented in pv_offset directly: >>> >>> Taking example #2 above, that would be: >>> >>> mov phys_hi, #0x1 >>> adds phys_lo, virt, #0xc0000000 >>> adc phys_hi, phys_hi, #0 >>> >>> If PA = 0x8000-0000 and VA = 0xc000-0000 then pv_offset is >>> 0xffff-ffff-c000-0000, meaning: >>> >>> mvn phys_hi, #0 >>> add phys_lo, virt, #0xc0000000 >>> adc phys_hi, phys_hi, #0 >>> >>> So that would require a special case in the patching code where a mvn >>> with 0 is used if the high part of pv_offset is 0xffffffff. >>> >>> >>> Nicolas >> Extending pv_offset to 64bit is really neat way. When PA > VA, then pv_offset >> is going to be actual value and not 2's complement. Fine here. >> When running from higher physical address space, we will always fall here. >> >> So for the second case where pv_offset is 0xffffffff .., (PA < VA) >> is a problem only when we run from lower physical address. So we can safely >> assume that the higher 32bits of PA are '0' and stub it initially. In this way we >> can avoid the special case. > Sorry, I missed one more point here. In the second case,we should patch it with > 0x0 when (PA > VA) and with 0xffffffff when (PA < VA). > As Sricharan said, we agree with your suggestion for the special case patching. It will be either 0x0 or 0xffffffff so easy to take care. We will try out the suggested changes. Thanks a lot again. Regards, Santosh
On Wed, 24 Jul 2013, Sricharan R wrote: > On Wednesday 24 July 2013 05:20 PM, Sricharan R wrote: > > On Wednesday 24 July 2013 08:19 AM, Nicolas Pitre wrote: > >> On Tue, 23 Jul 2013, Santosh Shilimkar wrote: > >> > >>> On Tuesday 23 July 2013 09:10 PM, Nicolas Pitre wrote: > >>>> On Fri, 21 Jun 2013, Santosh Shilimkar wrote: > >>>> > >>>>> From: Sricharan R <r.sricharan@ti.com> > >>>>> > >>>>> The current phys_to_virt patching mechanism does not work > >>>>> for 64 bit physical addressesp. Note that constant used in add/sub > >>>>> instructions is encoded in to the last 8 bits of the opcode. So shift > >>>>> the _pv_offset constant by 24 to get it in to the correct place. > >>>>> > >>>>> The v2p patching mechanism patches the higher 32bits of physical > >>>>> address with a constant. While this is correct, in those platforms > >>>>> where the lowmem addressable physical memory spawns across 4GB boundary, > >>>>> a carry bit can be produced as a result of addition of lower 32bits. > >>>>> This has to be taken in to account and added in to the upper. The patched > >>>>> __pv_offset and va are added in lower 32bits, where __pv_offset can be > >>>>> in two's complement form when PA_START < VA_START and that can result > >>>>> in a false carry bit. > >>>>> > >>>>> e.g PA = 0x80000000 VA = 0xC0000000 > >>>>> __pv_offset = PA - VA = 0xC0000000 (2's complement) > >>>>> > >>>>> So adding __pv_offset + VA should never result in a true overflow. So in > >>>>> order to differentiate between a true carry, a extra flag __pv_sign_flag > >>>>> is introduced. > >>> First of all thanks for the review. > >>> > >>>> I'm still wondering if this is worth bothering about. > >>>> > >>>> If PA = 0x80000000 and VA = 0xC0000000 there will never be a real carry > >>>> to propagate to the high word of the physical address as the VA space > >>>> cannot be larger than 0x40000000. > >>>> > >>> Agreed. > >>> > >>>> So is there really a case where: > >>>> > >>>> 1) physical memory is crossing the 4GB mark, and ... > >>>> > >>>> 2) physical memory start address is higher than virtual memory start > >>>> address needing a carry due to the 32-bit add overflow? > >>>> > >>> Consider below two cases of memory layout apart from one mentioned > >>> above where the carry is bit irrelevant as you rightly said. > >>> > >>> 1) PA = 0x8_0000_0000, VA= 0xC000_0000, absolute pv_offset = 0x7_4000_0000 > >> This can be patched as: > >> > >> mov phys_hi, #0x8 > >> add phys_lo, virt, #0x40000000 @ carry ignored > >> > >>> 2) PA = 0x2_8000_0000, VA= 0xC000_000, absolute pv_offset = 0x1_C000_0000 > >> mov phys_hi, #0x2 > >> add phys_lo, virt, #0xc0000000 @ carry ignored > >> > >>> In both of these cases there a true carry which needs to be > >>> considered. > >> Well, not really. However, if you have: > >> > >> 3) PA = 0x2_8000_0000, VA = 0x4000-0000, pv_offset = 0x2-4000-0000 > >> > >> ... then you need: > >> > >> mov phys_hi, #0x2 > >> adds phys_lo, virt, #0x40000000 > >> adc phys_hi, phys_hi, #0 > >> > >> My question is: how likely is this? > >> > >> What is your actual physical memory start address? > > Agreed. In our case we do not have the Physical address crossing across > > 4GB. So ignoring the carry would have be been OK. But we are > > also addressing the other case where it would really crossover. > > > >> If we really need to cope with the carry, then the __pv_sign_flag should > >> instead be represented in pv_offset directly: > >> > >> Taking example #2 above, that would be: > >> > >> mov phys_hi, #0x1 > >> adds phys_lo, virt, #0xc0000000 > >> adc phys_hi, phys_hi, #0 > >> > >> If PA = 0x8000-0000 and VA = 0xc000-0000 then pv_offset is > >> 0xffff-ffff-c000-0000, meaning: > >> > >> mvn phys_hi, #0 > >> add phys_lo, virt, #0xc0000000 > >> adc phys_hi, phys_hi, #0 > >> > >> So that would require a special case in the patching code where a mvn > >> with 0 is used if the high part of pv_offset is 0xffffffff. > >> > >> > >> Nicolas > > Extending pv_offset to 64bit is really neat way. When PA > VA, then pv_offset > > is going to be actual value and not 2's complement. Fine here. > > When running from higher physical address space, we will always fall here. > > > > So for the second case where pv_offset is 0xffffffff .., (PA < VA) > > is a problem only when we run from lower physical address. So we can safely > > assume that the higher 32bits of PA are '0' and stub it initially. In this way we > > can avoid the special case. > Sorry, I missed one more point here. In the second case,we should patch it with > 0x0 when (PA > VA) and with 0xffffffff when (PA < VA). I don't think I follow you here. Let's assume: phys_addr_t __pv_offset = PHYS_START - VIRT_START; If PA = 0x0-8000-0000 and VA = 0xc000-0000 then __pv_offset = 0xffff-ffff-c000-0000. If PA = 0x2-8000-0000 and VA = 0xc000-0000 then __pv_offset = 0x1-c000-0000. So the __virt_to_phys() assembly stub could look like: static inline phys_addr_t __virt_to_phys(unsigned long x) { phys_addr_t t; if if (sizeof(phys_addr_t) == 4) { __pv_stub(x, t, "add", __PV_BITS_31_24); } else { __pv_movhi_stub(t); __pv_add_carry_stub(x, t); } return t; } And... #define __pv_movhi_stub(y) \ __asm__("@ __pv_movhi_stub\n" \ "1: mov %R0, %1\n" \ " .pushsection .pv_table,\"a\"\n" \ " .long 1b\n" \ " .popsection\n" \ : "=r" (y) \ : "I" (__PV_BITS_8_0)) #define __pv_add_carry_stub(x, y) \ __asm__("@ __pv_add_carry_stub\n" \ "1: adds %Q0, %1, %2\n" \ " adc %R0, %R0, #0\n" \ " .pushsection .pv_table,\"a\"\n" \ " .long 1b\n" \ " .popsection\n" \ : "+r" (y) \ : "r" (x), "I" (__PV_BITS_31_24) \ : "cc") The stub bits such as __PV_BITS_8_0 can be augmented with more bits in the middle to determine the type of fixup needed. The fixup code would determine the shift needed on the value, and whether or not the low or high word of __pv_offset should be used according to those bits. Then, in the case where a mov is patched, you need to check if the high word of __pv_offset is 0xffffffff and if so the mov should be turned into a "mvn rn, #0". And there you are with all possible cases handled. Nicolas
Hi Nicolas, On Thursday 25 July 2013 01:51 AM, Nicolas Pitre wrote: > On Wed, 24 Jul 2013, Sricharan R wrote: > >> On Wednesday 24 July 2013 05:20 PM, Sricharan R wrote: >>> On Wednesday 24 July 2013 08:19 AM, Nicolas Pitre wrote: >>>> On Tue, 23 Jul 2013, Santosh Shilimkar wrote: >>>> >>>>> On Tuesday 23 July 2013 09:10 PM, Nicolas Pitre wrote: >>>>>> On Fri, 21 Jun 2013, Santosh Shilimkar wrote: >>>>>> >>>>>>> From: Sricharan R <r.sricharan@ti.com> >>>>>>> >>>>>>> The current phys_to_virt patching mechanism does not work >>>>>>> for 64 bit physical addressesp. Note that constant used in add/sub >>>>>>> instructions is encoded in to the last 8 bits of the opcode. So shift >>>>>>> the _pv_offset constant by 24 to get it in to the correct place. >>>>>>> >>>>>>> The v2p patching mechanism patches the higher 32bits of physical >>>>>>> address with a constant. While this is correct, in those platforms >>>>>>> where the lowmem addressable physical memory spawns across 4GB boundary, >>>>>>> a carry bit can be produced as a result of addition of lower 32bits. >>>>>>> This has to be taken in to account and added in to the upper. The patched >>>>>>> __pv_offset and va are added in lower 32bits, where __pv_offset can be >>>>>>> in two's complement form when PA_START < VA_START and that can result >>>>>>> in a false carry bit. >>>>>>> >>>>>>> e.g PA = 0x80000000 VA = 0xC0000000 >>>>>>> __pv_offset = PA - VA = 0xC0000000 (2's complement) >>>>>>> >>>>>>> So adding __pv_offset + VA should never result in a true overflow. So in >>>>>>> order to differentiate between a true carry, a extra flag __pv_sign_flag >>>>>>> is introduced. >>>>> First of all thanks for the review. >>>>> >>>>>> I'm still wondering if this is worth bothering about. >>>>>> >>>>>> If PA = 0x80000000 and VA = 0xC0000000 there will never be a real carry >>>>>> to propagate to the high word of the physical address as the VA space >>>>>> cannot be larger than 0x40000000. >>>>>> >>>>> Agreed. >>>>> >>>>>> So is there really a case where: >>>>>> >>>>>> 1) physical memory is crossing the 4GB mark, and ... >>>>>> >>>>>> 2) physical memory start address is higher than virtual memory start >>>>>> address needing a carry due to the 32-bit add overflow? >>>>>> >>>>> Consider below two cases of memory layout apart from one mentioned >>>>> above where the carry is bit irrelevant as you rightly said. >>>>> >>>>> 1) PA = 0x8_0000_0000, VA= 0xC000_0000, absolute pv_offset = 0x7_4000_0000 >>>> This can be patched as: >>>> >>>> mov phys_hi, #0x8 >>>> add phys_lo, virt, #0x40000000 @ carry ignored >>>> >>>>> 2) PA = 0x2_8000_0000, VA= 0xC000_000, absolute pv_offset = 0x1_C000_0000 >>>> mov phys_hi, #0x2 >>>> add phys_lo, virt, #0xc0000000 @ carry ignored >>>> >>>>> In both of these cases there a true carry which needs to be >>>>> considered. >>>> Well, not really. However, if you have: >>>> >>>> 3) PA = 0x2_8000_0000, VA = 0x4000-0000, pv_offset = 0x2-4000-0000 >>>> >>>> ... then you need: >>>> >>>> mov phys_hi, #0x2 >>>> adds phys_lo, virt, #0x40000000 >>>> adc phys_hi, phys_hi, #0 >>>> >>>> My question is: how likely is this? >>>> >>>> What is your actual physical memory start address? >>> Agreed. In our case we do not have the Physical address crossing across >>> 4GB. So ignoring the carry would have be been OK. But we are >>> also addressing the other case where it would really crossover. >>> >>>> If we really need to cope with the carry, then the __pv_sign_flag should >>>> instead be represented in pv_offset directly: >>>> >>>> Taking example #2 above, that would be: >>>> >>>> mov phys_hi, #0x1 >>>> adds phys_lo, virt, #0xc0000000 >>>> adc phys_hi, phys_hi, #0 >>>> >>>> If PA = 0x8000-0000 and VA = 0xc000-0000 then pv_offset is >>>> 0xffff-ffff-c000-0000, meaning: >>>> >>>> mvn phys_hi, #0 >>>> add phys_lo, virt, #0xc0000000 >>>> adc phys_hi, phys_hi, #0 >>>> >>>> So that would require a special case in the patching code where a mvn >>>> with 0 is used if the high part of pv_offset is 0xffffffff. >>>> >>>> >>>> Nicolas >>> Extending pv_offset to 64bit is really neat way. When PA > VA, then pv_offset >>> is going to be actual value and not 2's complement. Fine here. >>> When running from higher physical address space, we will always fall here. >>> >>> So for the second case where pv_offset is 0xffffffff .., (PA < VA) >>> is a problem only when we run from lower physical address. So we can safely >>> assume that the higher 32bits of PA are '0' and stub it initially. In this way we >>> can avoid the special case. >> Sorry, I missed one more point here. In the second case,we should patch it with >> 0x0 when (PA > VA) and with 0xffffffff when (PA < VA). > I don't think I follow you here. > > Let's assume: > > phys_addr_t __pv_offset = PHYS_START - VIRT_START; > > If PA = 0x0-8000-0000 and VA = 0xc000-0000 then > __pv_offset = 0xffff-ffff-c000-0000. > > If PA = 0x2-8000-0000 and VA = 0xc000-0000 then > __pv_offset = 0x1-c000-0000. > > So the __virt_to_phys() assembly stub could look like: > > static inline phys_addr_t __virt_to_phys(unsigned long x) > { > phys_addr_t t; > > if if (sizeof(phys_addr_t) == 4) { > __pv_stub(x, t, "add", __PV_BITS_31_24); > } else { > __pv_movhi_stub(t); > __pv_add_carry_stub(x, t); > } > > return t; > } > > And... > > #define __pv_movhi_stub(y) \ > __asm__("@ __pv_movhi_stub\n" \ > "1: mov %R0, %1\n" \ > " .pushsection .pv_table,\"a\"\n" \ > " .long 1b\n" \ > " .popsection\n" \ > : "=r" (y) \ > : "I" (__PV_BITS_8_0)) > > #define __pv_add_carry_stub(x, y) \ > __asm__("@ __pv_add_carry_stub\n" \ > "1: adds %Q0, %1, %2\n" \ > " adc %R0, %R0, #0\n" \ > " .pushsection .pv_table,\"a\"\n" \ > " .long 1b\n" \ > " .popsection\n" \ > : "+r" (y) \ > : "r" (x), "I" (__PV_BITS_31_24) \ > : "cc") > > The stub bits such as __PV_BITS_8_0 can be augmented with more bits in > the middle to determine the type of fixup needed. The fixup code would > determine the shift needed on the value, and whether or not the low or > high word of __pv_offset should be used according to those bits. > > Then, in the case where a mov is patched, you need to check if the high > word of __pv_offset is 0xffffffff and if so the mov should be turned > into a "mvn rn, #0". > > And there you are with all possible cases handled. > > > Nicolas Thanks and you have given the full details here. Sorry if i was not clear on my previous response. 1) When i said special case can be avoided, i meant that we need not differentiate the 0xfffffff case inside the __virt_to_phy macro, but can handle it at the time of patching. Your above code makes that clear. 2) I would have ended creating separate tables for 'mov' and 'add' case. But again thanks to your above idea of augumenting the __PV_BITS, with which we can find out run time. And 'mvn' would be needed for moving '0xffffffff' . Now I can get rid of the separate section that i created for 'mov' in my previous version. I will make the above suggestions and come back. Regards, Sricharan
On Wednesday 24 July 2013 11:49 PM, Sricharan R wrote: > Hi Nicolas, > > On Thursday 25 July 2013 01:51 AM, Nicolas Pitre wrote: [..] >> I don't think I follow you here. >> >> Let's assume: >> >> phys_addr_t __pv_offset = PHYS_START - VIRT_START; >> >> If PA = 0x0-8000-0000 and VA = 0xc000-0000 then >> __pv_offset = 0xffff-ffff-c000-0000. >> >> If PA = 0x2-8000-0000 and VA = 0xc000-0000 then >> __pv_offset = 0x1-c000-0000. >> >> So the __virt_to_phys() assembly stub could look like: >> >> static inline phys_addr_t __virt_to_phys(unsigned long x) >> { >> phys_addr_t t; >> >> if if (sizeof(phys_addr_t) == 4) { >> __pv_stub(x, t, "add", __PV_BITS_31_24); >> } else { >> __pv_movhi_stub(t); >> __pv_add_carry_stub(x, t); >> } >> >> return t; >> } >> >> And... >> >> #define __pv_movhi_stub(y) \ >> __asm__("@ __pv_movhi_stub\n" \ >> "1: mov %R0, %1\n" \ >> " .pushsection .pv_table,\"a\"\n" \ >> " .long 1b\n" \ >> " .popsection\n" \ >> : "=r" (y) \ >> : "I" (__PV_BITS_8_0)) >> >> #define __pv_add_carry_stub(x, y) \ >> __asm__("@ __pv_add_carry_stub\n" \ >> "1: adds %Q0, %1, %2\n" \ >> " adc %R0, %R0, #0\n" \ >> " .pushsection .pv_table,\"a\"\n" \ >> " .long 1b\n" \ >> " .popsection\n" \ >> : "+r" (y) \ >> : "r" (x), "I" (__PV_BITS_31_24) \ >> : "cc") >> >> The stub bits such as __PV_BITS_8_0 can be augmented with more bits in >> the middle to determine the type of fixup needed. The fixup code would >> determine the shift needed on the value, and whether or not the low or >> high word of __pv_offset should be used according to those bits. >> >> Then, in the case where a mov is patched, you need to check if the high >> word of __pv_offset is 0xffffffff and if so the mov should be turned >> into a "mvn rn, #0". >> >> And there you are with all possible cases handled. >> Brilliant !! We knew you will have some tricks and better way. We were not convinced with the extra stub for 'mov' but also didn't have idea to avoid it. > Thanks and you have given the full details here. > > Sorry if i was not clear on my previous response. > > 1) When i said special case can be avoided, i meant that > we need not differentiate the 0xfffffff case inside the > __virt_to_phy macro, but can handle it at the time of patching. > Your above code makes that clear. > > 2) I would have ended creating separate tables for 'mov' and 'add' > case. But again thanks to your above idea of augumenting the > __PV_BITS, with which we can find out run time. And 'mvn' would > be needed for moving '0xffffffff' . Now I can get rid > of the separate section that i created for 'mov' in my previous > version. > We also get rid of calling separate patching for modules as well as late patching. Overall the patch-set becomes smaller and simpler. Thanks for help. Regards, Santosh
diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h index d8a3ea6..e16468d 100644 --- a/arch/arm/include/asm/memory.h +++ b/arch/arm/include/asm/memory.h @@ -174,9 +174,17 @@ #define __PV_BITS_31_24 0x81000000 #define __PV_BITS_7_0 0x81 +/* + * PV patch constants. + * Lower 32bits are 16MB aligned. + */ +#define PV_LOW_SHIFT 24 +#define PV_HIGH_SHIFT 32 + extern phys_addr_t (*arch_virt_to_idmap) (unsigned long x); -extern unsigned long __pv_phys_offset; +extern phys_addr_t __pv_phys_offset; extern unsigned long __pv_offset; +extern unsigned long __pv_sign_flag; #define PHYS_OFFSET __pv_phys_offset @@ -187,7 +195,8 @@ extern unsigned long __pv_offset; " .long 1b\n" \ " .popsection\n" \ : "=r" (to) \ - : "r" (from), "I" (type)) + : "r" (from), "I" (type) \ + : "cc") #define __pv_stub_mov(to, instr, type) \ __asm__ volatile("@ __pv_stub_mov\n" \ @@ -200,8 +209,17 @@ extern unsigned long __pv_offset; static inline phys_addr_t __virt_to_phys(unsigned long x) { +#ifdef CONFIG_ARM_LPAE + register phys_addr_t t asm("r4") = 0; + + __pv_stub_mov(t, "mov", __PV_BITS_7_0); + __pv_stub(x, t, "adds", __PV_BITS_31_24); + __asm__ volatile("adc %R0, %R0, %1" : "+r" (t) : "I" (0x0)); + __asm__ volatile("sub %R0, %R0, %1" : "+r" (t) : "r" (__pv_sign_flag)); +#else unsigned long t; __pv_stub(x, t, "add", __PV_BITS_31_24); +#endif return t; } diff --git a/arch/arm/kernel/armksyms.c b/arch/arm/kernel/armksyms.c index 60d3b73..f0c51ed 100644 --- a/arch/arm/kernel/armksyms.c +++ b/arch/arm/kernel/armksyms.c @@ -155,4 +155,6 @@ EXPORT_SYMBOL(__gnu_mcount_nc); #ifdef CONFIG_ARM_PATCH_PHYS_VIRT EXPORT_SYMBOL(__pv_phys_offset); +EXPORT_SYMBOL(__pv_offset); +EXPORT_SYMBOL(__pv_sign_flag); #endif diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S index b1bdeb5..25c9d5f 100644 --- a/arch/arm/kernel/head.S +++ b/arch/arm/kernel/head.S @@ -546,24 +546,42 @@ ENDPROC(fixup_smp) __HEAD __fixup_pv_table: adr r0, 1f - ldmia r0, {r3-r5, r7} + ldmia r0, {r3-r7} + cmp r0, r3 + mov ip, #1 sub r3, r0, r3 @ PHYS_OFFSET - PAGE_OFFSET add r4, r4, r3 @ adjust table start address add r5, r5, r3 @ adjust table end address add r7, r7, r3 @ adjust __pv_phys_offset address + add r6, r6, r3 @ adjust __pv_sign_flag + strcc ip, [r6] @ save __pv_sign_flag str r8, [r7] @ save computed PHYS_OFFSET to __pv_phys_offset mov r6, r3, lsr #24 @ constant for add/sub instructions teq r3, r6, lsl #24 @ must be 16MiB aligned THUMB( it ne @ cross section branch ) bne __error +#ifndef CONFIG_ARM_LPAE str r6, [r7, #4] @ save to __pv_offset b __fixup_a_pv_table +#else + str r6, [r7, #8] @ save to __pv_offset + mov r0, r14 @ save lr + bl __fixup_a_pv_table + adr r6, 3f + ldmia r6, {r4-r5} + add r4, r4, r3 @ adjust __pv_high_table start address + add r5, r5, r3 @ adjust __pv_high_table end address + mov r6, #0 @ higher 32 bits of PHYS_OFFSET to start with + bl __fixup_a_pv_table + mov pc, r0 +#endif ENDPROC(__fixup_pv_table) .align 1: .long . .long __pv_table_begin .long __pv_table_end + .long __pv_sign_flag 2: .long __pv_phys_offset 3: .long __pv_high_table_begin .long __pv_high_table_end @@ -621,10 +639,22 @@ ENDPROC(fixup_pv_table) .globl __pv_phys_offset .type __pv_phys_offset, %object __pv_phys_offset: +#ifdef CONFIG_ARM_LPAE + .quad 0 +#else .long 0 - .size __pv_phys_offset, . - __pv_phys_offset +#endif + .data + .globl __pv_offset + .type __pv_offset, %object __pv_offset: .long 0 + + .data + .globl __pv_sign_flag + .type __pv_sign_flag, %object +__pv_sign_flag: + .long 0 #endif #include "head-common.S" diff --git a/arch/arm/kernel/module.c b/arch/arm/kernel/module.c index 1ac071b..024c06d 100644 --- a/arch/arm/kernel/module.c +++ b/arch/arm/kernel/module.c @@ -320,6 +320,13 @@ int module_finalize(const Elf32_Ehdr *hdr, const Elf_Shdr *sechdrs, s = find_mod_section(hdr, sechdrs, ".pv_table"); if (s) fixup_pv_table((void *)s->sh_addr, s->sh_size, __pv_offset); + +#ifdef CONFIG_ARM_LPAE + s = find_mod_section(hdr, sechdrs, ".pv_high_table"); + if (s) + fixup_pv_table((void *)s->sh_addr, s->sh_size, + __pv_phys_offset >> PV_HIGH_SHIFT); +#endif #endif s = find_mod_section(hdr, sechdrs, ".alt.smp.init"); if (s && !is_smp())