diff mbox

[V4,1/2] ACPI / EC: Fix broken 64bit big-endian users of 'global_lock'

Message ID e28c4b4deaf766910c366ab87b64325da59c8ad6.1443198783.git.viresh.kumar@linaro.org (mailing list archive)
State New, archived
Headers show

Commit Message

Viresh Kumar Sept. 25, 2015, 4:41 p.m. UTC
global_lock is defined as an unsigned long and accessing only its lower
32 bits from sysfs is incorrect, as we need to consider other 32 bits
for big endian 64 bit systems. There are no such platforms yet, but the
code needs to be robust for such a case.

Fix that by passing a local variable to debugfs_create_bool() and
assigning its value to global_lock later.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
V3->V4:
- Create a local variable instead of changing type of global_lock
  (Rafael)
- Drop the stable tag
- BCC'd a lot of people (rather than cc'ing them) to make sure
  - the series reaches them
  - mailing lists do not block the patchset due to long cc list
  - and we don't spam the BCC'd people for every reply
---
 drivers/acpi/ec_sys.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Johannes Berg Sept. 25, 2015, 5:42 p.m. UTC | #1
On Fri, 2015-09-25 at 09:41 -0700, Viresh Kumar wrote:

> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---
> V3->V4:
> - Create a local variable instead of changing type of global_lock
>   (Rafael)

Err, surely that wasn't what Rafael meant, since it's clearly
impossible to use a pointer to the stack, assign to it once, and the
expect anything to wkr at all ...

johannes
Viresh Kumar Sept. 25, 2015, 6:47 p.m. UTC | #2
On 25-09-15, 19:42, Johannes Berg wrote:
> On Fri, 2015-09-25 at 09:41 -0700, Viresh Kumar wrote:
> 
> > Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> > ---
> > V3->V4:
> > - Create a local variable instead of changing type of global_lock
> >   (Rafael)
> 
> Err, surely that wasn't what Rafael meant, since it's clearly
> impossible to use a pointer to the stack, assign to it once, and the
> expect anything to wkr at all ...

Sorry, I am not sure on what wouldn't work with this patch but this is
what Rafael said, and I don't think I wrote it differently :)

Rafael wrote:
> Actually, what about adding a local u32 variable, say val, here and doing
> 
> >  	if (!debugfs_create_x32("gpe", 0444, dev_dir, (u32 *)&first_ec->gpe))
> >  		goto error;
> >  	if (!debugfs_create_bool("use_global_lock", 0444, dev_dir,
> > -				 (u32 *)&first_ec->global_lock))
> > +				 &first_ec->global_lock))
> 
> 	if (!debugfs_create_bool("use_global_lock", 0444, dev_dir, &val))
> 
> >  		goto error;
> 
> 	first_ec->global_lock = val;
> 
> And then you can turn val into bool just fine without changing the structure
> definition.
Johannes Berg Sept. 25, 2015, 6:49 p.m. UTC | #3
> Rafael wrote:
> > Actually, what about adding a local u32 variable, say val, here and 
> > doing
> > 
> > >  	if (!debugfs_create_x32("gpe", 0444, dev_dir, (u32 
> > > *)&first_ec->gpe))
> > >  		goto error;
> > >  	if (!debugfs_create_bool("use_global_lock", 0444, 
> > > dev_dir,
> > > -				 (u32 *)&first_ec->global_lock))
> > > +				 &first_ec->global_lock))
> > 
> > 	if (!debugfs_create_bool("use_global_lock", 0444, dev_dir, 
> > &val))
> > 
> > >  		goto error;
> > 
> > 	first_ec->global_lock = val;
> > 
> > And then you can turn val into bool just fine without changing the 
> > structure
> > definition.

Ok, then, but that means Rafael is completely wrong ...
debugfs_create_bool() takes a *pointer* and it needs to be long-lived,
it can't be on the stack. You also don't get a call when it changes.

If you cannot change the struct definition then you must implement a
debugfs file with its own read/write handlers.

johannes
Viresh Kumar Sept. 25, 2015, 6:52 p.m. UTC | #4
On 25-09-15, 20:49, Johannes Berg wrote:
> Ok, then, but that means Rafael is completely wrong ...
> debugfs_create_bool() takes a *pointer* and it needs to be long-lived,
> it can't be on the stack. You also don't get a call when it changes.

Ahh, ofcourse. My bad as well...

I think we can change structure definition but will wait for Rafael's
comment before that.
Rafael J. Wysocki Sept. 25, 2015, 8:18 p.m. UTC | #5
On Friday, September 25, 2015 09:41:37 AM Viresh Kumar wrote:
> global_lock is defined as an unsigned long and accessing only its lower
> 32 bits from sysfs is incorrect, as we need to consider other 32 bits
> for big endian 64 bit systems. There are no such platforms yet, but the
> code needs to be robust for such a case.
> 
> Fix that by passing a local variable to debugfs_create_bool() and
> assigning its value to global_lock later.
> 
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>

Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Greg, please take this one if the [2/2] looks good to you.

> ---
> V3->V4:
> - Create a local variable instead of changing type of global_lock
>   (Rafael)
> - Drop the stable tag
> - BCC'd a lot of people (rather than cc'ing them) to make sure
>   - the series reaches them
>   - mailing lists do not block the patchset due to long cc list
>   - and we don't spam the BCC'd people for every reply
> ---
>  drivers/acpi/ec_sys.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/acpi/ec_sys.c b/drivers/acpi/ec_sys.c
> index b4c216bab22b..b44b91331a56 100644
> --- a/drivers/acpi/ec_sys.c
> +++ b/drivers/acpi/ec_sys.c
> @@ -110,6 +110,7 @@ static int acpi_ec_add_debugfs(struct acpi_ec *ec, unsigned int ec_device_count)
>  	struct dentry *dev_dir;
>  	char name[64];
>  	umode_t mode = 0400;
> +	u32 val;
>  
>  	if (ec_device_count == 0) {
>  		acpi_ec_debugfs_dir = debugfs_create_dir("ec", NULL);
> @@ -127,10 +128,11 @@ static int acpi_ec_add_debugfs(struct acpi_ec *ec, unsigned int ec_device_count)
>  
>  	if (!debugfs_create_x32("gpe", 0444, dev_dir, (u32 *)&first_ec->gpe))
>  		goto error;
> -	if (!debugfs_create_bool("use_global_lock", 0444, dev_dir,
> -				 (u32 *)&first_ec->global_lock))
> +	if (!debugfs_create_bool("use_global_lock", 0444, dev_dir, &val))
>  		goto error;
>  
> +	first_ec->global_lock = val;
> +
>  	if (write_support)
>  		mode = 0600;
>  	if (!debugfs_create_file("io", mode, dev_dir, ec, &acpi_ec_io_ops))
> 

Thanks,
Rafael
Rafael J. Wysocki Sept. 25, 2015, 8:22 p.m. UTC | #6
On Friday, September 25, 2015 10:18:13 PM Rafael J. Wysocki wrote:
> On Friday, September 25, 2015 09:41:37 AM Viresh Kumar wrote:
> > global_lock is defined as an unsigned long and accessing only its lower
> > 32 bits from sysfs is incorrect, as we need to consider other 32 bits
> > for big endian 64 bit systems. There are no such platforms yet, but the
> > code needs to be robust for such a case.
> > 
> > Fix that by passing a local variable to debugfs_create_bool() and
> > assigning its value to global_lock later.
> > 
> > Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> 
> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Greg, please take this one if the [2/2] looks good to you.

Ouch, turns out it was a bad idea.  Please scratch that.

Thanks,
Rafael
Viresh Kumar Sept. 25, 2015, 8:25 p.m. UTC | #7
On 25 September 2015 at 13:33, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> You're going to change that into bool in the next patch, right?

Yeah.

> So what if bool is a byte and the field is not word-aligned

Its between two 'unsigned long' variables today, and the struct isn't packed.
So, it will be aligned, isn't it?

> and changing
> that byte requires a read-modify-write.  How do we ensure that things remain
> consistent in that case?

I didn't understood why a read-modify-write is special here? That's
what will happen
to most of the non-word-sized fields anyway?

Probably I didn't understood what you meant..

--
viresh
Rafael J. Wysocki Sept. 25, 2015, 8:26 p.m. UTC | #8
On Friday, September 25, 2015 11:52:56 AM Viresh Kumar wrote:
> On 25-09-15, 20:49, Johannes Berg wrote:
> > Ok, then, but that means Rafael is completely wrong ...
> > debugfs_create_bool() takes a *pointer* and it needs to be long-lived,
> > it can't be on the stack. You also don't get a call when it changes.
> 
> Ahh, ofcourse. My bad as well...

Well, sorry about the wrong suggestion.

> I think we can change structure definition but will wait for Rafael's
> comment before that.

OK, change the structure then.

Thanks,
Rafael
Rafael J. Wysocki Sept. 25, 2015, 8:33 p.m. UTC | #9
On Friday, September 25, 2015 10:26:22 PM Rafael J. Wysocki wrote:
> On Friday, September 25, 2015 11:52:56 AM Viresh Kumar wrote:
> > On 25-09-15, 20:49, Johannes Berg wrote:
> > > Ok, then, but that means Rafael is completely wrong ...
> > > debugfs_create_bool() takes a *pointer* and it needs to be long-lived,
> > > it can't be on the stack. You also don't get a call when it changes.
> > 
> > Ahh, ofcourse. My bad as well...
> 
> Well, sorry about the wrong suggestion.
> 
> > I think we can change structure definition but will wait for Rafael's
> > comment before that.
> 
> OK, change the structure then.

But here's a question.

You're going to change that into bool in the next patch, right?

So what if bool is a byte and the field is not word-aligned and changing
that byte requires a read-modify-write.  How do we ensure that things remain
consistent in that case?

Thanks,
Rafael
Rafael J. Wysocki Sept. 25, 2015, 8:58 p.m. UTC | #10
On Friday, September 25, 2015 01:25:49 PM Viresh Kumar wrote:
> On 25 September 2015 at 13:33, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > You're going to change that into bool in the next patch, right?
> 
> Yeah.
> 
> > So what if bool is a byte and the field is not word-aligned
> 
> Its between two 'unsigned long' variables today, and the struct isn't packed.
> So, it will be aligned, isn't it?
> 
> > and changing
> > that byte requires a read-modify-write.  How do we ensure that things remain
> > consistent in that case?
> 
> I didn't understood why a read-modify-write is special here? That's
> what will happen
> to most of the non-word-sized fields anyway?
> 
> Probably I didn't understood what you meant..

Say you have three adjacent fields in a structure, x, y, z, each one byte long.
Initially, all of them are equal to 0.

CPU A writes 1 to x and CPU B writes 2 to y at the same time.

What's the result?

Thanks,
Rafael
Viresh Kumar Sept. 25, 2015, 9:44 p.m. UTC | #11
On 25-09-15, 22:58, Rafael J. Wysocki wrote:
> Say you have three adjacent fields in a structure, x, y, z, each one byte long.
> Initially, all of them are equal to 0.
> 
> CPU A writes 1 to x and CPU B writes 2 to y at the same time.
> 
> What's the result?

But then two CPUs can update the same variable as well, and we must
have proper locking in place to fix that.

Anyway, that problem isn't here for sure as its between two
unsigned-longs. So, should I just move it to bool and resend ?
Rafael J. Wysocki Sept. 25, 2015, 10:19 p.m. UTC | #12
On Fri, Sep 25, 2015 at 11:44 PM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 25-09-15, 22:58, Rafael J. Wysocki wrote:
>> Say you have three adjacent fields in a structure, x, y, z, each one byte long.
>> Initially, all of them are equal to 0.
>>
>> CPU A writes 1 to x and CPU B writes 2 to y at the same time.
>>
>> What's the result?
>
> But then two CPUs can update the same variable as well, and we must
> have proper locking in place to fix that.

Right.

So if you allow something like debugfs to update your structure, how
do you make sure there is the proper locking?

Is that not a problem in all of the places modified by the [2/2]?

How can the future users of the API know what to do to avoid potential
problems with it?

>
> Anyway, that problem isn't here for sure as its between two
> unsigned-longs. So, should I just move it to bool and resend ?

I guess it might be more convenient to fold this into the other patch,
because we seem to be splitting hairs here.

Thanks,
Rafael
Viresh Kumar Sept. 26, 2015, 6:40 p.m. UTC | #13
On 25 September 2015 at 15:19, Rafael J. Wysocki <rafael@kernel.org> wrote:
> So if you allow something like debugfs to update your structure, how
> do you make sure there is the proper locking?

Not really sure at all.. Isn't there some debugfs locking that will
jump in, to avoid updation of fields to the same device?

> Is that not a problem in all of the places modified by the [2/2]?

Right, but its not new AFAICT.

We already have u32 fields in those structs and on 64 bit machines
we have the same read-update-write problems for two consecutive
u32's. Right?

> How can the future users of the API know what to do to avoid potential
> problems with it?

No idea really :)

>> Anyway, that problem isn't here for sure as its between two
>> unsigned-longs. So, should I just move it to bool and resend ?
>
> I guess it might be more convenient to fold this into the other patch,
> because we seem to be splitting hairs here.

I can and that's what I did. But then Arnd asked me to separate it
out. I can fold it back if that's what you want.

--
viresh
Arnd Bergmann Sept. 26, 2015, 7:33 p.m. UTC | #14
On Saturday 26 September 2015 11:40:00 Viresh Kumar wrote:
> On 25 September 2015 at 15:19, Rafael J. Wysocki <rafael@kernel.org> wrote:
> > So if you allow something like debugfs to update your structure, how
> > do you make sure there is the proper locking?
> 
> Not really sure at all.. Isn't there some debugfs locking that will
> jump in, to avoid updation of fields to the same device?

No, if you need any locking to access variable, you cannot use the
simple debugfs helpers but have to provide your own functions.

> >> Anyway, that problem isn't here for sure as its between two
> >> unsigned-longs. So, should I just move it to bool and resend ?
> >
> > I guess it might be more convenient to fold this into the other patch,
> > because we seem to be splitting hairs here.
> 
> I can and that's what I did. But then Arnd asked me to separate it
> out. I can fold it back if that's what you want.

It still makes sense to keep it separate I think, the patch is clearly
different from the other parts.

	Arnd
James Bottomley Sept. 26, 2015, 7:52 p.m. UTC | #15
On Fri, 2015-09-25 at 22:58 +0200, Rafael J. Wysocki wrote:
> On Friday, September 25, 2015 01:25:49 PM Viresh Kumar wrote:
> > On 25 September 2015 at 13:33, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > > You're going to change that into bool in the next patch, right?
> > 
> > Yeah.
> > 
> > > So what if bool is a byte and the field is not word-aligned
> > 
> > Its between two 'unsigned long' variables today, and the struct isn't packed.
> > So, it will be aligned, isn't it?
> > 
> > > and changing
> > > that byte requires a read-modify-write.  How do we ensure that things remain
> > > consistent in that case?
> > 
> > I didn't understood why a read-modify-write is special here? That's
> > what will happen
> > to most of the non-word-sized fields anyway?
> > 
> > Probably I didn't understood what you meant..
> 
> Say you have three adjacent fields in a structure, x, y, z, each one byte long.
> Initially, all of them are equal to 0.
> 
> CPU A writes 1 to x and CPU B writes 2 to y at the same time.
> 
> What's the result?

I think every CPU's  cache architecure guarantees adjacent store
integrity, even in the face of SMP, so it's x==1 and y==2.  If you're
thinking of old alpha SMP system where the lowest store width is 32 bits
and thus you have to do RMW to update a byte, this was usually fixed by
padding (assuming the structure is not packed).  However, it was such a
problem that even the later alpha chips had byte extensions.

James
Jiri Slaby Sept. 27, 2015, 5:31 a.m. UTC | #16
Dne 25. 9. 2015 18:42 napsal uživatel "Viresh Kumar" <
viresh.kumar@linaro.org>:
>
> global_lock is defined as an unsigned long and accessing only its lower
> 32 bits from sysfs is incorrect, as we need to consider other 32 bits
> for big endian 64 bit systems. There are no such platforms yet, but the
> code needs to be robust for such a case.
>
> Fix that by passing a local variable to debugfs_create_bool() and
> assigning its value to global_lock later.

But this has to crash whenever the file is read as val's storage is gone at
that moment already, right?

> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---
> V3->V4:
> - Create a local variable instead of changing type of global_lock
>   (Rafael)
> - Drop the stable tag
> - BCC'd a lot of people (rather than cc'ing them) to make sure
>   - the series reaches them
>   - mailing lists do not block the patchset due to long cc list
>   - and we don't spam the BCC'd people for every reply
> ---
>  drivers/acpi/ec_sys.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/acpi/ec_sys.c b/drivers/acpi/ec_sys.c
> index b4c216bab22b..b44b91331a56 100644
> --- a/drivers/acpi/ec_sys.c
> +++ b/drivers/acpi/ec_sys.c
> @@ -110,6 +110,7 @@ static int acpi_ec_add_debugfs(struct acpi_ec *ec,
unsigned int ec_device_count)
>         struct dentry *dev_dir;
>         char name[64];
>         umode_t mode = 0400;
> +       u32 val;
>
>         if (ec_device_count == 0) {
>                 acpi_ec_debugfs_dir = debugfs_create_dir("ec", NULL);
> @@ -127,10 +128,11 @@ static int acpi_ec_add_debugfs(struct acpi_ec *ec,
unsigned int ec_device_count)
>
>         if (!debugfs_create_x32("gpe", 0444, dev_dir, (u32
*)&first_ec->gpe))
>                 goto error;
> -       if (!debugfs_create_bool("use_global_lock", 0444, dev_dir,
> -                                (u32 *)&first_ec->global_lock))
> +       if (!debugfs_create_bool("use_global_lock", 0444, dev_dir, &val))
>                 goto error;
>
> +       first_ec->global_lock = val;
> +
>         if (write_support)
>                 mode = 0600;
>         if (!debugfs_create_file("io", mode, dev_dir, ec,
&acpi_ec_io_ops))
> --
> 2.4.0
>
Rafael J. Wysocki Sept. 27, 2015, 2:09 p.m. UTC | #17
On Saturday, September 26, 2015 12:52:08 PM James Bottomley wrote:
> On Fri, 2015-09-25 at 22:58 +0200, Rafael J. Wysocki wrote:
> > On Friday, September 25, 2015 01:25:49 PM Viresh Kumar wrote:
> > > On 25 September 2015 at 13:33, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > > > You're going to change that into bool in the next patch, right?
> > > 
> > > Yeah.
> > > 
> > > > So what if bool is a byte and the field is not word-aligned
> > > 
> > > Its between two 'unsigned long' variables today, and the struct isn't packed.
> > > So, it will be aligned, isn't it?
> > > 
> > > > and changing
> > > > that byte requires a read-modify-write.  How do we ensure that things remain
> > > > consistent in that case?
> > > 
> > > I didn't understood why a read-modify-write is special here? That's
> > > what will happen
> > > to most of the non-word-sized fields anyway?
> > > 
> > > Probably I didn't understood what you meant..
> > 
> > Say you have three adjacent fields in a structure, x, y, z, each one byte long.
> > Initially, all of them are equal to 0.
> > 
> > CPU A writes 1 to x and CPU B writes 2 to y at the same time.
> > 
> > What's the result?
> 
> I think every CPU's  cache architecure guarantees adjacent store
> integrity, even in the face of SMP, so it's x==1 and y==2.  If you're
> thinking of old alpha SMP system where the lowest store width is 32 bits
> and thus you have to do RMW to update a byte, this was usually fixed by
> padding (assuming the structure is not packed).  However, it was such a
> problem that even the later alpha chips had byte extensions.

OK, thanks!

Rafael
Rafael J. Wysocki Sept. 27, 2015, 2:10 p.m. UTC | #18
On Saturday, September 26, 2015 09:33:56 PM Arnd Bergmann wrote:
> On Saturday 26 September 2015 11:40:00 Viresh Kumar wrote:
> > On 25 September 2015 at 15:19, Rafael J. Wysocki <rafael@kernel.org> wrote:
> > > So if you allow something like debugfs to update your structure, how
> > > do you make sure there is the proper locking?
> > 
> > Not really sure at all.. Isn't there some debugfs locking that will
> > jump in, to avoid updation of fields to the same device?
> 
> No, if you need any locking to access variable, you cannot use the
> simple debugfs helpers but have to provide your own functions.
> 
> > >> Anyway, that problem isn't here for sure as its between two
> > >> unsigned-longs. So, should I just move it to bool and resend ?
> > >
> > > I guess it might be more convenient to fold this into the other patch,
> > > because we seem to be splitting hairs here.
> > 
> > I can and that's what I did. But then Arnd asked me to separate it
> > out. I can fold it back if that's what you want.
> 
> It still makes sense to keep it separate I think, the patch is clearly
> different from the other parts.

I just don't see much point in going from unsigned long to u32 and then
from 32 to bool if we can go directly to bool in one go.

Thanks,
Rafael
Viresh Kumar Sept. 27, 2015, 2:35 p.m. UTC | #19
On 26 September 2015 at 22:31, Jiri Slaby <jirislaby@gmail.com> wrote:
> But this has to crash whenever the file is read as val's storage is gone at
> that moment already, right?

Yeah, its fixed now in the new version. This was a *really* bad idea :(
Arnd Bergmann Sept. 28, 2015, 8:24 a.m. UTC | #20
On Sunday 27 September 2015 16:10:48 Rafael J. Wysocki wrote:
> On Saturday, September 26, 2015 09:33:56 PM Arnd Bergmann wrote:
> > On Saturday 26 September 2015 11:40:00 Viresh Kumar wrote:
> > > On 25 September 2015 at 15:19, Rafael J. Wysocki <rafael@kernel.org> wrote:
> > > > So if you allow something like debugfs to update your structure, how
> > > > do you make sure there is the proper locking?
> > > 
> > > Not really sure at all.. Isn't there some debugfs locking that will
> > > jump in, to avoid updation of fields to the same device?
> > 
> > No, if you need any locking to access variable, you cannot use the
> > simple debugfs helpers but have to provide your own functions.
> > 
> > > >> Anyway, that problem isn't here for sure as its between two
> > > >> unsigned-longs. So, should I just move it to bool and resend ?
> > > >
> > > > I guess it might be more convenient to fold this into the other patch,
> > > > because we seem to be splitting hairs here.
> > > 
> > > I can and that's what I did. But then Arnd asked me to separate it
> > > out. I can fold it back if that's what you want.
> > 
> > It still makes sense to keep it separate I think, the patch is clearly
> > different from the other parts.
> 
> I just don't see much point in going from unsigned long to u32 and then
> from 32 to bool if we can go directly to bool in one go.

It's only important to keep the 34-file multi-subsystem trivial cleanup
that doesn't change any functionality separate from the bugfix. If you
like to avoid patching one of the files twice, the alternative would
be to first change the API for all other instances from u32 to bool
and leave ACPI alone, and then do the second patch that changes ACPI
from long to bool.

	Arnd
David Laight Sept. 28, 2015, 8:58 a.m. UTC | #21
From: Rafael J. Wysocki
> Sent: 27 September 2015 15:09
...
> > > Say you have three adjacent fields in a structure, x, y, z, each one byte long.
> > > Initially, all of them are equal to 0.
> > >
> > > CPU A writes 1 to x and CPU B writes 2 to y at the same time.
> > >
> > > What's the result?
> >
> > I think every CPU's  cache architecure guarantees adjacent store
> > integrity, even in the face of SMP, so it's x==1 and y==2.  If you're
> > thinking of old alpha SMP system where the lowest store width is 32 bits
> > and thus you have to do RMW to update a byte, this was usually fixed by
> > padding (assuming the structure is not packed).  However, it was such a
> > problem that even the later alpha chips had byte extensions.

Does linux still support those old Alphas?

The x86 cpus will also do 32bit wide rmw cycles for the 'bit' operations.

> OK, thanks!

You still have to ensure the compiler doesn't do wider rmw cycles.
I believe the recent versions of gcc won't do wider accesses for volatile data.

	David
Rafael J. Wysocki Sept. 28, 2015, 1:07 p.m. UTC | #22
On Monday, September 28, 2015 10:24:58 AM Arnd Bergmann wrote:
> On Sunday 27 September 2015 16:10:48 Rafael J. Wysocki wrote:
> > On Saturday, September 26, 2015 09:33:56 PM Arnd Bergmann wrote:
> > > On Saturday 26 September 2015 11:40:00 Viresh Kumar wrote:
> > > > On 25 September 2015 at 15:19, Rafael J. Wysocki <rafael@kernel.org> wrote:
> > > > > So if you allow something like debugfs to update your structure, how
> > > > > do you make sure there is the proper locking?
> > > > 
> > > > Not really sure at all.. Isn't there some debugfs locking that will
> > > > jump in, to avoid updation of fields to the same device?
> > > 
> > > No, if you need any locking to access variable, you cannot use the
> > > simple debugfs helpers but have to provide your own functions.
> > > 
> > > > >> Anyway, that problem isn't here for sure as its between two
> > > > >> unsigned-longs. So, should I just move it to bool and resend ?
> > > > >
> > > > > I guess it might be more convenient to fold this into the other patch,
> > > > > because we seem to be splitting hairs here.
> > > > 
> > > > I can and that's what I did. But then Arnd asked me to separate it
> > > > out. I can fold it back if that's what you want.
> > > 
> > > It still makes sense to keep it separate I think, the patch is clearly
> > > different from the other parts.
> > 
> > I just don't see much point in going from unsigned long to u32 and then
> > from 32 to bool if we can go directly to bool in one go.
> 
> It's only important to keep the 34-file multi-subsystem trivial cleanup
> that doesn't change any functionality separate from the bugfix.

Which isn't a bugfix really, because the EC code is not run on any big-endian
systems to my knowledge.  And it won't matter after the [2/2] anyway.

And the changelog of it doesn't really makes sense, because it talks about
future systems, but after the [2/2] no future systems will run that code in
the first place.

> If you like to avoid patching one of the files twice, the alternative would
> be to first change the API for all other instances from u32 to bool
> and leave ACPI alone, and then do the second patch that changes ACPI
> from long to bool.

My point is that this patch doesn't matter.  It doesn't really fix anything
and the result of it goes away after the second patch.

The only marginal value of having it as a separate commit is in case if
(a) we need to revert the [2/2] for some reason and (b) ACPI-based ARM systems
(the big-endian ones) become full-hardware at one point.  You know what the
chances of that are, though. :-)

That said I've ACKed the patch, because I don't care that much.  I'm not exactly
sure why you care either.

Thanks,
Rafael
James Bottomley Sept. 28, 2015, 2:26 p.m. UTC | #23
On Mon, 2015-09-28 at 08:58 +0000, David Laight wrote:
> From: Rafael J. Wysocki
> > Sent: 27 September 2015 15:09
> ...
> > > > Say you have three adjacent fields in a structure, x, y, z, each one byte long.
> > > > Initially, all of them are equal to 0.
> > > >
> > > > CPU A writes 1 to x and CPU B writes 2 to y at the same time.
> > > >
> > > > What's the result?
> > >
> > > I think every CPU's  cache architecure guarantees adjacent store
> > > integrity, even in the face of SMP, so it's x==1 and y==2.  If you're
> > > thinking of old alpha SMP system where the lowest store width is 32 bits
> > > and thus you have to do RMW to update a byte, this was usually fixed by
> > > padding (assuming the structure is not packed).  However, it was such a
> > > problem that even the later alpha chips had byte extensions.
> 
> Does linux still support those old Alphas?
> 
> The x86 cpus will also do 32bit wide rmw cycles for the 'bit' operations.

That's different: it's an atomic RMW operation.  The problem with the
alpha was that the operation wasn't atomic (meaning that it can't be
interrupted and no intermediate output states are visible).

> > OK, thanks!
> 
> You still have to ensure the compiler doesn't do wider rmw cycles.
> I believe the recent versions of gcc won't do wider accesses for volatile data.

I don't understand this comment.  You seem to be implying gcc would do a
64 bit RMW for a 32 bit store ... that would be daft when a single
instruction exists to perform the operation on all architectures.

James
David Laight Sept. 28, 2015, 2:50 p.m. UTC | #24
From: James Bottomley [mailto:James.Bottomley@HansenPartnership.com]
> Sent: 28 September 2015 15:27
> On Mon, 2015-09-28 at 08:58 +0000, David Laight wrote:
> > From: Rafael J. Wysocki
> > > Sent: 27 September 2015 15:09
> > ...
> > > > > Say you have three adjacent fields in a structure, x, y, z, each one byte long.
> > > > > Initially, all of them are equal to 0.
> > > > >
> > > > > CPU A writes 1 to x and CPU B writes 2 to y at the same time.
> > > > >
> > > > > What's the result?
> > > >
> > > > I think every CPU's  cache architecure guarantees adjacent store
> > > > integrity, even in the face of SMP, so it's x==1 and y==2.  If you're
> > > > thinking of old alpha SMP system where the lowest store width is 32 bits
> > > > and thus you have to do RMW to update a byte, this was usually fixed by
> > > > padding (assuming the structure is not packed).  However, it was such a
> > > > problem that even the later alpha chips had byte extensions.
> >
> > Does linux still support those old Alphas?
> >
> > The x86 cpus will also do 32bit wide rmw cycles for the 'bit' operations.
> 
> That's different: it's an atomic RMW operation.  The problem with the
> alpha was that the operation wasn't atomic (meaning that it can't be
> interrupted and no intermediate output states are visible).

It is only atomic if prefixed by the 'lock' prefix.
Normally the read and write are separate bus cycles.
 
> > You still have to ensure the compiler doesn't do wider rmw cycles.
> > I believe the recent versions of gcc won't do wider accesses for volatile data.
> 
> I don't understand this comment.  You seem to be implying gcc would do a
> 64 bit RMW for a 32 bit store ... that would be daft when a single
> instruction exists to perform the operation on all architectures.

Read the object code and weep...
It is most likely to happen for operations that are rmw (eg bit set).
For instance the arm cpu has limited offsets for 16bit accesses, for
normal structures the compiler is likely to use a 32bit rmw sequence
for a 16bit field that has a large offset.
The C language allows the compiler to do it for any access (IIRC including
volatiles).

	David
James Bottomley Sept. 28, 2015, 3:11 p.m. UTC | #25
On Mon, 2015-09-28 at 14:50 +0000, David Laight wrote:
> From: James Bottomley [mailto:James.Bottomley@HansenPartnership.com]
> > Sent: 28 September 2015 15:27
> > On Mon, 2015-09-28 at 08:58 +0000, David Laight wrote:
> > > From: Rafael J. Wysocki
> > > > Sent: 27 September 2015 15:09
> > > ...
> > > > > > Say you have three adjacent fields in a structure, x, y, z, each one byte long.
> > > > > > Initially, all of them are equal to 0.
> > > > > >
> > > > > > CPU A writes 1 to x and CPU B writes 2 to y at the same time.
> > > > > >
> > > > > > What's the result?
> > > > >
> > > > > I think every CPU's  cache architecure guarantees adjacent store
> > > > > integrity, even in the face of SMP, so it's x==1 and y==2.  If you're
> > > > > thinking of old alpha SMP system where the lowest store width is 32 bits
> > > > > and thus you have to do RMW to update a byte, this was usually fixed by
> > > > > padding (assuming the structure is not packed).  However, it was such a
> > > > > problem that even the later alpha chips had byte extensions.
> > >
> > > Does linux still support those old Alphas?
> > >
> > > The x86 cpus will also do 32bit wide rmw cycles for the 'bit' operations.
> > 
> > That's different: it's an atomic RMW operation.  The problem with the
> > alpha was that the operation wasn't atomic (meaning that it can't be
> > interrupted and no intermediate output states are visible).
> 
> It is only atomic if prefixed by the 'lock' prefix.
> Normally the read and write are separate bus cycles.

The essential point is that x86 has atomic bit ops and byte writes.
Early alpha did not.

> > > You still have to ensure the compiler doesn't do wider rmw cycles.
> > > I believe the recent versions of gcc won't do wider accesses for volatile data.
> > 
> > I don't understand this comment.  You seem to be implying gcc would do a
> > 64 bit RMW for a 32 bit store ... that would be daft when a single
> > instruction exists to perform the operation on all architectures.
> 
> Read the object code and weep...
> It is most likely to happen for operations that are rmw (eg bit set).
> For instance the arm cpu has limited offsets for 16bit accesses, for
> normal structures the compiler is likely to use a 32bit rmw sequence
> for a 16bit field that has a large offset.
> The C language allows the compiler to do it for any access (IIRC including
> volatiles).

I think you might be confusing different things.  Most RISC CPUs can't
do 32 bit store immediates because there aren't enough bits in their
arsenal, so they tend to split 32 bit loads into a left and right part
(first the top then the offset).  This (and other things) are mostly
what you see in code.  However, 32 bit register stores are still atomic,
which is all we require.  It's not really the compiler's fault, it's
mostly an architectural limitation.

James
David Laight Sept. 28, 2015, 3:31 p.m. UTC | #26
From: James Bottomley 
> Sent: 28 September 2015 16:12
> > > > The x86 cpus will also do 32bit wide rmw cycles for the 'bit' operations.
> > >
> > > That's different: it's an atomic RMW operation.  The problem with the
> > > alpha was that the operation wasn't atomic (meaning that it can't be
> > > interrupted and no intermediate output states are visible).
> >
> > It is only atomic if prefixed by the 'lock' prefix.
> > Normally the read and write are separate bus cycles.
> 
> The essential point is that x86 has atomic bit ops and byte writes.
> Early alpha did not.

Early alpha didn't have any byte accesses.

On x86 if you have the following:
	struct {
		char  a;
		volatile char b;
	} *foo;
	foo->a |= 4;

The compiler is likely to generate a 'bis #4, 0(rbx)' (or similar)
and the cpu will do two 32bit memory cycles that read and write
the 'volatile' field 'b'.
(gcc definitely used to do this...)

A lot of fields were made 32bit (and probably not bitfields) in the linux
kernel tree a year or two ago to avoid this very problem.

> > > > You still have to ensure the compiler doesn't do wider rmw cycles.
> > > > I believe the recent versions of gcc won't do wider accesses for volatile data.
> > >
> > > I don't understand this comment.  You seem to be implying gcc would do a
> > > 64 bit RMW for a 32 bit store ... that would be daft when a single
> > > instruction exists to perform the operation on all architectures.
> >
> > Read the object code and weep...
> > It is most likely to happen for operations that are rmw (eg bit set).
> > For instance the arm cpu has limited offsets for 16bit accesses, for
> > normal structures the compiler is likely to use a 32bit rmw sequence
> > for a 16bit field that has a large offset.
> > The C language allows the compiler to do it for any access (IIRC including
> > volatiles).
> 
> I think you might be confusing different things.  Most RISC CPUs can't
> do 32 bit store immediates because there aren't enough bits in their
> arsenal, so they tend to split 32 bit loads into a left and right part
> (first the top then the offset).  This (and other things) are mostly
> what you see in code.  However, 32 bit register stores are still atomic,
> which is all we require.  It's not really the compiler's fault, it's
> mostly an architectural limitation.

No, I'm not talking about how 32bit constants are generated.
I'm talking about structure offsets.

	David
diff mbox

Patch

diff --git a/drivers/acpi/ec_sys.c b/drivers/acpi/ec_sys.c
index b4c216bab22b..b44b91331a56 100644
--- a/drivers/acpi/ec_sys.c
+++ b/drivers/acpi/ec_sys.c
@@ -110,6 +110,7 @@  static int acpi_ec_add_debugfs(struct acpi_ec *ec, unsigned int ec_device_count)
 	struct dentry *dev_dir;
 	char name[64];
 	umode_t mode = 0400;
+	u32 val;
 
 	if (ec_device_count == 0) {
 		acpi_ec_debugfs_dir = debugfs_create_dir("ec", NULL);
@@ -127,10 +128,11 @@  static int acpi_ec_add_debugfs(struct acpi_ec *ec, unsigned int ec_device_count)
 
 	if (!debugfs_create_x32("gpe", 0444, dev_dir, (u32 *)&first_ec->gpe))
 		goto error;
-	if (!debugfs_create_bool("use_global_lock", 0444, dev_dir,
-				 (u32 *)&first_ec->global_lock))
+	if (!debugfs_create_bool("use_global_lock", 0444, dev_dir, &val))
 		goto error;
 
+	first_ec->global_lock = val;
+
 	if (write_support)
 		mode = 0600;
 	if (!debugfs_create_file("io", mode, dev_dir, ec, &acpi_ec_io_ops))