[4/4] btrfs: sysfs, add devid/dev_state kobject and attribute
diff mbox series

Message ID 20191205112706.8125-5-anand.jain@oracle.com
State New
Headers show
Series
  • btrfs, sysfs cleanup and add dev_state
Related show

Commit Message

Anand Jain Dec. 5, 2019, 11:27 a.m. UTC
A new sysfs RW attribute
  UUID/devinfo/<devid>/dev_state
is added here. The dev_state here reflects the state of the device from
the kernel parameter btrfs_device::dev_state and this attribute is born
after the device is mounted and goes along with the dynamic nature of the
device add and delete. This attribute gets deleted at unmount.

For example:
pwd
 /sys/fs/btrfs/6e1961f1-5918-4ecc-a22f-948897b409f7/devinfo
cat 1/dev_state
 IN_FS_METADATA MISSING
cat 2/dev_state
 WRITABLE IN_FS_METADATA

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 fs/btrfs/sysfs.c   | 101 ++++++++++++++++++++++++++++++++++-----------
 fs/btrfs/volumes.h |   2 +
 2 files changed, 80 insertions(+), 23 deletions(-)

Comments

David Sterba Dec. 5, 2019, 2:10 p.m. UTC | #1
On Thu, Dec 05, 2019 at 07:27:06PM +0800, Anand Jain wrote:
> +static void btrfs_dev_state_to_str(struct btrfs_device *device, char *dev_state_str, size_t n)
> +{
> +	int state;
> +	const char *btrfs_dev_states[] = { "WRITEABLE", "IN_FS_METADATA",
> +					   "MISSING", "REPLACE_TGT", "FLUSHING"
> +					 };
> +
> +	n = n - strlen(dev_state_str) - 1;
> +
> +	for (state = 0; state < ARRAY_SIZE(btrfs_dev_states); state++) {
> +		if (test_bit(state, &device->dev_state)) {
> +			if (strlen(dev_state_str))
> +				strncat(dev_state_str, " ", n);
> +			strncat(dev_state_str, btrfs_dev_states[state], n);
> +		}
> +	}
> +	strncat(dev_state_str, "\n", n);

Please use the snprintf way as in supported_checksums_show and not the
str* functions.
David Sterba Dec. 5, 2019, 2:21 p.m. UTC | #2
On Thu, Dec 05, 2019 at 07:27:06PM +0800, Anand Jain wrote:
> A new sysfs RW attribute

Why RW? Most attributes/files in sysfs are supposed to be read-only, but
you actually define the attribute as read-only with the macro
BTRFS_ATTR.

>   UUID/devinfo/<devid>/dev_state
> is added here. The dev_state here reflects the state of the device from
> the kernel parameter btrfs_device::dev_state and this attribute is born
> after the device is mounted and goes along with the dynamic nature of the
> device add and delete. This attribute gets deleted at unmount.
> 
> For example:
> pwd
>  /sys/fs/btrfs/6e1961f1-5918-4ecc-a22f-948897b409f7/devinfo
> cat 1/dev_state
>  IN_FS_METADATA MISSING
> cat 2/dev_state
>  WRITABLE IN_FS_METADATA

So the values copy the device state macros, that's probably ok.

> +static ssize_t btrfs_sysfs_dev_state_show(struct kobject *kobj,
> +					  struct kobj_attribute *a, char *buf)
> +{
> +	struct btrfs_device *device = container_of(kobj, struct btrfs_device,
> +						   devid_kobj);
> +
> +	btrfs_dev_state_to_str(device, buf, PAGE_SIZE);

The device access is unprotected, you need at least RCU but that still
does not prevent from the device being freed by deletion. The
device_list_mutex is quite heavy and allowing a DoS by repeatedly
reading the file contents is not something we want to allow.
Anand Jain Dec. 5, 2019, 2:38 p.m. UTC | #3
On 12/5/19 10:21 PM, David Sterba wrote:
> On Thu, Dec 05, 2019 at 07:27:06PM +0800, Anand Jain wrote:
>> A new sysfs RW attribute
> 
> Why RW? Most attributes/files in sysfs are supposed to be read-only, but
> you actually define the attribute as read-only with the macro
> BTRFS_ATTR.
> 

  Oh. Change log has to be fixed here.

  But I am also working on a new patch to make this RW, to set the
  new read_preferred state on the device.

>>    UUID/devinfo/<devid>/dev_state
>> is added here. The dev_state here reflects the state of the device from
>> the kernel parameter btrfs_device::dev_state and this attribute is born
>> after the device is mounted and goes along with the dynamic nature of the
>> device add and delete. This attribute gets deleted at unmount.
>>
>> For example:
>> pwd
>>   /sys/fs/btrfs/6e1961f1-5918-4ecc-a22f-948897b409f7/devinfo
>> cat 1/dev_state
>>   IN_FS_METADATA MISSING
>> cat 2/dev_state
>>   WRITABLE IN_FS_METADATA
> 
> So the values copy the device state macros, that's probably ok.
> 
  Yep.

>> +static ssize_t btrfs_sysfs_dev_state_show(struct kobject *kobj,
>> +					  struct kobj_attribute *a, char *buf)
>> +{
>> +	struct btrfs_device *device = container_of(kobj, struct btrfs_device,
>> +						   devid_kobj);
>> +
>> +	btrfs_dev_state_to_str(device, buf, PAGE_SIZE);
> 
> The device access is unprotected, you need at least RCU but that still
> does not prevent from the device being freed by deletion.

  We need RCU let me fix. Device being deleted is fine, there
  is nothing to lose, another directory lookup will show that
  UUID/devinfo/<devid> is gone is the device is deleted.

> The
> device_list_mutex is quite heavy and allowing a DoS by repeatedly
> reading the file contents is not something we want to allow.
> 

   Yes we don't have to use device_list_mutex here, as its read,
   a refresh/re-read will refresh the dev_state.

Thanks, Anand
Anand Jain Dec. 5, 2019, 2:38 p.m. UTC | #4
On 12/5/19 10:10 PM, David Sterba wrote:
> On Thu, Dec 05, 2019 at 07:27:06PM +0800, Anand Jain wrote:
>> +static void btrfs_dev_state_to_str(struct btrfs_device *device, char *dev_state_str, size_t n)
>> +{
>> +	int state;
>> +	const char *btrfs_dev_states[] = { "WRITEABLE", "IN_FS_METADATA",
>> +					   "MISSING", "REPLACE_TGT", "FLUSHING"
>> +					 };
>> +
>> +	n = n - strlen(dev_state_str) - 1;
>> +
>> +	for (state = 0; state < ARRAY_SIZE(btrfs_dev_states); state++) {
>> +		if (test_bit(state, &device->dev_state)) {
>> +			if (strlen(dev_state_str))
>> +				strncat(dev_state_str, " ", n);
>> +			strncat(dev_state_str, btrfs_dev_states[state], n);
>> +		}
>> +	}
>> +	strncat(dev_state_str, "\n", n);
> 
> Please use the snprintf way as in supported_checksums_show and not the
> str* functions.
> 

Ok. Will fix.

Thanks, Anand
David Sterba Dec. 5, 2019, 3:14 p.m. UTC | #5
On Thu, Dec 05, 2019 at 10:38:15PM +0800, Anand Jain wrote:
> > So the values copy the device state macros, that's probably ok.
>   Yep.

Although, sysfs files should print one value per file, which makes sense
in many cases, so eg. missing should exist separately too for quick
checks for the most common device states. The dev_state reflects the
internal state and is likely useful only for debugging.

> >> +static ssize_t btrfs_sysfs_dev_state_show(struct kobject *kobj,
> >> +					  struct kobj_attribute *a, char *buf)
> >> +{
> >> +	struct btrfs_device *device = container_of(kobj, struct btrfs_device,
> >> +						   devid_kobj);
> >> +
> >> +	btrfs_dev_state_to_str(device, buf, PAGE_SIZE);
> > 
> > The device access is unprotected, you need at least RCU but that still
> > does not prevent from the device being freed by deletion.
> 
>   We need RCU let me fix. Device being deleted is fine, there
>   is nothing to lose, another directory lookup will show that
>   UUID/devinfo/<devid> is gone is the device is deleted.

The device can be gone from the list but the sysfs files can still
exist.

    CPU1                                   CPU2

btrfs_rm_device
                                        open file
  btrfs_sysfs_rm_device_link
    btrfs_free_device
      kfree(device)
                                        call read, access freed device

> > The
> > device_list_mutex is quite heavy and allowing a DoS by repeatedly
> > reading the file contents is not something we want to allow.
> > 
> 
>    Yes we don't have to use device_list_mutex here, as its read,
>    a refresh/re-read will refresh the dev_state.

The point is not to synchronize the device state values but the device
object itself.
Anand Jain Dec. 6, 2019, 1:49 p.m. UTC | #6
On 5/12/19 11:14 PM, David Sterba wrote:
> On Thu, Dec 05, 2019 at 10:38:15PM +0800, Anand Jain wrote:
>>> So the values copy the device state macros, that's probably ok.
>>    Yep.
> 
> Although, sysfs files should print one value per file, which makes sense
> in many cases, so eg. missing should exist separately too for quick
> checks for the most common device states. The dev_state reflects the
> internal state and is likely useful only for debugging.
> 

  I agree. Its better to create an individual attribute for each of the
  device states. For instance..

    under the 'UUID/devinfo/<devid>' kobject
    attributes will be:
      missing
      in_fs_metadata
      replace_target

   cat missing
    0
   cat in_fs_metadata
    1

   ..etc

  which seems to be more or less standard for block devices.

  Will fix it in v2.

>>>> +static ssize_t btrfs_sysfs_dev_state_show(struct kobject *kobj,
>>>> +					  struct kobj_attribute *a, char *buf)
>>>> +{
>>>> +	struct btrfs_device *device = container_of(kobj, struct btrfs_device,
>>>> +						   devid_kobj);
>>>> +
>>>> +	btrfs_dev_state_to_str(device, buf, PAGE_SIZE);
>>>
>>> The device access is unprotected, you need at least RCU but that still
>>> does not prevent from the device being freed by deletion.
>>
>>    We need RCU let me fix. Device being deleted is fine, there
>>    is nothing to lose, another directory lookup will show that
>>    UUID/devinfo/<devid> is gone is the device is deleted.
> 
> The device can be gone from the list but the sysfs files can still
> exist.
> 
>      CPU1                                   CPU2
> 
> btrfs_rm_device
>                                          open file
>    btrfs_sysfs_rm_device_link
>      btrfs_free_device
>        kfree(device)
>                                          call read, access freed device
> 

   I completely missed the sysfs synchronization with device delete.
   As in the other email discussion, I think a new rwlock shall suffice.
   And as its lock is only between device delete and sysfs so it shall
   be light weight without affecting the other device_list_mutex holders.

Thanks,
Anand

>>> The
>>> device_list_mutex is quite heavy and allowing a DoS by repeatedly
>>> reading the file contents is not something we want to allow.
>>>
>>
>>     Yes we don't have to use device_list_mutex here, as its read,
>>     a refresh/re-read will refresh the dev_state.
> 
> The point is not to synchronize the device state values but the device
> object itself.
Anand Jain Dec. 9, 2019, 2:05 p.m. UTC | #7
On 12/6/19 9:49 PM, Anand Jain wrote:
> 
> 
> On 5/12/19 11:14 PM, David Sterba wrote:
>> On Thu, Dec 05, 2019 at 10:38:15PM +0800, Anand Jain wrote:
>>>> So the values copy the device state macros, that's probably ok.
>>>    Yep.
>>
>> Although, sysfs files should print one value per file, which makes sense
>> in many cases, so eg. missing should exist separately too for quick
>> checks for the most common device states. The dev_state reflects the
>> internal state and is likely useful only for debugging.
>>
> 
>   I agree. Its better to create an individual attribute for each of the
>   device states. For instance..
> 
>     under the 'UUID/devinfo/<devid>' kobject
>     attributes will be:
>       missing
>       in_fs_metadata
>       replace_target
> 
>    cat missing
>     0
>    cat in_fs_metadata
>     1
> 
>    ..etc
> 
>   which seems to be more or less standard for block devices.
> 
>   Will fix it in v2.

This is fixed in v2.


> 
>>>>> +static ssize_t btrfs_sysfs_dev_state_show(struct kobject *kobj,
>>>>> +                      struct kobj_attribute *a, char *buf)
>>>>> +{
>>>>> +    struct btrfs_device *device = container_of(kobj, struct 
>>>>> btrfs_device,
>>>>> +                           devid_kobj);
>>>>> +
>>>>> +    btrfs_dev_state_to_str(device, buf, PAGE_SIZE);
>>>>
>>>> The device access is unprotected, you need at least RCU but that still
>>>> does not prevent from the device being freed by deletion.
>>>
>>>    We need RCU let me fix. Device being deleted is fine, there
>>>    is nothing to lose, another directory lookup will show that
>>>    UUID/devinfo/<devid> is gone is the device is deleted.
>>
>> The device can be gone from the list but the sysfs files can still
>> exist.
>>
>>      CPU1                                   CPU2
>>
>> btrfs_rm_device
>>                                          open file
>>    btrfs_sysfs_rm_device_link
>>      btrfs_free_device
>>        kfree(device)
>>                                          call read, access freed device
>>
> 
>    I completely missed the sysfs synchronization with device delete.
>    As in the other email discussion, I think a new rwlock shall suffice.
>    And as its lock is only between device delete and sysfs so it shall
>    be light weight without affecting the other device_list_mutex holders.

  Looked into this further, actually we don't need any lock here
  the device delete thread which calls kobject_put() makes sure
  sysfs read is closed. So an existing sysfs read thread will have
  to complete before device free.


       CPU1                                   CPU2

  btrfs_rm_device
                                           open file
     btrfs_sysfs_rm_device_link
                                           call read, access freed device
       sysfs waits for the open file
       to close.
       btrfs_free_device
         kfree(device)


Thanks
Anand


> Thanks,
> Anand
> 
>>>> The
>>>> device_list_mutex is quite heavy and allowing a DoS by repeatedly
>>>> reading the file contents is not something we want to allow.
>>>>
>>>
>>>     Yes we don't have to use device_list_mutex here, as its read,
>>>     a refresh/re-read will refresh the dev_state.
>>
>> The point is not to synchronize the device state values but the device
>> object itself.
> 
> 
>
David Sterba Dec. 9, 2019, 6:31 p.m. UTC | #8
On Mon, Dec 09, 2019 at 10:05:39PM +0800, Anand Jain wrote:
>   Looked into this further, actually we don't need any lock here
>   the device delete thread which calls kobject_put() makes sure
>   sysfs read is closed. So an existing sysfs read thread will have
>   to complete before device free.
> 
> 
>        CPU1                                   CPU2
> 
>   btrfs_rm_device
>                                            open file
>      btrfs_sysfs_rm_device_link
>                                            call read, access freed device
>        sysfs waits for the open file
>        to close.
>        btrfs_free_device
>          kfree(device)

Ok, as long as this holds and sysfs file is removed before the device is
freed, the locking is not necessary.
David Sterba Dec. 13, 2019, 4:43 p.m. UTC | #9
On Mon, Dec 09, 2019 at 10:05:39PM +0800, Anand Jain wrote:
> On 12/6/19 9:49 PM, Anand Jain wrote:
> > 
> > 
> > On 5/12/19 11:14 PM, David Sterba wrote:
> >> On Thu, Dec 05, 2019 at 10:38:15PM +0800, Anand Jain wrote:
> >>>> So the values copy the device state macros, that's probably ok.
> >>>    Yep.
> >>
> >> Although, sysfs files should print one value per file, which makes sense
> >> in many cases, so eg. missing should exist separately too for quick
> >> checks for the most common device states. The dev_state reflects the
> >> internal state and is likely useful only for debugging.
> >>
> > 
> >   I agree. Its better to create an individual attribute for each of the
> >   device states. For instance..
> > 
> >     under the 'UUID/devinfo/<devid>' kobject
> >     attributes will be:
> >       missing
> >       in_fs_metadata
> >       replace_target
> > 
> >    cat missing
> >     0
> >    cat in_fs_metadata
> >     1
> > 
> >    ..etc
> > 
> >   which seems to be more or less standard for block devices.
> > 
> >   Will fix it in v2.
> 
> This is fixed in v2.
> 
> 
> > 
> >>>>> +static ssize_t btrfs_sysfs_dev_state_show(struct kobject *kobj,
> >>>>> +                      struct kobj_attribute *a, char *buf)
> >>>>> +{
> >>>>> +    struct btrfs_device *device = container_of(kobj, struct 
> >>>>> btrfs_device,
> >>>>> +                           devid_kobj);
> >>>>> +
> >>>>> +    btrfs_dev_state_to_str(device, buf, PAGE_SIZE);
> >>>>
> >>>> The device access is unprotected, you need at least RCU but that still
> >>>> does not prevent from the device being freed by deletion.
> >>>
> >>>    We need RCU let me fix. Device being deleted is fine, there
> >>>    is nothing to lose, another directory lookup will show that
> >>>    UUID/devinfo/<devid> is gone is the device is deleted.
> >>
> >> The device can be gone from the list but the sysfs files can still
> >> exist.
> >>
> >>      CPU1                                   CPU2
> >>
> >> btrfs_rm_device
> >>                                          open file
> >>    btrfs_sysfs_rm_device_link
> >>      btrfs_free_device
> >>        kfree(device)
> >>                                          call read, access freed device
> >>
> > 
> >    I completely missed the sysfs synchronization with device delete.
> >    As in the other email discussion, I think a new rwlock shall suffice.
> >    And as its lock is only between device delete and sysfs so it shall
> >    be light weight without affecting the other device_list_mutex holders.
> 
>   Looked into this further, actually we don't need any lock here
>   the device delete thread which calls kobject_put() makes sure
>   sysfs read is closed. So an existing sysfs read thread will have
>   to complete before device free.
> 
> 
>        CPU1                                   CPU2
> 
>   btrfs_rm_device
>                                            open file
>      btrfs_sysfs_rm_device_link
>                                            call read, access freed device
>        sysfs waits for the open file
>        to close.

How exactly does sysfs wait for the device? Is it eg wait_event checking
number of references? If the file stays open by an evil process is it
going to block the device removal indefinitelly?
David Sterba Dec. 13, 2019, 5:02 p.m. UTC | #10
On Fri, Dec 13, 2019 at 05:43:32PM +0100, David Sterba wrote:
> >   Looked into this further, actually we don't need any lock here
> >   the device delete thread which calls kobject_put() makes sure
> >   sysfs read is closed. So an existing sysfs read thread will have
> >   to complete before device free.
> > 
> > 
> >        CPU1                                   CPU2
> > 
> >   btrfs_rm_device
> >                                            open file
> >      btrfs_sysfs_rm_device_link
> >                                            call read, access freed device
> >        sysfs waits for the open file
> >        to close.
> 
> How exactly does sysfs wait for the device? Is it eg wait_event checking
> number of references? If the file stays open by an evil process is it
> going to block the device removal indefinitelly?

Yeah, sysfs waits until the file is closed. Eg. umount can be stalled
that way too.
Anand Jain Dec. 14, 2019, 12:26 a.m. UTC | #11
On 14/12/19 1:02 AM, David Sterba wrote:
> On Fri, Dec 13, 2019 at 05:43:32PM +0100, David Sterba wrote:
>>>    Looked into this further, actually we don't need any lock here
>>>    the device delete thread which calls kobject_put() makes sure
>>>    sysfs read is closed. So an existing sysfs read thread will have
>>>    to complete before device free.
>>>
>>>
>>>         CPU1                                   CPU2
>>>
>>>    btrfs_rm_device
>>>                                             open file
>>>       btrfs_sysfs_rm_device_link
>>>                                             call read, access freed device
>>>         sysfs waits for the open file
>>>         to close.
>>
>> How exactly does sysfs wait for the device? Is it eg wait_event checking
>> number of references? If the file stays open by an evil process is it
>> going to block the device removal indefinitelly?
> 
> Yeah, sysfs waits until the file is closed. Eg. umount can be stalled
> that way too.
> 

  And similar to umount, I don't think we should return EBUSY
  for btrfs_rm_device if the device sysfs attribute is opened,
  as sysfs show attributes are non blocking and would be completed
  in the timely manner.

regards, Anand
David Sterba Jan. 6, 2020, 4 p.m. UTC | #12
On Sat, Dec 14, 2019 at 08:26:24AM +0800, Anand Jain wrote:
> 
> 
> On 14/12/19 1:02 AM, David Sterba wrote:
> > On Fri, Dec 13, 2019 at 05:43:32PM +0100, David Sterba wrote:
> >>>    Looked into this further, actually we don't need any lock here
> >>>    the device delete thread which calls kobject_put() makes sure
> >>>    sysfs read is closed. So an existing sysfs read thread will have
> >>>    to complete before device free.
> >>>
> >>>
> >>>         CPU1                                   CPU2
> >>>
> >>>    btrfs_rm_device
> >>>                                             open file
> >>>       btrfs_sysfs_rm_device_link
> >>>                                             call read, access freed device
> >>>         sysfs waits for the open file
> >>>         to close.
> >>
> >> How exactly does sysfs wait for the device? Is it eg wait_event checking
> >> number of references? If the file stays open by an evil process is it
> >> going to block the device removal indefinitelly?
> > 
> > Yeah, sysfs waits until the file is closed. Eg. umount can be stalled
> > that way too.
> 
>   And similar to umount, I don't think we should return EBUSY
>   for btrfs_rm_device if the device sysfs attribute is opened,
>   as sysfs show attributes are non blocking and would be completed
>   in the timely manner.

While I don't think the blocking behaviour is totally OK, returning
EBUSY could be confusing without any other explanation. Also leaving
sysfs files open but not read is kind of strange on itself.

Patch
diff mbox series

diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 834f712ed60c..99c7269e0524 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -978,29 +978,76 @@  int btrfs_sysfs_remove_devices_attr(struct btrfs_fs_devices *fs_devices,
 	if (!fs_devices->devices_kobj)
 		return -EINVAL;
 
-	if (one_device && one_device->bdev) {
-		disk = one_device->bdev->bd_part;
-		disk_kobj = &part_to_dev(disk)->kobj;
+	if (one_device) {
+		if (one_device->bdev) {
+			disk = one_device->bdev->bd_part;
+			disk_kobj = &part_to_dev(disk)->kobj;
+			sysfs_remove_link(fs_devices->devices_kobj, disk_kobj->name);
+		}
 
-		sysfs_remove_link(fs_devices->devices_kobj, disk_kobj->name);
-	}
+		kobject_del(&one_device->devid_kobj);
+		kobject_put(&one_device->devid_kobj);
 
-	if (one_device)
 		return 0;
+	}
 
-	list_for_each_entry(one_device,
-			&fs_devices->devices, dev_list) {
-		if (!one_device->bdev)
-			continue;
-		disk = one_device->bdev->bd_part;
-		disk_kobj = &part_to_dev(disk)->kobj;
+	list_for_each_entry(one_device, &fs_devices->devices, dev_list) {
 
-		sysfs_remove_link(fs_devices->devices_kobj, disk_kobj->name);
+		if (one_device->bdev) {
+			disk = one_device->bdev->bd_part;
+			disk_kobj = &part_to_dev(disk)->kobj;
+			sysfs_remove_link(fs_devices->devices_kobj, disk_kobj->name);
+		}
+		kobject_del(&one_device->devid_kobj);
+		kobject_put(&one_device->devid_kobj);
 	}
 
 	return 0;
 }
 
+static void btrfs_dev_state_to_str(struct btrfs_device *device, char *dev_state_str, size_t n)
+{
+	int state;
+	const char *btrfs_dev_states[] = { "WRITEABLE", "IN_FS_METADATA",
+					   "MISSING", "REPLACE_TGT", "FLUSHING"
+					 };
+
+	n = n - strlen(dev_state_str) - 1;
+
+	for (state = 0; state < ARRAY_SIZE(btrfs_dev_states); state++) {
+		if (test_bit(state, &device->dev_state)) {
+			if (strlen(dev_state_str))
+				strncat(dev_state_str, " ", n);
+			strncat(dev_state_str, btrfs_dev_states[state], n);
+		}
+	}
+	strncat(dev_state_str, "\n", n);
+}
+
+static ssize_t btrfs_sysfs_dev_state_show(struct kobject *kobj,
+					  struct kobj_attribute *a, char *buf)
+{
+	struct btrfs_device *device = container_of(kobj, struct btrfs_device,
+						   devid_kobj);
+
+	btrfs_dev_state_to_str(device, buf, PAGE_SIZE);
+	return strlen(buf);
+}
+
+BTRFS_ATTR(, dev_state, btrfs_sysfs_dev_state_show);
+
+static struct attribute *devinfo_attrs[] = {
+	BTRFS_ATTR_PTR(, dev_state),
+	NULL
+};
+
+ATTRIBUTE_GROUPS(devinfo);
+
+static struct kobj_type devinfo_ktype = {
+	.sysfs_ops = &kobj_sysfs_ops,
+	.default_groups = devinfo_groups,
+};
+
 int btrfs_sysfs_add_devices_attr(struct btrfs_fs_devices *fs_devices,
 				 struct btrfs_device *one_device)
 {
@@ -1008,22 +1055,30 @@  int btrfs_sysfs_add_devices_attr(struct btrfs_fs_devices *fs_devices,
 	struct btrfs_device *dev;
 
 	list_for_each_entry(dev, &fs_devices->devices, dev_list) {
-		struct hd_struct *disk;
-		struct kobject *disk_kobj;
-
-		if (!dev->bdev)
-			continue;
 
 		if (one_device && one_device != dev)
 			continue;
 
-		disk = dev->bdev->bd_part;
-		disk_kobj = &part_to_dev(disk)->kobj;
+		if (dev->bdev) {
+			struct hd_struct *disk;
+			struct kobject *disk_kobj;
+
+			disk = dev->bdev->bd_part;
+			disk_kobj = &part_to_dev(disk)->kobj;
 
-		error = sysfs_create_link(fs_devices->devices_kobj,
-					  disk_kobj, disk_kobj->name);
-		if (error)
+			error = sysfs_create_link(fs_devices->devices_kobj,
+						  disk_kobj, disk_kobj->name);
+			if (error)
+				break;
+		}
+
+		error = kobject_init_and_add(&dev->devid_kobj, &devinfo_ktype,
+					     fs_devices->devinfo_kobj, "%llu",
+					     dev->devid);
+		if (error) {
+			kobject_put(&dev->devid_kobj);
 			break;
+		}
 	}
 
 	return error;
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 38f2e8437b68..68021d1ee216 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -138,6 +138,8 @@  struct btrfs_device {
 	atomic_t dev_stat_values[BTRFS_DEV_STAT_VALUES_MAX];
 
 	struct extent_io_tree alloc_state;
+
+	struct kobject devid_kobj;
 };
 
 /*