diff mbox series

[1/3] btrfs: add readmirror type framework

Message ID 1576818365-20286-2-git-send-email-anand.jain@oracle.com (mailing list archive)
State New, archived
Headers show
Series readmirror feature (sysfs and in-memory only approach) | expand

Commit Message

Anand Jain Dec. 20, 2019, 5:06 a.m. UTC
As of now we use %pid method to read stripped mirrored data. So
application's process id determines the stripe id to be read. This type
of routing typically helps in a system with many small independent
applications tying to read random data. On the other hand the %pid
based read IO distribution policy is inefficient if there is a single
application trying to read large data and the overall disk bandwidth
remains under utilized.

So this patch introduces a framework where we could add more readmirror
policies, such as routing the IO based on device's waitqueue or manual
when we have a read-preferred device or a policy based on the target
storage caching.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 fs/btrfs/volumes.c | 16 +++++++++++++++-
 fs/btrfs/volumes.h |  8 ++++++++
 2 files changed, 23 insertions(+), 1 deletion(-)

Comments

Josef Bacik Dec. 20, 2019, 2:44 p.m. UTC | #1
On 12/20/19 12:06 AM, Anand Jain wrote:
> As of now we use %pid method to read stripped mirrored data. So
> application's process id determines the stripe id to be read. This type
> of routing typically helps in a system with many small independent
> applications tying to read random data. On the other hand the %pid
> based read IO distribution policy is inefficient if there is a single
> application trying to read large data and the overall disk bandwidth
> remains under utilized.
> 
> So this patch introduces a framework where we could add more readmirror
> policies, such as routing the IO based on device's waitqueue or manual
> when we have a read-preferred device or a policy based on the target
> storage caching.
> 
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
>   fs/btrfs/volumes.c | 16 +++++++++++++++-
>   fs/btrfs/volumes.h |  8 ++++++++
>   2 files changed, 23 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index c95e47aa84f8..0c6caae29248 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -1162,6 +1162,8 @@ static int open_fs_devices(struct btrfs_fs_devices *fs_devices,
>   	fs_devices->opened = 1;
>   	fs_devices->latest_bdev = latest_dev->bdev;
>   	fs_devices->total_rw_bytes = 0;
> +	/* Set the default readmirror policy */
> +	atomic_set(&fs_devices->readmirror, BTRFS_READMIRROR_DEFAULT);
There's no reason for this to be atomic, it's just a behavior change, if you 
really want to be super safe use READ_ONCE/WRITE_ONCE and have readmirror be 
your enum.  Thanks,

Josef
Anand Jain Jan. 2, 2020, 10:12 a.m. UTC | #2
On 12/20/19 10:44 PM, Josef Bacik wrote:
> On 12/20/19 12:06 AM, Anand Jain wrote:
>> As of now we use %pid method to read stripped mirrored data. So
>> application's process id determines the stripe id to be read. This type
>> of routing typically helps in a system with many small independent
>> applications tying to read random data. On the other hand the %pid
>> based read IO distribution policy is inefficient if there is a single
>> application trying to read large data and the overall disk bandwidth
>> remains under utilized.
>>
>> So this patch introduces a framework where we could add more readmirror
>> policies, such as routing the IO based on device's waitqueue or manual
>> when we have a read-preferred device or a policy based on the target
>> storage caching.
>>
>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>> ---
>>   fs/btrfs/volumes.c | 16 +++++++++++++++-
>>   fs/btrfs/volumes.h |  8 ++++++++
>>   2 files changed, 23 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index c95e47aa84f8..0c6caae29248 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -1162,6 +1162,8 @@ static int open_fs_devices(struct 
>> btrfs_fs_devices *fs_devices,
>>       fs_devices->opened = 1;
>>       fs_devices->latest_bdev = latest_dev->bdev;
>>       fs_devices->total_rw_bytes = 0;
>> +    /* Set the default readmirror policy */
>> +    atomic_set(&fs_devices->readmirror, BTRFS_READMIRROR_DEFAULT);
> There's no reason for this to be atomic, it's just a behavior change, if 
> you really want to be super safe use READ_ONCE/WRITE_ONCE and have 
> readmirror be your enum.  Thanks,

  Agreed fs_devices::readmirror doesn't have to be atmoic_t. Fixed this
  to declare it as u8 in v2.

Thanks, Anand


> Josef
diff mbox series

Patch

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index c95e47aa84f8..0c6caae29248 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1162,6 +1162,8 @@  static int open_fs_devices(struct btrfs_fs_devices *fs_devices,
 	fs_devices->opened = 1;
 	fs_devices->latest_bdev = latest_dev->bdev;
 	fs_devices->total_rw_bytes = 0;
+	/* Set the default readmirror policy */
+	atomic_set(&fs_devices->readmirror, BTRFS_READMIRROR_DEFAULT);
 out:
 	return ret;
 }
@@ -5300,7 +5302,19 @@  static int find_live_mirror(struct btrfs_fs_info *fs_info,
 	else
 		num_stripes = map->num_stripes;
 
-	preferred_mirror = first + current->pid % num_stripes;
+	switch (atomic_read(&fs_info->fs_devices->readmirror)) {
+	case BTRFS_READMIRROR_BY_PID:
+		preferred_mirror = first + current->pid % num_stripes;
+		break;
+	default:
+		/*
+		 * Shouln't happen, just warn and use default instead of failing.
+		 */
+		btrfs_warn_rl(fs_info,
+			      "unknown readmirror type %u, fallback to by_pid",
+			      atomic_read(&fs_info->fs_devices->readmirror));
+		preferred_mirror = first + current->pid % num_stripes;
+	}
 
 	if (dev_replace_is_ongoing &&
 	    fs_info->dev_replace.cont_reading_from_srcdev_mode ==
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 68021d1ee216..d9c4c4e1dbc2 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -209,6 +209,12 @@  struct btrfs_device {
 BTRFS_DEVICE_GETSET_FUNCS(disk_total_bytes);
 BTRFS_DEVICE_GETSET_FUNCS(bytes_used);
 
+/* readmirror_policy types */
+#define BTRFS_READMIRROR_DEFAULT	BTRFS_READMIRROR_BY_PID
+enum btrfs_readmirror_policy_type {
+	BTRFS_READMIRROR_BY_PID,
+};
+
 struct btrfs_fs_devices {
 	u8 fsid[BTRFS_FSID_SIZE]; /* FS specific uuid */
 	u8 metadata_uuid[BTRFS_FSID_SIZE];
@@ -260,6 +266,8 @@  struct btrfs_fs_devices {
 	struct kobject *devices_kobj;
 	struct kobject *devinfo_kobj;
 	struct completion kobj_unregister;
+
+	atomic_t readmirror;
 };
 
 #define BTRFS_BIO_INLINE_CSUM_SIZE	64