Message ID | 20230324181741.13908-2-michael.christie@oracle.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Use block pr_ops in LIO | expand |
On 3/24/23 11:17, Mike Christie wrote: > Add callouts for reading keys and reservations. This allows LIO to support > the READ_KEYS and READ_RESERVATION commands and will allow dm-multipath > to optimize it's error handling so it can check if it's getting an error > because there's an existing reservation or if we need to retry different > paths. > > Note: This only initially adds the struct definitions in the kernel as I'm > not sure if we wanted to export the interface to userspace yet. read_keys > and read_reservation are exactly what dm-multipath and LIO need, but for a > userspace interface we may want something like SCSI's READ_FULL_STATUS and > NVMe's report reservation commands. Those are overkill for dm/LIO and > READ_FULL_STATUS is sometimes broken for SCSI devices. Reviewed-by: Bart Van Assche <bvanassche@acm.org>
On Fri, Mar 24 2023 at 2:17P -0400, Mike Christie <michael.christie@oracle.com> wrote: > Add callouts for reading keys and reservations. This allows LIO to support > the READ_KEYS and READ_RESERVATION commands and will allow dm-multipath > to optimize it's error handling so it can check if it's getting an error > because there's an existing reservation or if we need to retry different > paths. Not seeing anything in DM's path selectors that adds these optimizations. Is there accompanying multipath-tools callouts to get at this data to optimize path selection (via DM table reloads)? Mike
On 3/28/23 11:36 AM, Mike Snitzer wrote: > On Fri, Mar 24 2023 at 2:17P -0400, > Mike Christie <michael.christie@oracle.com> wrote: > >> Add callouts for reading keys and reservations. This allows LIO to support >> the READ_KEYS and READ_RESERVATION commands and will allow dm-multipath >> to optimize it's error handling so it can check if it's getting an error >> because there's an existing reservation or if we need to retry different >> paths. > > Not seeing anything in DM's path selectors that adds these > optimizations. Is there accompanying multipath-tools callouts to get > at this data to optimize path selection (via DM table reloads)? > You can ignore that comment. The comment was meant for the dm pr callouts and not for normal IO/path handling, and now I think I can fix in a different way. I originally added the comment about better dm error handling for something like __dm_pr_register getting a failure when it did: ret = ops->pr_register(dev->bdev, pr->old_key, pr->new_key, pr->flags); Right now, we fail the entire operation if just one call on one path fails. With the pr_read_keys/reservation callouts we could check if we got a failure because there was an existing reservation vs a retryable/ignorable error. I forgot I wrote that comment about dm in the mail and we actually don't need the pr_read_* callouts for that type of thing now, because I ended up changing the existing callouts to return common error codes last kernel. So I have another patchset that I'm working on, but am still debating about some issues like: ret = ops->pr_register(dev->bdev, pr->old_key, pr->new_key, pr->flags); switch (ret) { case PR_STS_RESERVATION_CONFLICT: pr->ret = ret; return -1; case PR_STS_RETRY_PATH_FAILURE: /* * We probably want to retry like how we do for the pg_init. */ .... case PR_STS_PATH_FAILED: case PR_STS_PATH_FAST_FAILED: /* * I'm still not sure what to do here, because if this is the last * host then we might want to try and register the rest of the paths * and limp on. It probably needs a user config option. */ ....
diff --git a/include/linux/pr.h b/include/linux/pr.h index 94ceec713afe..3003daec28a5 100644 --- a/include/linux/pr.h +++ b/include/linux/pr.h @@ -4,6 +4,18 @@ #include <uapi/linux/pr.h> +struct pr_keys { + u32 generation; + u32 num_keys; + u64 keys[]; +}; + +struct pr_held_reservation { + u64 key; + u32 generation; + enum pr_type type; +}; + struct pr_ops { int (*pr_register)(struct block_device *bdev, u64 old_key, u64 new_key, u32 flags); @@ -14,6 +26,19 @@ struct pr_ops { int (*pr_preempt)(struct block_device *bdev, u64 old_key, u64 new_key, enum pr_type type, bool abort); int (*pr_clear)(struct block_device *bdev, u64 key); + /* + * pr_read_keys - Read the registered keys and return them in the + * pr_keys->keys array. The keys array will have been allocated at the + * end of the pr_keys struct, and pr_keys->num_keys must be set to the + * number of keys the array can hold. If there are more than can fit + * in the array, success will still be returned and pr_keys->num_keys + * will reflect the total number of keys the device contains, so the + * caller can retry with a larger array. + */ + int (*pr_read_keys)(struct block_device *bdev, + struct pr_keys *keys_info); + int (*pr_read_reservation)(struct block_device *bdev, + struct pr_held_reservation *rsv); }; #endif /* LINUX_PR_H */