diff mbox

[1/6] block: add support for carrying a stream ID in a bio

Message ID 1427210823-5283-2-git-send-email-axboe@fb.com (mailing list archive)
State New, archived
Headers show

Commit Message

Jens Axboe March 24, 2015, 3:26 p.m. UTC
The top bits of bio->bi_flags are reserved for keeping the
allocation pool, set aside the next four bits for carrying
a stream ID. That leaves us with support for 15 streams,
0 is reserved as a "stream not set" value.

Add helpers for setting/getting stream ID of a bio.

Signed-off-by: Jens Axboe <axboe@fb.com>
---
 block/bio.c               |  2 ++
 block/blk-core.c          |  3 +++
 include/linux/blk_types.h | 28 +++++++++++++++++++++++++++-
 3 files changed, 32 insertions(+), 1 deletion(-)

Comments

Matias Bjørling March 24, 2015, 5:11 p.m. UTC | #1
On 03/24/2015 04:26 PM, Jens Axboe wrote:
> The top bits of bio->bi_flags are reserved for keeping the
> allocation pool, set aside the next four bits for carrying
> a stream ID. That leaves us with support for 15 streams,
> 0 is reserved as a "stream not set" value.

15 streams seem very limited. Can this be extended? e.g. 16 bits.

15 streams is enough for 1-4 applications. More, and applications starts 
to fight over the same stream id's, leading them to place different age 
data in same flash blocks and push us back to square one.

I understand that Samsung multi-stream SSD supports a limited amount of 
streams, more advance implementations should provide higher limits.

Thanks,
Matias
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jens Axboe March 24, 2015, 5:26 p.m. UTC | #2
On 03/24/2015 11:11 AM, Matias Bjørling wrote:
> On 03/24/2015 04:26 PM, Jens Axboe wrote:
>> The top bits of bio->bi_flags are reserved for keeping the
>> allocation pool, set aside the next four bits for carrying
>> a stream ID. That leaves us with support for 15 streams,
>> 0 is reserved as a "stream not set" value.
>
> 15 streams seem very limited. Can this be extended? e.g. 16 bits.
>
> 15 streams is enough for 1-4 applications. More, and applications starts
> to fight over the same stream id's, leading them to place different age
> data in same flash blocks and push us back to square one.
>
> I understand that Samsung multi-stream SSD supports a limited amount of
> streams, more advance implementations should provide higher limits.

Pushing it higher is not a big deal as far as the implementation goes, 
though 16 bits might be stealing a bit too much space for this. On 
32-bit archs, we have 18 bits currently free that we can abuse. The 
Samsung device supports 16 streams. That's honestly a lot more than I 
would expect most devices to support in hardware, 16 is a lot of open 
erase blocks and write append points. Obviously the open channel effort 
would make that more feasible, though.
Ming Lin-SSI March 24, 2015, 10:07 p.m. UTC | #3
> -----Original Message-----
> From: Jens Axboe [mailto:axboe@kernel.dk]
> Sent: Tuesday, March 24, 2015 10:27 AM
> To: Matias Bjørling; Jens Axboe; linux-kernel@vger.kernel.org; linux-
> fsdevel@vger.kernel.org
> Cc: Ming Lin-SSI
> Subject: Re: [PATCH 1/6] block: add support for carrying a stream ID in a bio
> 
> On 03/24/2015 11:11 AM, Matias Bjørling wrote:
> > On 03/24/2015 04:26 PM, Jens Axboe wrote:
> >> The top bits of bio->bi_flags are reserved for keeping the allocation
> >> pool, set aside the next four bits for carrying a stream ID. That
> >> leaves us with support for 15 streams,
> >> 0 is reserved as a "stream not set" value.
> >
> > 15 streams seem very limited. Can this be extended? e.g. 16 bits.
> >
> > 15 streams is enough for 1-4 applications. More, and applications
> > starts to fight over the same stream id's, leading them to place
> > different age data in same flash blocks and push us back to square one.
> >
> > I understand that Samsung multi-stream SSD supports a limited amount
> > of streams, more advance implementations should provide higher limits.
> 
> Pushing it higher is not a big deal as far as the implementation goes, though
> 16 bits might be stealing a bit too much space for this. On 32-bit archs, we
> have 18 bits currently free that we can abuse. The Samsung device supports
> 16 streams. That's honestly a lot more than I would expect most devices to
> support in hardware, 16 is a lot of open erase blocks and write append points.
> Obviously the open channel effort would make that more feasible, though.

Can we use 8 bits at least? I'll test performance with 16 streams.

> 
> --
> Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jens Axboe March 25, 2015, 1:42 a.m. UTC | #4
On 03/24/2015 04:07 PM, Ming Lin-SSI wrote:
>> -----Original Message-----
>> From: Jens Axboe [mailto:axboe@kernel.dk]
>> Sent: Tuesday, March 24, 2015 10:27 AM
>> To: Matias Bjørling; Jens Axboe; linux-kernel@vger.kernel.org; linux-
>> fsdevel@vger.kernel.org
>> Cc: Ming Lin-SSI
>> Subject: Re: [PATCH 1/6] block: add support for carrying a stream ID in a bio
>>
>> On 03/24/2015 11:11 AM, Matias Bjørling wrote:
>>> On 03/24/2015 04:26 PM, Jens Axboe wrote:
>>>> The top bits of bio->bi_flags are reserved for keeping the allocation
>>>> pool, set aside the next four bits for carrying a stream ID. That
>>>> leaves us with support for 15 streams,
>>>> 0 is reserved as a "stream not set" value.
>>>
>>> 15 streams seem very limited. Can this be extended? e.g. 16 bits.
>>>
>>> 15 streams is enough for 1-4 applications. More, and applications
>>> starts to fight over the same stream id's, leading them to place
>>> different age data in same flash blocks and push us back to square one.
>>>
>>> I understand that Samsung multi-stream SSD supports a limited amount
>>> of streams, more advance implementations should provide higher limits.
>>
>> Pushing it higher is not a big deal as far as the implementation goes, though
>> 16 bits might be stealing a bit too much space for this. On 32-bit archs, we
>> have 18 bits currently free that we can abuse. The Samsung device supports
>> 16 streams. That's honestly a lot more than I would expect most devices to
>> support in hardware, 16 is a lot of open erase blocks and write append points.
>> Obviously the open channel effort would make that more feasible, though.
>
> Can we use 8 bits at least? I'll test performance with 16 streams.

We could, but I still question whether that's really useful. I'd rather 
start smaller and go bigger if there's a real use case for it. It wont 
change the user space ABI if we later make it larger.
Dave Chinner March 25, 2015, 2:30 a.m. UTC | #5
On Tue, Mar 24, 2015 at 09:26:58AM -0600, Jens Axboe wrote:
> The top bits of bio->bi_flags are reserved for keeping the
> allocation pool, set aside the next four bits for carrying
> a stream ID. That leaves us with support for 15 streams,
> 0 is reserved as a "stream not set" value.
> 
> Add helpers for setting/getting stream ID of a bio.
....
> +/*
> + * after the pool bits, next 4 bits are for the stream id
> + */
> +#define BIO_STREAM_BITS		(4)
> +#define BIO_STREAM_OFFSET	(BITS_PER_LONG - 8)
> +#define BIO_STREAM_MASK		((1 << BIO_STREAM_BITS) - 1)
> +
> +static inline unsigned long streamid_to_flags(unsigned int id)
> +{
> +	return (unsigned long) (id & BIO_STREAM_MASK) << BIO_STREAM_OFFSET;
> +}
> +
> +static inline void bio_set_streamid(struct bio *bio, unsigned int id)
> +{
> +	bio->bi_flags |= streamid_to_flags(id);
> +}
> +
> +static inline unsigned int bio_get_streamid(struct bio *bio)
> +{
> +	return (bio->bi_flags >> BIO_STREAM_OFFSET) & BIO_STREAM_MASK;
> +}
> +
> +static inline bool bio_streamid_valid(struct bio *bio)
> +{
> +	return bio_get_streamid(bio) != 0;
> +}

Need to reserve at least one stream for filesystem private use (e.g.
metadata writeback). Potentially 2 streams - one for the journal
which is frequently overwritten, the other for all other long lived
persistent metadata.

Cheers,

Dave.
Matias Bjørling March 25, 2015, 8:11 a.m. UTC | #6
>> Pushing it higher is not a big deal as far as the implementation goes, though
>> 16 bits might be stealing a bit too much space for this. On 32-bit archs, we
>> have 18 bits currently free that we can abuse. The Samsung device supports
>> 16 streams. That's honestly a lot more than I would expect most devices to
>> support in hardware, 16 is a lot of open erase blocks and write append points.
>> Obviously the open channel effort would make that more feasible, though.
>
> Can we use 8 bits at least? I'll test performance with 16 streams.
>

Ming, can you provide an example of how streams will be managed for 
multiple applications? I can see how it would be efficient for a single 
application, but how will it be managed for multiple applications?

-Matias
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ming Lin-SSI March 25, 2015, 6:36 p.m. UTC | #7
> -----Original Message-----
> From: Matias Bjørling [mailto:m@bjorling.me]
> Sent: Wednesday, March 25, 2015 1:11 AM
> To: Ming Lin-SSI; Jens Axboe; linux-kernel@vger.kernel.org; linux-
> fsdevel@vger.kernel.org
> Subject: Re: [PATCH 1/6] block: add support for carrying a stream ID in a bio
> 
> >> Pushing it higher is not a big deal as far as the implementation
> >> goes, though
> >> 16 bits might be stealing a bit too much space for this. On 32-bit
> >> archs, we have 18 bits currently free that we can abuse. The Samsung
> >> device supports
> >> 16 streams. That's honestly a lot more than I would expect most
> >> devices to support in hardware, 16 is a lot of open erase blocks and write
> append points.
> >> Obviously the open channel effort would make that more feasible,
> though.
> >
> > Can we use 8 bits at least? I'll test performance with 16 streams.
> >
> 
> Ming, can you provide an example of how streams will be managed for
> multiple applications? I can see how it would be efficient for a single
> application, but how will it be managed for multiple applications?

Multiple applications will get different stream-id.
For example,

Application 1:
stream_id = open_stream(NVME_DEVICE_HANDLE,  ....); //maybe stream-id 1
fadvise(fd, stream_id, 0, POSIX_FADV_STREAMID); 
write(fd, buf, count);
close_stream(NVME_DEVICE_HANDLE, stream_id);

Application 2:
stream_id = open_stream(NVME_DEVICE_HANDLE,  ....); //maybe stream-id 2
fadvise(fd, stream_id, 0, POSIX_FADV_STREAMID); 
write(fd, buf, count);
close_stream(NVME_DEVICE_HANDLE, stream_id);

Thanks,
Ming

> 
> -Matias
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dmitry Monakhov April 12, 2015, 10:42 a.m. UTC | #8
Dave Chinner <david@fromorbit.com> writes:

> On Tue, Mar 24, 2015 at 09:26:58AM -0600, Jens Axboe wrote:
>> The top bits of bio->bi_flags are reserved for keeping the
>> allocation pool, set aside the next four bits for carrying
>> a stream ID. That leaves us with support for 15 streams,
>> 0 is reserved as a "stream not set" value.
>> 
>> Add helpers for setting/getting stream ID of a bio.
> ....
>> +/*
>> + * after the pool bits, next 4 bits are for the stream id
>> + */
>> +#define BIO_STREAM_BITS		(4)
>> +#define BIO_STREAM_OFFSET	(BITS_PER_LONG - 8)
>> +#define BIO_STREAM_MASK		((1 << BIO_STREAM_BITS) - 1)
>> +
>> +static inline unsigned long streamid_to_flags(unsigned int id)
>> +{
>> +	return (unsigned long) (id & BIO_STREAM_MASK) << BIO_STREAM_OFFSET;
>> +}
>> +
>> +static inline void bio_set_streamid(struct bio *bio, unsigned int id)
>> +{
>> +	bio->bi_flags |= streamid_to_flags(id);
>> +}
>> +
>> +static inline unsigned int bio_get_streamid(struct bio *bio)
>> +{
>> +	return (bio->bi_flags >> BIO_STREAM_OFFSET) & BIO_STREAM_MASK;
>> +}
>> +
>> +static inline bool bio_streamid_valid(struct bio *bio)
>> +{
>> +	return bio_get_streamid(bio) != 0;
>> +}
>
> Need to reserve at least one stream for filesystem private use (e.g.
> metadata writeback). Potentially 2 streams - one for the journal
> which is frequently overwritten, the other for all other long lived
> persistent metadata.
Definitely. User may set it only if it has CAP_RESOURCES.
This is fun, but we act as a soviet nomenclature who try to monopolize
food distribution system :)
>
> Cheers,
>
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/block/bio.c b/block/bio.c
index f66a4eae16ee..1cd3d745047c 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -567,6 +567,7 @@  void __bio_clone_fast(struct bio *bio, struct bio *bio_src)
 	bio->bi_rw = bio_src->bi_rw;
 	bio->bi_iter = bio_src->bi_iter;
 	bio->bi_io_vec = bio_src->bi_io_vec;
+	bio_set_streamid(bio, bio_get_streamid(bio_src));
 }
 EXPORT_SYMBOL(__bio_clone_fast);
 
@@ -672,6 +673,7 @@  integrity_clone:
 		}
 	}
 
+	bio_set_streamid(bio, bio_get_streamid(bio_src));
 	return bio;
 }
 EXPORT_SYMBOL(bio_clone_bioset);
diff --git a/block/blk-core.c b/block/blk-core.c
index 794c3e7f01cf..6b7b8c95c4c3 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1928,6 +1928,9 @@  void generic_make_request(struct bio *bio)
 	do {
 		struct request_queue *q = bdev_get_queue(bio->bi_bdev);
 
+		if (bio_streamid_valid(bio))
+			blk_add_trace_msg(q, "StreamID=%u", bio_get_streamid(bio));
+
 		q->make_request_fn(q, bio);
 
 		bio = bio_list_pop(current->bio_list);
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index c294e3e25e37..510d1e49ba7d 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -138,9 +138,35 @@  struct bio {
 #define BIO_POOL_BITS		(4)
 #define BIO_POOL_NONE		((1UL << BIO_POOL_BITS) - 1)
 #define BIO_POOL_OFFSET		(BITS_PER_LONG - BIO_POOL_BITS)
-#define BIO_POOL_MASK		(1UL << BIO_POOL_OFFSET)
 #define BIO_POOL_IDX(bio)	((bio)->bi_flags >> BIO_POOL_OFFSET)
 
+/*
+ * after the pool bits, next 4 bits are for the stream id
+ */
+#define BIO_STREAM_BITS		(4)
+#define BIO_STREAM_OFFSET	(BITS_PER_LONG - 8)
+#define BIO_STREAM_MASK		((1 << BIO_STREAM_BITS) - 1)
+
+static inline unsigned long streamid_to_flags(unsigned int id)
+{
+	return (unsigned long) (id & BIO_STREAM_MASK) << BIO_STREAM_OFFSET;
+}
+
+static inline void bio_set_streamid(struct bio *bio, unsigned int id)
+{
+	bio->bi_flags |= streamid_to_flags(id);
+}
+
+static inline unsigned int bio_get_streamid(struct bio *bio)
+{
+	return (bio->bi_flags >> BIO_STREAM_OFFSET) & BIO_STREAM_MASK;
+}
+
+static inline bool bio_streamid_valid(struct bio *bio)
+{
+	return bio_get_streamid(bio) != 0;
+}
+
 #endif /* CONFIG_BLOCK */
 
 /*