diff mbox series

[RFC,v5,1/9] xfs: set t_task at wait time instead of alloc time

Message ID 20200227134321.7238-2-bfoster@redhat.com (mailing list archive)
State Deferred, archived
Headers show
Series xfs: automatic relogging experiment | expand

Commit Message

Brian Foster Feb. 27, 2020, 1:43 p.m. UTC
The xlog_ticket structure contains a task reference to support
blocking for available log reservation. This reference is assigned
at ticket allocation time, which assumes that the transaction
allocator will acquire reservation in the same context. This is
normally true, but will not always be the case with automatic
relogging.

There is otherwise no fundamental reason log space cannot be
reserved for a ticket from a context different from the allocating
context. Move the task assignment to the log reservation blocking
code where it is used.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_log.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Allison Henderson Feb. 27, 2020, 8:48 p.m. UTC | #1
On 2/27/20 6:43 AM, Brian Foster wrote:
> The xlog_ticket structure contains a task reference to support
> blocking for available log reservation. This reference is assigned
> at ticket allocation time, which assumes that the transaction
> allocator will acquire reservation in the same context. This is
> normally true, but will not always be the case with automatic
> relogging.
> 
> There is otherwise no fundamental reason log space cannot be
> reserved for a ticket from a context different from the allocating
> context. Move the task assignment to the log reservation blocking
> code where it is used.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>

Ok, looks ok to me
Reviewed-by: Allison Collins <allison.henderson@oracle.com>

> ---
>   fs/xfs/xfs_log.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> index f6006d94a581..df60942a9804 100644
> --- a/fs/xfs/xfs_log.c
> +++ b/fs/xfs/xfs_log.c
> @@ -262,6 +262,7 @@ xlog_grant_head_wait(
>   	int			need_bytes) __releases(&head->lock)
>   					    __acquires(&head->lock)
>   {
> +	tic->t_task = current;
>   	list_add_tail(&tic->t_queue, &head->waiters);
>   
>   	do {
> @@ -3601,7 +3602,6 @@ xlog_ticket_alloc(
>   	unit_res = xfs_log_calc_unit_res(log->l_mp, unit_bytes);
>   
>   	atomic_set(&tic->t_ref, 1);
> -	tic->t_task		= current;
>   	INIT_LIST_HEAD(&tic->t_queue);
>   	tic->t_unit_res		= unit_res;
>   	tic->t_curr_res		= unit_res;
>
Darrick J. Wong Feb. 27, 2020, 11:28 p.m. UTC | #2
On Thu, Feb 27, 2020 at 08:43:13AM -0500, Brian Foster wrote:
> The xlog_ticket structure contains a task reference to support
> blocking for available log reservation. This reference is assigned
> at ticket allocation time, which assumes that the transaction
> allocator will acquire reservation in the same context. This is
> normally true, but will not always be the case with automatic
> relogging.
> 
> There is otherwise no fundamental reason log space cannot be
> reserved for a ticket from a context different from the allocating
> context. Move the task assignment to the log reservation blocking
> code where it is used.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
>  fs/xfs/xfs_log.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> index f6006d94a581..df60942a9804 100644
> --- a/fs/xfs/xfs_log.c
> +++ b/fs/xfs/xfs_log.c
> @@ -262,6 +262,7 @@ xlog_grant_head_wait(
>  	int			need_bytes) __releases(&head->lock)
>  					    __acquires(&head->lock)
>  {
> +	tic->t_task = current;
>  	list_add_tail(&tic->t_queue, &head->waiters);
>  
>  	do {
> @@ -3601,7 +3602,6 @@ xlog_ticket_alloc(
>  	unit_res = xfs_log_calc_unit_res(log->l_mp, unit_bytes);
>  
>  	atomic_set(&tic->t_ref, 1);
> -	tic->t_task		= current;

Hm.  So this leaves t_task set to NULL in the ticket constructor in
favor of setting it in xlog_grant_head_wait.  I guess this implies that
some future piece will be able to transfer a ticket to another process
as part of a regrant or something?

I've been wondering lately if you could transfer a dirty permanent
transaction to a different task so that the front end could return to
userspace as soon as the first transaction (with the intent items)
commits, and then you could reduce the latency of front-end system
calls.  That's probably a huge fantasy since you'd also have to transfer
a whole ton of state to that worker and whatever you locked to do the
operation remains locked...

--D

>  	INIT_LIST_HEAD(&tic->t_queue);
>  	tic->t_unit_res		= unit_res;
>  	tic->t_curr_res		= unit_res;
> -- 
> 2.21.1
>
Dave Chinner Feb. 28, 2020, 12:10 a.m. UTC | #3
On Thu, Feb 27, 2020 at 03:28:53PM -0800, Darrick J. Wong wrote:
> On Thu, Feb 27, 2020 at 08:43:13AM -0500, Brian Foster wrote:
> > The xlog_ticket structure contains a task reference to support
> > blocking for available log reservation. This reference is assigned
> > at ticket allocation time, which assumes that the transaction
> > allocator will acquire reservation in the same context. This is
> > normally true, but will not always be the case with automatic
> > relogging.
> > 
> > There is otherwise no fundamental reason log space cannot be
> > reserved for a ticket from a context different from the allocating
> > context. Move the task assignment to the log reservation blocking
> > code where it is used.
> > 
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > ---
> >  fs/xfs/xfs_log.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> > index f6006d94a581..df60942a9804 100644
> > --- a/fs/xfs/xfs_log.c
> > +++ b/fs/xfs/xfs_log.c
> > @@ -262,6 +262,7 @@ xlog_grant_head_wait(
> >  	int			need_bytes) __releases(&head->lock)
> >  					    __acquires(&head->lock)
> >  {
> > +	tic->t_task = current;
> >  	list_add_tail(&tic->t_queue, &head->waiters);
> >  
> >  	do {
> > @@ -3601,7 +3602,6 @@ xlog_ticket_alloc(
> >  	unit_res = xfs_log_calc_unit_res(log->l_mp, unit_bytes);
> >  
> >  	atomic_set(&tic->t_ref, 1);
> > -	tic->t_task		= current;
> 
> Hm.  So this leaves t_task set to NULL in the ticket constructor in
> favor of setting it in xlog_grant_head_wait.  I guess this implies that
> some future piece will be able to transfer a ticket to another process
> as part of a regrant or something?
> 
> I've been wondering lately if you could transfer a dirty permanent
> transaction to a different task so that the front end could return to
> userspace as soon as the first transaction (with the intent items)
> commits, and then you could reduce the latency of front-end system
> calls.  That's probably a huge fantasy since you'd also have to transfer
> a whole ton of state to that worker and whatever you locked to do the
> operation remains locked...

Yup, that's basically the idea I've raised in the past for "async
XFS" where the front end is completely detached from the back end
that does the internal work. i.e deferred ops are the basis for
turning XFS into a huge async processing machine.

This isn't a new idea - tux3 was based around this "async back end"
concept, too.

Cheers,

Dave.
Brian Foster Feb. 28, 2020, 1:46 p.m. UTC | #4
On Fri, Feb 28, 2020 at 11:10:00AM +1100, Dave Chinner wrote:
> On Thu, Feb 27, 2020 at 03:28:53PM -0800, Darrick J. Wong wrote:
> > On Thu, Feb 27, 2020 at 08:43:13AM -0500, Brian Foster wrote:
> > > The xlog_ticket structure contains a task reference to support
> > > blocking for available log reservation. This reference is assigned
> > > at ticket allocation time, which assumes that the transaction
> > > allocator will acquire reservation in the same context. This is
> > > normally true, but will not always be the case with automatic
> > > relogging.
> > > 
> > > There is otherwise no fundamental reason log space cannot be
> > > reserved for a ticket from a context different from the allocating
> > > context. Move the task assignment to the log reservation blocking
> > > code where it is used.
> > > 
> > > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > > ---
> > >  fs/xfs/xfs_log.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> > > index f6006d94a581..df60942a9804 100644
> > > --- a/fs/xfs/xfs_log.c
> > > +++ b/fs/xfs/xfs_log.c
> > > @@ -262,6 +262,7 @@ xlog_grant_head_wait(
> > >  	int			need_bytes) __releases(&head->lock)
> > >  					    __acquires(&head->lock)
> > >  {
> > > +	tic->t_task = current;
> > >  	list_add_tail(&tic->t_queue, &head->waiters);
> > >  
> > >  	do {
> > > @@ -3601,7 +3602,6 @@ xlog_ticket_alloc(
> > >  	unit_res = xfs_log_calc_unit_res(log->l_mp, unit_bytes);
> > >  
> > >  	atomic_set(&tic->t_ref, 1);
> > > -	tic->t_task		= current;
> > 
> > Hm.  So this leaves t_task set to NULL in the ticket constructor in
> > favor of setting it in xlog_grant_head_wait.  I guess this implies that
> > some future piece will be able to transfer a ticket to another process
> > as part of a regrant or something?
> > 

Pretty much.. it's mostly just breaking the assumption that the task
that allocates a log ticket is necessarily the one that acquires log
reservation (or regrants it). The purpose of this change is so that any
particular task could allocate (and reserve) a relog ticket and donate
it to the relog mechanism (a separate task) for use (i.e. to roll it).

> > I've been wondering lately if you could transfer a dirty permanent
> > transaction to a different task so that the front end could return to
> > userspace as soon as the first transaction (with the intent items)
> > commits, and then you could reduce the latency of front-end system
> > calls.  That's probably a huge fantasy since you'd also have to transfer
> > a whole ton of state to that worker and whatever you locked to do the
> > operation remains locked...
> 
> Yup, that's basically the idea I've raised in the past for "async
> XFS" where the front end is completely detached from the back end
> that does the internal work. i.e deferred ops are the basis for
> turning XFS into a huge async processing machine.
> 

I think we've discussed this in the past, though I'm not clear on
whether it rely on this sort of change. Either way, there's a big
difference in scope between the tweak made by this patch and the design
of a generic async XFS front-end. :)

Brian

> This isn't a new idea - tux3 was based around this "async back end"
> concept, too.
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
>
diff mbox series

Patch

diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index f6006d94a581..df60942a9804 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -262,6 +262,7 @@  xlog_grant_head_wait(
 	int			need_bytes) __releases(&head->lock)
 					    __acquires(&head->lock)
 {
+	tic->t_task = current;
 	list_add_tail(&tic->t_queue, &head->waiters);
 
 	do {
@@ -3601,7 +3602,6 @@  xlog_ticket_alloc(
 	unit_res = xfs_log_calc_unit_res(log->l_mp, unit_bytes);
 
 	atomic_set(&tic->t_ref, 1);
-	tic->t_task		= current;
 	INIT_LIST_HEAD(&tic->t_queue);
 	tic->t_unit_res		= unit_res;
 	tic->t_curr_res		= unit_res;