diff mbox series

[v2,2/2] git-gui: revert untracked files by deleting them

Message ID 9469beb59937f87647190cf7f56544b8c27e20b6.1573110335.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series git-gui: revert untracked files by deleting them | expand

Commit Message

John Passaro via GitGitGadget Nov. 7, 2019, 7:05 a.m. UTC
From: Jonathan Gilbert <JonathanG@iQmetrix.com>

Update the revert_helper procedure to also detect untracked files. If
files are present, the user is asked if they want them deleted. Perform
the deletion in batches, using new proc delete_files with helper
delete_helper, to allow the UI to remain responsive. Coordinate the
completion of multiple overlapping asynchronous operations using a new
construct called a "chord". Migrate unlocking of the index out of
_close_updateindex to a responsibility of the caller, to permit paths
that don't directly unlock the index.

Signed-off-by: Jonathan Gilbert <JonathanG@iQmetrix.com>
---
 lib/chord.tcl | 137 ++++++++++++++++++++++++
 lib/index.tcl | 288 +++++++++++++++++++++++++++++++++++++++++++-------
 2 files changed, 385 insertions(+), 40 deletions(-)
 create mode 100644 lib/chord.tcl

Comments

Pratyush Yadav Nov. 11, 2019, 7:25 p.m. UTC | #1
Hi Jonathan,

Thanks for the re-roll. Some comments below. Apart from those comments, 
this looks close to good enough for merging :)

On 07/11/19 07:05AM, Jonathan Gilbert via GitGitGadget wrote:
> From: Jonathan Gilbert <JonathanG@iQmetrix.com>
> 
> Update the revert_helper procedure to also detect untracked files. If
> files are present, the user is asked if they want them deleted. Perform
> the deletion in batches, using new proc delete_files with helper
> delete_helper, to allow the UI to remain responsive. Coordinate the
> completion of multiple overlapping asynchronous operations using a new
> construct called a "chord". Migrate unlocking of the index out of
> _close_updateindex to a responsibility of the caller, to permit paths
> that don't directly unlock the index.
> 
> Signed-off-by: Jonathan Gilbert <JonathanG@iQmetrix.com>
> ---
>  lib/chord.tcl | 137 ++++++++++++++++++++++++
>  lib/index.tcl | 288 +++++++++++++++++++++++++++++++++++++++++++-------
>  2 files changed, 385 insertions(+), 40 deletions(-)
>  create mode 100644 lib/chord.tcl
> 
> diff --git a/lib/chord.tcl b/lib/chord.tcl
> new file mode 100644
> index 0000000000..2d13af14fc
> --- /dev/null
> +++ b/lib/chord.tcl
> @@ -0,0 +1,137 @@

The 'class' documentation [0] suggests adding a "package require TclOO". 
But TclOO ships by default with Tcl 8.6 and above. So, I'm not really 
sure if we need this.

> +# SimpleChord class:
> +#   Represents a procedure that conceptually has multiple entrypoints that must
> +#   all be called before the procedure executes. Each entrypoint is called a
> +#   "note". The chord is only "completed" when all the notes are "activated".
> +#
> +#   Constructor:
> +#     set chord [SimpleChord new {body}]
> +#       Creates a new chord object with the specified body script. The body
> +#       script is evaluated at most once, when a note is activated and the
> +#       chord has no other non-activated notes.
> +#
> +#   Methods:
> +#     $chord eval {script}
> +#       Runs the specified script in the same context (namespace) in which the
> +#       chord body will be evaluated. This can be used to set variable values
> +#       for the chord body to use.
> +#
> +#     set note [$chord add_note]
> +#       Adds a new note to the chord, an instance of ChordNote. Raises an
> +#       error if the chord is already completed, otherwise the chord is updated
> +#       so that the new note must also be activated before the body is
> +#       evaluated.
> +#
> +#     $chord notify_note_activation
> +#       For internal use only.
> +#
> +# ChordNote class:
> +#   Represents a note within a chord, providing a way to activate it. When the
> +#   final note of the chord is activated (this can be any note in the chord,
> +#   with all other notes already previously activated in any order), the chord's
> +#   body is evaluated.
> +#
> +#   Constructor:
> +#     Instances of ChordNote are created internally by calling add_note on
> +#     SimpleChord objects.
> +#
> +#   Methods:
> +#     [$note is_activated]
> +#       Returns true if this note has already been activated.
> +#
> +#     $note
> +#       Activates the note, if it has not already been activated, and completes
> +#       the chord if there are no other notes awaiting activation. Subsequent
> +#       calls will have no further effect.

Nice to see some good documentation!

One nitpick: would it make more sense to have the documentation for a 
method/constructor just above that method/constructor? This way, when 
someone updates the code some time later, they'll also hopefully 
remember to update the documentation. It is much more likely to be stale 
if all of it just stays on the top.

> +#
> +# Example:
> +#
> +#   # Turn off the UI while running a couple of async operations.
> +#   lock_ui
> +#
> +#   set chord [SimpleChord new {
> +#     unlock_ui
> +#     # Note: $notice here is not referenced in the calling scope
> +#     if {$notice} { info_popup $notice }
> +#   }
> +#
> +#   # Configure a note to keep the chord from completing until
> +#   # all operations have been initiated.
> +#   set common_note [$chord add_note]
> +#
> +#   # Pass notes as 'after' callbacks to other operations
> +#   async_operation $args [$chord add_note]
> +#   other_async_operation $args [$chord add_note]
> +#
> +#   # Communicate with the chord body
> +#   if {$condition} {
> +#     # This sets $notice in the same context that the chord body runs in.
> +#     $chord eval { set notice "Something interesting" }
> +#   }
> +#
> +#   # Activate the common note, making the chord eligible to complete
> +#   $common_note
> +#
> +# At this point, the chord will complete at some unknown point in the future.
> +# The common note might have been the first note activated, or the async
> +# operations might have completed synchronously and the common note is the
> +# last one, completing the chord before this code finishes, or anything in
> +# between. The purpose of the chord is to not have to worry about the order.
> +
> +oo::class create SimpleChord {

This comes from the TclOO package, right?

git-gui has its own object-oriented system (lib/class.tcl). It was 
written circa 2007. I suspect something like TclOO did not exist back 
then.

Why not use that? Does it have some limitations that TclOO does not 
have? I do not mind using the "official" OO system. I just want to know 
why exactly you made the choice.

We would end up mixing the two implementations/flavors in the same 
codebase, but as long as they don't interfere with each other and are 
cross compatible (which I think they are, but I haven't tested), I don't 
mind some "modernization" of the codebase. 

More importantly, TclOO ships as part of the core distribution with Tcl 
8.6, but as of now the minimum version required for git-gui is 8.4. So, 
I think we should bump the minimum version (8.6 released circa 2012, so 
most people should have caught up by now I hope).

> +	variable Notes
> +	variable Body
> +	variable IsCompleted

Nitpick: Please use snake_case, here and in other places.

> +
> +	constructor {body} {
> +		set Notes [list]
> +		set Body $body
> +		set IsCompleted 0
> +	}
> +
> +	method eval {script} {
> +		namespace eval [self] $script
> +	}
> +
> +	method add_note {} {
> +		if {$IsCompleted} { error "Cannot add a note to a completed chord" }
> +
> +		set note [ChordNote new [self]]
> +
> +		lappend Notes $note
> +
> +		return $note
> +	}
> +
> +	method notify_note_activation {} {

Since this method is for internal use only, can it be made "private"? 
Does the OO library support something like this?

> +		if {!$IsCompleted} {
> +			foreach note $Notes {
> +				if {![$note is_activated]} { return }
> +			}
> +
> +			set IsCompleted 1
> +
> +			namespace eval [self] $Body
> +			namespace delete [self]
> +		}
> +	}
> +}
> +
> +oo::class create ChordNote {
> +	variable Chord IsActivated
> +
> +	constructor {chord} {
> +		set Chord $chord
> +		set IsActivated 0
> +	}
> +
> +	method is_activated {} {
> +		return $IsActivated
> +	}
> +
> +	method unknown {} {

I'm a bit lost here. This method is named 'unknown', but searching for 
'unknown' in this patch just gives me two results: this line here, and 
then one in a comment at the start of the file.

From what I understand looking at the code, it some sort of a "default" 
method, and is called when you run just `$chord_note`. How exactly is 
this method designated to be the default?

Also, "unknown" makes little sense in this context. Can you rename it to 
something more meaningful? Maybe something like "activate_note"?

> +		if {!$IsActivated} {
> +			set IsActivated 1
> +			$Chord notify_note_activation
> +		}
> +	}
> +}

From what I understand, the "Note" object is effectively used as a 
count. There is no other state associated with it. When I first heard of 
your description of this abstraction, I assumed that a Note would also 
store a script to execute with it. So, when you "activate" a note, it 
would first execute the script, and then mark itself as "activated", and 
notify the chord. Would that abstraction make more sense?

I don't really mind keeping it this way, but I wonder if that design 
would make the abstraction easier to wrap your head around.

> diff --git a/lib/index.tcl b/lib/index.tcl
> index 28d4d2a54e..64046d6833 100644
> --- a/lib/index.tcl
> +++ b/lib/index.tcl
> @@ -7,7 +7,7 @@ proc _delete_indexlock {} {
>  	}
>  }
>  
> -proc _close_updateindex {fd after} {
> +proc _close_updateindex {fd} {
>  	global use_ttk NS
>  	fconfigure $fd -blocking 1
>  	if {[catch {close $fd} err]} {
> @@ -52,8 +52,6 @@ proc _close_updateindex {fd after} {
>  	}
>  
>  	$::main_status stop
> -	unlock_index
> -	uplevel #0 $after

There is a call to unlock_index in the body of the if statement above 
too. Do we want to remove that too, or should it be left alone?

But immediately after the unlocking of the index there, a call to 
`rescan` is made. `rescan` acquired the lock, so it would fail if we do 
not unlock the index there. Note that `rescan` itself is asynchronous. 

Since every call to `_close_updateindex` is followed by an index unlock, 
it would mean the index would be unlocked for the rescan while it is in 
progress (for all calls other than the one from `write_checkout_index`). 
What a mess!

That codepath seems to be taken when a major error happens, and we just 
resign to our fate and get a fresh start by doing a rescan and syncing 
the repo state. So it is quite likely whatever operation we were doing 
failed spectacularly.

Maybe the answer is to swallow the bitter pill and introduce a 
switch/boolean in `_close_updateindex` that controls whether the index 
is unlocked or not. We unlock it when the if statement is not taken, and 
keep the current codepath when it is. I call it a "bitter pill" because 
I'm usually not a huge fan of adding knobs like that in functions. Makes 
the function harder to reason about and makes it more bug prone.

If you can think of a better/cleaner way of working around this, 
suggestions are welcome!

>  }
>  
>  proc update_indexinfo {msg path_list after} {
> @@ -90,7 +88,9 @@ proc write_update_indexinfo {fd path_list total_cnt batch after} {
>  	global file_states current_diff_path
>  
>  	if {$update_index_cp >= $total_cnt} {
> -		_close_updateindex $fd $after
> +		_close_updateindex $fd
> +		unlock_index
> +		uplevel #0 $after
>  		return
>  	}
>  
> @@ -156,7 +156,9 @@ proc write_update_index {fd path_list total_cnt batch after} {
>  	global file_states current_diff_path
>  
>  	if {$update_index_cp >= $total_cnt} {
> -		_close_updateindex $fd $after
> +		_close_updateindex $fd
> +		unlock_index
> +		uplevel #0 $after
>  		return
>  	}
>  
> @@ -233,7 +235,8 @@ proc write_checkout_index {fd path_list total_cnt batch after} {
>  	global file_states current_diff_path
>  
>  	if {$update_index_cp >= $total_cnt} {
> -		_close_updateindex $fd $after
> +		_close_updateindex $fd $do_unlock_index $after
> +		uplevel #0 $after

_close_updateindex takes only one argument, and you pass it 3. 
$do_unlock_index does not seem to be defined anywhere. $after is 
evaluated just after this line, and _close_updateindex doesn't accept 
the argument anyway. I suspect this is a leftover from a different 
approach you tried before this one.

Also, unlike all the other places where _close_updateindex is used, this 
one does not make a call to unlock_index. Is that intended? IIUC, it 
should be intended, since this is the part which uses the "chord", but a 
confirmation would be nice.

>  		return
>  	}
>  
> @@ -393,61 +396,266 @@ proc revert_helper {txt paths} {
>  
>  	if {![lock_index begin-update]} return
>  
> +	# Common "after" functionality that waits until multiple asynchronous
> +	# operations are complete (by waiting for them to activate their notes
> +	# on the chord).

Nitpick: mention what the "multiple asynchronous operations" are exactly 
(i.e, they are the deletion and index checkout operations).

> +	set after_chord [SimpleChord new {
> +		unlock_index
> +		if {$should_reshow_diff} { reshow_diff }
> +		ui_ready
> +	}]
> +
> +	$after_chord eval { set should_reshow_diff 0 }
> +
> +	# We don't know how many notes we're going to create (it's dynamic based
> +	# on conditional paths below), so create a common note that will delay
> +	# the chord's completion until we activate it, and then activate it
> +	# after all the other notes have been created.
> +	set after_common_note [$after_chord add_note]
> +
>  	set path_list [list]
> +	set untracked_list [list]
>  	set after {}

'after' seems to be an unused variable. This line can be deleted.

>  	foreach path $paths {
>  		switch -glob -- [lindex $file_states($path) 0] {
>  		U? {continue}
> +		?O {
> +			lappend untracked_list $path
> +		}
>  		?M -
>  		?T -
>  		?D {
>  			lappend path_list $path
>  			if {$path eq $current_diff_path} {
> -				set after {reshow_diff;}
> +				$after_chord eval { set should_reshow_diff 1 }
>  			}
>  		}
>  		}
>  	}
>  
> +	set path_cnt [llength $path_list]
> +	set untracked_cnt [llength $untracked_list]
>  
> -	# Split question between singular and plural cases, because
> -	# such distinction is needed in some languages. Previously, the
> -	# code used "Revert changes in" for both, but that can't work
> -	# in languages where 'in' must be combined with word from
> -	# rest of string (in different way for both cases of course).
> -	#
> -	# FIXME: Unfortunately, even that isn't enough in some languages
> -	# as they have quite complex plural-form rules. Unfortunately,
> -	# msgcat doesn't seem to support that kind of string translation.
> -	#
> -	set n [llength $path_list]
> -	if {$n == 0} {
> -		unlock_index
> -		return
> -	} elseif {$n == 1} {
> -		set query [mc "Revert changes in file %s?" [short_path [lindex $path_list]]]
> -	} else {
> -		set query [mc "Revert changes in these %i files?" $n]
> -	}
> +	if {$path_cnt > 0} {
> +		# Split question between singular and plural cases, because
> +		# such distinction is needed in some languages. Previously, the
> +		# code used "Revert changes in" for both, but that can't work
> +		# in languages where 'in' must be combined with word from
> +		# rest of string (in different way for both cases of course).
> +		#
> +		# FIXME: Unfortunately, even that isn't enough in some languages
> +		# as they have quite complex plural-form rules. Unfortunately,
> +		# msgcat doesn't seem to support that kind of string
> +		# translation.
> +		#
> +		if {$path_cnt == 1} {
> +			set query [mc \
> +				"Revert changes in file %s?" \
> +				[short_path [lindex $path_list]] \
> +				]
> +		} else {
> +			set query [mc \
> +				"Revert changes in these %i files?" \
> +				$path_cnt]
> +		}
>  
> -	set reply [tk_dialog \
> -		.confirm_revert \
> -		"[appname] ([reponame])" \
> -		"$query
> +		set reply [tk_dialog \
> +			.confirm_revert \
> +			"[appname] ([reponame])" \
> +			"$query
>  
>  [mc "Any unstaged changes will be permanently lost by the revert."]" \
> -		question \
> -		1 \
> -		[mc "Do Nothing"] \
> -		[mc "Revert Changes"] \
> -		]
> -	if {$reply == 1} {
> -		checkout_index \
> -			$txt \
> +			question \
> +			1 \
> +			[mc "Do Nothing"] \
> +			[mc "Revert Changes"] \
> +			]
> +
> +		if {$reply == 1} {
> +			checkout_index \
> +				$txt \
> +				$path_list \
> +				[$after_chord add_note]
> +		}
> +	}
> +
> +	if {$untracked_cnt > 0} {
> +		# Split question between singular and plural cases, because
> +		# such distinction is needed in some languages.
> +		#
> +		# FIXME: Unfortunately, even that isn't enough in some languages
> +		# as they have quite complex plural-form rules. Unfortunately,
> +		# msgcat doesn't seem to support that kind of string
> +		# translation.
> +		#
> +		if {$untracked_cnt == 1} {
> +			set query [mc \
> +				"Delete untracked file %s?" \
> +				[short_path [lindex $untracked_list]] \
> +				]
> +		} else {
> +			set query [mc \
> +				"Delete these %i untracked files?" \
> +				$untracked_cnt \
> +				]
> +		}
> +
> +		set reply [tk_dialog \
> +			.confirm_revert \
> +			"[appname] ([reponame])" \
> +			"$query
> +
> +[mc "Files will be permanently deleted."]" \
> +			question \
> +			1 \
> +			[mc "Do Nothing"] \
> +			[mc "Delete Files"] \
> +			]
> +
> +		if {$reply == 1} {
> +			$after_chord eval { set should_reshow_diff 1 }
> +
> +			delete_files $untracked_list [$after_chord add_note]
> +		}
> +	}
> +
> +	# Activate the common note. If no other notes were created, this
> +	# completes the chord. If other notes were created, then this common
> +	# note prevents a race condition where the chord might complete early.
> +	$after_common_note
> +}
> +
> +# Delete all of the specified files, performing deletion in batches to allow the
> +# UI to remain responsive and updated.
> +proc delete_files {path_list after} {
> +	# Enable progress bar status updates
> +	$::main_status start [mc "Deleting"] [mc "files"]
> +
> +	set path_index 0
> +	set deletion_errors [list]
> +	set batch_size 50
> +
> +	delete_helper \
> +		$path_list \
> +		$path_index \
> +		$deletion_errors \
> +		$batch_size \
> +		$after
> +}
> +
> +# Helper function to delete a list of files in batches. Each call deletes one
> +# batch of files, and then schedules a call for the next batch after any UI
> +# messages have been processed.
> +proc delete_helper {path_list path_index deletion_errors batch_size after} {
> +	global file_states
> +
> +	set path_cnt [llength $path_list]
> +
> +	set batch_remaining $batch_size
> +
> +	while {$batch_remaining > 0} {
> +		if {$path_index >= $path_cnt} { break }
> +
> +		set path [lindex $path_list $path_index]
> +
> +		set deletion_failed [catch {file delete -- $path} deletion_error]
> +
> +		if {$deletion_failed} {
> +			lappend deletion_errors [list "$deletion_error"]
> +		} else {
> +			remove_empty_directories [file dirname $path]
> +
> +			# Don't assume the deletion worked. Remove the file from
> +			# the UI, but only if it no longer exists.
> +			if {![path_exists $path]} {
> +				unset file_states($path)
> +				display_file $path __
> +			}
> +		}
> +
> +		incr path_index 1
> +		incr batch_remaining -1
> +	}
> +
> +	# Update the progress bar to indicate that this batch has been
> +	# completed. The update will be visible when this procedure returns
> +	# and allows the UI thread to process messages.
> +	$::main_status update $path_index $path_cnt
> +
> +	if {$path_index < $path_cnt} {
> +		# The Tcler's Wiki lists this as the best practice for keeping
> +		# a UI active and processing messages during a long-running
> +		# operation.
> +
> +		after idle [list after 0 [list \
> +			delete_helper \
>  			$path_list \
> -			[concat $after [list ui_ready]]
> +			$path_index \
> +			$deletion_errors \
> +			$batch_size \
> +			$after
> +			]]
>  	} else {
> -		unlock_index
> +		# Finish the status bar operation.
> +		$::main_status stop
> +
> +		# Report error, if any, based on how many deletions failed.
> +		set deletion_error_cnt [llength $deletion_errors]
> +
> +		if {($deletion_error_cnt > 0) && ($deletion_error_cnt <= [MAX_VERBOSE_FILES_IN_DELETION_ERROR])} {

Nitpick: please split the line into two.

> +			set error_text "Encountered errors deleting files:\n"

Wrap the string in a `mc [...]` so it can be translated some time in the 
future.

> +
> +			foreach deletion_error $deletion_errors {
> +				append error_text "* [lindex $deletion_error 0]\n"
> +			}
> +
> +			error_popup $error_text
> +		} elseif {$deletion_error_cnt == $path_cnt} {
> +			error_popup [mc \
> +				"None of the %d selected files could be deleted." \
> +				$path_cnt \
> +				]
> +		} elseif {$deletion_error_cnt > 1} {
> +			error_popup [mc \
> +				"%d of the %d selected files could not be deleted." \
> +				$deletion_error_cnt \
> +				$path_cnt \
> +				]

Nice! In case someone in the future wants to have a config variable to 
change this limit, this makes it pretty easy to do so. 

> +		}
> +
> +		uplevel #0 $after
> +	}
> +}
> +
> +proc MAX_VERBOSE_FILES_IN_DELETION_ERROR {} { return 10; }

Why use a procedure, and not a global variable? My guess is to make it 
impossible for some code to change this value by mistake. Do I guess 
correctly?

> +
> +# This function is from the TCL documentation:
> +#
> +#   https://wiki.tcl-lang.org/page/file+exists
> +#
> +# [file exists] returns false if the path does exist but is a symlink to a path
> +# that doesn't exist. This proc returns true if the path exists, regardless of
> +# whether it is a symlink and whether it is broken.
> +proc path_exists {name} {
> +	expr {![catch {file lstat $name finfo}]}
> +}
> +
> +# Remove as many empty directories as we can starting at the specified path,
> +# walking up the directory tree. If we encounter a directory that is not
> +# empty, or if a directory deletion fails, then we stop the operation and
> +# return to the caller. Even if this procedure fails to delete any
> +# directories at all, it does not report failure.
> +proc remove_empty_directories {directory_path} {
> +	set parent_path [file dirname $directory_path]
> +
> +	while {$parent_path != $directory_path} {
> +		set contents [glob -nocomplain -dir $directory_path *]
> +
> +		if {[llength $contents] > 0} { break }
> +		if {[catch {file delete -- $directory_path}]} { break }
> +
> +		set directory_path $parent_path
> +		set parent_path [file dirname $directory_path]
>  	}
>  }

Wew! This took longer than I expected ;)

Tested on Linux. Works fine after fixing the extra arguments passed to 
`_close_updateindex`. Thanks.

[0] https://www.tcl.tk/man/tcl8.6/TclCmd/class.htm
Jonathan Gilbert Nov. 11, 2019, 9:55 p.m. UTC | #2
On Mon, Nov 11, 2019 at 1:25 PM Pratyush Yadav me-at-yadavpratyush.com
|GitHub Public/Example Allow| <172q77k4bxwj0zt@sneakemail.com> wrote:
> On 07/11/19 07:05AM, Jonathan Gilbert via GitGitGadget wrote:
> > --- /dev/null
> > +++ b/lib/chord.tcl
> > @@ -0,0 +1,137 @@
>
> The 'class' documentation [0] suggests adding a "package require TclOO".
> But TclOO ships by default with Tcl 8.6 and above. So, I'm not really
> sure if we need this.

I'm not super familiar with it. I just checked what Tcl version I was
myself running, since it's only there because of the Git Gui
installation bundled with Git for Windows, and it was 8.6, so I
assumed it was fair game to use. It didn't occur to me that you could
already have an older version of Tcl installed and have Git Gui use
it. :-) So, if I'm understanding correctly, `TclOO` as a package could
potentially be used to allow TclOO to be used with 8.4, the minimum
supported version you mention below, and it just happened to work for
me in my testing without that because I have 8.6 installed but that's
technically newer than the supported baseline?

> Nice to see some good documentation!
>
> One nitpick: would it make more sense to have the documentation for a
> method/constructor just above that method/constructor? This way, when
> someone updates the code some time later, they'll also hopefully
> remember to update the documentation. It is much more likely to be stale
> if all of it just stays on the top.

Hmm, what do you think of both? I was thinking of the documentation as
a single self-contained block that someone could read to put together
an understanding of how the chord system fits together, and split out,
it wouldn't have that readability. What about a more abstract
description in a block at the top, and then more technically-detailed
& specific descriptions attached to each method?

> > +oo::class create SimpleChord {
>
> This comes from the TclOO package, right?
>
> git-gui has its own object-oriented system (lib/class.tcl). It was
> written circa 2007. I suspect something like TclOO did not exist back
> then.
>
> Why not use that? Does it have some limitations that TclOO does not
> have? I do not mind using the "official" OO system. I just want to know
> why exactly you made the choice.

Having limited experience with Tcl, I did a Google search for "tcl
object oriented" and ended up writing code using TclOO because that's
what came up. Do you think it makes sense to rework this to use
`class.tcl`, or perhaps instead the opposite: have a policy of using
the standard TclOO going forward, and let the rest of Git Gui
organically upgrade itself to some hypothetical point in the future
where class.tcl is no longer used by anything?

> More importantly, TclOO ships as part of the core distribution with Tcl
> 8.6, but as of now the minimum version required for git-gui is 8.4. So,
> I think we should bump the minimum version (8.6 released circa 2012, so
> most people should have caught up by now I hope).

If I understand correctly, you mentioned that TclOO was intrinsically
available to me because I was using Tcl 8.6, and that the manual
recommends `package require TclOO` -- does that package dependency
permit the use of TclOO on 8.4? If so, could that be a way to avoid
bumping the minimum version required? Simply in the interest of
keeping the scope of the change limited. If not, then bumping the
minimum required version to 8.6 from 2012 doesn't seem entirely
unreasonable either. :-)

> > +     variable Notes
> > +     variable Body
> > +     variable IsCompleted
>
> Nitpick: Please use snake_case, here and in other places.

Okay, yep -- I had copied the convention that I saw in TclOO examples,
conscious of the fact that there might be a standard specific to
object-oriented Tcl.

> > +     method notify_note_activation {} {
>
> Since this method is for internal use only, can it be made "private"?
> Does the OO library support something like this?

I don't think so, because it's called from outside the class. What
we'd be looking for is something like C++'s "friend" syntax. Tcl
doesn't seem to have this. Though, I just did some further Googling,
and saw a hint that it might be possible to bypass member security on
a case-by-case basis, so that the method is private but `ChordNote` is
able to call it anyway. I'll see if I can't figure this out. :-)

> > +     method unknown {} {
>
> I'm a bit lost here. This method is named 'unknown', but searching for
> 'unknown' in this patch just gives me two results: this line here, and
> then one in a comment at the start of the file.
>
> From what I understand looking at the code, it some sort of a "default"
> method, and is called when you run just `$chord_note`. How exactly is
> this method designated to be the default?
>
> Also, "unknown" makes little sense in this context. Can you rename it to
> something more meaningful? Maybe something like "activate_note"?

I think it's the fact that it is named `unknown` that makes it the
"default" method. I think this just needs documentary comments next to
it. The TclOO documentation says:

> obj unknown ?methodName? ?arg ...?
> This method is called when an attempt to invoke the method methodName on
> object obj fails. The arguments that the user supplied to the method are
> given as arg arguments. If methodName is absent, the object was invoked with
> no method name at all (or any other arguments).

It was based on that last sentence that I interpreted `unknown` as,
"This is a mechanism for making an object that can be called like a
method."

> > +             if {!$IsActivated} {
> > +                     set IsActivated 1
> > +                     $Chord notify_note_activation
> > +             }
> > +     }
> > +}
>
> From what I understand, the "Note" object is effectively used as a
> count. There is no other state associated with it. When I first heard of
> your description of this abstraction, I assumed that a Note would also
> store a script to execute with it. So, when you "activate" a note, it
> would first execute the script, and then mark itself as "activated", and
> notify the chord. Would that abstraction make more sense?
>
> I don't really mind keeping it this way, but I wonder if that design
> would make the abstraction easier to wrap your head around.

I learned about the concept of chords and notes from an experimental
language that Microsoft created many years back called "Polyphonic C#"
(which in turn got rolled into "Cw" (C-omega)), and in that
abstraction, the idea was that, well, as a baseline, for starters, we
have methods and each one, conceptually, has an entrypoint with a
certain set of parameters, and when you call that entrypoint, the
parameters are all set and the body runs. With a "chord", you have
more than one entrypoint attached to the same body -- the entrypoints
themselves don't have any logic associated with them individually.
Each note has its own parameter list, and when all the notes have been
called, the body is run with _all_ of those parameters.

I drew some ASCII art, don't know if it'll translate in the message,
but here goes :-)

Basic method (or, if you will, a "chord" with only one "note"):

           (caller)
              |
    void Add(int X, int Y)
              |
      { output(X + Y) }

A "chord" with two "notes":

        (caller)                (caller)
            |                       |
    void AddX(int X)         void AddY(int Y)
            |                       |
            `-----------.-----------'
                        |
                { output(X + Y) }

The specific details differ from what I've written here. In Polyphonic
C#, you don't have to instantiate a chord, you simply start calling
methods, and the runtime matches up complete sets dynamically. (Just
thinking through the implications of this, if the notes aren't all
called at exactly the same rate this obviously leads very easily to
bugs that chew up all memory on incomplete chords. :-P) Also,
Microsoft's language has parameters to each of the notes that are
_all_ passed to the body at once. My implementation here is a "simple"
chord, I didn't bother with arguments, as they aren't needed in this
usage :-) I also found it much simpler to think of implementing the
chord with the activations being explicit instead of implicit. So
instead of saying up front, "Here is my method body and here are its 3
entrypoints", with this implementation the chord is a dynamic object,
you say "Here is my method body" and get back a thing that you can
start tacking entrypoints onto.

But, a "note" in a SimpleChord isn't a counter, it's a latch. The
chord itself is acting sort of like a counter, in that all the notes
need to be activated, but because the notes are latches, activating a
note repeatedly has the same effect as activating it once. There's no
way for one note to interfere with other notes, which wouldn't be the
case if it literally were just a counter.

It seems to me that a chord where each note has a script of its own is
actually basically just a class with methods, I guess with a common
joined epilogue?:

        (caller)                (caller)
            |                       |
    void AddX(int X)         void AddY(int Y)
            |                       |
   { script for AddX }      {script for AddY }
            |                       |
            `-----------.-----------'
                        |
                { common tail?? }

The whole point is that the notes are conceptually different "headers"
into _the same_ body. When you call a note of a chord, it is because
you want the _chord_'s script to run, and the chord is acting as a
construct that says "okay, yes, I'll satisfy your request that I
execute, but you'll have to wait, because I'm going to satisfy _all_
your requests in one go".

> >       $::main_status stop
> > -     unlock_index
> > -     uplevel #0 $after
>
> There is a call to unlock_index in the body of the if statement above
> too. Do we want to remove that too, or should it be left alone?
>
> That codepath seems to be taken when a major error happens, and we just
> resign to our fate and get a fresh start by doing a rescan and syncing
> the repo state. So it is quite likely whatever operation we were doing
> failed spectacularly.
>
> Maybe the answer is to swallow the bitter pill and introduce a
> switch/boolean in `_close_updateindex` that controls whether the index
> is unlocked or not. We unlock it when the if statement is not taken, and
> keep the current codepath when it is. I call it a "bitter pill" because
> I'm usually not a huge fan of adding knobs like that in functions. Makes
> the function harder to reason about and makes it more bug prone.
>
> If you can think of a better/cleaner way of working around this,
> suggestions are welcome!

Hmm, so, yeah, the entire if statement only occurs if it can't close
the file descriptor. Is that something that actually happens? If so,
then it should perhaps be throwing an exception, because having
started a rescan is probably more than the caller bargained for. That
would prevent the callers from unlocking the index out from under the
rescan, and also cancel any other processing they might be doing that
is probably making bad assumptions with a rescan running.

> >       if {$update_index_cp >= $total_cnt} {
> > -             _close_updateindex $fd $after
> > +             _close_updateindex $fd $do_unlock_index $after
>
> _close_updateindex takes only one argument, and you pass it 3.
> $do_unlock_index does not seem to be defined anywhere. $after is
> evaluated just after this line, and _close_updateindex doesn't accept
> the argument anyway. I suspect this is a leftover from a different
> approach you tried before this one.

It is indeed, oops!

> Also, unlike all the other places where _close_updateindex is used, this
> one does not make a call to unlock_index. Is that intended? IIUC, it
> should be intended, since this is the part which uses the "chord", but a
> confirmation would be nice.

Intentional, yes. I'll see if there's a concise way to document this.

> > +     # Common "after" functionality that waits until multiple asynchronous
> > +     # operations are complete (by waiting for them to activate their notes
> > +     # on the chord).
>
> Nitpick: mention what the "multiple asynchronous operations" are exactly
> (i.e, they are the deletion and index checkout operations).

Okeydoke.

> >       set after {}
>
> 'after' seems to be an unused variable. This line can be deleted.

Good catch.

> > +             if {($deletion_error_cnt > 0) && ($deletion_error_cnt <= [MAX_VERBOSE_FILES_IN_DELETION_ERROR])} {
>
> Nitpick: please split the line into two.

Will do.

> > +                     set error_text "Encountered errors deleting files:\n"
>
> Wrap the string in a `mc [...]` so it can be translated some time in the
> future.

Ah, yes, I did that with most messages, this was an oversight.

> > +proc MAX_VERBOSE_FILES_IN_DELETION_ERROR {} { return 10; }
>
> Why use a procedure, and not a global variable? My guess is to make it
> impossible for some code to change this value by mistake. Do I guess
> correctly?

A variable is by definition not a constant. This is the pattern that
came up when I did a search for how one makes a constant in Tcl. ""\_(
``_/ )_/""

Making it a procedure means that if someone wants to put actual logic
behind it in the future, it's already being called as a proc.

> Wew! This took longer than I expected ;)
>
> Tested on Linux. Works fine after fixing the extra arguments passed to
> `_close_updateindex`. Thanks.

Yeah, I did run things as I was changing them to verify, and felt like
I covered everything, I'm surprised I didn't bump into that, obviously
I didn't cover everything after all. Perfect demonstration of why
developers should never be exclusively responsible for testing their
own code :-D

Let me know w.r.t. which OO framework to employ and what that means
for minimum required versions and/or package references.

Thanks very much,

Jonathan Gilbert
Philip Oakley Nov. 11, 2019, 10:59 p.m. UTC | #3
On 11/11/2019 21:55, Jonathan Gilbert wrote:
> Basic method (or, if you will, a "chord" with only one "note"):
>
>             (caller)
>                |
>      void Add(int X, int Y)
>                |
>        { output(X + Y) }
>
> A "chord" with two "notes":
>
>          (caller)                (caller)
>              |                       |
>      void AddX(int X)         void AddY(int Y)
>              |                       |
>              `-----------.-----------'
>                          |
>                  { output(X + Y) }
>
> The specific details differ from what I've written here. In Polyphonic
> C#, you don't have to instantiate a chord, you simply start calling
> methods, and the runtime matches up complete sets dynamically.
sounds like "Currying" a function but with the parameters taken in any 
order, though, in a sense, perhaps not generating intermediate functions...

Philip
Jonathan Gilbert Nov. 12, 2019, 4:49 a.m. UTC | #4
On Mon, Nov 11, 2019 at 4:59 PM Philip Oakley <philipoakley@iee.email> wrote:
> sounds like "Currying" a function but with the parameters taken in any
> order, though, in a sense, perhaps not generating intermediate functions...

It's like currying if you could pass g(x) = f(x, y) to one block of
code and h(y) = f(x, y) to another block of code, so that each of g
and h are each like curried versions of f that "bake in" one of the
arguments, without having to know which one will get called first. :-)

Jonathan Gilbert
Philip Oakley Nov. 12, 2019, 10:45 a.m. UTC | #5
On 12/11/2019 04:49, Jonathan Gilbert wrote:
> On Mon, Nov 11, 2019 at 4:59 PM Philip Oakley <philipoakley@iee.email> wrote:
>> sounds like "Currying" a function but with the parameters taken in any
>> order, though, in a sense, perhaps not generating intermediate functions...
> It's like currying if you could pass g(x) = f(x, y) to one block of
> code and h(y) = f(x, y) to another block of code, so that each of g
> and h are each like curried versions of f that "bake in" one of the
> arguments, without having to know which one will get called first. :-)
>
> Jonathan Gilbert
So that would be called "Chording"...
(Is there a 'proper' technical term for that approach?)
P.
Jonathan Gilbert Nov. 12, 2019, 4:29 p.m. UTC | #6
On Tue, Nov 12, 2019 at 4:45 AM Philip Oakley <philipoakley@iee.email> wrote:
> On 12/11/2019 04:49, Jonathan Gilbert wrote:
> > On Mon, Nov 11, 2019 at 4:59 PM Philip Oakley <philipoakley@iee.email> wrote:
> >> sounds like "Currying" a function but with the parameters taken in any
> >> order, though, in a sense, perhaps not generating intermediate functions...
> > It's like currying if you could pass g(x) = f(x, y) to one block of
> > code and h(y) = f(x, y) to another block of code, so that each of g
> > and h are each like curried versions of f that "bake in" one of the
> > arguments, without having to know which one will get called first. :-)
> >
> > Jonathan Gilbert
> So that would be called "Chording"...
> (Is there a 'proper' technical term for that approach?)

Not an entirely implausible term :-) The only other implementation
I've ever seen was Microsoft's "Polyphonic C#", which got rolled into
C-omega. I'm pretty sure, though, that it was never referred to as
something you _do to_ a function, but rather as a _different type_ of
function -- as in, the function hasn't been "chorded", it "is a
chord". Very little literature one way or the other though, and this
is the first actual, live use case for the structure I've encountered
in my years of programming :-)

Jonathan Gilbert
Pratyush Yadav Nov. 12, 2019, 7:35 p.m. UTC | #7
Hi Jonathan,

On 11/11/19 03:55PM, Jonathan Gilbert wrote:
> On Mon, Nov 11, 2019 at 1:25 PM Pratyush Yadav me-at-yadavpratyush.com
> |GitHub Public/Example Allow| <172q77k4bxwj0zt@sneakemail.com> wrote:
> > On 07/11/19 07:05AM, Jonathan Gilbert via GitGitGadget wrote:
> > > --- /dev/null
> > > +++ b/lib/chord.tcl
> > > @@ -0,0 +1,137 @@
> >
> > The 'class' documentation [0] suggests adding a "package require TclOO".
> > But TclOO ships by default with Tcl 8.6 and above. So, I'm not really
> > sure if we need this.
> 
> I'm not super familiar with it. I just checked what Tcl version I was
> myself running, since it's only there because of the Git Gui
> installation bundled with Git for Windows, and it was 8.6, so I
> assumed it was fair game to use. It didn't occur to me that you could
> already have an older version of Tcl installed and have Git Gui use
> it. :-) So, if I'm understanding correctly, `TclOO` as a package could
> potentially be used to allow TclOO to be used with 8.4, the minimum
> supported version you mention below, and it just happened to work for
> me in my testing without that because I have 8.6 installed but that's
> technically newer than the supported baseline?
> 
> > Nice to see some good documentation!
> >
> > One nitpick: would it make more sense to have the documentation for a
> > method/constructor just above that method/constructor? This way, when
> > someone updates the code some time later, they'll also hopefully
> > remember to update the documentation. It is much more likely to be stale
> > if all of it just stays on the top.
> 
> Hmm, what do you think of both? I was thinking of the documentation as
> a single self-contained block that someone could read to put together
> an understanding of how the chord system fits together, and split out,
> it wouldn't have that readability. What about a more abstract
> description in a block at the top, and then more technically-detailed
> & specific descriptions attached to each method?

Since you put it this way, it does make sense to create some flow. I'm 
not sure if these relatively simple methods warrant specific detailed 
documentation.

So, if you can figure out a reasonable split, that'd be great. 
Otherwise, I guess we can just stick with this.
 
> > > +oo::class create SimpleChord {
> >
> > This comes from the TclOO package, right?
> >
> > git-gui has its own object-oriented system (lib/class.tcl). It was
> > written circa 2007. I suspect something like TclOO did not exist back
> > then.
> >
> > Why not use that? Does it have some limitations that TclOO does not
> > have? I do not mind using the "official" OO system. I just want to know
> > why exactly you made the choice.
> 
> Having limited experience with Tcl, I did a Google search for "tcl
> object oriented" and ended up writing code using TclOO because that's
> what came up. Do you think it makes sense to rework this to use
> `class.tcl`, or perhaps instead the opposite: have a policy of using
> the standard TclOO going forward, and let the rest of Git Gui
> organically upgrade itself to some hypothetical point in the future
> where class.tcl is no longer used by anything?

Replacing class.tcl would be a big effort, and seeing how things stand 
as of now in terms of active contributors, I don't think it would happen 
in the near future.

So the question really boils down to "do we want to mix these two 
flavors of OO frameworks?".

If TclOO gives us some benefit over our homegrown framework, or if our 
framework is in some way hard to use, then I would certainly side on 
just sticking with TclOO.

If not, it becomes a question of taste more of less. Which 
implementation do we like more, and which more people would be 
comfortable working with. And whether mixing the two is a good idea or 
not.

That being said, I am more inclined towards using our homegrown 
framework just for the sake of uniformity if nothing else.

So in the end I guess the answer is I dunno.
 
> > More importantly, TclOO ships as part of the core distribution with Tcl
> > 8.6, but as of now the minimum version required for git-gui is 8.4. So,
> > I think we should bump the minimum version (8.6 released circa 2012, so
> > most people should have caught up by now I hope).
> 
> If I understand correctly, you mentioned that TclOO was intrinsically
> available to me because I was using Tcl 8.6, and that the manual
> recommends `package require TclOO` -- does that package dependency
> permit the use of TclOO on 8.4? If so, could that be a way to avoid
> bumping the minimum version required? Simply in the interest of
> keeping the scope of the change limited. If not, then bumping the
> minimum required version to 8.6 from 2012 doesn't seem entirely
> unreasonable either. :-)

I looked around a bit, and it seems that TclOO would not work with 8.4 
[0]. So, a version bump is needed. Unless, of course, you decide to use 
the OO framework provided by class.tcl.

The version can be bumped by editing the line git-gui.sh:33.
 
> > > +     variable Notes
> > > +     variable Body
> > > +     variable IsCompleted
> >
> > Nitpick: Please use snake_case, here and in other places.
> 
> Okay, yep -- I had copied the convention that I saw in TclOO examples,
> conscious of the fact that there might be a standard specific to
> object-oriented Tcl.
> 
> > > +     method notify_note_activation {} {
> >
> > Since this method is for internal use only, can it be made "private"?
> > Does the OO library support something like this?
> 
> I don't think so, because it's called from outside the class. What
> we'd be looking for is something like C++'s "friend" syntax. Tcl
> doesn't seem to have this. Though, I just did some further Googling,
> and saw a hint that it might be possible to bypass member security on
> a case-by-case basis, so that the method is private but `ChordNote` is
> able to call it anyway. I'll see if I can't figure this out. :-)

I don't think too much complexity/hacking is warranted for something 
like this. If you can figure out a really simple way to do it, great! 
Otherwise, just keep it like it is.
 
> > > +     method unknown {} {
> >
> > I'm a bit lost here. This method is named 'unknown', but searching for
> > 'unknown' in this patch just gives me two results: this line here, and
> > then one in a comment at the start of the file.
> >
> > From what I understand looking at the code, it some sort of a "default"
> > method, and is called when you run just `$chord_note`. How exactly is
> > this method designated to be the default?
> >
> > Also, "unknown" makes little sense in this context. Can you rename it to
> > something more meaningful? Maybe something like "activate_note"?
> 
> I think it's the fact that it is named `unknown` that makes it the
> "default" method. I think this just needs documentary comments next to
> it. The TclOO documentation says:

Yes, a comment explaining it is the default would be nice.
 
> > obj unknown ?methodName? ?arg ...?
> > This method is called when an attempt to invoke the method methodName on
> > object obj fails. The arguments that the user supplied to the method are
> > given as arg arguments. If methodName is absent, the object was invoked with
> > no method name at all (or any other arguments).
> 
> It was based on that last sentence that I interpreted `unknown` as,
> "This is a mechanism for making an object that can be called like a
> method."

Looks like this method would also be called if someone misspelled a 
method name for this object. So say if someone by mistake writes 

  $note is_activate

this method would be called. This is a clear bug. So, add a check here 
to make sure 'methodName' is actually absent. And if it isn't, display 
an error. Displaying an error to the user on a programmer error can get 
annoying. But since we don't have something like assertions in git-gui 
yet, maybe that's the best way to get bugs noticed.
 
> > > +             if {!$IsActivated} {
> > > +                     set IsActivated 1
> > > +                     $Chord notify_note_activation
> > > +             }
> > > +     }
> > > +}
> >
> > From what I understand, the "Note" object is effectively used as a
> > count. There is no other state associated with it. When I first heard of
> > your description of this abstraction, I assumed that a Note would also
> > store a script to execute with it. So, when you "activate" a note, it
> > would first execute the script, and then mark itself as "activated", and
> > notify the chord. Would that abstraction make more sense?
> >
> > I don't really mind keeping it this way, but I wonder if that design
> > would make the abstraction easier to wrap your head around.
> 
> I learned about the concept of chords and notes from an experimental
> language that Microsoft created many years back called "Polyphonic C#"
> (which in turn got rolled into "Cw" (C-omega)), and in that
> abstraction, the idea was that, well, as a baseline, for starters, we
> have methods and each one, conceptually, has an entrypoint with a
> certain set of parameters, and when you call that entrypoint, the
> parameters are all set and the body runs. With a "chord", you have
> more than one entrypoint attached to the same body -- the entrypoints
> themselves don't have any logic associated with them individually.
> Each note has its own parameter list, and when all the notes have been
> called, the body is run with _all_ of those parameters.
> 
> I drew some ASCII art, don't know if it'll translate in the message,
> but here goes :-)
> 
> Basic method (or, if you will, a "chord" with only one "note"):
> 
>            (caller)
>               |
>     void Add(int X, int Y)
>               |
>       { output(X + Y) }
> 
> A "chord" with two "notes":
> 
>         (caller)                (caller)
>             |                       |
>     void AddX(int X)         void AddY(int Y)
>             |                       |
>             `-----------.-----------'
>                         |
>                 { output(X + Y) }
> 
> The specific details differ from what I've written here. In Polyphonic
> C#, you don't have to instantiate a chord, you simply start calling
> methods, and the runtime matches up complete sets dynamically. (Just
> thinking through the implications of this, if the notes aren't all
> called at exactly the same rate this obviously leads very easily to
> bugs that chew up all memory on incomplete chords. :-P) Also,
> Microsoft's language has parameters to each of the notes that are
> _all_ passed to the body at once. My implementation here is a "simple"
> chord, I didn't bother with arguments, as they aren't needed in this
> usage :-) I also found it much simpler to think of implementing the
> chord with the activations being explicit instead of implicit. So
> instead of saying up front, "Here is my method body and here are its 3
> entrypoints", with this implementation the chord is a dynamic object,
> you say "Here is my method body" and get back a thing that you can
> start tacking entrypoints onto.
> 
> But, a "note" in a SimpleChord isn't a counter, it's a latch. The
> chord itself is acting sort of like a counter, in that all the notes
> need to be activated, but because the notes are latches, activating a
> note repeatedly has the same effect as activating it once. There's no
> way for one note to interfere with other notes, which wouldn't be the
> case if it literally were just a counter.

Makes sense.
 
> It seems to me that a chord where each note has a script of its own is
> actually basically just a class with methods, I guess with a common
> joined epilogue?:
> 
>         (caller)                (caller)
>             |                       |
>     void AddX(int X)         void AddY(int Y)
>             |                       |
>    { script for AddX }      {script for AddY }
>             |                       |
>             `-----------.-----------'
>                         |
>                 { common tail?? }

Thanks for explaining.

I had a slightly different mental model of the abstraction. The figure 
here is what I had in mind, with the exception being that the two 
functions that the two callers call are independent of each other.

To put it in more detail, what I was thinking of was that you'd create a 
bunch of scripts that had to be evaluated separately, independent of 
each other. Each script is associated with a note. Activating a note 
runs that script. And when all the notes are activated, the common tail 
is executed.

As far as I see, the use of the chord in the patch has just two 
independent operations that need to run a common tail once both are 
complete.

That's not to say it has to be done this way. Your way works just as 
well, just in a slightly different way :)
 
> The whole point is that the notes are conceptually different "headers"
> into _the same_ body. When you call a note of a chord, it is because
> you want the _chord_'s script to run, and the chord is acting as a
> construct that says "okay, yes, I'll satisfy your request that I
> execute, but you'll have to wait, because I'm going to satisfy _all_
> your requests in one go".
> 
> > >       $::main_status stop
> > > -     unlock_index
> > > -     uplevel #0 $after
> >
> > There is a call to unlock_index in the body of the if statement above
> > too. Do we want to remove that too, or should it be left alone?
> >
> > That codepath seems to be taken when a major error happens, and we just
> > resign to our fate and get a fresh start by doing a rescan and syncing
> > the repo state. So it is quite likely whatever operation we were doing
> > failed spectacularly.
> >
> > Maybe the answer is to swallow the bitter pill and introduce a
> > switch/boolean in `_close_updateindex` that controls whether the index
> > is unlocked or not. We unlock it when the if statement is not taken, and
> > keep the current codepath when it is. I call it a "bitter pill" because
> > I'm usually not a huge fan of adding knobs like that in functions. Makes
> > the function harder to reason about and makes it more bug prone.
> >
> > If you can think of a better/cleaner way of working around this,
> > suggestions are welcome!
> 
> Hmm, so, yeah, the entire if statement only occurs if it can't close
> the file descriptor. Is that something that actually happens? If so,
> then it should perhaps be throwing an exception, because having
> started a rescan is probably more than the caller bargained for. That
> would prevent the callers from unlocking the index out from under the
> rescan, and also cancel any other processing they might be doing that
> is probably making bad assumptions with a rescan running.

This seems like defensive programming. It is accounting for something 
_really bad_ happening.

If closing the file descriptor fails, it means the buffer was not 
flushed properly for some reason. Whatever operations we thought we did 
were potentially not completed. So, we just discard all 
assumptions/state, and get a fresh start by doing a rescan. This was 
introduced in d4e890e5 ("git-gui: Make sure we get errors from 
git-update-index", 23-10-2007). The commit message says:

    I'm seeing a lot of silent failures from git-update-index on
    Windows and this is leaving the index.lock file intact, which
    means users are later unable to perform additional operations.

    When the index is locked behind our back and we are unable to
    use it we may need to allow the user to delete the index lock
    and try again.  However our UI state is probably not currect
    as we have assumed that some changes were applied but none of
    them actually did.  A rescan is the easiest (in code anyway)
    solution to correct our UI to show what the index really has
    (or doesn't have).

Since this is a _really_ old commit, I'm not sure if the problem still 
exists today though.

So, this recovery code has to go somewhere. Yes, a rescan is certainly 
more than what the caller wanted, but it is better than working on an 
inconsistent in-memory state of the repo.

The question then becomes where the best place to do so is. This seems 
like a good one if we can get our locking requirements to work with it 
properly.

The glaring problem is that we don't want the rescan to run while the 
deletion task is still running because they will interfere with each 
other. Also, deletion expects the index to be locked, so the rescan and 
deletion should be mutually exclusive.

One quick hack I can think of is to throw an error from this function, 
and let the caller handle it. Then, in the callers that don't have the 
deletion task to worry about, they just call the rescan (to be more 
specific, the body of the if statement - moved to its own function). The 
callers that do have to worry about the deletion somehow schedule it 
after the deletion process finished. Or, they somehow cancel the 
deletion operation, and then run the rescan.

Waiting till the deletion is over can probably be done by polling the 
lock in an `after idle...`.

This is what I can think of at first glance. Maybe I'm missing a better 
and cleaner way?
 
> > >       if {$update_index_cp >= $total_cnt} {
> > > -             _close_updateindex $fd $after
> > > +             _close_updateindex $fd $do_unlock_index $after
> >
> > _close_updateindex takes only one argument, and you pass it 3.
> > $do_unlock_index does not seem to be defined anywhere. $after is
> > evaluated just after this line, and _close_updateindex doesn't accept
> > the argument anyway. I suspect this is a leftover from a different
> > approach you tried before this one.
> 
> It is indeed, oops!
> 
> > Also, unlike all the other places where _close_updateindex is used, this
> > one does not make a call to unlock_index. Is that intended? IIUC, it
> > should be intended, since this is the part which uses the "chord", but a
> > confirmation would be nice.
> 
> Intentional, yes. I'll see if there's a concise way to document this.
> 
> > > +     # Common "after" functionality that waits until multiple asynchronous
> > > +     # operations are complete (by waiting for them to activate their notes
> > > +     # on the chord).
> >
> > Nitpick: mention what the "multiple asynchronous operations" are exactly
> > (i.e, they are the deletion and index checkout operations).
> 
> Okeydoke.
> 
> > >       set after {}
> >
> > 'after' seems to be an unused variable. This line can be deleted.
> 
> Good catch.
> 
> > > +             if {($deletion_error_cnt > 0) && ($deletion_error_cnt <= [MAX_VERBOSE_FILES_IN_DELETION_ERROR])} {
> >
> > Nitpick: please split the line into two.
> 
> Will do.
> 
> > > +                     set error_text "Encountered errors deleting files:\n"
> >
> > Wrap the string in a `mc [...]` so it can be translated some time in the
> > future.
> 
> Ah, yes, I did that with most messages, this was an oversight.
> 
> > > +proc MAX_VERBOSE_FILES_IN_DELETION_ERROR {} { return 10; }
> >
> > Why use a procedure, and not a global variable? My guess is to make it
> > impossible for some code to change this value by mistake. Do I guess
> > correctly?
> 
> A variable is by definition not a constant. This is the pattern that
> came up when I did a search for how one makes a constant in Tcl. ""\_(
> ``_/ )_/""
> 
> Making it a procedure means that if someone wants to put actual logic
> behind it in the future, it's already being called as a proc.

Makes sense.
 
> > Wew! This took longer than I expected ;)
> >
> > Tested on Linux. Works fine after fixing the extra arguments passed to
> > `_close_updateindex`. Thanks.
> 
> Yeah, I did run things as I was changing them to verify, and felt like
> I covered everything, I'm surprised I didn't bump into that, obviously
> I didn't cover everything after all. Perfect demonstration of why
> developers should never be exclusively responsible for testing their
> own code :-D
> 
> Let me know w.r.t. which OO framework to employ and what that means
> for minimum required versions and/or package references.
> 
> Thanks very much,
> 
> Jonathan Gilbert

[0] https://wiki.tcl-lang.org/page/MeTOO
Philip Oakley Nov. 26, 2019, 11:22 a.m. UTC | #8
On 12/11/2019 16:29, Jonathan Gilbert wrote:
> On Tue, Nov 12, 2019 at 4:45 AM Philip Oakley <philipoakley@iee.email> wrote:
>> On 12/11/2019 04:49, Jonathan Gilbert wrote:
>>> On Mon, Nov 11, 2019 at 4:59 PM Philip Oakley <philipoakley@iee.email> wrote:
>>>> sounds like "Currying" a function but with the parameters taken in any
>>>> order, though, in a sense, perhaps not generating intermediate functions...
>>> It's like currying if you could pass g(x) = f(x, y) to one block of
>>> code and h(y) = f(x, y) to another block of code, so that each of g
>>> and h are each like curried versions of f that "bake in" one of the
>>> arguments, without having to know which one will get called first. :-)
>>>
>>> Jonathan Gilbert
>> So that would be called "Chording"...
>> (Is there a 'proper' technical term for that approach?)
> Not an entirely implausible term :-) The only other implementation
> I've ever seen was Microsoft's "Polyphonic C#", which got rolled into
> C-omega. I'm pretty sure, though, that it was never referred to as
> something you _do to_ a function, but rather as a _different type_ of
> function -- as in, the function hasn't been "chorded", it "is a
> chord". Very little literature one way or the other though, and this
> is the first actual, live use case for the structure I've encountered
> in my years of programming :-)
>
A little bit of late follow up ;-)

The basic ideas that are embedded in "chording" would appear to be the 
same as those used in Data Flow Diagrams and the older attempts at data 
flow based machines such as the Transputer and it's message passing, and 
out of order execution machines. See 
https://en.wikipedia.org/wiki/Dataflow_architecture etc.

It just looks like it's now moved to the compiler, or JIT (just-in-time) 
compilation, which appears to be the same thing with different branding!

Philip
diff mbox series

Patch

diff --git a/lib/chord.tcl b/lib/chord.tcl
new file mode 100644
index 0000000000..2d13af14fc
--- /dev/null
+++ b/lib/chord.tcl
@@ -0,0 +1,137 @@ 
+# SimpleChord class:
+#   Represents a procedure that conceptually has multiple entrypoints that must
+#   all be called before the procedure executes. Each entrypoint is called a
+#   "note". The chord is only "completed" when all the notes are "activated".
+#
+#   Constructor:
+#     set chord [SimpleChord new {body}]
+#       Creates a new chord object with the specified body script. The body
+#       script is evaluated at most once, when a note is activated and the
+#       chord has no other non-activated notes.
+#
+#   Methods:
+#     $chord eval {script}
+#       Runs the specified script in the same context (namespace) in which the
+#       chord body will be evaluated. This can be used to set variable values
+#       for the chord body to use.
+#
+#     set note [$chord add_note]
+#       Adds a new note to the chord, an instance of ChordNote. Raises an
+#       error if the chord is already completed, otherwise the chord is updated
+#       so that the new note must also be activated before the body is
+#       evaluated.
+#
+#     $chord notify_note_activation
+#       For internal use only.
+#
+# ChordNote class:
+#   Represents a note within a chord, providing a way to activate it. When the
+#   final note of the chord is activated (this can be any note in the chord,
+#   with all other notes already previously activated in any order), the chord's
+#   body is evaluated.
+#
+#   Constructor:
+#     Instances of ChordNote are created internally by calling add_note on
+#     SimpleChord objects.
+#
+#   Methods:
+#     [$note is_activated]
+#       Returns true if this note has already been activated.
+#
+#     $note
+#       Activates the note, if it has not already been activated, and completes
+#       the chord if there are no other notes awaiting activation. Subsequent
+#       calls will have no further effect.
+#
+# Example:
+#
+#   # Turn off the UI while running a couple of async operations.
+#   lock_ui
+#
+#   set chord [SimpleChord new {
+#     unlock_ui
+#     # Note: $notice here is not referenced in the calling scope
+#     if {$notice} { info_popup $notice }
+#   }
+#
+#   # Configure a note to keep the chord from completing until
+#   # all operations have been initiated.
+#   set common_note [$chord add_note]
+#
+#   # Pass notes as 'after' callbacks to other operations
+#   async_operation $args [$chord add_note]
+#   other_async_operation $args [$chord add_note]
+#
+#   # Communicate with the chord body
+#   if {$condition} {
+#     # This sets $notice in the same context that the chord body runs in.
+#     $chord eval { set notice "Something interesting" }
+#   }
+#
+#   # Activate the common note, making the chord eligible to complete
+#   $common_note
+#
+# At this point, the chord will complete at some unknown point in the future.
+# The common note might have been the first note activated, or the async
+# operations might have completed synchronously and the common note is the
+# last one, completing the chord before this code finishes, or anything in
+# between. The purpose of the chord is to not have to worry about the order.
+
+oo::class create SimpleChord {
+	variable Notes
+	variable Body
+	variable IsCompleted
+
+	constructor {body} {
+		set Notes [list]
+		set Body $body
+		set IsCompleted 0
+	}
+
+	method eval {script} {
+		namespace eval [self] $script
+	}
+
+	method add_note {} {
+		if {$IsCompleted} { error "Cannot add a note to a completed chord" }
+
+		set note [ChordNote new [self]]
+
+		lappend Notes $note
+
+		return $note
+	}
+
+	method notify_note_activation {} {
+		if {!$IsCompleted} {
+			foreach note $Notes {
+				if {![$note is_activated]} { return }
+			}
+
+			set IsCompleted 1
+
+			namespace eval [self] $Body
+			namespace delete [self]
+		}
+	}
+}
+
+oo::class create ChordNote {
+	variable Chord IsActivated
+
+	constructor {chord} {
+		set Chord $chord
+		set IsActivated 0
+	}
+
+	method is_activated {} {
+		return $IsActivated
+	}
+
+	method unknown {} {
+		if {!$IsActivated} {
+			set IsActivated 1
+			$Chord notify_note_activation
+		}
+	}
+}
diff --git a/lib/index.tcl b/lib/index.tcl
index 28d4d2a54e..64046d6833 100644
--- a/lib/index.tcl
+++ b/lib/index.tcl
@@ -7,7 +7,7 @@  proc _delete_indexlock {} {
 	}
 }
 
-proc _close_updateindex {fd after} {
+proc _close_updateindex {fd} {
 	global use_ttk NS
 	fconfigure $fd -blocking 1
 	if {[catch {close $fd} err]} {
@@ -52,8 +52,6 @@  proc _close_updateindex {fd after} {
 	}
 
 	$::main_status stop
-	unlock_index
-	uplevel #0 $after
 }
 
 proc update_indexinfo {msg path_list after} {
@@ -90,7 +88,9 @@  proc write_update_indexinfo {fd path_list total_cnt batch after} {
 	global file_states current_diff_path
 
 	if {$update_index_cp >= $total_cnt} {
-		_close_updateindex $fd $after
+		_close_updateindex $fd
+		unlock_index
+		uplevel #0 $after
 		return
 	}
 
@@ -156,7 +156,9 @@  proc write_update_index {fd path_list total_cnt batch after} {
 	global file_states current_diff_path
 
 	if {$update_index_cp >= $total_cnt} {
-		_close_updateindex $fd $after
+		_close_updateindex $fd
+		unlock_index
+		uplevel #0 $after
 		return
 	}
 
@@ -233,7 +235,8 @@  proc write_checkout_index {fd path_list total_cnt batch after} {
 	global file_states current_diff_path
 
 	if {$update_index_cp >= $total_cnt} {
-		_close_updateindex $fd $after
+		_close_updateindex $fd $do_unlock_index $after
+		uplevel #0 $after
 		return
 	}
 
@@ -393,61 +396,266 @@  proc revert_helper {txt paths} {
 
 	if {![lock_index begin-update]} return
 
+	# Common "after" functionality that waits until multiple asynchronous
+	# operations are complete (by waiting for them to activate their notes
+	# on the chord).
+	set after_chord [SimpleChord new {
+		unlock_index
+		if {$should_reshow_diff} { reshow_diff }
+		ui_ready
+	}]
+
+	$after_chord eval { set should_reshow_diff 0 }
+
+	# We don't know how many notes we're going to create (it's dynamic based
+	# on conditional paths below), so create a common note that will delay
+	# the chord's completion until we activate it, and then activate it
+	# after all the other notes have been created.
+	set after_common_note [$after_chord add_note]
+
 	set path_list [list]
+	set untracked_list [list]
 	set after {}
 	foreach path $paths {
 		switch -glob -- [lindex $file_states($path) 0] {
 		U? {continue}
+		?O {
+			lappend untracked_list $path
+		}
 		?M -
 		?T -
 		?D {
 			lappend path_list $path
 			if {$path eq $current_diff_path} {
-				set after {reshow_diff;}
+				$after_chord eval { set should_reshow_diff 1 }
 			}
 		}
 		}
 	}
 
+	set path_cnt [llength $path_list]
+	set untracked_cnt [llength $untracked_list]
 
-	# Split question between singular and plural cases, because
-	# such distinction is needed in some languages. Previously, the
-	# code used "Revert changes in" for both, but that can't work
-	# in languages where 'in' must be combined with word from
-	# rest of string (in different way for both cases of course).
-	#
-	# FIXME: Unfortunately, even that isn't enough in some languages
-	# as they have quite complex plural-form rules. Unfortunately,
-	# msgcat doesn't seem to support that kind of string translation.
-	#
-	set n [llength $path_list]
-	if {$n == 0} {
-		unlock_index
-		return
-	} elseif {$n == 1} {
-		set query [mc "Revert changes in file %s?" [short_path [lindex $path_list]]]
-	} else {
-		set query [mc "Revert changes in these %i files?" $n]
-	}
+	if {$path_cnt > 0} {
+		# Split question between singular and plural cases, because
+		# such distinction is needed in some languages. Previously, the
+		# code used "Revert changes in" for both, but that can't work
+		# in languages where 'in' must be combined with word from
+		# rest of string (in different way for both cases of course).
+		#
+		# FIXME: Unfortunately, even that isn't enough in some languages
+		# as they have quite complex plural-form rules. Unfortunately,
+		# msgcat doesn't seem to support that kind of string
+		# translation.
+		#
+		if {$path_cnt == 1} {
+			set query [mc \
+				"Revert changes in file %s?" \
+				[short_path [lindex $path_list]] \
+				]
+		} else {
+			set query [mc \
+				"Revert changes in these %i files?" \
+				$path_cnt]
+		}
 
-	set reply [tk_dialog \
-		.confirm_revert \
-		"[appname] ([reponame])" \
-		"$query
+		set reply [tk_dialog \
+			.confirm_revert \
+			"[appname] ([reponame])" \
+			"$query
 
 [mc "Any unstaged changes will be permanently lost by the revert."]" \
-		question \
-		1 \
-		[mc "Do Nothing"] \
-		[mc "Revert Changes"] \
-		]
-	if {$reply == 1} {
-		checkout_index \
-			$txt \
+			question \
+			1 \
+			[mc "Do Nothing"] \
+			[mc "Revert Changes"] \
+			]
+
+		if {$reply == 1} {
+			checkout_index \
+				$txt \
+				$path_list \
+				[$after_chord add_note]
+		}
+	}
+
+	if {$untracked_cnt > 0} {
+		# Split question between singular and plural cases, because
+		# such distinction is needed in some languages.
+		#
+		# FIXME: Unfortunately, even that isn't enough in some languages
+		# as they have quite complex plural-form rules. Unfortunately,
+		# msgcat doesn't seem to support that kind of string
+		# translation.
+		#
+		if {$untracked_cnt == 1} {
+			set query [mc \
+				"Delete untracked file %s?" \
+				[short_path [lindex $untracked_list]] \
+				]
+		} else {
+			set query [mc \
+				"Delete these %i untracked files?" \
+				$untracked_cnt \
+				]
+		}
+
+		set reply [tk_dialog \
+			.confirm_revert \
+			"[appname] ([reponame])" \
+			"$query
+
+[mc "Files will be permanently deleted."]" \
+			question \
+			1 \
+			[mc "Do Nothing"] \
+			[mc "Delete Files"] \
+			]
+
+		if {$reply == 1} {
+			$after_chord eval { set should_reshow_diff 1 }
+
+			delete_files $untracked_list [$after_chord add_note]
+		}
+	}
+
+	# Activate the common note. If no other notes were created, this
+	# completes the chord. If other notes were created, then this common
+	# note prevents a race condition where the chord might complete early.
+	$after_common_note
+}
+
+# Delete all of the specified files, performing deletion in batches to allow the
+# UI to remain responsive and updated.
+proc delete_files {path_list after} {
+	# Enable progress bar status updates
+	$::main_status start [mc "Deleting"] [mc "files"]
+
+	set path_index 0
+	set deletion_errors [list]
+	set batch_size 50
+
+	delete_helper \
+		$path_list \
+		$path_index \
+		$deletion_errors \
+		$batch_size \
+		$after
+}
+
+# Helper function to delete a list of files in batches. Each call deletes one
+# batch of files, and then schedules a call for the next batch after any UI
+# messages have been processed.
+proc delete_helper {path_list path_index deletion_errors batch_size after} {
+	global file_states
+
+	set path_cnt [llength $path_list]
+
+	set batch_remaining $batch_size
+
+	while {$batch_remaining > 0} {
+		if {$path_index >= $path_cnt} { break }
+
+		set path [lindex $path_list $path_index]
+
+		set deletion_failed [catch {file delete -- $path} deletion_error]
+
+		if {$deletion_failed} {
+			lappend deletion_errors [list "$deletion_error"]
+		} else {
+			remove_empty_directories [file dirname $path]
+
+			# Don't assume the deletion worked. Remove the file from
+			# the UI, but only if it no longer exists.
+			if {![path_exists $path]} {
+				unset file_states($path)
+				display_file $path __
+			}
+		}
+
+		incr path_index 1
+		incr batch_remaining -1
+	}
+
+	# Update the progress bar to indicate that this batch has been
+	# completed. The update will be visible when this procedure returns
+	# and allows the UI thread to process messages.
+	$::main_status update $path_index $path_cnt
+
+	if {$path_index < $path_cnt} {
+		# The Tcler's Wiki lists this as the best practice for keeping
+		# a UI active and processing messages during a long-running
+		# operation.
+
+		after idle [list after 0 [list \
+			delete_helper \
 			$path_list \
-			[concat $after [list ui_ready]]
+			$path_index \
+			$deletion_errors \
+			$batch_size \
+			$after
+			]]
 	} else {
-		unlock_index
+		# Finish the status bar operation.
+		$::main_status stop
+
+		# Report error, if any, based on how many deletions failed.
+		set deletion_error_cnt [llength $deletion_errors]
+
+		if {($deletion_error_cnt > 0) && ($deletion_error_cnt <= [MAX_VERBOSE_FILES_IN_DELETION_ERROR])} {
+			set error_text "Encountered errors deleting files:\n"
+
+			foreach deletion_error $deletion_errors {
+				append error_text "* [lindex $deletion_error 0]\n"
+			}
+
+			error_popup $error_text
+		} elseif {$deletion_error_cnt == $path_cnt} {
+			error_popup [mc \
+				"None of the %d selected files could be deleted." \
+				$path_cnt \
+				]
+		} elseif {$deletion_error_cnt > 1} {
+			error_popup [mc \
+				"%d of the %d selected files could not be deleted." \
+				$deletion_error_cnt \
+				$path_cnt \
+				]
+		}
+
+		uplevel #0 $after
+	}
+}
+
+proc MAX_VERBOSE_FILES_IN_DELETION_ERROR {} { return 10; }
+
+# This function is from the TCL documentation:
+#
+#   https://wiki.tcl-lang.org/page/file+exists
+#
+# [file exists] returns false if the path does exist but is a symlink to a path
+# that doesn't exist. This proc returns true if the path exists, regardless of
+# whether it is a symlink and whether it is broken.
+proc path_exists {name} {
+	expr {![catch {file lstat $name finfo}]}
+}
+
+# Remove as many empty directories as we can starting at the specified path,
+# walking up the directory tree. If we encounter a directory that is not
+# empty, or if a directory deletion fails, then we stop the operation and
+# return to the caller. Even if this procedure fails to delete any
+# directories at all, it does not report failure.
+proc remove_empty_directories {directory_path} {
+	set parent_path [file dirname $directory_path]
+
+	while {$parent_path != $directory_path} {
+		set contents [glob -nocomplain -dir $directory_path *]
+
+		if {[llength $contents] > 0} { break }
+		if {[catch {file delete -- $directory_path}]} { break }
+
+		set directory_path $parent_path
+		set parent_path [file dirname $directory_path]
 	}
 }