diff mbox series

[v3] clone: document partial clone section

Message ID pull.745.v3.git.git.1603768321361.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series [v3] clone: document partial clone section | expand

Commit Message

Teng Long Oct. 27, 2020, 3:12 a.m. UTC
From: Dyrone Teng <dyroneteng@gmail.com>

Partial clones are created using 'git clone', but there is no related
help information in the git-clone documentation during a period. Add
a relevant section to help users understand what partial clones are
and how they differ from normal clones.

The section briefly introduces the applicable scenarios and some
precautions of partial clone. If users want to know more about its
technical design and other details, users can view the link of
git-partial-clone(7) according to the guidelines in the section.

Signed-off-by: Teng Long <dyroneteng@gmail.com>
---
    clone: document partial clone section
    
    Partial clones are created using 'git clone', but there is no related
    help information in the git-clone documentation during a period. Add a
    relevant section to help users understand what partial clones are and
    how they differ from normal clones.
    
    The section briefly introduces the applicable scenarios and some
    precautions of partial clone. If users want to know more about its
    technical design and other details, users can view the link of
    git-partial-clone(7) according to the guidelines in the section.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-745%2Fdyrone%2Fmaster-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-745/dyrone/master-v3
Pull-Request: https://github.com/git/git/pull/745

Range-diff vs v2:

 1:  6f340d9aad < -:  ---------- partial-clone: set default filter with --partial
 2:  9baf4c8ba3 < -:  ---------- clone: document --partial and --filter options
 3:  c1a44a3509 ! 1:  681c5dcb79 clone: document partial clone section
     @@ Commit message
          clone: document partial clone section
      
          Partial clones are created using 'git clone', but there is no related
     -    help information in the git-clone documentation. Add a relevant section
     -    to help users understand what partial clones are and how they differ
     -    from normal clones.
     +    help information in the git-clone documentation during a period. Add
     +    a relevant section to help users understand what partial clones are
     +    and how they differ from normal clones.
      
          The section briefly introduces the applicable scenarios and some
          precautions of partial clone. If users want to know more about its
          technical design and other details, users can view the link of
          git-partial-clone(7) according to the guidelines in the section.
      
     -    Signed-off-by: Dyrone Teng <dyroneteng@gmail.com>
     +    Signed-off-by: Teng Long <dyroneteng@gmail.com>
      
       ## Documentation/git-clone.txt ##
      @@ Documentation/git-clone.txt: or `--mirror` is given)
     @@ Documentation/git-clone.txt: or `--mirror` is given)
      +-------------
      +
      +By default, `git clone` will download every reachable object, including
     -+every version of every file in the history of the repository. The
     -+**partial clone** feature allows Git to transfer fewer objects and
     -+request them from the remote only when they are needed, so some
     -+reachable objects can be omitted from the initial `git clone` and
     -+subsequent `git fetch` operations.
     ++every version of every file in the history of the repository. The **partial clone**
     ++feature allows Git to transfer fewer objects and request them from the
     ++remote only when they are needed, so some reachable objects can be
     ++omitted from the initial `git clone` and subsequent `git fetch`
     ++operations. In this way, a partial clone can reduce the network traffic
     ++costs and disk space usage when git is working under a large repository.
      +
      +To use the partial clone feature, you can run `git clone` with the 
     -+`--filter=<filter-spec>` option. If you want to clone a repository
     -+without download any blobs, the form `filter=blob:none` will omit all
     -+the blobs. If the repository has some large blobs and you want to
     -+prevent some large blobs being downloaded by an appropriate threshold,
     -+the form `--filter=blob:limit=<n>[kmg]`omits blobs larger than n bytes
     -+or units (see linkgit:git-rev-list[1]).
     ++`--filter=<filter-spec>` option. If the repository has a deep history
     ++and you don't want to download any blobs, the form `filter=blob:none`
     ++will omit all the blobs. If the repository has some large blobs and you
     ++want to prevent some large blobs being downloaded by an appropriate
     ++threshold, the form `--filter=blob:limit=<n>[kmg]` omits blobs larger
     ++than n bytes or units (see linkgit:git-rev-list[1]).
      +
     -+As mentioned before, a partially cloned repository may have to request
     -+the missing objects when they are needed. So some 'local' commands may
     -+fail without a network connection to the remote repository.
     ++When using a partial clone, Git will request missing objects from the
     ++remote(s) when necessary. Several commands that do not involve a request
     ++over a network may now trigger these requests.
      +
      +For example, The <repository> contains two branches which names 'master'
      +and 'topic. Then, we clone the repository by
     @@ Documentation/git-clone.txt: or `--mirror` is given)
      +were downloaded previously.
      +
      +`git log` may also make a surprise with partial clones. `git log
     -+-- <pathspec>` will not cause downloads with the blob filters, because
     -+it's only reading commits and trees. In addition to any options that
     -+require git to look at the contents of blobs, like "-p" and "--stat"
     -+, options that cause git to report pathnames, like "--summary" and
     -+"--raw", will trigger lazy/on-demand fetching of blobs, as they are
     -+needed to detect inexact renames.
     -+
     -+linkgit:partial-clone[1]
     ++--<path>` will not cause downloads with the blob filters, because it's
     ++only reading commits. `git log -p -- <path>` will download blobs to
     ++generate the patch output and git log --raw will download all blobs
     ++that changed at recent commits in order to compute renames.
      +
       :git-clone: 1
       include::urls.txt[]


 Documentation/git-clone.txt | 69 +++++++++++++++++++++++++++++++++++++
 1 file changed, 69 insertions(+)


base-commit: e1cfff676549cdcd702cbac105468723ef2722f4

Comments

Philippe Blain Oct. 27, 2020, 1:13 p.m. UTC | #1
Hi Dyrone,

> Le 26 oct. 2020 à 23:12, Teng Long via GitGitGadget <gitgitgadget@gmail.com> a écrit :
> 
> From: Dyrone Teng <dyroneteng@gmail.com>
> 
> Partial clones are created using 'git clone', but there is no related
> help information in the git-clone documentation during a period. Add
> a relevant section to help users understand what partial clones are
> and how they differ from normal clones.
> 
> The section briefly introduces the applicable scenarios and some
> precautions of partial clone. If users want to know more about its
> technical design and other details, users can view the link of
> git-partial-clone(7) according to the guidelines in the section.
> 
> Signed-off-by: Teng Long <dyroneteng@gmail.com>
> ---
>    clone: document partial clone section
> 
>    Partial clones are created using 'git clone', but there is no related
>    help information in the git-clone documentation during a period. Add a
>    relevant section to help users understand what partial clones are and
>    how they differ from normal clones.
> 
>    The section briefly introduces the applicable scenarios and some
>    precautions of partial clone. If users want to know more about its
>    technical design and other details, users can view the link of
>    git-partial-clone(7) according to the guidelines in the section.

Since your series has just the one patch now, you don't need to add a description
in your GitGitGadget (GGG) PR. That's why it appears two times here:
the text above the '---' is the commit message, and the text below is the PR description.
In the context of a one-patch series, you can use this space to add additional info that 
do not fit into the commit message, for example questions about your patch, etc.
It is also a good idea (and viewed positively by reviewers) to use it to add a summary of what changed
in your series since the last version you sent. I encourage you to read MyFirstContribution [1]
for a good tutorial on the contribution process. Also, GGG understands that if you end your
PR description with a line starting with "CC:" and an email address, further iterations of your 
series will be sent to those email addresses. So it would have been good to add Stolee in there, like this:

CC: Derrick Stolee <stolee@gmail.com>

(Junio prefers not to be directly CC'ed).

> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-745%2Fdyrone%2Fmaster-v3
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-745/dyrone/master-v3
> Pull-Request: https://github.com/git/git/pull/745
> 
> Range-diff vs v2:
> 
> 1:  6f340d9aad < -:  ---------- partial-clone: set default filter with --partial
> 2:  9baf4c8ba3 < -:  ---------- clone: document --partial and --filter options
> 3:  c1a44a3509 ! 1:  681c5dcb79 clone: document partial clone section
>     @@ Commit message
>          clone: document partial clone section
> 
>          Partial clones are created using 'git clone', but there is no related
>     -    help information in the git-clone documentation. Add a relevant section
>     -    to help users understand what partial clones are and how they differ
>     -    from normal clones.
>     +    help information in the git-clone documentation during a period. Add
>     +    a relevant section to help users understand what partial clones are
>     +    and how they differ from normal clones.

It appears that you sent the same version of the patch as in v1, instead
of the one you sent in v2 ? You had removed "during a period"  for v2, 
but here it pops up again. You should check that you've sent the more 
up to date version of your patch, before sending v4.

I will not comment on the patch below, since it's not the more up-to-date.
I will send comments shortly on the v2 version by replying to [2] (the v2 version 
of your patch).

Cheers,

Philippe.

[1] https://git-scm.com/docs/MyFirstContribution
[2] https://lore.kernel.org/git/c1a44a35095e7d681c312ecaa07c46e49f2fae67.1586791560.git.gitgitgadget@gmail.com/
Junio C Hamano Oct. 27, 2020, 6:51 p.m. UTC | #2
Philippe Blain <levraiphilippeblain@gmail.com> writes:

> Hi Dyrone,
>
>> Le 26 oct. 2020 à 23:12, Teng Long via GitGitGadget <gitgitgadget@gmail.com> a écrit :
>> 
>> From: Dyrone Teng <dyroneteng@gmail.com>
>> 
>> Partial clones are created using 'git clone', but there is no related
>> help information in the git-clone documentation during a period. Add
>> a relevant section to help users understand what partial clones are
>> and how they differ from normal clones.
>> 
>> The section briefly introduces the applicable scenarios and some
>> precautions of partial clone. If users want to know more about its
>> technical design and other details, users can view the link of
>> git-partial-clone(7) according to the guidelines in the section.
>> 
>> Signed-off-by: Teng Long <dyroneteng@gmail.com>

Compare this line and "From:" we see above?
They need to match.

>> ---
>>    clone: document partial clone section
>> 
>>    Partial clones are created using 'git clone', but there is no related
>>    help information in the git-clone documentation during a period. Add a
>>    relevant section to help users understand what partial clones are and
>>    how they differ from normal clones.
>> 
>>    The section briefly introduces the applicable scenarios and some
>>    precautions of partial clone. If users want to know more about its
>>    technical design and other details, users can view the link of
>>    git-partial-clone(7) according to the guidelines in the section.
>
> Since your series has just the one patch now, you don't need to
> add a description in your GitGitGadget (GGG) PR. That's why it
> appears two times here: the text above the '---' is the commit
> message, and the text below is the PR description.

Nice.  We learn new things every day---I've always wondered where
the duplicated description we sometimes see comes from.

> In the context of a one-patch series, you can use this space to
> add additional info that do not fit into the commit message, for
> example questions about your patch, etc.  It is also a good idea
> (and viewed positively by reviewers) to use it to add a summary of
> what changed in your series since the last version you sent. I
> encourage you to read MyFirstContribution [1] for a good tutorial
> on the contribution process. Also, GGG understands that if you end
> your PR description with a line starting with "CC:" and an email
> address, further iterations of your series will be sent to those
> email addresses. So it would have been good to add Stolee in
> there, like this:
>
> CC: Derrick Stolee <stolee@gmail.com>

> (Junio prefers not to be directly CC'ed).

... unless it is the "final version with concensus among reviewers",
that is.

>> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-745%2Fdyrone%2Fmaster-v3
>> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-745/dyrone/master-v3
>> Pull-Request: https://github.com/git/git/pull/745
>> 
>> Range-diff vs v2:
>> 
>> 1:  6f340d9aad < -:  ---------- partial-clone: set default filter with --partial
>> 2:  9baf4c8ba3 < -:  ---------- clone: document --partial and --filter options
>> 3:  c1a44a3509 ! 1:  681c5dcb79 clone: document partial clone section
>>     @@ Commit message
>>          clone: document partial clone section
>> 
>>          Partial clones are created using 'git clone', but there is no related
>>     -    help information in the git-clone documentation. Add a relevant section
>>     -    to help users understand what partial clones are and how they differ
>>     -    from normal clones.
>>     +    help information in the git-clone documentation during a period. Add
>>     +    a relevant section to help users understand what partial clones are
>>     +    and how they differ from normal clones.
>
> It appears that you sent the same version of the patch as in v1, instead
> of the one you sent in v2 ? You had removed "during a period"  for v2, 
> but here it pops up again. You should check that you've sent the more 
> up to date version of your patch, before sending v4.
>
> I will not comment on the patch below, since it's not the more up-to-date.
> I will send comments shortly on the v2 version by replying to [2] (the v2 version 
> of your patch).
>
> Cheers,
>
> Philippe.
>
> [1] https://git-scm.com/docs/MyFirstContribution
> [2] https://lore.kernel.org/git/c1a44a35095e7d681c312ecaa07c46e49f2fae67.1586791560.git.gitgitgadget@gmail.com/
diff mbox series

Patch

diff --git a/Documentation/git-clone.txt b/Documentation/git-clone.txt
index c898310099..15495675a8 100644
--- a/Documentation/git-clone.txt
+++ b/Documentation/git-clone.txt
@@ -308,6 +308,75 @@  or `--mirror` is given)
 	for `host.xz:foo/.git`).  Cloning into an existing directory
 	is only allowed if the directory is empty.
 
+Partial Clone
+-------------
+
+By default, `git clone` will download every reachable object, including
+every version of every file in the history of the repository. The **partial clone**
+feature allows Git to transfer fewer objects and request them from the
+remote only when they are needed, so some reachable objects can be
+omitted from the initial `git clone` and subsequent `git fetch`
+operations. In this way, a partial clone can reduce the network traffic
+costs and disk space usage when git is working under a large repository.
+
+To use the partial clone feature, you can run `git clone` with the 
+`--filter=<filter-spec>` option. If the repository has a deep history
+and you don't want to download any blobs, the form `filter=blob:none`
+will omit all the blobs. If the repository has some large blobs and you
+want to prevent some large blobs being downloaded by an appropriate
+threshold, the form `--filter=blob:limit=<n>[kmg]` omits blobs larger
+than n bytes or units (see linkgit:git-rev-list[1]).
+
+When using a partial clone, Git will request missing objects from the
+remote(s) when necessary. Several commands that do not involve a request
+over a network may now trigger these requests.
+
+For example, The <repository> contains two branches which names 'master'
+and 'topic. Then, we clone the repository by
+
+    $ git clone --filter=blob:none --no-checkout <repository>
+
+With the `--filter=blob:none` option Git will omit all the blobs and
+the `--no-checkout` option Git will not perform a checkout of HEAD
+after the clone is complete. Then, we check out the remote tracking
+'topic' branch by
+
+    $ git checkout -b topic origin/topic 
+
+The output looks like
+
+------------
+    remote: Enumerating objects: 1, done.
+    remote: Counting objects: 100% (1/1), done.
+    remote: Total 1 (delta 0), reused 0 (delta 0), pack-reused 0
+    Receiving objects: 100% (1/1), 43 bytes | 43.00 KiB/s, done.
+    Branch 'topic' set up to track remote branch 'topic' from 'origin'.
+    Switched to a new branch 'topic'
+------------
+
+The output is a bit surprising but it shows how partial clone works.
+When we check out the branch 'topic' Git will request the missing blobs
+because they are needed. Then, We can switch back to branch 'master' by
+
+    $ git checkout master
+
+This time the output looks like
+
+------------
+    Switched to branch 'master'
+    Your branch is up to date with 'origin/master'.
+------------
+
+It shows that when we switch back to the previous location, the checkout
+is done without a download because the repository has all the blobs that
+were downloaded previously.
+
+`git log` may also make a surprise with partial clones. `git log
+--<path>` will not cause downloads with the blob filters, because it's
+only reading commits. `git log -p -- <path>` will download blobs to
+generate the patch output and git log --raw will download all blobs
+that changed at recent commits in order to compute renames.
+
 :git-clone: 1
 include::urls.txt[]