From c5107b9ab23c4f1db26cc4e7602a761b49e1f0bf Mon Sep 17 00:00:00 2001 From: Ben Straub Date: Mon, 15 Sep 2014 19:50:42 -0700 Subject: [PATCH 01/32] Ch11: First edit for two sections --- book/08-customizing-git/sections/hooks.asc | 1 + book/11-git-internals/sections/objects.asc | 40 +++++++++---------- .../sections/plumbing-porcelain.asc | 14 +++---- 3 files changed, 25 insertions(+), 30 deletions(-) diff --git a/book/08-customizing-git/sections/hooks.asc b/book/08-customizing-git/sections/hooks.asc index b67551c26..baa7e57ae 100644 --- a/book/08-customizing-git/sections/hooks.asc +++ b/book/08-customizing-git/sections/hooks.asc @@ -1,3 +1,4 @@ +[[_hooks]] === Git Hooks (((hooks))) diff --git a/book/11-git-internals/sections/objects.asc b/book/11-git-internals/sections/objects.asc index d7c525d46..3ee2fc9c5 100644 --- a/book/11-git-internals/sections/objects.asc +++ b/book/11-git-internals/sections/objects.asc @@ -10,10 +10,9 @@ First, you initialize a new Git repository and verify that there is nothing in t [source,shell] ---- -$ mkdir test -$ cd test -$ git init +$ git init test Initialized empty Git repository in /tmp/test/.git/ +$ cd test $ find .git/objects .git/objects .git/objects/info @@ -117,7 +116,7 @@ blob ==== Tree Objects -The next type you’ll look at is the tree object, which solves the problem of storing the filename and also allows you to store a group of files together. +The next type we'll look at is the tree, which solves the problem of storing the filename and also allows you to store a group of files together. Git stores content in a manner similar to a UNIX filesystem, but a bit simplified. All the content is stored as tree and blob objects, with trees corresponding to UNIX directory entries and blobs corresponding more or less to inodes or file contents. A single tree object contains one or more tree entries, each of which contains a SHA-1 pointer to a blob or subtree with its associated mode, type, and filename. @@ -140,9 +139,8 @@ $ git cat-file -p 99f1a6d12cb4b6f19c8655fca46c3ecf317074e0 100644 blob 47c6340d6459e05787f644c2447d2595f5d3a54b simplegit.rb ---- -Conceptually, the data that Git is storing is something like <>. +Conceptually, the data that Git is storing is something like this: -[[treefig_a]] .Simple version of the Git data model. image::images/data-model-1.png[Simple version of the Git data model.] @@ -221,9 +219,8 @@ $ git cat-file -p 3c4e9cd789d88d8d89c1073707c3585e41b0e614 ---- If you created a working directory from the new tree you just wrote, you would get the two files in the top level of the working directory and a subdirectory named `bak` that contained the first version of the test.txt file. -You can think of the data that Git contains for these structures as being like <>. +You can think of the data that Git contains for these structures as being like this: -[[treefig_b]] .The content structure of your current Git data. image::images/data-model-2.png[The content structure of your current Git data.] @@ -254,7 +251,7 @@ committer Scott Chacon 1243040974 -0700 first commit ---- -The format for a commit object is simple: it specifies the top-level tree for the snapshot of the project at that point; the author/committer information pulled from your `user.name` and `user.email` configuration settings, with the current timestamp; a blank line, and then the commit message. +The format for a commit object is simple: it specifies the top-level tree for the snapshot of the project at that point; the author/committer information (which uses your `user.name` and `user.email` configuration settings and a timestamp); a blank line, and then the commit message. Next, you’ll write the other two commit objects, each referencing the commit that came directly before it: @@ -278,8 +275,8 @@ Date: Fri May 22 18:15:24 2009 -0700 third commit -bak/test.txt | 1 + -1 files changed, 1 insertions(+), 0 deletions(-) + bak/test.txt | 1 + + 1 file changed, 1 insertion(+) commit cac0cab538b970a37ea1e769cbbde608743bc96d Author: Scott Chacon @@ -287,18 +284,18 @@ Date: Fri May 22 18:14:29 2009 -0700 second commit -new.txt | 1 + -test.txt | 2 +- -2 files changed, 2 insertions(+), 1 deletions(-) + new.txt | 1 + + test.txt | 2 +- + 2 files changed, 2 insertions(+), 1 deletion(-) commit fdf4fc3344e67ab068f836878b6c4951e3b15f3d Author: Scott Chacon Date: Fri May 22 18:09:34 2009 -0700 - first commit + first commit -test.txt | 1 + -1 files changed, 1 insertions(+), 0 deletions(-) + test.txt | 1 + + 1 file changed, 1 insertion(+) ---- Amazing. @@ -322,9 +319,8 @@ $ find .git/objects -type f .git/objects/fd/f4fc3344e67ab068f836878b6c4951e3b15f3d # commit 1 ---- -If you follow all the internal pointers, you get an object graph something like <>. +If you follow all the internal pointers, you get an object graph something like this: -[[commitfig_a]] .All the objects in your Git directory. image::images/data-model-3.png[All the objects in your Git directory.] @@ -348,7 +344,7 @@ Then, it adds a space followed by the size of the content and finally a null byt [source,shell] ---- >> header = "blob #{content.length}\0" -=> "blob 16\000" +=> "blob 16\u0000" ---- Git concatenates the header and the original content and then calculates the SHA-1 checksum of that new content. @@ -357,7 +353,7 @@ You can calculate the SHA-1 value of a string in Ruby by including the SHA1 dige [source,shell] ---- >> store = header + content -=> "blob 16\000what is up, doc?" +=> "blob 16\u0000what is up, doc?" >> require 'digest/sha1' => true >> sha1 = Digest::SHA1.hexdigest(store) @@ -372,7 +368,7 @@ First, you need to require the library and then run `Zlib::Deflate.deflate()` on >> require 'zlib' => true >> zlib_content = Zlib::Deflate.deflate(store) -=> "x\234K\312\311OR04c(\317H,Q\310,V(-\320QH\311O\266\a\000_\034\a\235" +=> "x\x9CK\xCA\xC9OR04c(\xCFH,Q\xC8,V(-\xD0QH\xC9O\xB6\a\x00_\x1C\a\x9D" ---- Finally, you’ll write your zlib-deflated content to an object on disk. diff --git a/book/11-git-internals/sections/plumbing-porcelain.asc b/book/11-git-internals/sections/plumbing-porcelain.asc index d051a1a26..21defed5f 100644 --- a/book/11-git-internals/sections/plumbing-porcelain.asc +++ b/book/11-git-internals/sections/plumbing-porcelain.asc @@ -5,7 +5,7 @@ But because Git was initially a toolkit for a VCS rather than a full user-friend These commands are generally referred to as ``plumbing'' commands, and the more user-friendly commands are called ``porcelain'' commands. The book’s first eight chapters deal almost exclusively with porcelain commands. -But in this chapter, you’ll be dealing mostly with the lower-level plumbing commands, because they give you access to the inner workings of Git and help demonstrate how and why Git does what it does. +But in this chapter, you’ll be dealing mostly with the lower-level plumbing commands, because they give you access to the inner workings of Git, and help demonstrate how and why Git does what it does. These commands aren’t meant to be used manually on the command line, but rather to be used as building blocks for new tools and custom scripts. When you run `git init` in a new or existing directory, Git creates the `.git` directory, which is where almost everything that Git stores and manipulates is located. @@ -15,24 +15,22 @@ Here’s what it looks like: [source,shell] ---- -$ ls +$ ls -F1 HEAD -branches/ -config +config* description hooks/ -index info/ objects/ refs/ ---- You may see some other files in there, but this is a fresh `git init` repository – it’s what you see by default. -The `branches` directory isn’t used by newer Git versions, and the `description` file is only used by the GitWeb program, so don’t worry about those. +The `description` file is only used by the GitWeb program, so don’t worry about it. The `config` file contains your project-specific configuration options, and the `info` directory keeps a global exclude file for ignored patterns that you don’t want to track in a .gitignore file. -The `hooks` directory contains your client- or server-side hook scripts, which are discussed in detail in <<_customizing_git>>. +The `hooks` directory contains your client- or server-side hook scripts, which are discussed in detail in <<_hooks>>. -This leaves four important entries: the `HEAD` and `index` files and the `objects` and `refs` directories. +This leaves four important entries: the `HEAD` and (yet to be created) `index` files, and the `objects` and `refs` directories. These are the core parts of Git. The `objects` directory stores all the content for your database, the `refs` directory stores pointers into commit objects in that data (branches), the `HEAD` file points to the branch you currently have checked out, and the `index` file is where Git stores your staging area information. You’ll now look at each of these sections in detail to see how Git operates. From 5561fa78ff4cc239323d566075453b5f5d85cb97 Mon Sep 17 00:00:00 2001 From: Ben Straub Date: Tue, 16 Sep 2014 18:51:01 -0700 Subject: [PATCH 02/32] Fix broken reference --- book/01-introduction/sections/basics.asc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/book/01-introduction/sections/basics.asc b/book/01-introduction/sections/basics.asc index 6c24204d8..0c90330fd 100644 --- a/book/01-introduction/sections/basics.asc +++ b/book/01-introduction/sections/basics.asc @@ -68,7 +68,7 @@ It is hard to get the system to do anything that is not undoable or to make it e As in any VCS, you can lose or mess up changes you haven’t committed yet; but after you commit a snapshot into Git, it is very difficult to lose, especially if you regularly push your database to another repository. This makes using Git a joy because we know we can experiment without the danger of severely screwing things up. -For a more in-depth look at how Git stores its data and how you can recover data that seems lost, see <<_git_objects>>. +For a more in-depth look at how Git stores its data and how you can recover data that seems lost, see <<_undoing>>. ==== The Three States From 835cde9bdd8fcd563f097422a7cc367efd832cf8 Mon Sep 17 00:00:00 2001 From: Ben Straub Date: Tue, 16 Sep 2014 19:23:13 -0700 Subject: [PATCH 03/32] Packfiles and refs --- book/11-git-internals/sections/objects.asc | 1 + book/11-git-internals/sections/packfiles.asc | 104 ++++++++++--------- book/11-git-internals/sections/refs.asc | 19 ++-- 3 files changed, 63 insertions(+), 61 deletions(-) diff --git a/book/11-git-internals/sections/objects.asc b/book/11-git-internals/sections/objects.asc index 3ee2fc9c5..7af645526 100644 --- a/book/11-git-internals/sections/objects.asc +++ b/book/11-git-internals/sections/objects.asc @@ -1,3 +1,4 @@ +[[_objects]] === Git Objects Git is a content-addressable filesystem. diff --git a/book/11-git-internals/sections/packfiles.asc b/book/11-git-internals/sections/packfiles.asc index 79bd9b67a..243161148 100644 --- a/book/11-git-internals/sections/packfiles.asc +++ b/book/11-git-internals/sections/packfiles.asc @@ -21,15 +21,15 @@ $ find .git/objects -type f Git compresses the contents of these files with zlib, and you’re not storing much, so all these files collectively take up only 925 bytes. You’ll add some larger content to the repository to demonstrate an interesting feature of Git. -Add the repo.rb file from the Grit library you worked with earlier – this is about a 12K source code file: +Add the repo.rb file from the Grit library you worked with earlier – this is about a 22K source code file: [source,shell] ---- -$ curl http://github.com/mojombo/grit/raw/master/lib/grit/repo.rb > repo.rb +$ curl https://raw.githubusercontent.com/mojombo/grit/master/lib/grit/repo.rb > repo.rb $ git add repo.rb $ git commit -m 'added repo.rb' [master 484a592] added repo.rb - 3 files changed, 459 insertions(+), 2 deletions(-) + 3 files changed, 709 insertions(+), 2 deletions(-) delete mode 100644 bak/test.txt create mode 100644 repo.rb rewrite test.txt (100%) @@ -41,7 +41,7 @@ If you look at the resulting tree, you can see the SHA-1 value your repo.rb file ---- $ git cat-file -p master^{tree} 100644 blob fa49b077972391ad58037050f2a75f74e3671e92 new.txt -100644 blob 9bc1dc421dcd51b4ac296e3e5b6e2a99cf44391e repo.rb +100644 blob 033b4468fa6b2a9547a70d88d1bbe8bf3f9ed0d5 repo.rb 100644 blob e3f094f522629ae358806b17daf78246c27c007b test.txt ---- @@ -49,8 +49,8 @@ You can then use `git cat-file` to see how big that object is: [source,shell] ---- -$ git cat-file -s 9bc1dc421dcd51b4ac296e3e5b6e2a99cf44391e -12898 +$ git cat-file -s 033b4468fa6b2a9547a70d88d1bbe8bf3f9ed0d5 +22044 ---- Now, modify that file a little, and see what happens: @@ -59,8 +59,8 @@ Now, modify that file a little, and see what happens: ---- $ echo '# testing' >> repo.rb $ git commit -am 'modified repo a bit' -[master ab1afef] modified repo a bit - 1 files changed, 1 insertions(+), 0 deletions(-) +[master 2431da6] modified repo.rb a bit + 1 file changed, 1 insertion(+) ---- Check the tree created by that commit, and you see something interesting: @@ -69,7 +69,7 @@ Check the tree created by that commit, and you see something interesting: ---- $ git cat-file -p master^{tree} 100644 blob fa49b077972391ad58037050f2a75f74e3671e92 new.txt -100644 blob 05408d195263d853f09dca71d55116663690c27c repo.rb +100644 blob b042a60ef7dff760008df33cee372b945b6e884e repo.rb 100644 blob e3f094f522629ae358806b17daf78246c27c007b test.txt ---- @@ -77,15 +77,15 @@ The blob is now a different blob, which means that although you added only a sin [source,shell] ---- -$ git cat-file -s 05408d195263d853f09dca71d55116663690c27c -12908 +$ git cat-file -s b042a60ef7dff760008df33cee372b945b6e884e +22054 ---- -You have two nearly identical 12K objects on your disk. +You have two nearly identical 22K objects on your disk. Wouldn’t it be nice if Git could store one of them in full but then the second object only as the delta between it and the first? It turns out that it can. -The initial format in which Git saves objects on disk is called a loose object format. +The initial format in which Git saves objects on disk is called a ``loose'' object format. However, occasionally Git packs up several of these objects into a single binary file called a packfile in order to save space and be more efficient. Git does this if you have too many loose objects around, if you run the `git gc` command manually, or if you push to a remote server. To see what happens, you can manually ask Git to pack up the objects by calling the `git gc` command: @@ -93,11 +93,11 @@ To see what happens, you can manually ask Git to pack up the objects by calling [source,shell] ---- $ git gc -Counting objects: 17, done. -Delta compression using 2 threads. -Compressing objects: 100% (13/13), done. -Writing objects: 100% (17/17), done. -Total 17 (delta 1), reused 10 (delta 0) +Counting objects: 18, done. +Delta compression using up to 8 threads. +Compressing objects: 100% (14/14), done. +Writing objects: 100% (18/18), done. +Total 18 (delta 3), reused 0 (delta 0) ---- If you look in your objects directory, you’ll find that most of your objects are gone, and a new pair of files has appeared: @@ -105,11 +105,11 @@ If you look in your objects directory, you’ll find that most of your objects a [source,shell] ---- $ find .git/objects -type f -.git/objects/71/08f7ecb345ee9d0084193f147cdad4d2998293 +.git/objects/bd/9dbf5aae1a3862dd1526723246b20206e5fc37 .git/objects/d6/70460b4b4aece5915caf5c68d12f560a9fe3e4 .git/objects/info/packs -.git/objects/pack/pack-7a16e4488ae40c7d2bc56ea2bd43e25212a66c45.idx -.git/objects/pack/pack-7a16e4488ae40c7d2bc56ea2bd43e25212a66c45.pack +.git/objects/pack/pack-978e03944f5c581011e6998cd0e9e30000905586.idx +.git/objects/pack/pack-978e03944f5c581011e6998cd0e9e30000905586.pack ---- The objects that remain are the blobs that aren’t pointed to by any commit – in this case, the ``what is up, doc?'' @@ -119,8 +119,8 @@ Because you never added them to any commits, they’re considered dangling and a The other files are your new packfile and an index. The packfile is a single file containing the contents of all the objects that were removed from your filesystem. The index is a file that contains offsets into that packfile so you can quickly seek to a specific object. -What is cool is that although the objects on disk before you ran the `gc` were collectively about 12K in size, the new packfile is only 6K. -You’ve halved your disk usage by packing your objects. +What is cool is that although the objects on disk before you ran the `gc` were collectively about 22K in size, the new packfile is only 7K. +You’ve cut your disk usage by ⅔ by packing your objects. How does Git do this? When Git packs objects, it looks for files that are named and sized similarly, and stores just the deltas from one version of the file to the next. @@ -129,34 +129,36 @@ The `git verify-pack` plumbing command allows you to see what was packed up: [source,shell] ---- -$ git verify-pack -v \ - .git/objects/pack/pack-7a16e4488ae40c7d2bc56ea2bd43e25212a66c45.idx -0155eb4229851634a0f03eb265b69f5a2d56f341 tree 71 76 5400 -05408d195263d853f09dca71d55116663690c27c blob 12908 3478 874 -09f01cea547666f58d6a8d809583841a7c6f0130 tree 106 107 5086 -1a410efbd13591db07496601ebc7a059dd55cfe9 commit 225 151 322 -1f7a7a472abf3dd9643fd615f6da379c4acb3e3a blob 10 19 5381 -3c4e9cd789d88d8d89c1073707c3585e41b0e614 tree 101 105 5211 -484a59275031909e19aadb7c92262719cfcdf19a commit 226 153 169 -83baae61804e65cc73a7201a7252750c76066a30 blob 10 19 5362 -9585191f37f7b0fb9444f35a9bf50de191beadc2 tag 136 127 5476 -9bc1dc421dcd51b4ac296e3e5b6e2a99cf44391e blob 7 18 5193 1 -05408d195263d853f09dca71d55116663690c27c \ - ab1afef80fac8e34258ff41fc1b867c702daa24b commit 232 157 12 -cac0cab538b970a37ea1e769cbbde608743bc96d commit 226 154 473 -d8329fc1cc938780ffdd9f94e0d364e0ea74f579 tree 36 46 5316 -e3f094f522629ae358806b17daf78246c27c007b blob 1486 734 4352 -f8f51d7d8a1760462eca26eebafde32087499533 tree 106 107 749 -fa49b077972391ad58037050f2a75f74e3671e92 blob 9 18 856 -fdf4fc3344e67ab068f836878b6c4951e3b15f3d commit 177 122 627 -chain length = 1: 1 object -pack-7a16e4488ae40c7d2bc56ea2bd43e25212a66c45.pack: ok ----- - -Here, the `9bc1d` blob, which if you remember was the first version of your repo.rb file, is referencing the `05408` blob, which was the second version of the file. -The third column in the output is the size of the object in the pack, so you can see that `05408` takes up 12K of the file but that `9bc1d` only takes up 7 bytes. +$ git verify-pack -v .git/objects/pack/pack-978e03944f5c581011e6998cd0e9e30000905586.idx +2431da676938450a4d72e260db3bf7b0f587bbc1 commit 223 155 12 +69bcdaff5328278ab1c0812ce0e07fa7d26a96d7 commit 214 152 167 +80d02664cb23ed55b226516648c7ad5d0a3deb90 commit 214 145 319 +43168a18b7613d1281e5560855a83eb8fde3d687 commit 213 146 464 +092917823486a802e94d727c820a9024e14a1fc2 commit 214 146 610 +702470739ce72005e2edff522fde85d52a65df9b commit 165 118 756 +d368d0ac0678cbe6cce505be58126d3526706e54 tag 130 122 874 +fe879577cb8cffcdf25441725141e310dd7d239b tree 136 136 996 +d8329fc1cc938780ffdd9f94e0d364e0ea74f579 tree 36 46 1132 +deef2e1b793907545e50a2ea2ddb5ba6c58c4506 tree 136 136 1178 +d982c7cb2c2a972ee391a85da481fc1f9127a01d tree 6 17 1314 1 \ + deef2e1b793907545e50a2ea2ddb5ba6c58c4506 +3c4e9cd789d88d8d89c1073707c3585e41b0e614 tree 8 19 1331 1 \ + deef2e1b793907545e50a2ea2ddb5ba6c58c4506 +0155eb4229851634a0f03eb265b69f5a2d56f341 tree 71 76 1350 +83baae61804e65cc73a7201a7252750c76066a30 blob 10 19 1426 +fa49b077972391ad58037050f2a75f74e3671e92 blob 9 18 1445 +b042a60ef7dff760008df33cee372b945b6e884e blob 22054 5799 1463 +033b4468fa6b2a9547a70d88d1bbe8bf3f9ed0d5 blob 9 20 7262 1 \ + b042a60ef7dff760008df33cee372b945b6e884e +1f7a7a472abf3dd9643fd615f6da379c4acb3e3a blob 10 19 7282 +non delta: 15 objects +chain length = 1: 3 objects +.git/objects/pack/pack-978e03944f5c581011e6998cd0e9e30000905586.pack: ok +---- + +Here, the `033b4` blob, which if you remember was the first version of your repo.rb file, is referencing the `b042a` blob, which was the second version of the file. +The third column in the output is the size of the object in the pack, so you can see that `b042a` takes up 22K of the file, but that `033b4` only takes up 9 bytes. What is also interesting is that the second version of the file is the one that is stored intact, whereas the original version is stored as a delta – this is because you’re most likely to need faster access to the most recent version of the file. The really nice thing about this is that it can be repacked at any time. -Git will occasionally repack your database automatically, always trying to save more space. -You can also manually repack at any time by running `git gc` by hand. +Git will occasionally repack your database automatically, always trying to save more space, but you can also manually repack at any time by running `git gc` by hand. diff --git a/book/11-git-internals/sections/refs.asc b/book/11-git-internals/sections/refs.asc index 4eeb3c13a..86d52e63b 100644 --- a/book/11-git-internals/sections/refs.asc +++ b/book/11-git-internals/sections/refs.asc @@ -3,7 +3,7 @@ You can run something like `git log 1a410e` to look through your whole history, but you still have to remember that `1a410e` is the last commit in order to walk that history to find all those objects. You need a file in which you can store the SHA-1 value under a simple name so you can use that pointer rather than the raw SHA-1 value. -In Git, these are called ``references'' or ``refs''; you can find the files that contain the SHA-1 values in the `.git/refs` directory. +In Git, these are called ``references'' or ``refs;'' you can find the files that contain the SHA-1 values in the `.git/refs` directory. In the current project, this directory contains no files, but it does contain a simple structure: [source,shell] @@ -57,9 +57,8 @@ cac0cab538b970a37ea1e769cbbde608743bc96d second commit fdf4fc3344e67ab068f836878b6c4951e3b15f3d first commit ---- -Now, your Git database conceptually looks something like <>. +Now, your Git database conceptually looks something like this: -[[reffig_a]] .Git directory objects with branch head references included. image::images/data-model-4.png[Git directory objects with branch head references included.] @@ -70,7 +69,7 @@ When you run commands like `git branch (branchname)`, Git basically runs that `u The question now is, when you run `git branch (branchname)`, how does Git know the SHA-1 of the last commit? The answer is the HEAD file. The HEAD file is a symbolic reference to the branch you’re currently on. -By symbolic reference, I mean that unlike a normal reference, it doesn’t generally contain a SHA-1 value but rather a pointer to another reference. +Unlike a normal reference, a symbolic reference doesn’t contain a SHA-1 value, but rather a pointer to another reference. If you look at the file, you’ll normally see something like this: [source,shell] @@ -117,7 +116,7 @@ fatal: Refusing to point HEAD outside of refs/ ==== Tags -You’ve just gone over Git’s three main object types, but there is a fourth. +We just finished discussing Git’s three main object types, but there is a fourth. The tag object is very much like a commit object – it contains a tagger, a date, a message, and a pointer. The main difference is that a tag object points to a commit rather than a tree. It’s like a branch reference, but it never moves – it always points to the same commit but gives it a friendlier name. @@ -130,7 +129,7 @@ You can make a lightweight tag by running something like this: $ git update-ref refs/tags/v1.0 cac0cab538b970a37ea1e769cbbde608743bc96d ---- -That is all a lightweight tag is – a branch that never moves. +That is all a lightweight tag is – a reference that never moves. An annotated tag is more complex, however. If you create an annotated tag, Git creates a tag object and then writes a reference to point to it rather than directly to the commit. You can see this by creating an annotated tag (`-a` specifies that it’s an annotated tag): @@ -164,14 +163,13 @@ test tag Notice that the object entry points to the commit SHA-1 value that you tagged. Also notice that it doesn’t need to point to a commit; you can tag any Git object. In the Git source code, for example, the maintainer has added their GPG public key as a blob object and then tagged it. -You can view the public key by running +You can view the public key by running this in a clone of the Git repository: [source,shell] ---- $ git cat-file blob junio-gpg-pub ---- -in the Git source code repository. The Linux kernel repository also has a non-commit-pointing tag object – the first tag created points to the initial tree of the import of the source code. ==== Remotes @@ -200,5 +198,6 @@ $ cat .git/refs/remotes/origin/master ca82a6dff817ec66f44342007202690a93763949 ---- -Remote references differ from branches (`refs/heads` references) mainly in that they can’t be checked out. -Git moves them around as bookmarks to the last known state of where those branches were on those servers. +Remote references differ from branches (`refs/heads` references) mainly in that they're considered read-only. +You can `git checkout` to one, but Git won't point HEAD at one, so you'll never update it with a `commit` command. +Git manages them as bookmarks to the last known state of where those branches were on those servers. From d3d877e8a1f62e59e50f8016a13114ba96e3d50a Mon Sep 17 00:00:00 2001 From: Ben Straub Date: Tue, 16 Sep 2014 19:24:30 -0700 Subject: [PATCH 04/32] Replace single smart-quotes --- book/01-introduction/1-introduction.asc | 6 +- .../sections/about-version-control.asc | 14 +-- book/01-introduction/sections/basics.asc | 30 +++--- .../sections/first-time-setup.asc | 16 ++-- book/01-introduction/sections/help.asc | 2 +- book/01-introduction/sections/history.asc | 4 +- book/01-introduction/sections/installing.asc | 8 +- book/02-git-basics/1-git-basics.asc | 6 +- book/02-git-basics/sections/aliases.asc | 10 +- .../sections/getting-a-repository.asc | 10 +- .../sections/recording-changes.asc | 92 +++++++++---------- book/02-git-basics/sections/remotes.asc | 40 ++++---- book/02-git-basics/sections/tagging.asc | 22 ++--- book/02-git-basics/sections/undoing.asc | 20 ++-- .../sections/viewing-history.asc | 22 ++--- book/04-git-server/1-git-server.asc | 14 +-- .../sections/generating-ssh-key.asc | 14 +-- book/04-git-server/sections/git-daemon.asc | 12 +-- .../sections/git-on-a-server.asc | 22 ++--- book/04-git-server/sections/gitweb.asc | 12 +-- book/04-git-server/sections/hosted.asc | 4 +- book/04-git-server/sections/protocols.asc | 58 ++++++------ .../sections/setting-up-server.asc | 24 ++--- book/06-github/sections/projects.asc | 6 +- book/11-git-internals/1-git-internals.asc | 20 ++-- .../11-git-internals/sections/maintenance.asc | 64 ++++++------- book/11-git-internals/sections/objects.asc | 38 ++++---- book/11-git-internals/sections/packfiles.asc | 18 ++-- .../sections/plumbing-porcelain.asc | 16 ++-- book/11-git-internals/sections/refs.asc | 26 +++--- book/11-git-internals/sections/refspec.asc | 14 +-- .../sections/transfer-protocols.asc | 32 +++---- proposal.md | 8 +- 33 files changed, 352 insertions(+), 352 deletions(-) diff --git a/book/01-introduction/1-introduction.asc b/book/01-introduction/1-introduction.asc index a094bad6b..516505765 100644 --- a/book/01-introduction/1-introduction.asc +++ b/book/01-introduction/1-introduction.asc @@ -21,6 +21,6 @@ include::sections/help.asc[] === Summary -You should have a basic understanding of what Git is and how it’s different from the centralized version control system you may have previously been using. -You should also now have a working version of Git on your system that’s set up with your personal identity. -It’s now time to learn some Git basics. +You should have a basic understanding of what Git is and how it's different from the centralized version control system you may have previously been using. +You should also now have a working version of Git on your system that's set up with your personal identity. +It's now time to learn some Git basics. diff --git a/book/01-introduction/sections/about-version-control.asc b/book/01-introduction/sections/about-version-control.asc index d1a6438bb..89f70e79e 100644 --- a/book/01-introduction/sections/about-version-control.asc +++ b/book/01-introduction/sections/about-version-control.asc @@ -13,9 +13,9 @@ In addition, you get all this for very little overhead. ==== Local Version Control Systems (((version control,local))) -Many people’s version-control method of choice is to copy files into another directory (perhaps a time-stamped directory, if they’re clever). +Many people's version-control method of choice is to copy files into another directory (perhaps a time-stamped directory, if they're clever). This approach is very common because it is so simple, but it is also incredibly error prone. -It is easy to forget which directory you’re in and accidentally write to the wrong file or copy over files you don’t mean to. +It is easy to forget which directory you're in and accidentally write to the wrong file or copy over files you don't mean to. To deal with this issue, programmers long ago developed local VCSs that had a simple database that kept all the changes to files under revision control. @@ -39,19 +39,19 @@ image::images/centralized.png[Centralized version control diagram] This setup offers many advantages, especially over local VCSs. For example, everyone knows to a certain degree what everyone else on the project is doing. -Administrators have fine-grained control over who can do what; and it’s far easier to administer a CVCS than it is to deal with local databases on every client. +Administrators have fine-grained control over who can do what; and it's far easier to administer a CVCS than it is to deal with local databases on every client. However, this setup also has some serious downsides. The most obvious is the single point of failure that the centralized server represents. -If that server goes down for an hour, then during that hour nobody can collaborate at all or save versioned changes to anything they’re working on. -If the hard disk the central database is on becomes corrupted, and proper backups haven’t been kept, you lose absolutely everything – the entire history of the project except whatever single snapshots people happen to have on their local machines. +If that server goes down for an hour, then during that hour nobody can collaborate at all or save versioned changes to anything they're working on. +If the hard disk the central database is on becomes corrupted, and proper backups haven't been kept, you lose absolutely everything – the entire history of the project except whatever single snapshots people happen to have on their local machines. Local VCS systems suffer from this same problem – whenever you have the entire history of the project in a single place, you risk losing everything. ==== Distributed Version Control Systems (((version control,distributed))) This is where Distributed Version Control Systems (DVCSs) step in. -In a DVCS (such as Git, Mercurial, Bazaar or Darcs), clients don’t just check out the latest snapshot of the files: they fully mirror the repository. +In a DVCS (such as Git, Mercurial, Bazaar or Darcs), clients don't just check out the latest snapshot of the files: they fully mirror the repository. Thus if any server dies, and these systems were collaborating via it, any of the client repositories can be copied back up to the server to restore it. Every checkout is really a full backup of all the data. @@ -59,4 +59,4 @@ Every checkout is really a full backup of all the data. image::images/distributed.png[Distributed version control diagram] Furthermore, many of these systems deal pretty well with having several remote repositories they can work with, so you can collaborate with different groups of people in different ways simultaneously within the same project. -This allows you to set up several types of workflows that aren’t possible in centralized systems, such as hierarchical models. +This allows you to set up several types of workflows that aren't possible in centralized systems, such as hierarchical models. diff --git a/book/01-introduction/sections/basics.asc b/book/01-introduction/sections/basics.asc index 0c90330fd..66b45f43b 100644 --- a/book/01-introduction/sections/basics.asc +++ b/book/01-introduction/sections/basics.asc @@ -14,10 +14,10 @@ These systems (CVS, Subversion, Perforce, Bazaar, and so on) think of the inform .Storing data as changes to a base version of each file. image::images/deltas.png[Storing data as changes to a base version of each file.] -Git doesn’t think of or store its data this way. +Git doesn't think of or store its data this way. Instead, Git thinks of its data more like a set of snapshots of a miniature filesystem. Every time you commit, or save the state of your project in Git, it basically takes a picture of what all your files look like at that moment and stores a reference to that snapshot. -To be efficient, if files have not changed, Git doesn’t store the file again, just a link to the previous identical file it has already stored. +To be efficient, if files have not changed, Git doesn't store the file again, just a link to the previous identical file it has already stored. Git thinks about its data more like a *stream of snapshots*. .Storing data as snapshots of the project over time. @@ -26,31 +26,31 @@ image::images/snapshots.png[Git stores data as snapshots of the project over tim This is an important distinction between Git and nearly all other VCSs. It makes Git reconsider almost every aspect of version control that most other systems copied from the previous generation. This makes Git more like a mini filesystem with some incredibly powerful tools built on top of it, rather than simply a VCS. -We’ll explore some of the benefits you gain by thinking of your data this way when we cover Git branching in <<_git_branching>>. +We'll explore some of the benefits you gain by thinking of your data this way when we cover Git branching in <<_git_branching>>. ==== Nearly Every Operation Is Local Most operations in Git only need local files and resources to operate – generally no information is needed from another computer on your network. -If you’re used to a CVCS where most operations have that network latency overhead, this aspect of Git will make you think that the gods of speed have blessed Git with unworldly powers. +If you're used to a CVCS where most operations have that network latency overhead, this aspect of Git will make you think that the gods of speed have blessed Git with unworldly powers. Because you have the entire history of the project right there on your local disk, most operations seem almost instantaneous. -For example, to browse the history of the project, Git doesn’t need to go out to the server to get the history and display it for you – it simply reads it directly from your local database. +For example, to browse the history of the project, Git doesn't need to go out to the server to get the history and display it for you – it simply reads it directly from your local database. This means you see the project history almost instantly. If you want to see the changes introduced between the current version of a file and the file a month ago, Git can look up the file a month ago and do a local difference calculation, instead of having to either ask a remote server to do it or pull an older version of the file from the remote server to do it locally. -This also means that there is very little you can’t do if you’re offline or off VPN. +This also means that there is very little you can't do if you're offline or off VPN. If you get on an airplane or a train and want to do a little work, you can commit happily until you get to a network connection to upload. -If you go home and can’t get your VPN client working properly, you can still work. +If you go home and can't get your VPN client working properly, you can still work. In many other systems, doing so is either impossible or painful. -In Perforce, for example, you can’t do much when you aren’t connected to the server; and in Subversion and CVS, you can edit files, but you can’t commit changes to your database (because your database is offline). +In Perforce, for example, you can't do much when you aren't connected to the server; and in Subversion and CVS, you can edit files, but you can't commit changes to your database (because your database is offline). This may not seem like a huge deal, but you may be surprised what a big difference it can make. ==== Git Has Integrity Everything in Git is check-summed before it is stored and is then referred to by that checksum. -This means it’s impossible to change the contents of any file or directory without Git knowing about it. +This means it's impossible to change the contents of any file or directory without Git knowing about it. This functionality is built into Git at the lowest levels and is integral to its philosophy. -You can’t lose information in transit or get file corruption without Git being able to detect it. +You can't lose information in transit or get file corruption without Git being able to detect it. The mechanism that Git uses for this checksumming is called a SHA-1 hash.(((SHA-1))) This is a 40-character string composed of hexadecimal characters (0–9 and a–f) and calculated based on the contents of a file or directory structure in Git. @@ -65,7 +65,7 @@ In fact, Git stores everything in its database not by file name but by the hash When you do actions in Git, nearly all of them only add data to the Git database. It is hard to get the system to do anything that is not undoable or to make it erase data in any way. -As in any VCS, you can lose or mess up changes you haven’t committed yet; but after you commit a snapshot into Git, it is very difficult to lose, especially if you regularly push your database to another repository. +As in any VCS, you can lose or mess up changes you haven't committed yet; but after you commit a snapshot into Git, it is very difficult to lose, especially if you regularly push your database to another repository. This makes using Git a joy because we know we can experiment without the danger of severely screwing things up. For a more in-depth look at how Git stores its data and how you can recover data that seems lost, see <<_undoing>>. @@ -91,7 +91,7 @@ The working directory is a single checkout of one version of the project. These files are pulled out of the compressed database in the Git directory and placed on disk for you to use or modify. The staging area is a file, generally contained in your Git directory, that stores information about what will go into your next commit. -It’s sometimes referred to as the index, but it’s also common to refer to it as the staging area. +It's sometimes referred to as the index, but it's also common to refer to it as the staging area. The basic Git workflow goes something like this: @@ -99,7 +99,7 @@ The basic Git workflow goes something like this: 2. You stage the files, adding snapshots of them to your staging area. 3. You do a commit, which takes the files as they are in the staging area and stores that snapshot permanently to your Git directory. -If a particular version of a file is in the Git directory, it’s considered committed. -If it’s modified but has been added to the staging area, it is staged. +If a particular version of a file is in the Git directory, it's considered committed. +If it's modified but has been added to the staging area, it is staged. And if it was changed since it was checked out but has not been staged, it is modified. -In <<_git_basics_chapter>>, you’ll learn more about these states and how you can either take advantage of them or skip the staged part entirely. +In <<_git_basics_chapter>>, you'll learn more about these states and how you can either take advantage of them or skip the staged part entirely. diff --git a/book/01-introduction/sections/first-time-setup.asc b/book/01-introduction/sections/first-time-setup.asc index 1fef3ce9d..de771ece0 100644 --- a/book/01-introduction/sections/first-time-setup.asc +++ b/book/01-introduction/sections/first-time-setup.asc @@ -1,7 +1,7 @@ === First-Time Git Setup -Now that you have Git on your system, you’ll want to do a few things to customize your Git environment. -You should have to do these things only once on any given computer; they’ll stick around between upgrades. +Now that you have Git on your system, you'll want to do a few things to customize your Git environment. +You should have to do these things only once on any given computer; they'll stick around between upgrades. You can also change them at any time by running through the commands again. Git comes with a tool called `git config` that lets you get and set configuration variables that control all aspects of how Git looks and operates.(((git commands, config))) @@ -11,30 +11,30 @@ These variables can be stored in three different places: If you pass the option` --system` to `git config`, it reads and writes from this file specifically. 2. `~/.gitconfig` or `~/.config/git/config` file: Specific to your user. You can make Git read and write to this file specifically by passing the `--global` option. -3. `config` file in the Git directory (that is, `.git/config`) of whatever repository you’re currently using: Specific to that single repository. +3. `config` file in the Git directory (that is, `.git/config`) of whatever repository you're currently using: Specific to that single repository. Each level overrides values in the previous level, so values in `.git/config` trump those in `/etc/gitconfig`. On Windows systems, Git looks for the `.gitconfig` file in the `$HOME` directory (`C:\Users\$USER` for most people). -It also still looks for `/etc/gitconfig`, although it’s relative to the MSys root, which is wherever you decide to install Git on your Windows system when you run the installer. +It also still looks for `/etc/gitconfig`, although it's relative to the MSys root, which is wherever you decide to install Git on your Windows system when you run the installer. ==== Your Identity The first thing you should do when you install Git is to set your user name and e-mail address. -This is important because every Git commit uses this information, and it’s immutably baked into the commits you start creating: +This is important because every Git commit uses this information, and it's immutably baked into the commits you start creating: $ git config --global user.name "John Doe" $ git config --global user.email johndoe@example.com Again, you need to do this only once if you pass the `--global` option, because then Git will always use that information for anything you do on that system. -If you want to override this with a different name or e-mail address for specific projects, you can run the command without the `--global` option when you’re in that project. +If you want to override this with a different name or e-mail address for specific projects, you can run the command without the `--global` option when you're in that project. Many of the GUI tools will help you do this when you first run them. ==== Your Editor Now that your identity is set up, you can configure the default text editor that will be used when Git needs you to type in a message. -If not configured, Git uses your system’s default editor, which is generally Vim. +If not configured, Git uses your system's default editor, which is generally Vim. If you want to use a different text editor, such as Emacs, you can do the following: $ git config --global core.editor emacs @@ -60,7 +60,7 @@ If you want to check your settings, you can use the `git config --list` command You may see keys more than once, because Git reads the same key from different files (`/etc/gitconfig` and `~/.gitconfig`, for example). In this case, Git uses the last value for each unique key it sees. -You can also check what Git thinks a specific key’s value is by typing `git config `:(((git commands, config))) +You can also check what Git thinks a specific key's value is by typing `git config `:(((git commands, config))) $ git config user.name John Doe diff --git a/book/01-introduction/sections/help.asc b/book/01-introduction/sections/help.asc index 7d771594b..6e12517ca 100644 --- a/book/01-introduction/sections/help.asc +++ b/book/01-introduction/sections/help.asc @@ -11,5 +11,5 @@ For example, you can get the manpage help for the config command by running(((gi $ git help config These commands are nice because you can access them anywhere, even offline. -If the manpages and this book aren’t enough and you need in-person help, you can try the `#git` or `#github` channel on the Freenode IRC server (irc.freenode.net). +If the manpages and this book aren't enough and you need in-person help, you can try the `#git` or `#github` channel on the Freenode IRC server (irc.freenode.net). These channels are regularly filled with hundreds of people who are all very knowledgeable about Git and are often willing to help.(((IRC))) diff --git a/book/01-introduction/sections/history.asc b/book/01-introduction/sections/history.asc index 5631ff045..df3ebe449 100644 --- a/book/01-introduction/sections/history.asc +++ b/book/01-introduction/sections/history.asc @@ -6,7 +6,7 @@ The Linux kernel is an open source software project of fairly large scope.(((Lin For most of the lifetime of the Linux kernel maintenance (1991–2002), changes to the software were passed around as patches and archived files. In 2002, the Linux kernel project began using a proprietary DVCS called BitKeeper.(((BitKeeper))) -In 2005, the relationship between the community that developed the Linux kernel and the commercial company that developed BitKeeper broke down, and the tool’s free-of-charge status was revoked. +In 2005, the relationship between the community that developed the Linux kernel and the commercial company that developed BitKeeper broke down, and the tool's free-of-charge status was revoked. This prompted the Linux development community (and in particular Linus Torvalds, the creator of Linux) to develop their own tool based on some of the lessons they learned while using BitKeeper.(((Linus Torvalds))) Some of the goals of the new system were as follows: @@ -17,4 +17,4 @@ Some of the goals of the new system were as follows: * Able to handle large projects like the Linux kernel efficiently (speed and data size) Since its birth in 2005, Git has evolved and matured to be easy to use and yet retain these initial qualities. -It’s incredibly fast, it’s very efficient with large projects, and it has an incredible branching system for non-linear development (See <<_git_branching>>). +It's incredibly fast, it's very efficient with large projects, and it has an incredible branching system for non-linear development (See <<_git_branching>>). diff --git a/book/01-introduction/sections/installing.asc b/book/01-introduction/sections/installing.asc index 812948c75..18ad16875 100644 --- a/book/01-introduction/sections/installing.asc +++ b/book/01-introduction/sections/installing.asc @@ -8,11 +8,11 @@ You can either install it as a package or via another installer, or download the (((Linux, installing))) If you want to install Git on Linux via a binary installer, you can generally do so through the basic package-management tool that comes with your distribution. -If you’re on Fedora for example, you can use yum: +If you're on Fedora for example, you can use yum: $ yum install git -If you’re on a Debian-based distribution like Ubuntu, try apt-get: +If you're on a Debian-based distribution like Ubuntu, try apt-get: $ apt-get install git @@ -50,11 +50,11 @@ We'll learn more about those things a little later, but suffice it to say they'r ==== Installing from Source -Some people may instead find it useful to install Git from source, because you’ll get the most recent version. +Some people may instead find it useful to install Git from source, because you'll get the most recent version. The binary installers tend to be a bit behind, though as Git has matured in recent years, this tends to make a little less of a difference. If you do want to install Git from source, you need to have the following libraries that Git depends on: curl, zlib, openssl, expat, and libiconv. -For example, if you’re on a system that has yum (such as Fedora) or apt-get (such as a Debian based system), you can use one of these commands to install all of the dependencies: +For example, if you're on a system that has yum (such as Fedora) or apt-get (such as a Debian based system), you can use one of these commands to install all of the dependencies: $ yum install curl-devel expat-devel gettext-devel \ openssl-devel zlib-devel diff --git a/book/02-git-basics/1-git-basics.asc b/book/02-git-basics/1-git-basics.asc index 7b9db9bf0..14d3d34b0 100644 --- a/book/02-git-basics/1-git-basics.asc +++ b/book/02-git-basics/1-git-basics.asc @@ -2,9 +2,9 @@ == Git Basics If you can read only one chapter to get going with Git, this is it. -This chapter covers every basic command you need to do the vast majority of the things you’ll eventually spend your time doing with Git. +This chapter covers every basic command you need to do the vast majority of the things you'll eventually spend your time doing with Git. By the end of the chapter, you should be able to configure and initialize a repository, begin and stop tracking files, and stage and commit changes. -We’ll also show you how to set up Git to ignore certain files and file patterns, how to undo mistakes quickly and easily, how to browse the history of your project and view changes between commits, and how to push and pull from remote repositories. +We'll also show you how to set up Git to ignore certain files and file patterns, how to undo mistakes quickly and easily, how to browse the history of your project and view changes between commits, and how to push and pull from remote repositories. include::sections/getting-a-repository.asc[] @@ -23,4 +23,4 @@ include::sections/aliases.asc[] === Summary At this point, you can do all the basic local Git operations – creating or cloning a repository, making changes, staging and committing those changes, and viewing the history of all the changes the repository has been through. -Next, we’ll cover Git’s killer feature: its branching model. +Next, we'll cover Git's killer feature: its branching model. diff --git a/book/02-git-basics/sections/aliases.asc b/book/02-git-basics/sections/aliases.asc index 57b68eb15..c0080aaae 100644 --- a/book/02-git-basics/sections/aliases.asc +++ b/book/02-git-basics/sections/aliases.asc @@ -2,10 +2,10 @@ (((aliases))) Before we finish this chapter on basic Git, there's just one little tip that can make your Git experience simpler, easier, and more familiar: aliases. -We won’t refer to them or assume you’ve used them later in the book, but you should probably know how to use them. +We won't refer to them or assume you've used them later in the book, but you should probably know how to use them. -Git doesn’t automatically infer your command if you type it in partially. -If you don’t want to type the entire text of each of the Git commands, you can easily set up an alias for each command using `git config`.(((git commands, config))) +Git doesn't automatically infer your command if you type it in partially. +If you don't want to type the entire text of each of the Git commands, you can easily set up an alias for each command using `git config`.(((git commands, config))) Here are a couple of examples you may want to set up: [source,shell] @@ -17,7 +17,7 @@ $ git config --global alias.st status ---- This means that, for example, instead of typing `git commit`, you just need to type `git ci`. -As you go on using Git, you’ll probably use other commands frequently as well; don’t hesitate to create new aliases. +As you go on using Git, you'll probably use other commands frequently as well; don't hesitate to create new aliases. This technique can also be very useful in creating commands that you think should exist. For example, to correct the usability problem you encountered with unstaging a file, you can add your own unstage alias to Git: @@ -36,7 +36,7 @@ $ git reset HEAD fileA ---- This seems a bit clearer. -It’s also common to add a `last` command, like this: +It's also common to add a `last` command, like this: [source,shell] ---- diff --git a/book/02-git-basics/sections/getting-a-repository.asc b/book/02-git-basics/sections/getting-a-repository.asc index b7223f1a9..dea4fe200 100644 --- a/book/02-git-basics/sections/getting-a-repository.asc +++ b/book/02-git-basics/sections/getting-a-repository.asc @@ -6,7 +6,7 @@ The second clones an existing Git repository from another server. ==== Initializing a Repository in an Existing Directory -If you’re starting to track an existing project in Git, you need to go to the project’s directory and type +If you're starting to track an existing project in Git, you need to go to the project's directory and type [source,shell] ---- @@ -27,13 +27,13 @@ $ git add LICENSE $ git commit -m 'initial project version' ---- -We’ll go over what these commands do in just a minute. +We'll go over what these commands do in just a minute. At this point, you have a Git repository with tracked files and an initial commit. ==== Cloning an Existing Repository -If you want to get a copy of an existing Git repository – for example, a project you’d like to contribute to – the command you need is `git clone`. -If you’re familiar with other VCS systems such as Subversion, you’ll notice that the command is "clone" and not "checkout". +If you want to get a copy of an existing Git repository – for example, a project you'd like to contribute to – the command you need is `git clone`. +If you're familiar with other VCS systems such as Subversion, you'll notice that the command is "clone" and not "checkout". This is an important distinction – instead of getting just a working copy, Git receives a full copy of nearly all data that the server has. Every version of every file for the history of the project is pulled down by default when you run `git clone`. In fact, if your server disk gets corrupted, you can often use nearly any of the clones on any client to set the server back to the state it was in when it was cloned (you may lose some server-side hooks and such, but all the versioned data would be there – see <<_git_on_the_server>> for more details). @@ -47,7 +47,7 @@ $ git clone https://github.com/libgit2/libgit2 ---- That creates a directory named ``libgit2'', initializes a `.git` directory inside it, pulls down all the data for that repository, and checks out a working copy of the latest version. -If you go into the new `libgit2` directory, you’ll see the project files in there, ready to be worked on or used. +If you go into the new `libgit2` directory, you'll see the project files in there, ready to be worked on or used. If you want to clone the repository into a directory named something other than ``libgit2'', you can specify that as the next command-line option: [source,shell] diff --git a/book/02-git-basics/sections/recording-changes.asc b/book/02-git-basics/sections/recording-changes.asc index 46e7a40fc..6338b91d5 100644 --- a/book/02-git-basics/sections/recording-changes.asc +++ b/book/02-git-basics/sections/recording-changes.asc @@ -6,9 +6,9 @@ You need to make some changes and commit snapshots of those changes into your re Remember that each file in your working directory can be in one of two states: tracked or untracked. Tracked files are files that were in the last snapshot; they can be unmodified, modified, or staged. Untracked files are everything else – any files in your working directory that were not in your last snapshot and are not in your staging area. -When you first clone a repository, all of your files will be tracked and unmodified because you just checked them out and haven’t edited anything. +When you first clone a repository, all of your files will be tracked and unmodified because you just checked them out and haven't edited anything. -As you edit files, Git sees them as modified, because you’ve changed them since your last commit. +As you edit files, Git sees them as modified, because you've changed them since your last commit. You stage these modified files and then commit all your staged changes, and the cycle repeats. .The lifecycle of the status of your files. @@ -27,13 +27,13 @@ nothing to commit, working directory clean ---- This means you have a clean working directory – in other words, there are no tracked and modified files. -Git also doesn’t see any untracked files, or they would be listed here. -Finally, the command tells you which branch you’re on and informs you that it has not diverged from the same branch on the server. -For now, that branch is always ``master'', which is the default; you won’t worry about it here. +Git also doesn't see any untracked files, or they would be listed here. +Finally, the command tells you which branch you're on and informs you that it has not diverged from the same branch on the server. +For now, that branch is always ``master'', which is the default; you won't worry about it here. <<_git_branching>> will go over branches and references in detail. -Let’s say you add a new file to your project, a simple README file. -If the file didn’t exist before, and you run `git status`, you see your untracked file like so: +Let's say you add a new file to your project, a simple README file. +If the file didn't exist before, and you run `git status`, you see your untracked file like so: [source,shell] ---- @@ -48,10 +48,10 @@ Untracked files: nothing added to commit but untracked files present (use "git add" to track) ---- -You can see that your new README file is untracked, because it’s under the ``Untracked files'' heading in your status output. -Untracked basically means that Git sees a file you didn’t have in the previous snapshot (commit); Git won’t start including it in your commit snapshots until you explicitly tell it to do so. -It does this so you don’t accidentally begin including generated binary files or other files that you did not mean to include. -You do want to start including README, so let’s start tracking the file. +You can see that your new README file is untracked, because it's under the ``Untracked files'' heading in your status output. +Untracked basically means that Git sees a file you didn't have in the previous snapshot (commit); Git won't start including it in your commit snapshots until you explicitly tell it to do so. +It does this so you don't accidentally begin including generated binary files or other files that you did not mean to include. +You do want to start including README, so let's start tracking the file. ==== Tracking New Files @@ -76,14 +76,14 @@ Changes to be committed: ---- -You can tell that it’s staged because it’s under the ``Changes to be committed'' heading. +You can tell that it's staged because it's under the ``Changes to be committed'' heading. If you commit at this point, the version of the file at the time you ran `git add` is what will be in the historical snapshot. You may recall that when you ran `git init` earlier, you then ran `git add (files)` – that was to begin tracking files in your directory.(((git commands, init)))(((git commands, add))) -The `git add` command takes a path name for either a file or a directory; if it’s a directory, the command adds all the files in that directory recursively. +The `git add` command takes a path name for either a file or a directory; if it's a directory, the command adds all the files in that directory recursively. ==== Staging Modified Files -Let’s change a file that was already tracked. +Let's change a file that was already tracked. If you change a previously tracked file called ``benchmarks.rb'' and then run your `git status` command again, you get something that looks like this: [source,shell] @@ -105,7 +105,7 @@ Changes not staged for commit: The ``benchmarks.rb'' file appears under a section named ``Changed but not staged for commit'' – which means that a file that is tracked has been modified in the working directory but not yet staged. To stage it, you run the `git add` command. `git add` is a multipurpose command – you use it to begin tracking new files, to stage files, and to do other things like marking merge-conflicted files as resolved. It may be helpful to think of it more as ``add this content to the next commit'' rather than ``add this file to the project''.(((git commands, add))) -Let’s run `git add` now to stage the ``benchmarks.rb'' file, and then run `git status` again: +Let's run `git add` now to stage the ``benchmarks.rb'' file, and then run `git status` again: [source,shell] ---- @@ -122,8 +122,8 @@ Changes to be committed: Both files are staged and will go into your next commit. At this point, suppose you remember one little change that you want to make in `benchmarks.rb` before you commit it. -You open it again and make that change, and you’re ready to commit. -However, let’s run `git status` one more time: +You open it again and make that change, and you're ready to commit. +However, let's run `git status` one more time: [source,shell] ---- @@ -182,7 +182,7 @@ New files that aren't tracked have a `??` next to them, new files that have been [[_ignoring]] ==== Ignoring Files -Often, you’ll have a class of files that you don’t want Git to automatically add or even show you as being untracked. +Often, you'll have a class of files that you don't want Git to automatically add or even show you as being untracked. These are generally automatically generated files such as log files or files produced by your build system. In such cases, you can create a file listing patterns to match them named `.gitignore`.(((ignoring files))) Here is an example `.gitignore` file: @@ -197,7 +197,7 @@ $ cat .gitignore The first line tells Git to ignore any files ending in ``.o'' or ``.a'' – object and archive files that may be the product of building your code. The second line tells Git to ignore all files that end with a tilde (`~`), which is used by many text editors such as Emacs to mark temporary files. You may also include a log, tmp, or pid directory; automatically generated documentation; and so on. -Setting up a `.gitignore` file before you get going is generally a good idea so you don’t accidentally commit files that you really don’t want in your Git repository. +Setting up a `.gitignore` file before you get going is generally a good idea so you don't accidentally commit files that you really don't want in your Git repository. The rules for the patterns you can put in the `.gitignore` file are as follows: @@ -230,11 +230,11 @@ GitHub maintains a fairly comprehensive list of good `.gitignore` file examples ==== Viewing Your Staged and Unstaged Changes If the `git status` command is too vague for you – you want to know exactly what you changed, not just which files were changed – you can use the `git diff` command.(((git commands, diff))) -We’ll cover `git diff` in more detail later, but you’ll probably use it most often to answer these two questions: What have you changed but not yet staged? +We'll cover `git diff` in more detail later, but you'll probably use it most often to answer these two questions: What have you changed but not yet staged? And what have you staged that you are about to commit? Although `git status` answers those questions very generally by listing the file names, `git diff` shows you the exact lines added and removed – the patch, as it were. -Let’s say you edit and stage the `README` file again and then edit the `benchmarks.rb` file without staging it. +Let's say you edit and stage the `README` file again and then edit the `benchmarks.rb` file without staging it. If you run your `git status` command, you once again see something like this: [source,shell] @@ -253,7 +253,7 @@ Changes not staged for commit: modified: benchmarks.rb ---- -To see what you’ve changed but not yet staged, type `git diff` with no other arguments: +To see what you've changed but not yet staged, type `git diff` with no other arguments: [source,shell] ---- @@ -276,9 +276,9 @@ index 3cb747f..e445e28 100644 ---- That command compares what is in your working directory with what is in your staging area. -The result tells you the changes you’ve made that you haven’t yet staged. +The result tells you the changes you've made that you haven't yet staged. -If you want to see what you’ve staged that will go into your next commit, you can use `git diff --staged`. +If you want to see what you've staged that will go into your next commit, you can use `git diff --staged`. This command compares your staged changes to your last commit: [source,shell] @@ -296,8 +296,8 @@ index 0000000..03902a1 + ---- -It’s important to note that `git diff` by itself doesn’t show all changes made since your last commit – only changes that are still unstaged. -This can be confusing, because if you’ve staged all of your changes, `git diff` will give you no output. +It's important to note that `git diff` by itself doesn't show all changes made since your last commit – only changes that are still unstaged. +This can be confusing, because if you've staged all of your changes, `git diff` will give you no output. For another example, if you stage the `benchmarks.rb` file and then edit it, you can use `git diff` to see the changes in the file that are staged and the changes that are unstaged: @@ -335,7 +335,7 @@ index e445e28..86b2f7c 100644 +# test line ---- -and `git diff --cached` to see what you’ve staged so far: +and `git diff --cached` to see what you've staged so far: [source,shell] ---- @@ -360,9 +360,9 @@ index 3cb747f..e445e28 100644 ==== Committing Your Changes Now that your staging area is set up the way you want it, you can commit your changes. -Remember that anything that is still unstaged – any files you have created or modified that you haven’t run `git add` on since you edited them – won’t go into this commit. +Remember that anything that is still unstaged – any files you have created or modified that you haven't run `git add` on since you edited them – won't go into this commit. They will stay as modified files on your disk. -In this case, the last time you ran `git status`, you saw that everything was staged, so you’re ready to commit your changes.(((git commands, status))) +In this case, the last time you ran `git status`, you saw that everything was staged, so you're ready to commit your changes.(((git commands, status))) The simplest way to commit is to type `git commit`:(((git commands, commit))) [source,shell] @@ -371,7 +371,7 @@ $ git commit ---- Doing so launches your editor of choice. -(This is set by your shell’s `$EDITOR` environment variable – usually vim or emacs, although you can configure it with whatever you want using the `git config --global core.editor` command as you saw in <<_getting_started>>).(((editor, changing default)))(((git commands, config))) +(This is set by your shell's `$EDITOR` environment variable – usually vim or emacs, although you can configure it with whatever you want using the `git config --global core.editor` command as you saw in <<_getting_started>>).(((editor, changing default)))(((git commands, config))) The editor displays the following text (this example is a Vim screen): @@ -392,8 +392,8 @@ The editor displays the following text (this example is a Vim screen): ---- You can see that the default commit message contains the latest output of the `git status` command commented out and one empty line on top. -You can remove these comments and type your commit message, or you can leave them there to help you remember what you’re committing. -(For an even more explicit reminder of what you’ve modified, you can pass the `-v` option to `git commit`. +You can remove these comments and type your commit message, or you can leave them there to help you remember what you're committing. +(For an even more explicit reminder of what you've modified, you can pass the `-v` option to `git commit`. Doing so also puts the diff of your change in the editor so you can see exactly what you did.) When you exit the editor, Git creates your commit with that commit message (with the comments and diff stripped out). @@ -407,12 +407,12 @@ $ git commit -m "Story 182: Fix benchmarks for speed" create mode 100644 README ---- -Now you’ve created your first commit! +Now you've created your first commit! You can see that the commit has given you some output about itself: which branch you committed to (`master`), what SHA-1 checksum the commit has (`463dc4f`), how many files were changed, and statistics about lines added and removed in the commit. Remember that the commit records the snapshot you set up in your staging area. -Anything you didn’t stage is still sitting there modified; you can do another commit to add it to your history. -Every time you perform a commit, you’re recording a snapshot of your project that you can revert to or compare to later. +Anything you didn't stage is still sitting there modified; you can do another commit to add it to your history. +Every time you perform a commit, you're recording a snapshot of your project that you can revert to or compare to later. ==== Skipping the Staging Area @@ -437,13 +437,13 @@ $ git commit -a -m 'added new benchmarks' 1 file changed, 5 insertions(+), 0 deletions(-) ---- -Notice how you don’t have to run `git add` on the ``benchmarks.rb'' file in this case before you commit. +Notice how you don't have to run `git add` on the ``benchmarks.rb'' file in this case before you commit. ==== Removing Files (((files, removing))) To remove a file from Git, you have to remove it from your tracked files (more accurately, remove it from your staging area) and then commit. -The `git rm` command does that, and also removes the file from your working directory so you don’t see it as an untracked file the next time around. +The `git rm` command does that, and also removes the file from your working directory so you don't see it as an untracked file the next time around. If you simply remove the file from your working directory, it shows up under the ``Changed but not updated'' (that is, _unstaged_) area of your `git status` output: @@ -461,7 +461,7 @@ Changes not staged for commit: no changes added to commit (use "git add" and/or "git commit -a") ---- -Then, if you run `git rm`, it stages the file’s removal: +Then, if you run `git rm`, it stages the file's removal: [source,shell] ---- @@ -477,7 +477,7 @@ Changes to be committed: The next time you commit, the file will be gone and no longer tracked. If you modified the file and added it to the index already, you must force the removal with the `-f` option. -This is a safety feature to prevent accidental removal of data that hasn’t yet been recorded in a snapshot and that can’t be recovered from Git. +This is a safety feature to prevent accidental removal of data that hasn't yet been recorded in a snapshot and that can't be recovered from Git. Another useful thing you may want to do is to keep the file in your working tree but remove it from your staging area. In other words, you may want to keep the file on your hard drive but not have Git track it anymore. @@ -498,7 +498,7 @@ $ git rm log/\*.log ---- Note the backslash (`\`) in front of the `*`. -This is necessary because Git does its own filename expansion in addition to your shell’s filename expansion. +This is necessary because Git does its own filename expansion in addition to your shell's filename expansion. This command removes all files that have the `.log` extension in the `log/` directory. Or, you can do something like this: @@ -512,11 +512,11 @@ This command removes all files that end with `~`. ==== Moving Files (((files, moving))) -Unlike many other VCS systems, Git doesn’t explicitly track file movement. +Unlike many other VCS systems, Git doesn't explicitly track file movement. If you rename a file in Git, no metadata is stored in Git that tells it you renamed the file. -However, Git is pretty smart about figuring that out after the fact – we’ll deal with detecting file movement a bit later. +However, Git is pretty smart about figuring that out after the fact – we'll deal with detecting file movement a bit later. -Thus it’s a bit confusing that Git has a `mv` command. +Thus it's a bit confusing that Git has a `mv` command. If you want to rename a file in Git, you can run something like [source,shell] @@ -525,7 +525,7 @@ $ git mv file_from file_to ---- and it works fine. -In fact, if you run something like this and look at the status, you’ll see that Git considers it a renamed file: +In fact, if you run something like this and look at the status, you'll see that Git considers it a renamed file: [source,shell] ---- @@ -547,6 +547,6 @@ $ git rm README.md $ git add README ---- -Git figures out that it’s a rename implicitly, so it doesn’t matter if you rename a file that way or with the `mv` command. -The only real difference is that `mv` is one command instead of three – it’s a convenience function. +Git figures out that it's a rename implicitly, so it doesn't matter if you rename a file that way or with the `mv` command. +The only real difference is that `mv` is one command instead of three – it's a convenience function. More important, you can use any tool you like to rename a file, and address the add/rm later, before you commit. diff --git a/book/02-git-basics/sections/remotes.asc b/book/02-git-basics/sections/remotes.asc index ef3d01f25..da191ad0e 100644 --- a/book/02-git-basics/sections/remotes.asc +++ b/book/02-git-basics/sections/remotes.asc @@ -5,13 +5,13 @@ Remote repositories are versions of your project that are hosted on the Internet You can have several of them, each of which generally is either read-only or read/write for you. Collaborating with others involves managing these remote repositories and pushing and pulling data to and from them when you need to share work. Managing remote repositories includes knowing how to add remote repositories, remove remotes that are no longer valid, manage various remote branches and define them as being tracked or not, and more. -In this section, we’ll cover some of these remote-management skills. +In this section, we'll cover some of these remote-management skills. ==== Showing Your Remotes To see which remote servers you have configured, you can run the `git remote` command.(((git commands, remote))) -It lists the shortnames of each remote handle you’ve specified. -If you’ve cloned your repository, you should at least see origin – that is the default name Git gives to the server you cloned from: +It lists the shortnames of each remote handle you've specified. +If you've cloned your repository, you should at least see origin – that is the default name Git gives to the server you cloned from: [source,shell] ---- @@ -57,11 +57,11 @@ origin git@github.com:mojombo/grit.git (push) This means we can pull contributions from any of these users pretty easily. We may additionally have permission to push to one or more of these, though we can't tell that here. -Notice that these remotes use a variety of protocols; we’ll cover why more about this in <<_git_on_the_server>>. +Notice that these remotes use a variety of protocols; we'll cover why more about this in <<_git_on_the_server>>. ==== Adding Remote Repositories -I’ve mentioned and given some demonstrations of adding remote repositories in previous sections, but here is how to do it explicitly.(((git commands, remote))) +I've mentioned and given some demonstrations of adding remote repositories in previous sections, but here is how to do it explicitly.(((git commands, remote))) To add a new remote Git repository as a shortname you can reference easily, run `git remote add [shortname] [url]`: [source,shell] @@ -77,7 +77,7 @@ pb https://github.com/paulboone/ticgit (push) ---- Now you can use the string `pb` on the command line in lieu of the whole URL. -For example, if you want to fetch all the information that Paul has but that you don’t yet have in your repository, you can run `git fetch pb`: +For example, if you want to fetch all the information that Paul has but that you don't yet have in your repository, you can run `git fetch pb`: [source,shell] ---- @@ -91,8 +91,8 @@ From https://github.com/paulboone/ticgit * [new branch] ticgit -> pb/ticgit ---- -Paul’s master branch is now accessible locally as `pb/master` – you can merge it into one of your branches, or you can check out a local branch at that point if you want to inspect it. -(We’ll go over what branches are and how to use them in much more detail in <<_git_branching>>.) +Paul's master branch is now accessible locally as `pb/master` – you can merge it into one of your branches, or you can check out a local branch at that point if you want to inspect it. +(We'll go over what branches are and how to use them in much more detail in <<_git_branching>>.) ==== Fetching and Pulling from Your Remotes @@ -104,17 +104,17 @@ As you just saw, to get data from your remote projects, you can run:(((git comma $ git fetch [remote-name] ---- -The command goes out to that remote project and pulls down all the data from that remote project that you don’t have yet. +The command goes out to that remote project and pulls down all the data from that remote project that you don't have yet. After you do this, you should have references to all the branches from that remote, which you can merge in or inspect at any time. If you clone a repository, the command automatically adds that remote repository under the name origin. So, `git fetch origin` fetches any new work that has been pushed to that server since you cloned (or last fetched from) it. -It’s important to note that the `git fetch` command pulls the data to your local repository – it doesn’t automatically merge it with any of your work or modify what you’re currently working on. -You have to merge it manually into your work when you’re ready. +It's important to note that the `git fetch` command pulls the data to your local repository – it doesn't automatically merge it with any of your work or modify what you're currently working on. +You have to merge it manually into your work when you're ready. If you have a branch set up to track a remote branch (see the next section and <<_git_branching>> for more information), you can use the `git pull` command to automatically fetch and then merge a remote branch into your current branch.(((git commands, pull))) This may be an easier or more comfortable workflow for you; and by default, the `git clone` command automatically sets up your local master branch to track the remote master branch (or whatever the default branch is called) on the server you cloned from. -Running `git pull` generally fetches data from the server you originally cloned from and automatically tries to merge it into the code you’re currently working on. +Running `git pull` generally fetches data from the server you originally cloned from and automatically tries to merge it into the code you're currently working on. ==== Pushing to Your Remotes @@ -129,7 +129,7 @@ $ git push origin master This command works only if you cloned from a server to which you have write access and if nobody has pushed in the meantime. If you and someone else clone at the same time and they push upstream and then you push upstream, your push will rightly be rejected. -You’ll have to pull down their work first and incorporate it into yours before you’ll be allowed to push. +You'll have to pull down their work first and incorporate it into yours before you'll be allowed to push. See <<_git_branching>> for more detailed information on how to push to remote servers. ==== Inspecting a Remote @@ -154,11 +154,11 @@ $ git remote show origin ---- It lists the URL for the remote repository as well as the tracking branch information. -The command helpfully tells you that if you’re on the master branch and you run `git pull`, it will automatically merge in the master branch on the remote after it fetches all the remote references. +The command helpfully tells you that if you're on the master branch and you run `git pull`, it will automatically merge in the master branch on the remote after it fetches all the remote references. It also lists all the remote references it has pulled down. -That is a simple example you’re likely to encounter. -When you’re using Git more heavily, however, you may see much more information from `git remote show`: +That is a simple example you're likely to encounter. +When you're using Git more heavily, however, you may see much more information from `git remote show`: [source,shell] ---- @@ -185,11 +185,11 @@ $ git remote show origin ---- This command shows which branch is automatically pushed to when you run `git push` while on certain branches. -It also shows you which remote branches on the server you don’t yet have, which remote branches you have that have been removed from the server, and multiple branches that are automatically merged when you run `git pull`. +It also shows you which remote branches on the server you don't yet have, which remote branches you have that have been removed from the server, and multiple branches that are automatically merged when you run `git pull`. ==== Removing and Renaming Remotes -If you want to rename a reference you can run `git remote rename` to change a remote’s shortname.(((git commands, remote))) +If you want to rename a reference you can run `git remote rename` to change a remote's shortname.(((git commands, remote))) For instance, if you want to rename `pb` to `paul`, you can do so with `git remote rename`: [source,shell] @@ -200,10 +200,10 @@ origin paul ---- -It’s worth mentioning that this changes your remote branch names, too. +It's worth mentioning that this changes your remote branch names, too. What used to be referenced at `pb/master` is now at `paul/master`. -If you want to remove a remote for some reason – you’ve moved the server or are no longer using a particular mirror, or perhaps a contributor isn’t contributing anymore – you can use `git remote rm`: +If you want to remove a remote for some reason – you've moved the server or are no longer using a particular mirror, or perhaps a contributor isn't contributing anymore – you can use `git remote rm`: [source,shell] ---- diff --git a/book/02-git-basics/sections/tagging.asc b/book/02-git-basics/sections/tagging.asc index 799a330b0..b70c7a099 100644 --- a/book/02-git-basics/sections/tagging.asc +++ b/book/02-git-basics/sections/tagging.asc @@ -3,7 +3,7 @@ (((tags))) Like most VCSs, Git has the ability to tag specific points in history as being important. Typically people use this functionality to mark release points (v1.0, and so on). -In this section, you’ll learn how to list the available tags, how to create new tags, and what the different types of tags are. +In this section, you'll learn how to list the available tags, how to create new tags, and what the different types of tags are. ==== Listing Your Tags @@ -21,7 +21,7 @@ This command lists the tags in alphabetical order; the order in which they appea You can also search for tags with a particular pattern. The Git source repo, for instance, contains more than 500 tags. -If you’re only interested in looking at the 1.8.5 series, you can run this: +If you're only interested in looking at the 1.8.5 series, you can run this: [source,shell] ---- @@ -42,11 +42,11 @@ v1.8.5.5 Git uses two main types of tags: lightweight and annotated. -A lightweight tag is very much like a branch that doesn’t change – it’s just a pointer to a specific commit. +A lightweight tag is very much like a branch that doesn't change – it's just a pointer to a specific commit. Annotated tags, however, are stored as full objects in the Git database. -They’re checksummed; contain the tagger name, e-mail, and date; have a tagging message; and can be signed and verified with GNU Privacy Guard (GPG). -It’s generally recommended that you create annotated tags so you can have all this information; but if you want a temporary tag or for some reason don’t want to keep the other information, lightweight tags are available too. +They're checksummed; contain the tagger name, e-mail, and date; have a tagging message; and can be signed and verified with GNU Privacy Guard (GPG). +It's generally recommended that you create annotated tags so you can have all this information; but if you want a temporary tag or for some reason don't want to keep the other information, lightweight tags are available too. ==== Annotated Tags @@ -64,7 +64,7 @@ v1.4 ---- The `-m` specifies a tagging message, which is stored with the tag. -If you don’t specify a message for an annotated tag, Git launches your editor so you can type it in. +If you don't specify a message for an annotated tag, Git launches your editor so you can type it in. You can see the tag data along with the commit that was tagged by using the `git show` command: @@ -91,7 +91,7 @@ That shows the tagger information, the date the commit was tagged, and the annot (((tags, lightweight))) Another way to tag commits is with a lightweight tag. This is basically the commit checksum stored in a file – no other information is kept. -To create a lightweight tag, don’t supply the `-a`, `-s`, or `-m` option: +To create a lightweight tag, don't supply the `-a`, `-s`, or `-m` option: [source,shell] ---- @@ -104,7 +104,7 @@ v1.4-lw v1.5 ---- -This time, if you run `git show` on the tag, you don’t see the extra tag information.(((git commands, show))) +This time, if you run `git show` on the tag, you don't see the extra tag information.(((git commands, show))) The command just shows the commit: [source,shell] @@ -119,7 +119,7 @@ Date: Mon Mar 17 21:52:11 2008 -0700 ==== Tagging Later -You can also tag commits after you’ve moved past them. +You can also tag commits after you've moved past them. Suppose your commit history looks like this: [source,shell] @@ -146,7 +146,7 @@ To tag that commit, you specify the commit checksum (or part of it) at the end o $ git tag -a v1.2 9fceb02 ---- -You can see that you’ve tagged the commit:(((git commands, tag))) +You can see that you've tagged the commit:(((git commands, tag))) [source,shell] ---- @@ -174,7 +174,7 @@ Date: Sun Apr 27 20:43:35 2008 -0700 ==== Sharing Tags -By default, the `git push` command doesn’t transfer tags to remote servers.(((git commands, push))) +By default, the `git push` command doesn't transfer tags to remote servers.(((git commands, push))) You will have to explicitly push tags to a shared server after you have created them. This process is just like sharing remote branches – you can run `git push origin [tagname]`. diff --git a/book/02-git-basics/sections/undoing.asc b/book/02-git-basics/sections/undoing.asc index e49c466ca..926a93262 100644 --- a/book/02-git-basics/sections/undoing.asc +++ b/book/02-git-basics/sections/undoing.asc @@ -2,8 +2,8 @@ === Undoing Things At any stage, you may want to undo something. -Here, we’ll review a few basic tools for undoing changes that you’ve made. -Be careful, because you can’t always undo some of these undos. +Here, we'll review a few basic tools for undoing changes that you've made. +Be careful, because you can't always undo some of these undos. This is one of the few areas in Git where you may lose some work if you do it wrong. One of the common undos takes place when you commit too early and possibly forget to add some files, or you mess up your commit message. @@ -15,7 +15,7 @@ $ git commit --amend ---- This command takes your staging area and uses it for the commit. -If you’ve made no changes since your last commit (for instance, you run this command immediately after your previous commit), then your snapshot will look exactly the same, and all you’ll change is your commit message. +If you've made no changes since your last commit (for instance, you run this command immediately after your previous commit), then your snapshot will look exactly the same, and all you'll change is your commit message. The same commit-message editor fires up, but it already contains the message of your previous commit. You can edit the message the same as always, but it overwrites your previous commit. @@ -36,7 +36,7 @@ You end up with a single commit – the second commit replaces the results of th The next two sections demonstrate how to wrangle your staging area and working directory changes. The nice part is that the command you use to determine the state of those two areas also reminds you how to undo changes to them. -For example, let’s say you’ve changed two files and want to commit them as two separate changes, but you accidentally type `git add *` and stage them both. +For example, let's say you've changed two files and want to commit them as two separate changes, but you accidentally type `git add *` and stage them both. How can you unstage one of the two? The `git status` command reminds you: @@ -53,7 +53,7 @@ Changes to be committed: ---- Right below the ``Changes to be committed'' text, it says use `git reset HEAD ...` to unstage. -So, let’s use that advice to unstage the `benchmarks.rb` file: +So, let's use that advice to unstage the `benchmarks.rb` file: [source,shell] ---- @@ -86,7 +86,7 @@ For now this magic invocation is all you need to know about the `git reset` comm ==== Unmodifying a Modified File -What if you realize that you don’t want to keep your changes to the `benchmarks.rb` file? +What if you realize that you don't want to keep your changes to the `benchmarks.rb` file? How can you easily unmodify it – revert it back to what it looked like when you last committed (or initially cloned, or however you got it into your working directory)? Luckily, `git status` tells you how to do that, too. In the last example output, the unstaged area looks like this: @@ -100,8 +100,8 @@ Changes not staged for commit: modified: benchmarks.rb ---- -It tells you pretty explicitly how to discard the changes you’ve made. -Let’s do what it says: +It tells you pretty explicitly how to discard the changes you've made. +Let's do what it says: [source,shell] ---- @@ -120,10 +120,10 @@ You can see that the changes have been reverted. [IMPORTANT] ===== It's important to understand that `git checkout -- [file]` is a dangerous command. Any changes you made to that file are gone – you just copied another file over it. -Don’t ever use this command unless you absolutely know that you don’t want the file. +Don't ever use this command unless you absolutely know that you don't want the file. ===== -If you would like to keep the changes you've made to that file but still need to get it out of the way for now, we’ll go over stashing and branching in the next chapter; these are generally better ways to go. +If you would like to keep the changes you've made to that file but still need to get it out of the way for now, we'll go over stashing and branching in the next chapter; these are generally better ways to go. Remember, anything that is __committed__ in Git can almost always be recovered. Even commits that were on branches that were deleted or commits that were overwritten with an `--amend` commit can be recovered (see <> for data recovery). diff --git a/book/02-git-basics/sections/viewing-history.asc b/book/02-git-basics/sections/viewing-history.asc index 29c8124fd..64d3db21d 100644 --- a/book/02-git-basics/sections/viewing-history.asc +++ b/book/02-git-basics/sections/viewing-history.asc @@ -1,6 +1,6 @@ === Viewing the Commit History -After you have created several commits, or if you have cloned a repository with an existing commit history, you’ll probably want to look back to see what has happened. +After you have created several commits, or if you have cloned a repository with an existing commit history, you'll probably want to look back to see what has happened. The most basic and powerful tool to do this is the `git log` command. These examples use a very simple project called ``simplegit''. @@ -36,10 +36,10 @@ Date: Sat Mar 15 10:31:28 2008 -0700 ---- By default, with no arguments, `git log` lists the commits made in that repository in reverse chronological order – that is, the most recent commits show up first. -As you can see, this command lists each commit with its SHA-1 checksum, the author’s name and e-mail, the date written, and the commit message. +As you can see, this command lists each commit with its SHA-1 checksum, the author's name and e-mail, the date written, and the commit message. -A huge number and variety of options to the `git log` command are available to show you exactly what you’re looking for. -Here, we’ll show you some of the most popular. +A huge number and variety of options to the `git log` command are available to show you exactly what you're looking for. +Here, we'll show you some of the most popular. One of the more helpful options is `-p`, which shows the difference introduced in each commit. You can also use `-2`, which limits the output to only the last two entries: @@ -133,7 +133,7 @@ It also puts a summary of the information at the end. Another really useful option is `--pretty`. This option changes the log output to formats other than the default. A few prebuilt options are available for you to use. -The `oneline` option prints each commit on a single line, which is useful if you’re looking at a lot of commits. +The `oneline` option prints each commit on a single line, which is useful if you're looking at a lot of commits. In addition, the `short`, `full`, and `fuller` options show the output in roughly the same format but with less or more information, respectively: [source,shell] @@ -145,7 +145,7 @@ a11bef06a3f659402fe7563abf99ad00de2209e6 first commit ---- The most interesting option is `format`, which allows you to specify your own log output format. -This is especially useful when you’re generating output for machine parsing – because you specify the format explicitly, you know it won’t change with updates to Git:(((log formatting))) +This is especially useful when you're generating output for machine parsing – because you specify the format explicitly, you know it won't change with updates to Git:(((log formatting))) [source,shell] ---- @@ -182,7 +182,7 @@ a11bef0 - Scott Chacon, 6 years ago : first commit You may be wondering what the difference is between _author_ and _committer_. The author is the person who originally wrote the work, whereas the committer is the person who last applied the work. So, if you send in a patch to a project and one of the core members applies the patch, both of you get credit – you as the author, and the core member as the committer. -We’ll cover this distinction a bit more in <<_distributed_git>>. +We'll cover this distinction a bit more in <<_distributed_git>>. The oneline and format options are particularly useful with another `log` option called `--graph`. This option adds a nice little ASCII graph showing your branch and merge history: @@ -205,7 +205,7 @@ $ git log --pretty=format:"%h %s" --graph This type of output will become more interesting as we go through branching and merging in the next chapter. Those are only some simple output-formatting options to `git log` – there are many more. -<> lists the options we’ve covered so far, as well as some other common formatting options that may be useful, along with how they change the output of the log command. +<> lists the options we've covered so far, as well as some other common formatting options that may be useful, along with how they change the output of the log command. [[log_options]] .Common options to `git log` @@ -226,9 +226,9 @@ Those are only some simple output-formatting options to `git log` – there are ==== Limiting Log Output In addition to output-formatting options, `git log` takes a number of useful limiting options – that is, options that let you show only a subset of commits. -You’ve seen one such option already – the `-2` option, which show only the last two commits. +You've seen one such option already – the `-2` option, which show only the last two commits. In fact, you can do `-`, where `n` is any integer to show the last `n` commits. -In reality, you’re unlikely to use that often, because Git by default pipes all output through a pager so you see only one page of log output at a time. +In reality, you're unlikely to use that often, because Git by default pipes all output through a pager so you see only one page of log output at a time. However, the time-limiting options such as `--since` and `--until` are very useful. For example, this command gets the list of commits made in the last two weeks: @@ -255,7 +255,7 @@ The last really useful option to pass to `git log` as a filter is a path. If you specify a directory or file name, you can limit the log output to commits that introduced a change to those files. This is always the last option and is generally preceded by double dashes (`--`) to separate the paths from the options. -In <> we’ll list these and a few other common options for your reference. +In <> we'll list these and a few other common options for your reference. [[limit_options]] .Options to limit the output of `git log` diff --git a/book/04-git-server/1-git-server.asc b/book/04-git-server/1-git-server.asc index 1413d7cf2..a3165392e 100644 --- a/book/04-git-server/1-git-server.asc +++ b/book/04-git-server/1-git-server.asc @@ -1,9 +1,9 @@ == Git on the Server (((serving repositories))) -At this point, you should be able to do most of the day-to-day tasks for which you’ll be using Git. -However, in order to do any collaboration in Git, you’ll need to have a remote Git repository. -Although you can technically push changes to and pull changes from individuals’ repositories, doing so is discouraged because you can fairly easily confuse what they’re working on if you’re not careful. +At this point, you should be able to do most of the day-to-day tasks for which you'll be using Git. +However, in order to do any collaboration in Git, you'll need to have a remote Git repository. +Although you can technically push changes to and pull changes from individuals' repositories, doing so is discouraged because you can fairly easily confuse what they're working on if you're not careful. Furthermore, you want your collaborators to be able to access the repository even if your computer is offline – having a more reliable common repository is often useful. Therefore, the preferred method for collaborating with someone is to set up an intermediate repository that you both have access to, and push to and pull from that. @@ -11,13 +11,13 @@ Running a Git server is simple. First, you choose which protocols you want your server to communicate with. The first section of this chapter will cover the available protocols and the pros and cons of each. The next sections will explain some typical setups using those protocols and how to get your server running with them. -Last, we’ll go over a few hosted options, if you don’t mind hosting your code on someone else’s server and don’t want to go through the hassle of setting up and maintaining your own server. +Last, we'll go over a few hosted options, if you don't mind hosting your code on someone else's server and don't want to go through the hassle of setting up and maintaining your own server. If you have no interest in running your own server, you can skip to the last section of the chapter to see some options for setting up a hosted account and then move on to the next chapter, where we discuss the various ins and outs of working in a distributed source control environment. A remote repository is generally a _bare repository_ – a Git repository that has no working directory. -Because the repository is only used as a collaboration point, there is no reason to have a snapshot checked out on disk; it’s just the Git data. -In the simplest terms, a bare repository is the contents of your project’s `.git` directory and nothing else. +Because the repository is only used as a collaboration point, there is no reason to have a snapshot checked out on disk; it's just the Git data. +In the simplest terms, a bare repository is the contents of your project's `.git` directory and nothing else. include::sections/protocols.asc[] @@ -42,6 +42,6 @@ include::sections/hosted.asc[] You have several options to get a remote Git repository up and running so that you can collaborate with others or share your work. Running your own server gives you a lot of control and allows you to run the server within your own firewall, but such a server generally requires a fair amount of your time to set up and maintain. -If you place your data on a hosted server, it’s easy to set up and maintain; however, you have to be able to keep your code on someone else’s servers, and some organizations don’t allow that. +If you place your data on a hosted server, it's easy to set up and maintain; however, you have to be able to keep your code on someone else's servers, and some organizations don't allow that. It should be fairly straightforward to determine which solution or combination of solutions is appropriate for you and your organization. diff --git a/book/04-git-server/sections/generating-ssh-key.asc b/book/04-git-server/sections/generating-ssh-key.asc index 5b2428231..e413ede87 100644 --- a/book/04-git-server/sections/generating-ssh-key.asc +++ b/book/04-git-server/sections/generating-ssh-key.asc @@ -3,10 +3,10 @@ (((SSH keys))) That being said, many Git servers authenticate using SSH public keys. -In order to provide a public key, each user in your system must generate one if they don’t already have one. +In order to provide a public key, each user in your system must generate one if they don't already have one. This process is similar across all operating systems. -First, you should check to make sure you don’t already have a key. -By default, a user’s SSH keys are stored in that user’s `~/.ssh` directory. +First, you should check to make sure you don't already have a key. +By default, a user's SSH keys are stored in that user's `~/.ssh` directory. You can easily check to see if you have a key already by going to that directory and listing the contents: [source,shell] @@ -17,9 +17,9 @@ authorized_keys2 id_dsa known_hosts config id_dsa.pub ---- -You’re looking for a pair of files named something like `id_dsa` or `id_rsa` and a matching file with a `.pub` extension. +You're looking for a pair of files named something like `id_dsa` or `id_rsa` and a matching file with a `.pub` extension. The `.pub` file is your public key, and the other file is your private key. -If you don’t have these files (or you don’t even have a `.ssh` directory), you can create them by running a program called `ssh-keygen`, which is provided with the SSH package on Linux/Mac systems and comes with the MSysGit package on Windows: +If you don't have these files (or you don't even have a `.ssh` directory), you can create them by running a program called `ssh-keygen`, which is provided with the SSH package on Linux/Mac systems and comes with the MSysGit package on Windows: [source,shell] ---- @@ -35,9 +35,9 @@ The key fingerprint is: d0:82:24:8e:d7:f1:bb:9b:33:53:96:93:49:da:9b:e3 schacon@mylaptop.local ---- -First it confirms where you want to save the key (`.ssh/id_rsa`), and then it asks twice for a passphrase, which you can leave empty if you don’t want to type a password when you use the key. +First it confirms where you want to save the key (`.ssh/id_rsa`), and then it asks twice for a passphrase, which you can leave empty if you don't want to type a password when you use the key. -Now, each user that does this has to send their public key to you or whoever is administrating the Git server (assuming you’re using an SSH server setup that requires public keys). +Now, each user that does this has to send their public key to you or whoever is administrating the Git server (assuming you're using an SSH server setup that requires public keys). All they have to do is copy the contents of the `.pub` file and e-mail it. The public keys look something like this: diff --git a/book/04-git-server/sections/git-daemon.asc b/book/04-git-server/sections/git-daemon.asc index 0cc3e0b4a..014f12204 100644 --- a/book/04-git-server/sections/git-daemon.asc +++ b/book/04-git-server/sections/git-daemon.asc @@ -3,8 +3,8 @@ (((serving repositories, git protocol))) Next we'll set up a daemon serving repositories over the ``Git'' protocol. This is common choice for fast, unauthenticated access to your Git data. Remember that since it's not an authenticated service, anything you serve over this protocol is public within it's network. -If you’re running this on a server outside your firewall, it should only be used for projects that are publicly visible to the world. -If the server you’re running it on is inside your firewall, you might use it for projects that a large number of people or computers (continuous integration or build servers) have read-only access to, when you don’t want to have to add an SSH key for each. +If you're running this on a server outside your firewall, it should only be used for projects that are publicly visible to the world. +If the server you're running it on is inside your firewall, you might use it for projects that a large number of people or computers (continuous integration or build servers) have read-only access to, when you don't want to have to add an SSH key for each. In any case, the Git protocol is relatively easy to set up. Basically, you need to run this command in a daemonized manner:(((git commands, daemon))) @@ -15,9 +15,9 @@ git daemon --reuseaddr --base-path=/opt/git/ /opt/git/ ---- `--reuseaddr` allows the server to restart without waiting for old connections to time out, the `--base-path` option allows people to clone projects without specifying the entire path, and the path at the end tells the Git daemon where to look for repositories to export. -If you’re running a firewall, you’ll also need to punch a hole in it at port 9418 on the box you’re setting this up on. +If you're running a firewall, you'll also need to punch a hole in it at port 9418 on the box you're setting this up on. -You can daemonize this process a number of ways, depending on the operating system you’re running. +You can daemonize this process a number of ways, depending on the operating system you're running. On an Ubuntu machine, you can use an Upstart script. So, in the following file @@ -41,7 +41,7 @@ respawn ---- For security reasons, it is strongly encouraged to have this daemon run as a user with read-only permissions to the repositories – you can easily do this by creating a new user 'git-ro' and running the daemon as them. -For the sake of simplicity we’ll simply run it as the same 'git' user that Gitosis is running as. +For the sake of simplicity we'll simply run it as the same 'git' user that Gitosis is running as. When you restart your machine, your Git daemon will start automatically and respawn if it goes down. To get it running without having to reboot, you can run this: @@ -61,4 +61,4 @@ $ cd /path/to/project.git $ touch git-daemon-export-ok ---- -The presence of that file tells Git that it’s OK to serve this project without authentication. +The presence of that file tells Git that it's OK to serve this project without authentication. diff --git a/book/04-git-server/sections/git-on-a-server.asc b/book/04-git-server/sections/git-on-a-server.asc index b3c19d986..29424e203 100644 --- a/book/04-git-server/sections/git-on-a-server.asc +++ b/book/04-git-server/sections/git-on-a-server.asc @@ -8,7 +8,7 @@ Here we'll be demonstrating the commands and steps needed to do basic installati Actually setting up a production server within your infrastructure will certainly entail differences in security measures or operating system tools, but hopefully this will give you the general idea of what's involved. ==== -In order to initially set up any Git server, you have to export an existing repository into a new bare repository – a repository that doesn’t contain a working directory. +In order to initially set up any Git server, you have to export an existing repository into a new bare repository – a repository that doesn't contain a working directory. This is generally straightforward to do. In order to clone your repository to create a new bare repository, you run the clone command with the `--bare` option.(((git commands, clone, bare))) By convention, bare repository directories end in `.git`, like so: @@ -35,7 +35,7 @@ It takes the Git repository by itself, without a working directory, and creates ==== Putting the Bare Repository on a Server Now that you have a bare copy of your repository, all you need to do is put it on a server and set up your protocols. -Let’s say you’ve set up a server called `git.example.com` that you have SSH access to, and you want to store all your Git repositories under the `/opt/git` directory. +Let's say you've set up a server called `git.example.com` that you have SSH access to, and you want to store all your Git repositories under the `/opt/git` directory. Assuming that `/opt/git` exists on that server, you can set up your new repository by copying your bare repository over: [source,shell] @@ -62,29 +62,29 @@ $ git init --bare --shared ---- You see how easy it is to take a Git repository, create a bare version, and place it on a server to which you and your collaborators have SSH access. -Now you’re ready to collaborate on the same project. +Now you're ready to collaborate on the same project. -It’s important to note that this is literally all you need to do to run a useful Git server to which several people have access – just add SSH-able accounts on a server, and stick a bare repository somewhere that all those users have read and write access to. -You’re ready to go – nothing else needed. +It's important to note that this is literally all you need to do to run a useful Git server to which several people have access – just add SSH-able accounts on a server, and stick a bare repository somewhere that all those users have read and write access to. +You're ready to go – nothing else needed. -In the next few sections, you’ll see how to expand to more sophisticated setups. +In the next few sections, you'll see how to expand to more sophisticated setups. This discussion will include not having to create user accounts for each user, adding public read access to repositories, setting up web UIs, using the Gitosis tool, and more. However, keep in mind that to collaborate with a couple of people on a private project, all you _need_ is an SSH server and a bare repository. ==== Small Setups -If you’re a small outfit or are just trying out Git in your organization and have only a few developers, things can be simple for you. +If you're a small outfit or are just trying out Git in your organization and have only a few developers, things can be simple for you. One of the most complicated aspects of setting up a Git server is user management. If you want some repositories to be read-only to certain users and read/write to others, access and permissions can be a bit more difficult to arrange. ===== SSH Access (((serving repositories, SSH))) -If you have a server to which all your developers already have SSH access, it’s generally easiest to set up your first repository there, because you have to do almost no work (as we covered in the last section). +If you have a server to which all your developers already have SSH access, it's generally easiest to set up your first repository there, because you have to do almost no work (as we covered in the last section). If you want more complex access control type permissions on your repositories, you can handle them with the normal filesystem permissions of the operating system your server runs. -If you want to place your repositories on a server that doesn’t have accounts for everyone on your team whom you want to have write access, then you must set up SSH access for them. -We assume that if you have a server with which to do this, you already have an SSH server installed, and that’s how you’re accessing the server. +If you want to place your repositories on a server that doesn't have accounts for everyone on your team whom you want to have write access, then you must set up SSH access for them. +We assume that if you have a server with which to do this, you already have an SSH server installed, and that's how you're accessing the server. There are a few ways you can give access to everyone on your team. The first is to set up accounts for everybody, which is straightforward but can be cumbersome. @@ -92,7 +92,7 @@ You may not want to run `adduser` and set temporary passwords for every user. A second method is to create a single 'git' user on the machine, ask every user who is to have write access to send you an SSH public key, and add that key to the `~/.ssh/authorized_keys` file of your new 'git' user. At that point, everyone will be able to access that machine via the 'git' user. -This doesn’t affect the commit data in any way – the SSH user you connect as doesn’t affect the commits you’ve recorded. +This doesn't affect the commit data in any way – the SSH user you connect as doesn't affect the commits you've recorded. Another way to do it is to have your SSH server authenticate from an LDAP server or some other centralized authentication source that you may already have set up. As long as each user can get shell access on the machine, any SSH authentication mechanism you can think of should work. diff --git a/book/04-git-server/sections/gitweb.asc b/book/04-git-server/sections/gitweb.asc index 8cafdbd59..b6448f236 100644 --- a/book/04-git-server/sections/gitweb.asc +++ b/book/04-git-server/sections/gitweb.asc @@ -10,7 +10,7 @@ image::images/git-instaweb.png[The GitWeb web-based user interface.] If you want to check out what GitWeb would look like for your project, Git comes with a command to fire up a temporary instance if you have a lightweight server on your system like `lighttpd` or `webrick`. On Linux machines, `lighttpd` is often installed, so you may be able to get it to run by typing `git instaweb` in your project directory. -If you’re running a Mac, Leopard comes preinstalled with Ruby, so `webrick` may be your best bet. +If you're running a Mac, Leopard comes preinstalled with Ruby, so `webrick` may be your best bet. To start `instaweb` with a non-lighttpd handler, you can run it with the `--httpd` option.(((git commands, instaweb))) [source,shell] @@ -21,17 +21,17 @@ $ git instaweb --httpd=webrick ---- That starts up an HTTPD server on port 1234 and then automatically starts a web browser that opens on that page. -It’s pretty easy on your part. -When you’re done and want to shut down the server, you can run the same command with the `--stop` option: +It's pretty easy on your part. +When you're done and want to shut down the server, you can run the same command with the `--stop` option: [source,shell] ---- $ git instaweb --httpd=webrick --stop ---- -If you want to run the web interface on a server all the time for your team or for an open source project you’re hosting, you’ll need to set up the CGI script to be served by your normal web server. +If you want to run the web interface on a server all the time for your team or for an open source project you're hosting, you'll need to set up the CGI script to be served by your normal web server. Some Linux distributions have a `gitweb` package that you may be able to install via `apt` or `yum`, so you may want to try that first. -We’ll walk though installing GitWeb manually very quickly. +We'll walk though installing GitWeb manually very quickly. First, you need to get the Git source code, which GitWeb comes with, and generate the custom CGI script: [source,shell] @@ -66,5 +66,5 @@ Now, you need to make Apache use CGI for that script, for which you can add a Vi ---- -Again, GitWeb can be served with any CGI or Perl capable web server; if you prefer to use something else, it shouldn’t be difficult to set up. +Again, GitWeb can be served with any CGI or Perl capable web server; if you prefer to use something else, it shouldn't be difficult to set up. At this point, you should be able to visit `http://gitserver/` to view your repositories online. diff --git a/book/04-git-server/sections/hosted.asc b/book/04-git-server/sections/hosted.asc index 5a121089d..e9b8b06f2 100644 --- a/book/04-git-server/sections/hosted.asc +++ b/book/04-git-server/sections/hosted.asc @@ -1,8 +1,8 @@ === Third Party Hosted Options -If you don’t want to go through all of the work involved in setting up your own Git server, you have several options for hosting your Git projects on an external dedicated hosting site. +If you don't want to go through all of the work involved in setting up your own Git server, you have several options for hosting your Git projects on an external dedicated hosting site. Doing so offers a number of advantages: a hosting site is generally quick to set up and easy to start projects on, and no server maintenance or monitoring is involved. -Even if you set up and run your own server internally, you may still want to use a public hosting site for your open source code – it’s generally easier for the open source community to find and help you with. +Even if you set up and run your own server internally, you may still want to use a public hosting site for your open source code – it's generally easier for the open source community to find and help you with. These days, you have a huge number of hosting options to choose from, each with different advantages and disadvantages. To see an up-to-date list, check out the GitHosting page on the main Git wiki at https://git.wiki.kernel.org/index.php/GitHosting[] diff --git a/book/04-git-server/sections/protocols.asc b/book/04-git-server/sections/protocols.asc index fc431104d..9dc0f6ab3 100644 --- a/book/04-git-server/sections/protocols.asc +++ b/book/04-git-server/sections/protocols.asc @@ -1,14 +1,14 @@ === The Protocols Git can use four major protocols to transfer data: Local, HTTP, Secure Shell (SSH) and Git. -Here we’ll discuss what they are and in what basic circumstances you would want (or not want) to use them. +Here we'll discuss what they are and in what basic circumstances you would want (or not want) to use them. ==== Local Protocol (((protocols, local))) The most basic is the _Local protocol_, in which the remote repository is in another directory on disk. This is often used if everyone on your team has access to a shared filesystem such as an NFS mount, or in the less likely case that everyone logs in to the same computer. -The latter wouldn’t be ideal, because all your code repository instances would reside on the same computer, making a catastrophic loss much more likely. +The latter wouldn't be ideal, because all your code repository instances would reside on the same computer, making a catastrophic loss much more likely. If you have a shared mounted filesystem, then you can clone, push to, and pull from a local file-based repository. To clone a repository like this or to add one as a remote to an existing project, use the path to the repository as the URL. @@ -30,7 +30,7 @@ Git operates slightly differently if you explicitly specify `file://` at the beg If you just specify the path, Git tries to use hardlinks or directly copy the files it needs. If you specify `file://`, Git fires up the processes that it normally uses to transfer data over a network which is generally a lot less efficient method of transferring the data. The main reason to specify the `file://` prefix is if you want a clean copy of the repository with extraneous references or objects left out – generally after an import from another version-control system or something similar (see <<_git_internals>> for maintenance tasks). -We’ll use the normal path here because doing so is almost always faster. +We'll use the normal path here because doing so is almost always faster. To add a local repository to an existing Git project, you can run something like this: @@ -43,20 +43,20 @@ Then, you can push to and pull from that remote as though you were doing so over ===== The Pros -The pros of file-based repositories are that they’re simple and they use existing file permissions and network access. +The pros of file-based repositories are that they're simple and they use existing file permissions and network access. If you already have a shared filesystem to which your whole team has access, setting up a repository is very easy. You stick the bare repository copy somewhere everyone has shared access to and set the read/write permissions as you would for any other shared directory. -We’ll discuss how to export a bare repository copy for this purpose in <<_getting_git_on_a_server>>. +We'll discuss how to export a bare repository copy for this purpose in <<_getting_git_on_a_server>>. -This is also a nice option for quickly grabbing work from someone else’s working repository. +This is also a nice option for quickly grabbing work from someone else's working repository. If you and a co-worker are working on the same project and they want you to check something out, running a command like `git pull /home/john/project` is often easier than them pushing to a remote server and you pulling down. ===== The Cons The cons of this method are that shared access is generally more difficult to set up and reach from multiple locations than basic network access. -If you want to push from your laptop when you’re at home, you have to mount the remote disk, which can be difficult and slow compared to network-based access. +If you want to push from your laptop when you're at home, you have to mount the remote disk, which can be difficult and slow compared to network-based access. -It’s also important to mention that this isn’t necessarily the fastest option if you’re using a shared mount of some kind. +It's also important to mention that this isn't necessarily the fastest option if you're using a shared mount of some kind. A local repository is fast only if you have fast access to the data. A repository on NFS is often slower than the repository over SSH on the same server, allowing Git to run off local disks on each system. @@ -84,7 +84,7 @@ In fact, for services like GitHub, the URL you use to view the repository online If the server does not respond with a Git HTTP smart service, the Git client will try to fall back to the simpler ``dumb'' HTTP protocol. The Dumb protocol expects the bare Git repository to be served like normal files from the web server. The beauty of the Dumb HTTP protocol is the simplicity of setting it up. -Basically, all you have to do is put a bare Git repository under your HTTP document root and set up a specific `post-update` hook, and you’re done (See <<_git_hooks>>). +Basically, all you have to do is put a bare Git repository under your HTTP document root and set up a specific `post-update` hook, and you're done (See <<_git_hooks>>). At that point, anyone who can access the web server under which you put the repository can also clone your repository. To allow read access to your repository over HTTP, do something like this: @@ -97,7 +97,7 @@ $ mv hooks/post-update.sample hooks/post-update $ chmod a+x hooks/post-update ---- -That’s all.(((hooks, post-update))) +That's all.(((hooks, post-update))) The `post-update` hook that comes with Git by default runs the appropriate command (`git update-server-info`) to make HTTP fetching and cloning work properly. This command is run when you push to this repository (over SSH perhaps); then, other people can clone via something like @@ -106,8 +106,8 @@ This command is run when you push to this repository (over SSH perhaps); then, o $ git clone https://example.com/gitproject.git ---- -In this particular case, we’re using the `/var/www/htdocs` path that is common for Apache setups, but you can use any static web server – just put the bare repository in its path. -The Git data is served as basic static files (see <<_git_internals>> for details about exactly how it’s served). +In this particular case, we're using the `/var/www/htdocs` path that is common for Apache setups, but you can use any static web server – just put the bare repository in its path. +The Git data is served as basic static files (see <<_git_internals>> for details about exactly how it's served). Generally you would either choose to run a read/write Smart HTTP server or simply have the files accessible as read-only in the Dumb manner. It's rare to run a mix of the two services. @@ -135,8 +135,8 @@ If you're using HTTP for authenticated pushing, providing your credentials is so (((protocols, SSH))) A common transport protocol for Git when self-hosting is over SSH. -This is because SSH access to servers is already set up in most places – and if it isn’t, it’s easy to do. -SSH is also an authenticated network protocol; and because it’s ubiquitous, it’s generally easy to set up and use. +This is because SSH access to servers is already set up in most places – and if it isn't, it's easy to do. +SSH is also an authenticated network protocol; and because it's ubiquitous, it's generally easy to set up and use. To clone a Git repository over SSH, you can specify ssh:// URL like this: @@ -145,14 +145,14 @@ To clone a Git repository over SSH, you can specify ssh:// URL like this: $ git clone ssh://user@server:project.git ---- -Or you can not specify a protocol – Git assumes SSH if you aren’t explicit: +Or you can not specify a protocol – Git assumes SSH if you aren't explicit: [source,shell] ---- $ git clone user@server:project.git ---- -You can also not specify a user, and Git assumes the user you’re currently logged in as. +You can also not specify a user, and Git assumes the user you're currently logged in as. ===== The Pros @@ -163,34 +163,34 @@ Last, like the Git and Local protocols, SSH is efficient, making the data as com ===== The Cons -The negative aspect of SSH is that you can’t serve anonymous access of your repository over it. -People must have access to your machine over SSH to access it, even in a read-only capacity, which doesn’t make SSH access conducive to open source projects. -If you’re using it only within your corporate network, SSH may be the only protocol you need to deal with. -If you want to allow anonymous read-only access to your projects and also want to use SSH, you’ll have to set up SSH for you to push over but something else for others to pull over. +The negative aspect of SSH is that you can't serve anonymous access of your repository over it. +People must have access to your machine over SSH to access it, even in a read-only capacity, which doesn't make SSH access conducive to open source projects. +If you're using it only within your corporate network, SSH may be the only protocol you need to deal with. +If you want to allow anonymous read-only access to your projects and also want to use SSH, you'll have to set up SSH for you to push over but something else for others to pull over. ==== The Git Protocol (((protocols, git))) Next is the Git protocol. This is a special daemon that comes packaged with Git; it listens on a dedicated port (9418) that provides a service similar to the SSH protocol, but with absolutely no authentication. -In order for a repository to be served over the Git protocol, you must create the `git-export-daemon-ok` file – the daemon won’t serve a repository without that file in it – but other than that there is no security. -Either the Git repository is available for everyone to clone or it isn’t. +In order for a repository to be served over the Git protocol, you must create the `git-export-daemon-ok` file – the daemon won't serve a repository without that file in it – but other than that there is no security. +Either the Git repository is available for everyone to clone or it isn't. This means that there is generally no pushing over this protocol. -You can enable push access; but given the lack of authentication, if you turn on push access, anyone on the internet who finds your project’s URL could push to your project. +You can enable push access; but given the lack of authentication, if you turn on push access, anyone on the internet who finds your project's URL could push to your project. Suffice it to say that this is rare. ===== The Pros The Git protocol is the fastest transfer protocol available. -If you’re serving a lot of traffic for a public project or serving a very large project that doesn’t require user authentication for read access, it’s likely that you’ll want to set up a Git daemon to serve your project. +If you're serving a lot of traffic for a public project or serving a very large project that doesn't require user authentication for read access, it's likely that you'll want to set up a Git daemon to serve your project. It uses the same data-transfer mechanism as the SSH protocol but without the encryption and authentication overhead. ===== The Cons The downside of the Git protocol is the lack of authentication. -It’s generally undesirable for the Git protocol to be the only access to your project. -Generally, you’ll pair it with SSH or HTTPS access for the few developers who have push (write) access and have everyone else use `git://` for read-only access. -It’s also probably the most difficult protocol to set up. -It must run its own daemon, which requires `xinetd` configuration or the like, which isn’t always a walk in the park. -It also requires firewall access to port 9418, which isn’t a standard port that corporate firewalls always allow. +It's generally undesirable for the Git protocol to be the only access to your project. +Generally, you'll pair it with SSH or HTTPS access for the few developers who have push (write) access and have everyone else use `git://` for read-only access. +It's also probably the most difficult protocol to set up. +It must run its own daemon, which requires `xinetd` configuration or the like, which isn't always a walk in the park. +It also requires firewall access to port 9418, which isn't a standard port that corporate firewalls always allow. Behind big corporate firewalls, this obscure port is commonly blocked. diff --git a/book/04-git-server/sections/setting-up-server.asc b/book/04-git-server/sections/setting-up-server.asc index ea21464ff..25f4fb52f 100644 --- a/book/04-git-server/sections/setting-up-server.asc +++ b/book/04-git-server/sections/setting-up-server.asc @@ -1,8 +1,8 @@ === Setting Up the Server -Let’s walk through setting up SSH access on the server side. -In this example, you’ll use the `authorized_keys` method for authenticating your users. -We also assume you’re running a standard Linux distribution like Ubuntu. +Let's walk through setting up SSH access on the server side. +In this example, you'll use the `authorized_keys` method for authenticating your users. +We also assume you're running a standard Linux distribution like Ubuntu. First, you create a 'git' user and a `.ssh` directory for that user. [source,shell] @@ -14,7 +14,7 @@ $ mkdir .ssh ---- Next, you need to add some developer SSH public keys to the `authorized_keys` file for that user. -Let’s assume you’ve received a few keys by e-mail and saved them to temporary files. +Let's assume you've received a few keys by e-mail and saved them to temporary files. Again, the public keys look something like this: [source,shell] @@ -50,8 +50,8 @@ Initialized empty Git repository in /opt/git/project.git/ Then, John, Josie, or Jessica can push the first version of their project into that repository by adding it as a remote and pushing up a branch. Note that someone must shell onto the machine and create a bare repository every time you want to add a project. -Let’s use `gitserver` as the hostname of the server on which you’ve set up your 'git' user and repository. -If you’re running it internally, and you set up DNS for `gitserver` to point to that server, then you can use the commands pretty much as is: +Let's use `gitserver` as the hostname of the server on which you've set up your 'git' user and repository. +If you're running it internally, and you set up DNS for `gitserver` to point to that server, then you can use the commands pretty much as is: [source,shell] ---- @@ -79,9 +79,9 @@ With this method, you can quickly get a read/write Git server up and running for You should note that currently all these users can also log into the server and get a shell as the ``git'' user. If you want to restrict that, you will have to change the shell to something else in the `passwd` file. You can easily restrict the 'git' user to only doing Git activities with a limited shell tool called `git-shell` that comes with Git. -If you set this as your 'git' user’s login shell, then the 'git' user can’t have normal shell access to your server. -To use this, specify `git-shell` instead of bash or csh for your user’s login shell. -To do so, you’ll likely have to edit your `/etc/passwd` file: +If you set this as your 'git' user's login shell, then the 'git' user can't have normal shell access to your server. +To use this, specify `git-shell` instead of bash or csh for your user's login shell. +To do so, you'll likely have to edit your `/etc/passwd` file: [source,shell] ---- @@ -95,7 +95,7 @@ At the bottom, you should find a line that looks something like this: git:x:1000:1000::/home/git:/bin/sh ---- -Change `/bin/sh` to `/usr/bin/git-shell` (or run `which git-shell` to see where it’s installed).(((git-shell))) +Change `/bin/sh` to `/usr/bin/git-shell` (or run `which git-shell` to see where it's installed).(((git-shell))) The line should look something like this: [source,shell] @@ -103,8 +103,8 @@ The line should look something like this: git:x:1000:1000::/home/git:/usr/bin/git-shell ---- -Now, the 'git' user can only use the SSH connection to push and pull Git repositories and can’t shell onto the machine. -If you try, you’ll see a login rejection like this: +Now, the 'git' user can only use the SSH connection to push and pull Git repositories and can't shell onto the machine. +If you try, you'll see a login rejection like this: [source,shell] ---- diff --git a/book/06-github/sections/projects.asc b/book/06-github/sections/projects.asc index 6b8ba6ef4..6d046f65f 100644 --- a/book/06-github/sections/projects.asc +++ b/book/06-github/sections/projects.asc @@ -37,7 +37,7 @@ It is often preferable to share the HTTP based URL for a public project, since t ==== Adding Collaborators -Let’s add the rest of the team. +Let's add the rest of the team. If John, Josie, and Jessica all sign up for accounts on GitHub, and you want to give them push access to your repository, you can add them to your project as collaborators. Doing so will give them ``push'' access, which means they have both read and write access to the project and Git repository. @@ -86,7 +86,7 @@ We'll be covering them in the upcoming sections. ==== Forking Projects (((forking))) -If you want to contribute to an existing project to which you don’t have push access, you can ``fork'' the project. +If you want to contribute to an existing project to which you don't have push access, you can ``fork'' the project. What this means is that GitHub will make a copy of the project that is entirely yours; it lives in your user's namespace, and you can push to it. [NOTE] @@ -95,7 +95,7 @@ Historically, the term ``fork'' has been somewhat negative in context, meaning t In GitHub, a ``fork'' is simply the same project in your own namespace, allowing you to make changes to a project publicly as a way to contribute in a more open manner. ==== -This way, projects don’t have to worry about adding users as collaborators to give them push access. +This way, projects don't have to worry about adding users as collaborators to give them push access. People can fork a project, push to it, and contribute their changes back to the original repository by creating what's called a Pull Request, which we'll cover next. This opens up a discussion thread with code review, and the owner and the contributor can then communicate about the change until the owner is happy with it, at which point the owner can merge it in. diff --git a/book/11-git-internals/1-git-internals.asc b/book/11-git-internals/1-git-internals.asc index cf03573ef..ebbb4a732 100644 --- a/book/11-git-internals/1-git-internals.asc +++ b/book/11-git-internals/1-git-internals.asc @@ -1,19 +1,19 @@ [[_git_internals]] == Git Internals -You may have skipped to this chapter from a previous chapter, or you may have gotten here after reading the rest of the book – in either case, this is where you’ll go over the inner workings and implementation of Git. +You may have skipped to this chapter from a previous chapter, or you may have gotten here after reading the rest of the book – in either case, this is where you'll go over the inner workings and implementation of Git. I found that learning this information was fundamentally important to understanding how useful and powerful Git is, but others have argued to me that it can be confusing and unnecessarily complex for beginners. -Thus, I’ve made this discussion the last chapter in the book so you could read it early or later in your learning process. +Thus, I've made this discussion the last chapter in the book so you could read it early or later in your learning process. I leave it up to you to decide. -Now that you’re here, let’s get started. -First, if it isn’t yet clear, Git is fundamentally a content-addressable filesystem with a VCS user interface written on top of it. -You’ll learn more about what this means in a bit. +Now that you're here, let's get started. +First, if it isn't yet clear, Git is fundamentally a content-addressable filesystem with a VCS user interface written on top of it. +You'll learn more about what this means in a bit. In the early days of Git (mostly pre 1.5), the user interface was much more complex because it emphasized this filesystem rather than a polished VCS. -In the last few years, the UI has been refined until it’s as clean and easy to use as any system out there; but often, the stereotype lingers about the early Git UI that was complex and difficult to learn. +In the last few years, the UI has been refined until it's as clean and easy to use as any system out there; but often, the stereotype lingers about the early Git UI that was complex and difficult to learn. -The content-addressable filesystem layer is amazingly cool, so I’ll cover that first in this chapter; then, you’ll learn about the transport mechanisms and the repository maintenance tasks that you may eventually have to deal with. +The content-addressable filesystem layer is amazingly cool, so I'll cover that first in this chapter; then, you'll learn about the transport mechanisms and the repository maintenance tasks that you may eventually have to deal with. include::sections/plumbing-porcelain.asc[] @@ -33,9 +33,9 @@ include::sections/environment.asc[] === Summary -You should have a pretty good understanding of what Git does in the background and, to some degree, how it’s implemented. -This chapter has covered a number of plumbing commands – commands that are lower level and simpler than the porcelain commands you’ve learned about in the rest of the book. -Understanding how Git works at a lower level should make it easier to understand why it’s doing what it’s doing and also to write your own tools and helping scripts to make your specific workflow work for you. +You should have a pretty good understanding of what Git does in the background and, to some degree, how it's implemented. +This chapter has covered a number of plumbing commands – commands that are lower level and simpler than the porcelain commands you've learned about in the rest of the book. +Understanding how Git works at a lower level should make it easier to understand why it's doing what it's doing and also to write your own tools and helping scripts to make your specific workflow work for you. Git as a content-addressable filesystem is a very powerful tool that you can easily use as more than just a VCS. I hope you can use your newfound knowledge of Git internals to implement your own cool application of this technology and feel more comfortable using Git in more advanced ways. diff --git a/book/11-git-internals/sections/maintenance.asc b/book/11-git-internals/sections/maintenance.asc index 0637df793..4db26a204 100644 --- a/book/11-git-internals/sections/maintenance.asc +++ b/book/11-git-internals/sections/maintenance.asc @@ -8,7 +8,7 @@ This section will cover some of these scenarios. Occasionally, Git automatically runs a command called ``auto gc''. Most of the time, this command does nothing. However, if there are too many loose objects (objects not in a packfile) or too many packfiles, Git launches a full-fledged `git gc` command. -The `gc` stands for garbage collect, and the command does a number of things: it gathers up all the loose objects and places them in packfiles, it consolidates packfiles into one big packfile, and it removes objects that aren’t reachable from any commit and are a few months old. +The `gc` stands for garbage collect, and the command does a number of things: it gathers up all the loose objects and places them in packfiles, it consolidates packfiles into one big packfile, and it removes objects that aren't reachable from any commit and are a few months old. You can run auto gc manually as follows: @@ -33,7 +33,7 @@ $ find .git/refs -type f .git/refs/tags/v1.1 ---- -If you run `git gc`, you’ll no longer have these files in the `refs` directory. +If you run `git gc`, you'll no longer have these files in the `refs` directory. Git will move them for the sake of efficiency into a file named `.git/packed-refs` that looks like this: [source,shell] @@ -47,9 +47,9 @@ cac0cab538b970a37ea1e769cbbde608743bc96d refs/tags/v1.0 ^1a410efbd13591db07496601ebc7a059dd55cfe9 ---- -If you update a reference, Git doesn’t edit this file but instead writes a new file to `refs/heads`. +If you update a reference, Git doesn't edit this file but instead writes a new file to `refs/heads`. To get the appropriate SHA for a given reference, Git checks for that reference in the `refs` directory and then checks the `packed-refs` file as a fallback. -However, if you can’t find a reference in the `refs` directory, it’s probably in your `packed-refs` file. +However, if you can't find a reference in the `refs` directory, it's probably in your `packed-refs` file. Notice the last line of the file, which begins with a `^`. This means the tag directly above is an annotated tag and that line is the commit that the annotated tag points to. @@ -61,8 +61,8 @@ At some point in your Git journey, you may accidentally lose a commit. Generally, this happens because you force-delete a branch that had work on it, and it turns out you wanted the branch after all; or you hard-reset a branch, thus abandoning commits that you wanted something from. Assuming this happens, how can you get your commits back? -Here’s an example that hard-resets the master branch in your test repository to an older commit and then recovers the lost commits. -First, let’s review where your repository is at this point: +Here's an example that hard-resets the master branch in your test repository to an older commit and then recovers the lost commits. +First, let's review where your repository is at this point: [source,shell] ---- @@ -86,15 +86,15 @@ cac0cab538b970a37ea1e769cbbde608743bc96d second commit fdf4fc3344e67ab068f836878b6c4951e3b15f3d first commit ---- -You’ve effectively lost the top two commits – you have no branch from which those commits are reachable. +You've effectively lost the top two commits – you have no branch from which those commits are reachable. You need to find the latest commit SHA and then add a branch that points to it. -The trick is finding that latest commit SHA – it’s not like you’ve memorized it, right? +The trick is finding that latest commit SHA – it's not like you've memorized it, right? Often, the quickest way is to use a tool called `git reflog`. -As you’re working, Git silently records what your HEAD is every time you change it. +As you're working, Git silently records what your HEAD is every time you change it. Each time you commit or change branches, the reflog is updated. The reflog is also updated by the `git update-ref` command, which is another reason to use it instead of just writing the SHA value to your ref files, as we covered in <<_git_references>>. -You can see where you’ve been at any time by running `git reflog`: +You can see where you've been at any time by running `git reflog`: [source,shell] ---- @@ -142,7 +142,7 @@ fdf4fc3344e67ab068f836878b6c4951e3b15f3d first commit Cool – now you have a branch named `recover-branch` that is where your `master` branch used to be, making the first two commits reachable again. Next, suppose your loss was for some reason not in the reflog – you can simulate that by removing `recover-branch` and deleting the reflog. -Now the first two commits aren’t reachable by anything: +Now the first two commits aren't reachable by anything: [source,shell] ---- @@ -153,7 +153,7 @@ $ rm -Rf .git/logs/ Because the reflog data is kept in the `.git/logs/` directory, you effectively have no reflog. How can you recover that commit at this point? One way is to use the `git fsck` utility, which checks your database for integrity. -If you run it with the `--full` option, it shows you all objects that aren’t pointed to by another object: +If you run it with the `--full` option, it shows you all objects that aren't pointed to by another object: [source,shell] ---- @@ -172,17 +172,17 @@ You can recover it the same way, by adding a branch that points to that SHA. There are a lot of great things about Git, but one feature that can cause issues is the fact that a `git clone` downloads the entire history of the project, including every version of every file. This is fine if the whole thing is source code, because Git is highly optimized to compress that data efficiently. However, if someone at any point in the history of your project added a single huge file, every clone for all time will be forced to download that large file, even if it was removed from the project in the very next commit. -Because it’s reachable from the history, it will always be there. +Because it's reachable from the history, it will always be there. -This can be a huge problem when you’re converting Subversion or Perforce repositories into Git. -Because you don’t download the whole history in those systems, this type of addition carries few consequences. +This can be a huge problem when you're converting Subversion or Perforce repositories into Git. +Because you don't download the whole history in those systems, this type of addition carries few consequences. If you did an import from another system or otherwise find that your repository is much larger than it should be, here is how you can find and remove large objects. Be warned: this technique is destructive to your commit history. It rewrites every commit object downstream from the earliest tree you have to modify to remove a large file reference. -If you do this immediately after an import, before anyone has started to base work on the commit, you’re fine – otherwise, you have to notify all contributors that they must rebase their work onto your new commits. +If you do this immediately after an import, before anyone has started to base work on the commit, you're fine – otherwise, you have to notify all contributors that they must rebase their work onto your new commits. -To demonstrate, you’ll add a large file into your test repository, remove it in the next commit, find it, and remove it permanently from the repository. +To demonstrate, you'll add a large file into your test repository, remove it in the next commit, find it, and remove it permanently from the repository. First, add a large object to your history: [source,shell] @@ -195,7 +195,7 @@ $ git commit -am 'added git tarball' create mode 100644 git.tbz2 ---- -Oops – you didn’t want to add a huge tarball to your project. +Oops – you didn't want to add a huge tarball to your project. Better get rid of it: [source,shell] @@ -208,7 +208,7 @@ $ git commit -m 'oops - removed large tarball' delete mode 100644 git.tbz2 ---- -Now, `gc` your database and see how much space you’re using: +Now, `gc` your database and see how much space you're using: [source,shell] ---- @@ -220,7 +220,7 @@ Writing objects: 100% (21/21), done. Total 21 (delta 3), reused 15 (delta 1) ---- -You can run the `count-objects` command to quickly see how much space you’re using: +You can run the `count-objects` command to quickly see how much space you're using: [source,shell] ---- @@ -234,16 +234,16 @@ prune-packable: 0 garbage: 0 ---- -The `size-pack` entry is the size of your packfiles in kilobytes, so you’re using 2MB. -Before the last commit, you were using closer to 2K – clearly, removing the file from the previous commit didn’t remove it from your history. +The `size-pack` entry is the size of your packfiles in kilobytes, so you're using 2MB. +Before the last commit, you were using closer to 2K – clearly, removing the file from the previous commit didn't remove it from your history. Every time anyone clones this repository, they will have to clone all 2MB just to get this tiny project, because you accidentally added a big file. -Let’s get rid of it. +Let's get rid of it. First you have to find it. In this case, you already know what file it is. -But suppose you didn’t; how would you identify what file or files were taking up so much space? +But suppose you didn't; how would you identify what file or files were taking up so much space? If you run `git gc`, all the objects are in a packfile; you can identify the big objects by running another plumbing command called `git verify-pack` and sorting on the third field in the output, which is file size. -You can also pipe it through the `tail` command because you’re only interested in the last few largest files: +You can also pipe it through the `tail` command because you're only interested in the last few largest files: [source,shell] ---- @@ -254,9 +254,9 @@ e3f094f522629ae358806b17daf78246c27c007b blob 1486 734 4667 ---- The big object is at the bottom: 2MB. -To find out what file it is, you’ll use the `rev-list` command, which you used briefly in <<_customizing_git>>. +To find out what file it is, you'll use the `rev-list` command, which you used briefly in <<_customizing_git>>. If you pass `--objects` to `rev-list`, it lists all the commit SHAs and also the blob SHAs with the file paths associated with them. -You can use this to find your blob’s name: +You can use this to find your blob's name: [source,shell] ---- @@ -286,11 +286,11 @@ Rewrite da3f30d019005479c99eb4c3406225613985a1db (2/2) Ref 'refs/heads/master' was rewritten ---- -The `--index-filter` option is similar to the `--tree-filter` option used in <<_git_tools>>, except that instead of passing a command that modifies files checked out on disk, you’re modifying your staging area or index each time. +The `--index-filter` option is similar to the `--tree-filter` option used in <<_git_tools>>, except that instead of passing a command that modifies files checked out on disk, you're modifying your staging area or index each time. Rather than remove a specific file with something like `rm file`, you have to remove it with `git rm --cached` – you must remove it from the index, not from disk. -The reason to do it this way is speed – because Git doesn’t have to check out each revision to disk before running your filter, the process can be much, much faster. +The reason to do it this way is speed – because Git doesn't have to check out each revision to disk before running your filter, the process can be much, much faster. You can accomplish the same task with `--tree-filter` if you want. -The `--ignore-unmatch` option to `git rm` tells it not to error out if the pattern you’re trying to remove isn’t there. +The `--ignore-unmatch` option to `git rm` tells it not to error out if the pattern you're trying to remove isn't there. Finally, you ask `filter-branch` to rewrite your history only from the `6df7640` commit up, because you know that is where this problem started. Otherwise, it will start from the beginning and will unnecessarily take longer. @@ -310,7 +310,7 @@ Writing objects: 100% (19/19), done. Total 19 (delta 3), reused 16 (delta 1) ---- -Let’s see how much space you saved. +Let's see how much space you saved. [source,shell] ---- @@ -325,5 +325,5 @@ garbage: 0 ---- The packed repository size is down to 7K, which is much better than 2MB. -You can see from the size value that the big object is still in your loose objects, so it’s not gone; but it won’t be transferred on a push or subsequent clone, which is what is important. +You can see from the size value that the big object is still in your loose objects, so it's not gone; but it won't be transferred on a push or subsequent clone, which is what is important. If you really wanted to, you could remove the object completely by running `git prune --expire`. diff --git a/book/11-git-internals/sections/objects.asc b/book/11-git-internals/sections/objects.asc index 7af645526..2d37f8585 100644 --- a/book/11-git-internals/sections/objects.asc +++ b/book/11-git-internals/sections/objects.asc @@ -31,9 +31,9 @@ d670460b4b4aece5915caf5c68d12f560a9fe3e4 ---- The `-w` tells `hash-object` to store the object; otherwise, the command simply tells you what the key would be. -`--stdin` tells the command to read the content from stdin; if you don’t specify this, `hash-object` expects the path to a file. +`--stdin` tells the command to read the content from stdin; if you don't specify this, `hash-object` expects the path to a file. The output from the command is a 40-character checksum hash. -This is the SHA-1 hash – a checksum of the content you’re storing plus a header, which you’ll learn about in a bit. +This is the SHA-1 hash – a checksum of the content you're storing plus a header, which you'll learn about in a bit. Now you can see how Git has stored your data: [source,shell] @@ -105,7 +105,7 @@ $ cat test.txt version 2 ---- -But remembering the SHA-1 key for each version of your file isn’t practical; plus, you aren’t storing the filename in your system – just the content. +But remembering the SHA-1 key for each version of your file isn't practical; plus, you aren't storing the filename in your system – just the content. This object type is called a blob. You can have Git tell you the object type of any object in Git, given its SHA-1 key, with `cat-file -t`: @@ -132,7 +132,7 @@ $ git cat-file -p master^{tree} ---- The `master^{tree}` syntax specifies the tree object that is pointed to by the last commit on your `master` branch. -Notice that the `lib` subdirectory isn’t a blob but a pointer to another tree: +Notice that the `lib` subdirectory isn't a blob but a pointer to another tree: [source,shell] ---- @@ -150,7 +150,7 @@ Git normally creates a tree by taking the state of your staging area or index an So, to create a tree object, you first have to set up an index by staging some files. To create an index with a single entry – the first version of your text.txt file – you can use the plumbing command `update-index`. You use this command to artificially add the earlier version of the test.txt file to a new staging area. -You must pass it the `--add` option because the file doesn’t yet exist in your staging area (you don’t even have a staging area set up yet) and `--cacheinfo` because the file you’re adding isn’t in your directory but is in your database. +You must pass it the `--add` option because the file doesn't yet exist in your staging area (you don't even have a staging area set up yet) and `--cacheinfo` because the file you're adding isn't in your directory but is in your database. Then, you specify the mode, SHA-1, and filename: [source,shell] @@ -159,12 +159,12 @@ $ git update-index --add --cacheinfo 100644 \ 83baae61804e65cc73a7201a7252750c76066a30 test.txt ---- -In this case, you’re specifying a mode of `100644`, which means it’s a normal file. -Other options are `100755`, which means it’s an executable file; and `120000`, which specifies a symbolic link. +In this case, you're specifying a mode of `100644`, which means it's a normal file. +Other options are `100755`, which means it's an executable file; and `120000`, which specifies a symbolic link. The mode is taken from normal UNIX modes but is much less flexible – these three modes are the only ones that are valid for files (blobs) in Git (although other modes are used for directories and submodules). Now, you can use the `write-tree` command to write the staging area out to a tree object. -No `-w` option is needed – calling `write-tree` automatically creates a tree object from the state of the index if that tree doesn’t yet exist: +No `-w` option is needed – calling `write-tree` automatically creates a tree object from the state of the index if that tree doesn't yet exist: [source,shell] ---- @@ -182,7 +182,7 @@ $ git cat-file -t d8329fc1cc938780ffdd9f94e0d364e0ea74f579 tree ---- -You’ll now create a new tree with the second version of test.txt and a new file as well: +You'll now create a new tree with the second version of test.txt and a new file as well: [source,shell] ---- @@ -204,7 +204,7 @@ $ git cat-file -p 0155eb4229851634a0f03eb265b69f5a2d56f341 ---- Notice that this tree has both file entries and also that the test.txt SHA is the ``version 2'' SHA from earlier (`1f7a7a`). -Just for fun, you’ll add the first tree as a subdirectory into this one. +Just for fun, you'll add the first tree as a subdirectory into this one. You can read trees into your staging area by calling `read-tree`. In this case, you can read an existing tree into your staging area as a subtree by using the `--prefix` option to `read-tree`: @@ -228,7 +228,7 @@ image::images/data-model-2.png[The content structure of your current Git data.] ==== Commit Objects You have three trees that specify the different snapshots of your project that you want to track, but the earlier problem remains: you must remember all three SHA-1 values in order to recall the snapshots. -You also don’t have any information about who saved the snapshots, when they were saved, or why they were saved. +You also don't have any information about who saved the snapshots, when they were saved, or why they were saved. This is the basic information that the commit object stores for you. To create a commit object, you call `commit-tree` and specify a single tree SHA-1 and which commit objects, if any, directly preceded it. @@ -254,7 +254,7 @@ first commit The format for a commit object is simple: it specifies the top-level tree for the snapshot of the project at that point; the author/committer information (which uses your `user.name` and `user.email` configuration settings and a timestamp); a blank line, and then the commit message. -Next, you’ll write the other two commit objects, each referencing the commit that came directly before it: +Next, you'll write the other two commit objects, each referencing the commit that came directly before it: [source,shell] ---- @@ -300,7 +300,7 @@ Date: Fri May 22 18:09:34 2009 -0700 ---- Amazing. -You’ve just done the low-level operations to build up a Git history without using any of the front ends. +You've just done the low-level operations to build up a Git history without using any of the front ends. This is essentially what Git does when you run the `git add` and `git commit` commands – it stores blobs for the files that have changed, updates the index, writes out trees, and writes commit objects that reference the top-level trees and the commits that came immediately before them. These three main Git objects – the blob, the tree, and the commit – are initially stored as separate files in your `.git/objects` directory. Here are all the objects in the example directory now, commented with what they store: @@ -328,8 +328,8 @@ image::images/data-model-3.png[All the objects in your Git directory.] ==== Object Storage I mentioned earlier that a header is stored with the content. -Let’s take a minute to look at how Git stores its objects. -You’ll see how to store a blob object – in this case, the string ``what is up, doc?'' – interactively in the Ruby scripting language. +Let's take a minute to look at how Git stores its objects. +You'll see how to store a blob object – in this case, the string ``what is up, doc?'' – interactively in the Ruby scripting language. You can start up interactive Ruby mode with the `irb` command: [source,shell] @@ -372,9 +372,9 @@ First, you need to require the library and then run `Zlib::Deflate.deflate()` on => "x\x9CK\xCA\xC9OR04c(\xCFH,Q\xC8,V(-\xD0QH\xC9O\xB6\a\x00_\x1C\a\x9D" ---- -Finally, you’ll write your zlib-deflated content to an object on disk. -You’ll determine the path of the object you want to write out (the first two characters of the SHA-1 value being the subdirectory name, and the last 38 characters being the filename within that directory). -In Ruby, you can use the `FileUtils.mkdir_p()` function to create the subdirectory if it doesn’t exist. +Finally, you'll write your zlib-deflated content to an object on disk. +You'll determine the path of the object you want to write out (the first two characters of the SHA-1 value being the subdirectory name, and the last 38 characters being the filename within that directory). +In Ruby, you can use the `FileUtils.mkdir_p()` function to create the subdirectory if it doesn't exist. Then, open the file with `File.open()` and write out the previously zlib-compressed content to the file with a `write()` call on the resulting file handle: [source,shell] @@ -389,6 +389,6 @@ Then, open the file with `File.open()` and write out the previously zlib-compres => 32 ---- -That’s it – you’ve created a valid Git blob object. +That's it – you've created a valid Git blob object. All Git objects are stored the same way, just with different types – instead of the string blob, the header will begin with commit or tree. Also, although the blob content can be nearly anything, the commit and tree content are very specifically formatted. diff --git a/book/11-git-internals/sections/packfiles.asc b/book/11-git-internals/sections/packfiles.asc index 243161148..8aa94f767 100644 --- a/book/11-git-internals/sections/packfiles.asc +++ b/book/11-git-internals/sections/packfiles.asc @@ -1,6 +1,6 @@ === Packfiles -Let’s go back to the objects database for your test Git repository. +Let's go back to the objects database for your test Git repository. At this point, you have 11 objects – 4 blobs, 3 trees, 3 commits, and 1 tag: [source,shell] @@ -19,8 +19,8 @@ $ find .git/objects -type f .git/objects/fd/f4fc3344e67ab068f836878b6c4951e3b15f3d # commit 1 ---- -Git compresses the contents of these files with zlib, and you’re not storing much, so all these files collectively take up only 925 bytes. -You’ll add some larger content to the repository to demonstrate an interesting feature of Git. +Git compresses the contents of these files with zlib, and you're not storing much, so all these files collectively take up only 925 bytes. +You'll add some larger content to the repository to demonstrate an interesting feature of Git. Add the repo.rb file from the Grit library you worked with earlier – this is about a 22K source code file: [source,shell] @@ -82,7 +82,7 @@ $ git cat-file -s b042a60ef7dff760008df33cee372b945b6e884e ---- You have two nearly identical 22K objects on your disk. -Wouldn’t it be nice if Git could store one of them in full but then the second object only as the delta between it and the first? +Wouldn't it be nice if Git could store one of them in full but then the second object only as the delta between it and the first? It turns out that it can. The initial format in which Git saves objects on disk is called a ``loose'' object format. @@ -100,7 +100,7 @@ Writing objects: 100% (18/18), done. Total 18 (delta 3), reused 0 (delta 0) ---- -If you look in your objects directory, you’ll find that most of your objects are gone, and a new pair of files has appeared: +If you look in your objects directory, you'll find that most of your objects are gone, and a new pair of files has appeared: [source,shell] ---- @@ -112,15 +112,15 @@ $ find .git/objects -type f .git/objects/pack/pack-978e03944f5c581011e6998cd0e9e30000905586.pack ---- -The objects that remain are the blobs that aren’t pointed to by any commit – in this case, the ``what is up, doc?'' +The objects that remain are the blobs that aren't pointed to by any commit – in this case, the ``what is up, doc?'' example and the ``test content'' example blobs you created earlier. -Because you never added them to any commits, they’re considered dangling and aren’t packed up in your new packfile. +Because you never added them to any commits, they're considered dangling and aren't packed up in your new packfile. The other files are your new packfile and an index. The packfile is a single file containing the contents of all the objects that were removed from your filesystem. The index is a file that contains offsets into that packfile so you can quickly seek to a specific object. What is cool is that although the objects on disk before you ran the `gc` were collectively about 22K in size, the new packfile is only 7K. -You’ve cut your disk usage by ⅔ by packing your objects. +You've cut your disk usage by ⅔ by packing your objects. How does Git do this? When Git packs objects, it looks for files that are named and sized similarly, and stores just the deltas from one version of the file to the next. @@ -158,7 +158,7 @@ chain length = 1: 3 objects Here, the `033b4` blob, which if you remember was the first version of your repo.rb file, is referencing the `b042a` blob, which was the second version of the file. The third column in the output is the size of the object in the pack, so you can see that `b042a` takes up 22K of the file, but that `033b4` only takes up 9 bytes. -What is also interesting is that the second version of the file is the one that is stored intact, whereas the original version is stored as a delta – this is because you’re most likely to need faster access to the most recent version of the file. +What is also interesting is that the second version of the file is the one that is stored intact, whereas the original version is stored as a delta – this is because you're most likely to need faster access to the most recent version of the file. The really nice thing about this is that it can be repacked at any time. Git will occasionally repack your database automatically, always trying to save more space, but you can also manually repack at any time by running `git gc` by hand. diff --git a/book/11-git-internals/sections/plumbing-porcelain.asc b/book/11-git-internals/sections/plumbing-porcelain.asc index 21defed5f..040c853b0 100644 --- a/book/11-git-internals/sections/plumbing-porcelain.asc +++ b/book/11-git-internals/sections/plumbing-porcelain.asc @@ -4,14 +4,14 @@ This book covers how to use Git with 30 or so verbs such as `checkout`, `branch` But because Git was initially a toolkit for a VCS rather than a full user-friendly VCS, it has a bunch of verbs that do low-level work and were designed to be chained together UNIX style or called from scripts. These commands are generally referred to as ``plumbing'' commands, and the more user-friendly commands are called ``porcelain'' commands. -The book’s first eight chapters deal almost exclusively with porcelain commands. -But in this chapter, you’ll be dealing mostly with the lower-level plumbing commands, because they give you access to the inner workings of Git, and help demonstrate how and why Git does what it does. -These commands aren’t meant to be used manually on the command line, but rather to be used as building blocks for new tools and custom scripts. +The book's first eight chapters deal almost exclusively with porcelain commands. +But in this chapter, you'll be dealing mostly with the lower-level plumbing commands, because they give you access to the inner workings of Git, and help demonstrate how and why Git does what it does. +These commands aren't meant to be used manually on the command line, but rather to be used as building blocks for new tools and custom scripts. When you run `git init` in a new or existing directory, Git creates the `.git` directory, which is where almost everything that Git stores and manipulates is located. If you want to back up or clone your repository, copying this single directory elsewhere gives you nearly everything you need. This entire chapter basically deals with the stuff in this directory. -Here’s what it looks like: +Here's what it looks like: [source,shell] ---- @@ -25,12 +25,12 @@ objects/ refs/ ---- -You may see some other files in there, but this is a fresh `git init` repository – it’s what you see by default. -The `description` file is only used by the GitWeb program, so don’t worry about it. -The `config` file contains your project-specific configuration options, and the `info` directory keeps a global exclude file for ignored patterns that you don’t want to track in a .gitignore file. +You may see some other files in there, but this is a fresh `git init` repository – it's what you see by default. +The `description` file is only used by the GitWeb program, so don't worry about it. +The `config` file contains your project-specific configuration options, and the `info` directory keeps a global exclude file for ignored patterns that you don't want to track in a .gitignore file. The `hooks` directory contains your client- or server-side hook scripts, which are discussed in detail in <<_hooks>>. This leaves four important entries: the `HEAD` and (yet to be created) `index` files, and the `objects` and `refs` directories. These are the core parts of Git. The `objects` directory stores all the content for your database, the `refs` directory stores pointers into commit objects in that data (branches), the `HEAD` file points to the branch you currently have checked out, and the `index` file is where Git stores your staging area information. -You’ll now look at each of these sections in detail to see how Git operates. +You'll now look at each of these sections in detail to see how Git operates. diff --git a/book/11-git-internals/sections/refs.asc b/book/11-git-internals/sections/refs.asc index 86d52e63b..77eb8ef62 100644 --- a/book/11-git-internals/sections/refs.asc +++ b/book/11-git-internals/sections/refs.asc @@ -32,7 +32,7 @@ cac0cab538b970a37ea1e769cbbde608743bc96d second commit fdf4fc3344e67ab068f836878b6c4951e3b15f3d first commit ---- -You aren’t encouraged to directly edit the reference files. +You aren't encouraged to directly edit the reference files. Git provides a safer command to do this if you want to update a reference called `update-ref`: [source,shell] @@ -40,7 +40,7 @@ Git provides a safer command to do this if you want to update a reference called $ git update-ref refs/heads/master 1a410efbd13591db07496601ebc7a059dd55cfe9 ---- -That’s basically what a branch in Git is: a simple pointer or reference to the head of a line of work. +That's basically what a branch in Git is: a simple pointer or reference to the head of a line of work. To create a branch back at the second commit, you can do this: [source,shell] @@ -62,15 +62,15 @@ Now, your Git database conceptually looks something like this: .Git directory objects with branch head references included. image::images/data-model-4.png[Git directory objects with branch head references included.] -When you run commands like `git branch (branchname)`, Git basically runs that `update-ref` command to add the SHA-1 of the last commit of the branch you’re on into whatever new reference you want to create. +When you run commands like `git branch (branchname)`, Git basically runs that `update-ref` command to add the SHA-1 of the last commit of the branch you're on into whatever new reference you want to create. ==== The HEAD The question now is, when you run `git branch (branchname)`, how does Git know the SHA-1 of the last commit? The answer is the HEAD file. -The HEAD file is a symbolic reference to the branch you’re currently on. -Unlike a normal reference, a symbolic reference doesn’t contain a SHA-1 value, but rather a pointer to another reference. -If you look at the file, you’ll normally see something like this: +The HEAD file is a symbolic reference to the branch you're currently on. +Unlike a normal reference, a symbolic reference doesn't contain a SHA-1 value, but rather a pointer to another reference. +If you look at the file, you'll normally see something like this: [source,shell] ---- @@ -106,7 +106,7 @@ $ cat .git/HEAD ref: refs/heads/test ---- -You can’t set a symbolic reference outside of the refs style: +You can't set a symbolic reference outside of the refs style: [source,shell] ---- @@ -116,10 +116,10 @@ fatal: Refusing to point HEAD outside of refs/ ==== Tags -We just finished discussing Git’s three main object types, but there is a fourth. +We just finished discussing Git's three main object types, but there is a fourth. The tag object is very much like a commit object – it contains a tagger, a date, a message, and a pointer. The main difference is that a tag object points to a commit rather than a tree. -It’s like a branch reference, but it never moves – it always points to the same commit but gives it a friendlier name. +It's like a branch reference, but it never moves – it always points to the same commit but gives it a friendlier name. As discussed in <<_git_basics_chapter>>, there are two types of tags: annotated and lightweight. You can make a lightweight tag by running something like this: @@ -132,14 +132,14 @@ $ git update-ref refs/tags/v1.0 cac0cab538b970a37ea1e769cbbde608743bc96d That is all a lightweight tag is – a reference that never moves. An annotated tag is more complex, however. If you create an annotated tag, Git creates a tag object and then writes a reference to point to it rather than directly to the commit. -You can see this by creating an annotated tag (`-a` specifies that it’s an annotated tag): +You can see this by creating an annotated tag (`-a` specifies that it's an annotated tag): [source,shell] ---- $ git tag -a v1.1 1a410efbd13591db07496601ebc7a059dd55cfe9 -m 'test tag' ---- -Here’s the object SHA-1 value it created: +Here's the object SHA-1 value it created: [source,shell] ---- @@ -161,7 +161,7 @@ test tag ---- Notice that the object entry points to the commit SHA-1 value that you tagged. -Also notice that it doesn’t need to point to a commit; you can tag any Git object. +Also notice that it doesn't need to point to a commit; you can tag any Git object. In the Git source code, for example, the maintainer has added their GPG public key as a blob object and then tagged it. You can view the public key by running this in a clone of the Git repository: @@ -174,7 +174,7 @@ The Linux kernel repository also has a non-commit-pointing tag object – the fi ==== Remotes -The third type of reference that you’ll see is a remote reference. +The third type of reference that you'll see is a remote reference. If you add a remote and push to it, Git stores the value you last pushed to that remote for each branch in the `refs/remotes` directory. For instance, you can add a remote called `origin` and push your `master` branch to it: diff --git a/book/11-git-internals/sections/refspec.asc b/book/11-git-internals/sections/refspec.asc index 74010b4dd..10aeae668 100644 --- a/book/11-git-internals/sections/refspec.asc +++ b/book/11-git-internals/sections/refspec.asc @@ -1,7 +1,7 @@ [[_refspec]] === The Refspec -Throughout this book, you’ve used simple mappings from remote branches to local references; but they can be more complex. +Throughout this book, you've used simple mappings from remote branches to local references, but they can be more complex. Suppose you add a remote like this: [source,shell] @@ -19,7 +19,7 @@ It adds a section to your `.git/config` file, specifying the name of the remote ---- The format of the refspec is an optional `+`, followed by `:`, where `` is the pattern for references on the remote side and `` is where those references will be written locally. -The `+` tells Git to update the reference even if it isn’t a fast-forward. +The `+` tells Git to update the reference even if it isn't a fast-forward. In the default case that is automatically written by a `git remote add` command, Git fetches all the references under `refs/heads/` on the server and writes them to `refs/remotes/origin/` locally. So, if there is a `master` branch on the server, you can access the log of that branch locally via @@ -31,7 +31,7 @@ $ git log remotes/origin/master $ git log refs/remotes/origin/master ---- -They’re all equivalent, because Git expands each of them to `refs/remotes/origin/master`. +They're all equivalent, because Git expands each of them to `refs/remotes/origin/master`. If you want Git instead to pull down only the `master` branch each time, and not every other branch on the remote server, you can change the fetch line to @@ -61,7 +61,7 @@ From git@github.com:schacon/simplegit * [new branch] topic -> origin/topic ---- -In this case, the master branch pull was rejected because it wasn’t a fast-forward reference. +In this case, the master branch pull was rejected because it wasn't a fast-forward reference. You can override that by specifying the `+` in front of the refspec. You can also specify multiple refspecs for fetching in your configuration file. @@ -75,7 +75,7 @@ If you want to always fetch the master and experiment branches, add two lines: fetch = +refs/heads/experiment:refs/remotes/origin/experiment ---- -You can’t use partial globs in the pattern, so this would be invalid: +You can't use partial globs in the pattern, so this would be invalid: [source] ---- @@ -83,7 +83,7 @@ fetch = +refs/heads/qa*:refs/remotes/origin/qa* ---- However, you can use namespacing to accomplish something like that. -If you have a QA team that pushes a series of branches, and you want to get the master branch and any of the QA team’s branches but nothing else, you can use a config section like this: +If you have a QA team that pushes a series of branches, and you want to get the master branch and any of the QA team's branches but nothing else, you can use a config section like this: [source,ini] ---- @@ -97,7 +97,7 @@ If you have a complex workflow process that has a QA team pushing branches, deve ==== Pushing Refspecs -It’s nice that you can fetch namespaced references that way, but how does the QA team get their branches into a `qa/` namespace in the first place? +It's nice that you can fetch namespaced references that way, but how does the QA team get their branches into a `qa/` namespace in the first place? You accomplish that by using refspecs to push. If the QA team wants to push their `master` branch to `qa/master` on the remote server, they can run diff --git a/book/11-git-internals/sections/transfer-protocols.asc b/book/11-git-internals/sections/transfer-protocols.asc index aa9cd010c..8f0049848 100644 --- a/book/11-git-internals/sections/transfer-protocols.asc +++ b/book/11-git-internals/sections/transfer-protocols.asc @@ -7,7 +7,7 @@ This section will quickly cover how these two main protocols operate. Git transport over HTTP is often referred to as the dumb protocol because it requires no Git-specific code on the server side during the transport process. The fetch process is a series of GET requests, where the client can assume the layout of the Git repository on the server. -Let’s follow the `http-fetch` process for the simplegit library: +Let's follow the `http-fetch` process for the simplegit library: [source,shell] ---- @@ -24,7 +24,7 @@ ca82a6dff817ec66f44342007202690a93763949 refs/heads/master ---- Now you have a list of the remote references and SHAs. -Next, you look for what the HEAD reference is so you know what to check out when you’re finished: +Next, you look for what the HEAD reference is so you know what to check out when you're finished: [source] ---- @@ -32,8 +32,8 @@ Next, you look for what the HEAD reference is so you know what to check out when ref: refs/heads/master ---- -You need to check out the `master` branch when you’ve completed the process. -At this point, you’re ready to start the walking process. +You need to check out the `master` branch when you've completed the process. +At this point, you're ready to start the walking process. Because your starting point is the `ca82a6` commit object you saw in the `info/refs` file, you start by fetching that: [source] @@ -73,7 +73,7 @@ Grab the tree object: (404 - Not Found) ---- -Oops – it looks like that tree object isn’t in loose format on the server, so you get a 404 response back. +Oops – it looks like that tree object isn't in loose format on the server, so you get a 404 response back. There are a couple of reasons for this – the object could be in an alternate repository, or it could be in a packfile in this repository. Git checks for any listed alternates first: @@ -93,7 +93,7 @@ To see what packfiles are available on this server, you need to get the `objects P pack-816a9b2334da9953e530f27bcac22082a9f5b835.pack ---- -There is only one packfile on the server, so your object is obviously in there, but you’ll check the index file to make sure. +There is only one packfile on the server, so your object is obviously in there, but you'll check the index file to make sure. This is also useful if you have multiple packfiles on the server, so you can see which packfile contains the object you need: [source] @@ -112,7 +112,7 @@ Your object is there, so go ahead and get the whole packfile: ---- You have your tree object, so you continue walking your commits. -They’re all also within the packfile you just downloaded, so you don’t have to do any more requests to your server. +They're all also within the packfile you just downloaded, so you don't have to do any more requests to your server. Git checks out a working copy of the `master` branch that was pointed to by the HEAD reference you downloaded at the beginning. The entire output of this process looks like this: @@ -158,16 +158,16 @@ $ ssh -x git@github.com "git-receive-pack 'schacon/simplegit-progit.git'" ---- The `git-receive-pack` command immediately responds with one line for each reference it currently has – in this case, just the `master` branch and its SHA. -The first line also has a list of the server’s capabilities (here, `report-status` and `delete-refs`). +The first line also has a list of the server's capabilities (here, `report-status` and `delete-refs`). Each line starts with a 4-byte hex value specifying how long the rest of the line is. Your first line starts with 005b, which is 91 in hex, meaning that 91 bytes remain on that line. The next line starts with 003e, which is 62, so you read the remaining 62 bytes. The next line is 0000, meaning the server is done with its references listing. -Now that it knows the server’s state, your `send-pack` process determines what commits it has that the server doesn’t. +Now that it knows the server's state, your `send-pack` process determines what commits it has that the server doesn't. For each reference that this push will update, the `send-pack` process tells the `receive-pack` process that information. -For instance, if you’re updating the `master` branch and adding an `experiment` branch, the `send-pack` response may look something like this: +For instance, if you're updating the `master` branch and adding an `experiment` branch, the `send-pack` response may look something like this: [source] ---- @@ -176,12 +176,12 @@ For instance, if you’re updating the `master` branch and adding an `experiment 0000 ---- -The SHA-1 value of all '0's means that nothing was there before – because you’re adding the experiment reference. +The SHA-1 value of all '0's means that nothing was there before – because you're adding the experiment reference. If you were deleting a reference, you would see the opposite: all '0's on the right side. -Git sends a line for each reference you’re updating with the old SHA, the new SHA, and the reference that is being updated. -The first line also has the client’s capabilities. -Next, the client uploads a packfile of all the objects the server doesn’t have yet. +Git sends a line for each reference you're updating with the old SHA, the new SHA, and the reference that is being updated. +The first line also has the client's capabilities. +Next, the client uploads a packfile of all the objects the server doesn't have yet. Finally, the server responds with a success (or failure) indication: [source] @@ -204,11 +204,11 @@ The `fetch-pack` process sends data that looks like this to the daemon after con 003fgit-upload-pack schacon/simplegit-progit.git\0host=myserver.com\0 ---- -It starts with the 4 bytes specifying how much data is following, then the command to run followed by a null byte, and then the server’s hostname followed by a final null byte. +It starts with the 4 bytes specifying how much data is following, then the command to run followed by a null byte, and then the server's hostname followed by a final null byte. The Git daemon checks that the command can be run and that the repository exists and has public permissions. If everything is cool, it fires up the `upload-pack` process and hands off the request to it. -If you’re doing the fetch over SSH, `fetch-pack` instead runs something like this: +If you're doing the fetch over SSH, `fetch-pack` instead runs something like this: [source,shell] ---- diff --git a/proposal.md b/proposal.md index 1335f66b5..abfe710be 100644 --- a/proposal.md +++ b/proposal.md @@ -10,11 +10,11 @@ Pro Git (Second Edition) is your fully-updated guide to Git and its usage in the Pro Git (Second Edition) is your fully-updated guide to Git and its usage in the modern world. Git has come a long way since it was first developed by Linus Torvalds for Linux kernel development. It has taken the open source world by storm since its inception in 2005, and this book teaches you how to use it like a pro. -Effective and well-implemented version control is a necessity for successful web projects, whether large or small. With this book you’ll learn how to master the world of distributed version workflow, use the distributed features of Git to the full, and extend Git to meet your every need. +Effective and well-implemented version control is a necessity for successful web projects, whether large or small. With this book you'll learn how to master the world of distributed version workflow, use the distributed features of Git to the full, and extend Git to meet your every need. -Written by Git pros Scott Chacon and Ben Straub, Pro Git (Second Edition) builds on the hugely successful first edition, and is now fully updated for Git version 2.0, as well as including an indispensable chapter on GitHub. It’s the best book for all your Git needs. +Written by Git pros Scott Chacon and Ben Straub, Pro Git (Second Edition) builds on the hugely successful first edition, and is now fully updated for Git version 2.0, as well as including an indispensable chapter on GitHub. It's the best book for all your Git needs. -## What You’ll Learn +## What You'll Learn * Effectively use Git, either as a programmer or a project leader * Become a fluent Git user @@ -26,7 +26,7 @@ Written by Git pros Scott Chacon and Ben Straub, Pro Git (Second Edition) builds ## Who This Book Is For -This book is for all open source developers: you are bound to encounter Git somewhere in the course of your working life. Proprietary software developers will appreciate Git’s enormous scalability, since it is used for the Linux project, which comprises thousands of developers and testers. +This book is for all open source developers: you are bound to encounter Git somewhere in the course of your working life. Proprietary software developers will appreciate Git's enormous scalability, since it is used for the Linux project, which comprises thousands of developers and testers. ## Short Table of Contents From 7c7160aac1f3bb951173123b7aec75aa3ac598eb Mon Sep 17 00:00:00 2001 From: Ben Straub Date: Tue, 16 Sep 2014 19:37:03 -0700 Subject: [PATCH 05/32] Refspec --- book/11-git-internals/sections/refspec.asc | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/book/11-git-internals/sections/refspec.asc b/book/11-git-internals/sections/refspec.asc index 10aeae668..7902dd1a3 100644 --- a/book/11-git-internals/sections/refspec.asc +++ b/book/11-git-internals/sections/refspec.asc @@ -1,7 +1,7 @@ [[_refspec]] === The Refspec -Throughout this book, you've used simple mappings from remote branches to local references, but they can be more complex. +Throughout this book, we've used simple mappings from remote branches to local references, but they can be more complex. Suppose you add a remote like this: [source,shell] @@ -14,8 +14,8 @@ It adds a section to your `.git/config` file, specifying the name of the remote [source,ini] ---- [remote "origin"] - url = git@github.com:schacon/simplegit-progit.git - fetch = +refs/heads/*:refs/remotes/origin/* + url = https://github.com/schacon/simplegit-progit.git + fetch = +refs/heads/*:refs/remotes/origin/* ---- The format of the refspec is an optional `+`, followed by `:`, where `` is the pattern for references on the remote side and `` is where those references will be written locally. @@ -82,7 +82,7 @@ You can't use partial globs in the pattern, so this would be invalid: fetch = +refs/heads/qa*:refs/remotes/origin/qa* ---- -However, you can use namespacing to accomplish something like that. +However, you can use namespaces (or directories) to accomplish something like that. If you have a QA team that pushes a series of branches, and you want to get the master branch and any of the QA team's branches but nothing else, you can use a config section like this: [source,ini] From 074439f381f1af8ba2084044078643fa823382de Mon Sep 17 00:00:00 2001 From: Ben Straub Date: Wed, 17 Sep 2014 20:21:50 -0700 Subject: [PATCH 06/32] Maintenance --- book/07-git-tools/1-git-tools.asc | 1 + book/08-customizing-git/sections/policy.asc | 1 + .../11-git-internals/sections/maintenance.asc | 135 ++++++++++-------- 3 files changed, 80 insertions(+), 57 deletions(-) diff --git a/book/07-git-tools/1-git-tools.asc b/book/07-git-tools/1-git-tools.asc index 50efce0f5..bb04ff34f 100644 --- a/book/07-git-tools/1-git-tools.asc +++ b/book/07-git-tools/1-git-tools.asc @@ -1455,6 +1455,7 @@ f3cc40e changed my name a bit Once again, this changes the SHAs of all the commits in your list, so make sure no commit shows up in that list that you’ve already pushed to a shared repository. +[[_filter_branch]] ==== The Nuclear Option: filter-branch There is another history-rewriting option that you can use if you need to rewrite a larger number of commits in some scriptable way – for instance, changing your e-mail address globally or removing a file from every commit. diff --git a/book/08-customizing-git/sections/policy.asc b/book/08-customizing-git/sections/policy.asc index 671ee9d5d..3a9d3b41a 100644 --- a/book/08-customizing-git/sections/policy.asc +++ b/book/08-customizing-git/sections/policy.asc @@ -37,6 +37,7 @@ puts "(#{$refname}) (#{$oldrev[0,6]}) (#{$newrev[0,6]})" Yes, those are global variables. Don't judge – it's easier to demonstrate this way. +[[_enforcing_commit_message_format]] ===== Enforcing a Specific Commit-Message Format Your first challenge is to enforce that each commit message adheres to a particular format. diff --git a/book/11-git-internals/sections/maintenance.asc b/book/11-git-internals/sections/maintenance.asc index 4db26a204..4beb97ce5 100644 --- a/book/11-git-internals/sections/maintenance.asc +++ b/book/11-git-internals/sections/maintenance.asc @@ -8,7 +8,7 @@ This section will cover some of these scenarios. Occasionally, Git automatically runs a command called ``auto gc''. Most of the time, this command does nothing. However, if there are too many loose objects (objects not in a packfile) or too many packfiles, Git launches a full-fledged `git gc` command. -The `gc` stands for garbage collect, and the command does a number of things: it gathers up all the loose objects and places them in packfiles, it consolidates packfiles into one big packfile, and it removes objects that aren't reachable from any commit and are a few months old. +The ``gc'' stands for garbage collect, and the command does a number of things: it gathers up all the loose objects and places them in packfiles, it consolidates packfiles into one big packfile, and it removes objects that aren't reachable from any commit and are a few months old. You can run auto gc manually as follows: @@ -39,7 +39,7 @@ Git will move them for the sake of efficiency into a file named `.git/packed-ref [source,shell] ---- $ cat .git/packed-refs -# pack-refs with: peeled +# pack-refs with: peeled fully-peeled cac0cab538b970a37ea1e769cbbde608743bc96d refs/heads/experiment ab1afef80fac8e34258ff41fc1b867c702daa24b refs/heads/master cac0cab538b970a37ea1e769cbbde608743bc96d refs/tags/v1.0 @@ -99,8 +99,9 @@ You can see where you've been at any time by running `git reflog`: [source,shell] ---- $ git reflog -1a410ef HEAD@{0}: 1a410efbd13591db07496601ebc7a059dd55cfe9: updating HEAD -ab1afef HEAD@{1}: ab1afef80fac8e34258ff41fc1b867c702daa24b: updating HEAD +1a410ef HEAD@{0}: reset: moving to 1a410ef +ab1afef HEAD@{1}: commit: modified repo.rb a bit +484a592 HEAD@{2}: commit: added repo.rb ---- Here we can see the two commits that we have had checked out, however there is not much information here. @@ -123,7 +124,7 @@ Reflog message: updating HEAD Author: Scott Chacon Date: Fri May 22 18:15:24 2009 -0700 - modified repo a bit + modified repo.rb a bit ---- It looks like the bottom commit is the one you lost, so you can recover it by creating a new branch at that commit. @@ -158,6 +159,8 @@ If you run it with the `--full` option, it shows you all objects that aren't poi [source,shell] ---- $ git fsck --full +Checking object directories: 100% (256/256), done. +Checking objects: 100% (18/18), done. dangling blob d670460b4b4aece5915caf5c68d12f560a9fe3e4 dangling commit ab1afef80fac8e34258ff41fc1b867c702daa24b dangling tree aea790b9a58f6cf6f2804eeac9f0abbe9631e4c9 @@ -178,8 +181,8 @@ This can be a huge problem when you're converting Subversion or Perforce reposit Because you don't download the whole history in those systems, this type of addition carries few consequences. If you did an import from another system or otherwise find that your repository is much larger than it should be, here is how you can find and remove large objects. -Be warned: this technique is destructive to your commit history. -It rewrites every commit object downstream from the earliest tree you have to modify to remove a large file reference. +*Be warned: this technique is destructive to your commit history.* +It rewrites every commit object since the earliest tree you have to modify to remove a large file reference. If you do this immediately after an import, before anyone has started to base work on the commit, you're fine – otherwise, you have to notify all contributors that they must rebase their work onto your new commits. To demonstrate, you'll add a large file into your test repository, remove it in the next commit, find it, and remove it permanently from the repository. @@ -187,12 +190,12 @@ First, add a large object to your history: [source,shell] ---- -$ curl http://kernel.org/pub/software/scm/git/git-1.6.3.1.tar.bz2 > git.tbz2 -$ git add git.tbz2 -$ git commit -am 'added git tarball' -[master 6df7640] added git tarball - 1 files changed, 0 insertions(+), 0 deletions(-) - create mode 100644 git.tbz2 +$ curl https://www.kernel.org/pub/software/scm/git/git-2.1.0.tar.gz > git.tgz +$ git add git.tgz +$ git commit -m 'add git tarball' +[master 7b30847] add git tarball + 1 file changed, 0 insertions(+), 0 deletions(-) + create mode 100644 git.tgz ---- Oops – you didn't want to add a huge tarball to your project. @@ -200,12 +203,12 @@ Better get rid of it: [source,shell] ---- -$ git rm git.tbz2 -rm 'git.tbz2' +$ git rm git.tgz +rm 'git.tgz' $ git commit -m 'oops - removed large tarball' -[master da3f30d] oops - removed large tarball - 1 files changed, 0 insertions(+), 0 deletions(-) - delete mode 100644 git.tbz2 +[master dadf725] oops - removed large tarball + 1 file changed, 0 insertions(+), 0 deletions(-) + delete mode 100644 git.tgz ---- Now, `gc` your database and see how much space you're using: @@ -213,11 +216,11 @@ Now, `gc` your database and see how much space you're using: [source,shell] ---- $ git gc -Counting objects: 21, done. -Delta compression using 2 threads. -Compressing objects: 100% (16/16), done. -Writing objects: 100% (21/21), done. -Total 21 (delta 3), reused 15 (delta 1) +Counting objects: 17, done. +Delta compression using up to 8 threads. +Compressing objects: 100% (13/13), done. +Writing objects: 100% (17/17), done. +Total 17 (delta 1), reused 10 (delta 0) ---- You can run the `count-objects` command to quickly see how much space you're using: @@ -225,18 +228,19 @@ You can run the `count-objects` command to quickly see how much space you're usi [source,shell] ---- $ git count-objects -v -count: 4 -size: 16 -in-pack: 21 +count: 7 +size: 32 +in-pack: 17 packs: 1 -size-pack: 2016 +size-pack: 4868 prune-packable: 0 garbage: 0 +size-garbage: 0 ---- -The `size-pack` entry is the size of your packfiles in kilobytes, so you're using 2MB. +The `size-pack` entry is the size of your packfiles in kilobytes, so you're using almost 5MB. Before the last commit, you were using closer to 2K – clearly, removing the file from the previous commit didn't remove it from your history. -Every time anyone clones this repository, they will have to clone all 2MB just to get this tiny project, because you accidentally added a big file. +Every time anyone clones this repository, they will have to clone all 5MB just to get this tiny project, because you accidentally added a big file. Let's get rid of it. First you have to find it. @@ -247,21 +251,23 @@ You can also pipe it through the `tail` command because you're only interested i [source,shell] ---- -$ git verify-pack -v .git/objects/pack/pack-3f8c0...bb.idx | sort -k 3 -n | tail -3 -e3f094f522629ae358806b17daf78246c27c007b blob 1486 734 4667 -05408d195263d853f09dca71d55116663690c27c blob 12908 3478 1189 -7a9eb2fba2b1811321254ac360970fc169ba2330 blob 2056716 2056872 5401 +$ git verify-pack -v .git/objects/pack/pack-29…69.idx \ + | sort -k 3 -n \ + | tail -3 +dadf7258d699da2c8d89b09ef6670edb7d5f91b4 commit 229 159 12 +033b4468fa6b2a9547a70d88d1bbe8bf3f9ed0d5 blob 22044 5792 4977696 +82c99a3e86bb1267b236a4b6eff7868d97489af1 blob 4975916 4976258 1438 ---- -The big object is at the bottom: 2MB. -To find out what file it is, you'll use the `rev-list` command, which you used briefly in <<_customizing_git>>. +The big object is at the bottom: 5MB. +To find out what file it is, you'll use the `rev-list` command, which you used briefly in <<_enforcing_commit_message_format>>. If you pass `--objects` to `rev-list`, it lists all the commit SHAs and also the blob SHAs with the file paths associated with them. You can use this to find your blob's name: [source,shell] ---- -$ git rev-list --objects --all | grep 7a9eb2fb -7a9eb2fba2b1811321254ac360970fc169ba2330 git.tbz2 +$ git rev-list --objects --all | grep 82c99a3 +82c99a3e86bb1267b236a4b6eff7868d97489af1 git.tgz ---- Now, you need to remove this file from all trees in your past. @@ -269,24 +275,24 @@ You can easily see what commits modified this file: [source,shell] ---- -$ git log --pretty=oneline -- git.tbz2 -da3f30d019005479c99eb4c3406225613985a1db oops - removed large tarball -6df764092f3e7c8f5f94cbe08ee5cf42e92a0289 added git tarball +$ git log --oneline -- git.tgz +dadf725 oops - removed large tarball +7b30847 add git tarball ---- -You must rewrite all the commits downstream from `6df76` to fully remove this file from your Git history. -To do so, you use `filter-branch`, which you used in <<_git_tools>>: +You must rewrite all the commits downstream from `7b30847` to fully remove this file from your Git history. +To do so, you use `filter-branch`, which you saw in <<_filter_branch>>: [source,shell] ---- $ git filter-branch --index-filter \ - 'git rm --cached --ignore-unmatch git.tbz2' -- 6df7640^.. -Rewrite 6df764092f3e7c8f5f94cbe08ee5cf42e92a0289 (1/2)rm 'git.tbz2' -Rewrite da3f30d019005479c99eb4c3406225613985a1db (2/2) + 'git rm --cached --ignore-unmatch git.tgz' -- 7b30847^.. +Rewrite 7b30847d080183a1ab7d18fb202473b3096e9f34 (1/2)rm 'git.tgz' +Rewrite dadf7258d699da2c8d89b09ef6670edb7d5f91b4 (2/2) Ref 'refs/heads/master' was rewritten ---- -The `--index-filter` option is similar to the `--tree-filter` option used in <<_git_tools>>, except that instead of passing a command that modifies files checked out on disk, you're modifying your staging area or index each time. +The `--index-filter` option is similar to the `--tree-filter` option used in <<_filter_branch>>, except that instead of passing a command that modifies files checked out on disk, you're modifying your staging area or index each time. Rather than remove a specific file with something like `rm file`, you have to remove it with `git rm --cached` – you must remove it from the index, not from disk. The reason to do it this way is speed – because Git doesn't have to check out each revision to disk before running your filter, the process can be much, much faster. You can accomplish the same task with `--tree-filter` if you want. @@ -303,11 +309,11 @@ You need to get rid of anything that has a pointer to those old commits before y $ rm -Rf .git/refs/original $ rm -Rf .git/logs/ $ git gc -Counting objects: 19, done. -Delta compression using 2 threads. -Compressing objects: 100% (14/14), done. -Writing objects: 100% (19/19), done. -Total 19 (delta 3), reused 16 (delta 1) +Counting objects: 15, done. +Delta compression using up to 8 threads. +Compressing objects: 100% (11/11), done. +Writing objects: 100% (15/15), done. +Total 15 (delta 1), reused 12 (delta 0) ---- Let's see how much space you saved. @@ -315,15 +321,30 @@ Let's see how much space you saved. [source,shell] ---- $ git count-objects -v -count: 8 -size: 2040 -in-pack: 19 +count: 11 +size: 4904 +in-pack: 15 packs: 1 -size-pack: 7 +size-pack: 8 prune-packable: 0 garbage: 0 +size-garbage: 0 ---- -The packed repository size is down to 7K, which is much better than 2MB. +The packed repository size is down to 8K, which is much better than 5MB. You can see from the size value that the big object is still in your loose objects, so it's not gone; but it won't be transferred on a push or subsequent clone, which is what is important. -If you really wanted to, you could remove the object completely by running `git prune --expire`. +If you really wanted to, you could remove the object completely by running `git prune` with the `--expire` option: + +[source,shell] +---- +$ git prune --expire now +$ git count-objects -v +count: 0 +size: 0 +in-pack: 15 +packs: 1 +size-pack: 8 +prune-packable: 0 +garbage: 0 +size-garbage: 0 +---- From c24f22d6a6b0572f5756afd64f0c2e9fa6f804fa Mon Sep 17 00:00:00 2001 From: Ben Straub Date: Fri, 19 Sep 2014 21:01:07 -0700 Subject: [PATCH 07/32] Partially-written, partially-organized env vars --- .../11-git-internals/sections/environment.asc | 92 +++++++++++++++++++ 1 file changed, 92 insertions(+) diff --git a/book/11-git-internals/sections/environment.asc b/book/11-git-internals/sections/environment.asc index 22083e329..dbc43b339 100644 --- a/book/11-git-internals/sections/environment.asc +++ b/book/11-git-internals/sections/environment.asc @@ -1 +1,93 @@ === Environment Variables + +Git always runs inside a `bash` shell, and uses a number of shell environment variables to determine how it behaves. +Occasionally, it comes in handy to know what these are, and how they can be used to make Git behave the way you want it to. + + +==== Global Behavior + +Some of Git's behavior as a computer program depends on environment variables. + +* `GIT_EXEC_PATH` determines where Git looks for its sub-programs (like `git-commit`, `git-diff`, and others). + You can check the current setting by running `git --exec-path`. + +* `HOME` isn't usually considered customizable (too many other things depend on it), but it's where Git looks for the global configuration file. + If you want a truly portable Git installation, complete with global configuration, you can override `HOME` in the portable Git's shell profile. + +* `PREFIX` is similar, but for the system-wide configuration. + Git looks for this file at `$PREFIX/etc/gitconfig`. + +* `GIT_CONFIG_NOSYSTEM`, if set, disables the use of the system-wide configuration file. + This is useful if your system config is interfering with your commands, but you don't have access to change or remove it. + +==== Repository Locations + +Git uses several environment variables to find the paths to files related to the current repository: + +* `GIT_DIR` is the location of the `.git` folder. + If this isn't specified, Git walks up the directory tree until it gets to `~`, looking for a `.git` directory at every step. + +* `GIT_CEILING_DIRECTORIES` controls the behavior of searching for a `.git` directory. + If you access directories that are slow to load (such as those on a tape drive, or across a slow network connection), you may want to have Git stop trying earlier than it might otherwise. + +* `GIT_DISCOVERY_ACROSS_FILESYSTEM` can be used to allow Git to cross filesystem boundaries when searching for a `.git` directory (the default behavior is not to cross the boundary). + +* `GIT_WORK_TREE` is the location of the root of the working directory for a non-bare repository. + If not specified, the parent directory of `$GIT_DIR` is used. + +* `GIT_INDEX_FILE` is the path to the index file (non-bare repositories only). + +* `GIT_OBJECT_DIRECTORY` can be used to specify the location of the directory that usually resides at `.git/objects`. + +* `GIT_ALTERNATE_OBJECT_DIRECTORIES` is a colon-separated list (formatted like `/dir/one:/dir/two:…`) which tells Git where to check for objects if they aren't in `GIT_OBJECT_DIRECTORY`. + If you happen to have a lot of projects with large files that have the exact same contents, this can be used to avoid storing too many copies of them. + + +==== Pathspecs + +* `GIT_LITERAL_PATHSPECS` + +* `GIT_GLOB_PATHSPECS/GIT_NOGLOB_PATHSPECS` + +* `GIT_ICASE_PATHSPECS` + +==== Commiting + +* `GIT_AUTHOR_NAME` + +* `GIT_AUTHOR_EMAIL` + +* `GIT_AUTHOR_DATE` + +* `GIT_COMMITTER_NAME` + +* `GIT_COMMITTER_EMAIL` + +* `GIT_COMMITTER_DATE` + +* `EMAIL` + + +==== Diffing and Merging + +GIT_DIFF_OPTS +GIT_EXTERNAL_DIFF +GIT_DIFF_PATH_COUNTER +GIT_DIFF_PATH_TOTAL +GIT_MERGE_VERBOSITY + +==== Miscellaneous + +GIT_PAGER +GIT_EDITOR +GIT_SSH +GIT_ASKPASS +GIT_NAMESPACE +GIT_FLUSH +GIT_TRACE +GIT_TRACE_PACK_ACCESS +GIT_TRACE_PACKET +GIT_TRACE_PERFORMANCE +GIT_TRACE_SETUP +GIT_TRACE_SHALLOW +GIT_REFLOG_ACTION From a18ec025f126beed5c52d429bf86f2af9e396331 Mon Sep 17 00:00:00 2001 From: Ben Straub Date: Sun, 21 Sep 2014 19:57:58 -0700 Subject: [PATCH 08/32] Categorization, formatting test --- .../11-git-internals/sections/environment.asc | 107 ++++++++++++------ 1 file changed, 72 insertions(+), 35 deletions(-) diff --git a/book/11-git-internals/sections/environment.asc b/book/11-git-internals/sections/environment.asc index dbc43b339..cc7945fdc 100644 --- a/book/11-git-internals/sections/environment.asc +++ b/book/11-git-internals/sections/environment.asc @@ -2,71 +2,93 @@ Git always runs inside a `bash` shell, and uses a number of shell environment variables to determine how it behaves. Occasionally, it comes in handy to know what these are, and how they can be used to make Git behave the way you want it to. +This isn't an exhaustive list of all the environment variables Git pays attention to, but we gathered the most useful. ==== Global Behavior -Some of Git's behavior as a computer program depends on environment variables. +Some of Git's general behavior as a computer program depends on environment variables. -* `GIT_EXEC_PATH` determines where Git looks for its sub-programs (like `git-commit`, `git-diff`, and others). +* *`GIT_EXEC_PATH`* determines where Git looks for its sub-programs (like `git-commit`, `git-diff`, and others). You can check the current setting by running `git --exec-path`. -* `HOME` isn't usually considered customizable (too many other things depend on it), but it's where Git looks for the global configuration file. +* *`HOME`* isn't usually considered customizable (too many other things depend on it), but it's where Git looks for the global configuration file. If you want a truly portable Git installation, complete with global configuration, you can override `HOME` in the portable Git's shell profile. -* `PREFIX` is similar, but for the system-wide configuration. +* *`PREFIX`* is similar, but for the system-wide configuration. Git looks for this file at `$PREFIX/etc/gitconfig`. -* `GIT_CONFIG_NOSYSTEM`, if set, disables the use of the system-wide configuration file. +* *`GIT_CONFIG_NOSYSTEM`*, if set, disables the use of the system-wide configuration file. This is useful if your system config is interfering with your commands, but you don't have access to change or remove it. + +[cols="1,2",options="header"] +|================================ +| Variable | Notes +| `GIT_PAGER` | Controls the program used to display multi-page output on the command line. + If this is unset, `PAGER` will be used as a fallback. +| `GIT_EDITOR` | The editor Git will launch when the user needs to edit some text (a commit message, for example). + If unset, `EDITOR` will be used. +|================================ + + + ==== Repository Locations Git uses several environment variables to find the paths to files related to the current repository: -* `GIT_DIR` is the location of the `.git` folder. +* *`GIT_DIR`* is the location of the `.git` folder. If this isn't specified, Git walks up the directory tree until it gets to `~`, looking for a `.git` directory at every step. -* `GIT_CEILING_DIRECTORIES` controls the behavior of searching for a `.git` directory. +* *`GIT_CEILING_DIRECTORIES`* controls the behavior of searching for a `.git` directory. If you access directories that are slow to load (such as those on a tape drive, or across a slow network connection), you may want to have Git stop trying earlier than it might otherwise. -* `GIT_DISCOVERY_ACROSS_FILESYSTEM` can be used to allow Git to cross filesystem boundaries when searching for a `.git` directory (the default behavior is not to cross the boundary). +* *`GIT_DISCOVERY_ACROSS_FILESYSTEM`* can be used to allow Git to cross filesystem boundaries when searching for a `.git` directory (the default behavior is not to cross the boundary). -* `GIT_WORK_TREE` is the location of the root of the working directory for a non-bare repository. +* *`GIT_WORK_TREE`* is the location of the root of the working directory for a non-bare repository. If not specified, the parent directory of `$GIT_DIR` is used. -* `GIT_INDEX_FILE` is the path to the index file (non-bare repositories only). +* *`GIT_INDEX_FILE`* is the path to the index file (non-bare repositories only). -* `GIT_OBJECT_DIRECTORY` can be used to specify the location of the directory that usually resides at `.git/objects`. +* *`GIT_OBJECT_DIRECTORY`* can be used to specify the location of the directory that usually resides at `.git/objects`. -* `GIT_ALTERNATE_OBJECT_DIRECTORIES` is a colon-separated list (formatted like `/dir/one:/dir/two:…`) which tells Git where to check for objects if they aren't in `GIT_OBJECT_DIRECTORY`. +* *`GIT_ALTERNATE_OBJECT_DIRECTORIES`* is a colon-separated list (formatted like `/dir/one:/dir/two:…`) which tells Git where to check for objects if they aren't in `GIT_OBJECT_DIRECTORY`. If you happen to have a lot of projects with large files that have the exact same contents, this can be used to avoid storing too many copies of them. ==== Pathspecs -* `GIT_LITERAL_PATHSPECS` +* *`GIT_LITERAL_PATHSPECS`* -* `GIT_GLOB_PATHSPECS/GIT_NOGLOB_PATHSPECS` +* *`GIT_GLOB_PATHSPECS/GIT_NOGLOB_PATHSPECS`* -* `GIT_ICASE_PATHSPECS` +* *`GIT_ICASE_PATHSPECS`* ==== Commiting -* `GIT_AUTHOR_NAME` +* *`GIT_AUTHOR_NAME`* + +* *`GIT_AUTHOR_EMAIL`* -* `GIT_AUTHOR_EMAIL` +* *`GIT_AUTHOR_DATE`* -* `GIT_AUTHOR_DATE` +* *`GIT_COMMITTER_NAME`* -* `GIT_COMMITTER_NAME` +* *`GIT_COMMITTER_EMAIL`* -* `GIT_COMMITTER_EMAIL` +* *`GIT_COMMITTER_DATE`* -* `GIT_COMMITTER_DATE` +* *`EMAIL`* -* `EMAIL` +==== Networking + +* *`GIT_CURL_VERBOSE`* + +GIT_SSL_NO_VERIFY +GIT_HTTP_LOW_SPEED_LIMIT +GIT_HTTP_LOW_SPEED_TIME +GIT_HTTP_USER_AGENT ==== Diffing and Merging @@ -76,18 +98,33 @@ GIT_DIFF_PATH_COUNTER GIT_DIFF_PATH_TOTAL GIT_MERGE_VERBOSITY +==== Debugging + +Want to _really_ know what Git is up to? +Git has a fairly complete set of traces embedded, and all you need to do is turn them on. +If any of these are set to ``true'' (or 1 or 2), that trace category will be output to stderr; if the value is an absolute path (starts with `/`), the trace output will be written to that file. + +* *`GIT_TRACE`* + +* *`GIT_TRACE_PACK_ACCESS`* + +* *`GIT_TRACE_PACKET`* + +* *`GIT_TRACE_PERFORMANCE`* + +* *`GIT_TRACE_SETUP`* + +* *`GIT_TRACE_SHALLOW`* + + ==== Miscellaneous -GIT_PAGER -GIT_EDITOR -GIT_SSH -GIT_ASKPASS -GIT_NAMESPACE -GIT_FLUSH -GIT_TRACE -GIT_TRACE_PACK_ACCESS -GIT_TRACE_PACKET -GIT_TRACE_PERFORMANCE -GIT_TRACE_SETUP -GIT_TRACE_SHALLOW -GIT_REFLOG_ACTION +* *`GIT_SSH`* + +* *`GIT_ASKPASS`* + +* *`GIT_NAMESPACE`* + +* *`GIT_FLUSH`* + +* *`GIT_REFLOG_ACTION`* From e998870a6be8b58f7a6475a074c03f7516afcb82 Mon Sep 17 00:00:00 2001 From: Ben Straub Date: Sun, 21 Sep 2014 20:24:01 -0700 Subject: [PATCH 09/32] Tracing progress --- .../11-git-internals/sections/environment.asc | 201 ++++++++++++------ 1 file changed, 140 insertions(+), 61 deletions(-) diff --git a/book/11-git-internals/sections/environment.asc b/book/11-git-internals/sections/environment.asc index cc7945fdc..2a2ccd22d 100644 --- a/book/11-git-internals/sections/environment.asc +++ b/book/11-git-internals/sections/environment.asc @@ -9,122 +9,201 @@ This isn't an exhaustive list of all the environment variables Git pays attentio Some of Git's general behavior as a computer program depends on environment variables. -* *`GIT_EXEC_PATH`* determines where Git looks for its sub-programs (like `git-commit`, `git-diff`, and others). +*`GIT_EXEC_PATH`* determines where Git looks for its sub-programs (like `git-commit`, `git-diff`, and others). You can check the current setting by running `git --exec-path`. -* *`HOME`* isn't usually considered customizable (too many other things depend on it), but it's where Git looks for the global configuration file. +*`HOME`* isn't usually considered customizable (too many other things depend on it), but it's where Git looks for the global configuration file. If you want a truly portable Git installation, complete with global configuration, you can override `HOME` in the portable Git's shell profile. -* *`PREFIX`* is similar, but for the system-wide configuration. +*`PREFIX`* is similar, but for the system-wide configuration. Git looks for this file at `$PREFIX/etc/gitconfig`. -* *`GIT_CONFIG_NOSYSTEM`*, if set, disables the use of the system-wide configuration file. +*`GIT_CONFIG_NOSYSTEM`*, if set, disables the use of the system-wide configuration file. This is useful if your system config is interfering with your commands, but you don't have access to change or remove it. +*`GIT_PAGER`* controls the program used to display multi-page output on the command line. +If this is unset, `PAGER` will be used as a fallback. -[cols="1,2",options="header"] -|================================ -| Variable | Notes -| `GIT_PAGER` | Controls the program used to display multi-page output on the command line. - If this is unset, `PAGER` will be used as a fallback. -| `GIT_EDITOR` | The editor Git will launch when the user needs to edit some text (a commit message, for example). - If unset, `EDITOR` will be used. -|================================ - +*`GIT_EDITOR`* is the editor Git will launch when the user needs to edit some text (a commit message, for example). +If unset, `EDITOR` will be used. ==== Repository Locations -Git uses several environment variables to find the paths to files related to the current repository: +Git uses several environment variables to find the paths to files related to the current repository. -* *`GIT_DIR`* is the location of the `.git` folder. - If this isn't specified, Git walks up the directory tree until it gets to `~`, looking for a `.git` directory at every step. +*`GIT_DIR`* is the location of the `.git` folder. +If this isn't specified, Git walks up the directory tree until it gets to `~`, looking for a `.git` directory at every step. -* *`GIT_CEILING_DIRECTORIES`* controls the behavior of searching for a `.git` directory. - If you access directories that are slow to load (such as those on a tape drive, or across a slow network connection), you may want to have Git stop trying earlier than it might otherwise. +*`GIT_CEILING_DIRECTORIES`* controls the behavior of searching for a `.git` directory. +If you access directories that are slow to load (such as those on a tape drive, or across a slow network connection), you may want to have Git stop trying earlier than it might otherwise. -* *`GIT_DISCOVERY_ACROSS_FILESYSTEM`* can be used to allow Git to cross filesystem boundaries when searching for a `.git` directory (the default behavior is not to cross the boundary). +*`GIT_DISCOVERY_ACROSS_FILESYSTEM`* can be used to allow Git to cross filesystem boundaries when searching for a `.git` directory (the default behavior is not to cross the boundary). -* *`GIT_WORK_TREE`* is the location of the root of the working directory for a non-bare repository. - If not specified, the parent directory of `$GIT_DIR` is used. +*`GIT_WORK_TREE`* is the location of the root of the working directory for a non-bare repository. +If not specified, the parent directory of `$GIT_DIR` is used. -* *`GIT_INDEX_FILE`* is the path to the index file (non-bare repositories only). +*`GIT_INDEX_FILE`* is the path to the index file (non-bare repositories only). -* *`GIT_OBJECT_DIRECTORY`* can be used to specify the location of the directory that usually resides at `.git/objects`. +*`GIT_OBJECT_DIRECTORY`* can be used to specify the location of the directory that usually resides at `.git/objects`. -* *`GIT_ALTERNATE_OBJECT_DIRECTORIES`* is a colon-separated list (formatted like `/dir/one:/dir/two:…`) which tells Git where to check for objects if they aren't in `GIT_OBJECT_DIRECTORY`. - If you happen to have a lot of projects with large files that have the exact same contents, this can be used to avoid storing too many copies of them. +*`GIT_ALTERNATE_OBJECT_DIRECTORIES`* is a colon-separated list (formatted like `/dir/one:/dir/two:…`) which tells Git where to check for objects if they aren't in `GIT_OBJECT_DIRECTORY`. +If you happen to have a lot of projects with large files that have the exact same contents, this can be used to avoid storing too many copies of them. ==== Pathspecs -* *`GIT_LITERAL_PATHSPECS`* +*`GIT_LITERAL_PATHSPECS`* -* *`GIT_GLOB_PATHSPECS/GIT_NOGLOB_PATHSPECS`* +*`GIT_GLOB_PATHSPECS/GIT_NOGLOB_PATHSPECS`* -* *`GIT_ICASE_PATHSPECS`* +*`GIT_ICASE_PATHSPECS`* ==== Commiting -* *`GIT_AUTHOR_NAME`* +*`GIT_AUTHOR_NAME`* -* *`GIT_AUTHOR_EMAIL`* +*`GIT_AUTHOR_EMAIL`* -* *`GIT_AUTHOR_DATE`* +*`GIT_AUTHOR_DATE`* -* *`GIT_COMMITTER_NAME`* +*`GIT_COMMITTER_NAME`* -* *`GIT_COMMITTER_EMAIL`* +*`GIT_COMMITTER_EMAIL`* -* *`GIT_COMMITTER_DATE`* +*`GIT_COMMITTER_DATE`* -* *`EMAIL`* +*`EMAIL`* ==== Networking -* *`GIT_CURL_VERBOSE`* +*`GIT_CURL_VERBOSE`* -GIT_SSL_NO_VERIFY -GIT_HTTP_LOW_SPEED_LIMIT -GIT_HTTP_LOW_SPEED_TIME -GIT_HTTP_USER_AGENT +*GIT_SSL_NO_VERIFY* -==== Diffing and Merging +*GIT_HTTP_LOW_SPEED_LIMIT* -GIT_DIFF_OPTS -GIT_EXTERNAL_DIFF -GIT_DIFF_PATH_COUNTER -GIT_DIFF_PATH_TOTAL -GIT_MERGE_VERBOSITY +*GIT_HTTP_LOW_SPEED_TIME* -==== Debugging +*GIT_HTTP_USER_AGENT* -Want to _really_ know what Git is up to? -Git has a fairly complete set of traces embedded, and all you need to do is turn them on. -If any of these are set to ``true'' (or 1 or 2), that trace category will be output to stderr; if the value is an absolute path (starts with `/`), the trace output will be written to that file. -* *`GIT_TRACE`* +==== Diffing and Merging + +*GIT_DIFF_OPTS* + +*GIT_EXTERNAL_DIFF* + +*GIT_DIFF_PATH_COUNTER* -* *`GIT_TRACE_PACK_ACCESS`* +*GIT_DIFF_PATH_TOTAL* -* *`GIT_TRACE_PACKET`* +*GIT_MERGE_VERBOSITY* -* *`GIT_TRACE_PERFORMANCE`* -* *`GIT_TRACE_SETUP`* +==== Debugging -* *`GIT_TRACE_SHALLOW`* +Want to _really_ know what Git is up to? +Git has a fairly complete set of traces embedded, and all you need to do is turn them on. +The possible values of these variables are as follows: + +* ``true'', ``1'', or ``2'' – the trace category is written to stderr. +* An absolute path starting with `/` – the trace output will be written to that file. + +*`GIT_TRACE`* controls general traces, which don't fit into any specific category. +This includes the expansion of aliases, and delegation to other sub-programs. + +[source,shell] +---- +$ GIT_TRACE=true git lga +20:12:49.877982 git.c:554 trace: exec: 'git-lga' +20:12:49.878369 run-command.c:341 trace: run_command: 'git-lga' +20:12:49.879529 git.c:282 trace: alias expansion: lga => 'log' '--graph' '--pretty=oneline' '--abbrev-commit' '--decorate' '--all' +20:12:49.879885 git.c:349 trace: built-in: git 'log' '--graph' '--pretty=oneline' '--abbrev-commit' '--decorate' '--all' +20:12:49.899217 run-command.c:341 trace: run_command: 'less' +20:12:49.899675 run-command.c:192 trace: exec: 'less' +---- + +*`GIT_TRACE_PACK_ACCESS`* controls tracing of packfile access. +The first field is the packfile being accessed, the second is the offset within that file: + +[source,shell] +---- +$ GIT_TRACE_PACK_ACCESS=true git status +20:10:12.081397 sha1_file.c:2088 .git/objects/pack/pack-c3fa...291e.pack 12 +20:10:12.081886 sha1_file.c:2088 .git/objects/pack/pack-c3fa...291e.pack 34662 +20:10:12.082115 sha1_file.c:2088 .git/objects/pack/pack-c3fa...291e.pack 35175 +# […] +20:10:12.087398 sha1_file.c:2088 .git/objects/pack/pack-e80e...e3d2.pack 56914983 +20:10:12.087419 sha1_file.c:2088 .git/objects/pack/pack-e80e...e3d2.pack 14303666 +On branch master +Your branch is up-to-date with 'origin/master'. +nothing to commit, working directory clean +---- + +*`GIT_TRACE_PACKET`* enables packet-level tracing for network operations. + +[source,shell] +---- +$ GIT_TRACE_PACKET=true git ls-remote origin +20:15:14.867043 pkt-line.c:46 packet: git< # service=git-upload-pack +20:15:14.867071 pkt-line.c:46 packet: git< 0000 +20:15:14.867079 pkt-line.c:46 packet: git< 97b8860c071898d9e162678ea1035a8ced2f8b1f HEAD\0multi_ack thin-pack side-band side-band-64k ofs-delta shallow no-progress include-tag multi_ack_detailed no-done symref=HEAD:refs/heads/master agent=git/2.0.4 +20:15:14.867088 pkt-line.c:46 packet: git< 0f20ae29889d61f2e93ae00fd34f1cdb53285702 refs/heads/ab/add-interactive-show-diff-func-name +20:15:14.867094 pkt-line.c:46 packet: git< 36dc827bc9d17f80ed4f326de21247a5d1341fbc refs/heads/ah/doc-gitk-config +# […] +---- + +*`GIT_TRACE_PERFORMANCE`* controls logging of performance data. +The output shows how long each particular git invocation takes. + +[source,shell] +---- +$ GIT_TRACE_PERFORMANCE=true git gc +20:18:19.499676 trace.c:414 performance: 0.374835000 s: git command: 'git' 'pack-refs' '--all' '--prune' +20:18:19.845585 trace.c:414 performance: 0.343020000 s: git command: 'git' 'reflog' 'expire' '--all' +Counting objects: 170994, done. +Delta compression using up to 8 threads. +Compressing objects: 100% (43413/43413), done. +Writing objects: 100% (170994/170994), done. +Total 170994 (delta 126176), reused 170524 (delta 125706) +20:18:23.567927 trace.c:414 performance: 3.715349000 s: git command: 'git' 'pack-objects' '--keep-true-parents' '--honor-pack-keep' '--non-empty' '--all' '--reflog' '--unpack-unreachable=2.weeks.ago' '--local' '--delta-base-offset' '.git/objects/pack/.tmp-49190-pack' +20:18:23.584728 trace.c:414 performance: 0.000910000 s: git command: 'git' 'prune-packed' +20:18:23.605218 trace.c:414 performance: 0.017972000 s: git command: 'git' 'update-server-info' +20:18:23.606342 trace.c:414 performance: 3.756312000 s: git command: 'git' 'repack' '-d' '-l' '-A' '--unpack-unreachable=2.weeks.ago' +Checking connectivity: 170994, done. +20:18:25.225424 trace.c:414 performance: 1.616423000 s: git command: 'git' 'prune' '--expire' '2.weeks.ago' +20:18:25.232403 trace.c:414 performance: 0.001051000 s: git command: 'git' 'rerere' 'gc' +20:18:25.233159 trace.c:414 performance: 6.112217000 s: git command: 'git' 'gc' +---- + +*`GIT_TRACE_SETUP`* shows information about what Git is discovering about the repository and environment it's interacting with. + +[source,shell] +---- +$ GIT_TRACE_SETUP=true git status +20:19:47.086765 trace.c:315 setup: git_dir: .git +20:19:47.087184 trace.c:316 setup: worktree: /Users/ben/src/git +20:19:47.087191 trace.c:317 setup: cwd: /Users/ben/src/git +20:19:47.087194 trace.c:318 setup: prefix: (null) +On branch master +Your branch is up-to-date with 'origin/master'. +nothing to commit, working directory clean +---- + + +*`GIT_TRACE_SHALLOW`* ==== Miscellaneous -* *`GIT_SSH`* +*`GIT_SSH`* -* *`GIT_ASKPASS`* +*`GIT_ASKPASS`* -* *`GIT_NAMESPACE`* +*`GIT_NAMESPACE`* -* *`GIT_FLUSH`* +*`GIT_FLUSH`* -* *`GIT_REFLOG_ACTION`* +*`GIT_REFLOG_ACTION`* From fcd36ab1efdbe5bbebbed707a156fb8b23e2e882 Mon Sep 17 00:00:00 2001 From: Ben Straub Date: Wed, 24 Sep 2014 05:45:41 -0700 Subject: [PATCH 10/32] Wrap up tracing section --- book/11-git-internals/sections/environment.asc | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/book/11-git-internals/sections/environment.asc b/book/11-git-internals/sections/environment.asc index 2a2ccd22d..02a26a512 100644 --- a/book/11-git-internals/sections/environment.asc +++ b/book/11-git-internals/sections/environment.asc @@ -50,6 +50,8 @@ If not specified, the parent directory of `$GIT_DIR` is used. *`GIT_ALTERNATE_OBJECT_DIRECTORIES`* is a colon-separated list (formatted like `/dir/one:/dir/two:…`) which tells Git where to check for objects if they aren't in `GIT_OBJECT_DIRECTORY`. If you happen to have a lot of projects with large files that have the exact same contents, this can be used to avoid storing too many copies of them. +*`GIT_SHALLOW_FILE`* overrides the default location of the ``shallow'' file, which usually lives at `.git/shallow`. + ==== Pathspecs @@ -192,10 +194,6 @@ Your branch is up-to-date with 'origin/master'. nothing to commit, working directory clean ---- - -*`GIT_TRACE_SHALLOW`* - - ==== Miscellaneous *`GIT_SSH`* From 5c9428c5017609ee070713c6db275f4bac4cf86d Mon Sep 17 00:00:00 2001 From: Ben Straub Date: Wed, 24 Sep 2014 15:12:00 -0700 Subject: [PATCH 11/32] Getting closer --- .../11-git-internals/sections/environment.asc | 42 ++++++++++++------- 1 file changed, 27 insertions(+), 15 deletions(-) diff --git a/book/11-git-internals/sections/environment.asc b/book/11-git-internals/sections/environment.asc index 02a26a512..76f8d7211 100644 --- a/book/11-git-internals/sections/environment.asc +++ b/book/11-git-internals/sections/environment.asc @@ -55,40 +55,52 @@ If you happen to have a lot of projects with large files that have the exact sam ==== Pathspecs -*`GIT_LITERAL_PATHSPECS`* +A ``pathspec'' refers to how you specify paths to things in Git. +These are used in the `.gitignore` file, but also on the command-line (`git add *.c`). -*`GIT_GLOB_PATHSPECS/GIT_NOGLOB_PATHSPECS`* +*`GIT_GLOB_PATHSPECS` and `GIT_NOGLOB_PATHSPECS`* control the default behavior of wildcards in pathspecs. +If `GIT_GLOB_PATHSPECS` is set to 1, wildcard characters act as wildcards (which is the default); if `GIT_NOGLOB_PATHSPECS` is set to 1, wildcard characters only match themselves, meaning something like `*.c` would only match a file _named_ ``*.c'', rather than any file whose name ends with `.c`. +You can override this in individual cases by starting the pathspec with `:(glob)` or `:(literal)`, as in `:(glob)*.c`. + +*`GIT_LITERAL_PATHSPECS`* disables both of the above behaviors; no wildcard characters will work, and the override prefixes are disabled as well. + +*`GIT_ICASE_PATHSPECS`* sets all pathspecs to work in a case-insensitive manner. -*`GIT_ICASE_PATHSPECS`* ==== Commiting -*`GIT_AUTHOR_NAME`* +The final creation of a Git commit object is usually done by `git-commit-tree`, which uses these environment variables as its primary source of information, falling back to configuration values only if these aren't present. + +*`GIT_AUTHOR_NAME`* is the human-readable name in the ``author'' field. -*`GIT_AUTHOR_EMAIL`* +*`GIT_AUTHOR_EMAIL`* is the email for the ``author'' field. -*`GIT_AUTHOR_DATE`* +*`GIT_AUTHOR_DATE`* is the timestamp used for the ``author'' field. -*`GIT_COMMITTER_NAME`* +*`GIT_COMMITTER_NAME`* sets the human name for the ``committer'' field. -*`GIT_COMMITTER_EMAIL`* +*`GIT_COMMITTER_EMAIL`* is the email address for the ``committer'' field. -*`GIT_COMMITTER_DATE`* +*`GIT_COMMITTER_DATE`* is used for the timestamp in the ``committer'' field. -*`EMAIL`* +*`EMAIL`* is the fallback email address in case the `user.email` configuration value isn't set. +If _this_ isn't set, Git falls back to the system user and host names. ==== Networking -*`GIT_CURL_VERBOSE`* +Git uses the `curl` library to do network operations over HTTP, so *`GIT_CURL_VERBOSE`* tells Git to emit all the messages generated by that library. +This is similar to doing `curl -v` on the command line. -*GIT_SSL_NO_VERIFY* +*GIT_SSL_NO_VERIFY* tells Git not to verify SSL certificates. +This can sometimes be necessary if you're using a self-signed certificate to serve Git repositories over HTTPS, or you're in the middle of setting up a Git server but haven't installed a full certificate yet. -*GIT_HTTP_LOW_SPEED_LIMIT* -*GIT_HTTP_LOW_SPEED_TIME* +If an HTTP operation is lower than *`GIT_HTTP_LOW_SPEED_LIMIT`* bytes per second for longer than *GIT_HTTP_LOW_SPEED_TIME* seconds, Git will abort that operation. +These values override the `http.lowSpeedLimit` and `http.lowSpeedTime` configuration values. -*GIT_HTTP_USER_AGENT* +*GIT_HTTP_USER_AGENT* sets the user-agent string used by Git when communicating over HTTP. +The default is a value like `git/2.0.0`. ==== Diffing and Merging From aea93a7370aa0468825f37972f8a495f414d498d Mon Sep 17 00:00:00 2001 From: Ben Straub Date: Thu, 25 Sep 2014 16:45:21 -0700 Subject: [PATCH 12/32] Diffing and merging --- .../11-git-internals/sections/environment.asc | 20 ++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/book/11-git-internals/sections/environment.asc b/book/11-git-internals/sections/environment.asc index 76f8d7211..2f35cf3a2 100644 --- a/book/11-git-internals/sections/environment.asc +++ b/book/11-git-internals/sections/environment.asc @@ -105,16 +105,26 @@ The default is a value like `git/2.0.0`. ==== Diffing and Merging -*GIT_DIFF_OPTS* +*GIT_DIFF_OPTS* is a bit of a misnomer. +The only valid values are `-u` or `--unified=`, which controls the number of context lines shown in a `git diff` command. -*GIT_EXTERNAL_DIFF* +*GIT_EXTERNAL_DIFF* is used as an override for the `diff.external` configuration value. +If it's set, Git will invoke this program when `git diff` is invoked. -*GIT_DIFF_PATH_COUNTER* +*GIT_DIFF_PATH_COUNTER* and *GIT_DIFF_PATH_TOTAL* are useful when you're using `GIT_EXTERNAL_DIFF` or `diff.external`. +The former represents which file in a series is being diffed (starting with 1), and the latter is the total number of files in the batch. -*GIT_DIFF_PATH_TOTAL* +*GIT_MERGE_VERBOSITY* controls the output for the recursive merge strategy. +The allowed values are as follows: -*GIT_MERGE_VERBOSITY* +* 0 outputs nothing, except possibly a single error message. +* 1 shows only conflicts. +* 2 also shows file changes. +* 3 shows when files are skipped because they haven't changed. +* 4 shows all paths as they are processed. +* 5 and above show detailed debugging information. +The default value is 2. ==== Debugging From 13843c681abc51bf910e6771020824e88c69d036 Mon Sep 17 00:00:00 2001 From: Ben Straub Date: Thu, 25 Sep 2014 18:54:14 -0700 Subject: [PATCH 13/32] Misc. env vars --- .../11-git-internals/sections/environment.asc | 27 +++++++++++++++---- 1 file changed, 22 insertions(+), 5 deletions(-) diff --git a/book/11-git-internals/sections/environment.asc b/book/11-git-internals/sections/environment.asc index 2f35cf3a2..d5918720d 100644 --- a/book/11-git-internals/sections/environment.asc +++ b/book/11-git-internals/sections/environment.asc @@ -218,12 +218,29 @@ nothing to commit, working directory clean ==== Miscellaneous -*`GIT_SSH`* +*`GIT_SSH`*, if specified, is a program that is invoked instead of `ssh` when Git tries to connect to an SSH host. +It is invoked like `$GIT_SSH [username@]host [-p ] `. +Note that this isn't the easiest way to customize how `ssh` is invoked; it won't support extra command-line parameters, so you'd have to write a wrapper script and set `GIT_SSH` to point to it. +It's probably easier just to use the `~/.ssh/config` file for that. -*`GIT_ASKPASS`* +*`GIT_ASKPASS`* is an override for the `core.askpass` configuration value. +This is the program invoked whenever Git needs to ask the user for credentials, which can expect a text prompt as a command-line argument, and should return the answer on `stdout`. +(See <<_credential_caching>> for more on this subsystem.) -*`GIT_NAMESPACE`* +*`GIT_NAMESPACE`* controls access to namespaced refs, and is equivalent to the `--namespace` flag. +This is mostly useful on the server side, where you may want to store multiple forks of a single repository in one repository, only keeping the refs separate. -*`GIT_FLUSH`* +*`GIT_FLUSH`* can be used to force Git to use non-buffered I/O when writing incrementally to stdout. +A value of 1 causes Git to flush more often, a value of 0 causes all output to be buffered. +The default value (if this variable is not set) is to choose an appropriate buffering scheme depending on the activity and the output mode. -*`GIT_REFLOG_ACTION`* +*`GIT_REFLOG_ACTION`* lets you specify the descriptive text written to the reflog. +Here's an example: + +[source,shell] +---- +$ GIT_REFLOG_ACTION="my action" git commit --allow-empty -m 'my message' +[master 9e3d55a] my message +$ git reflog -1 +9e3d55a HEAD@{0}: my action: my message +---- From 936c9c524407c389100f7ac0ade5587b039620c0 Mon Sep 17 00:00:00 2001 From: Ben Straub Date: Thu, 25 Sep 2014 19:10:19 -0700 Subject: [PATCH 14/32] Cleanups and corrections --- .../11-git-internals/sections/environment.asc | 24 +++++++++---------- 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/book/11-git-internals/sections/environment.asc b/book/11-git-internals/sections/environment.asc index d5918720d..cba74c785 100644 --- a/book/11-git-internals/sections/environment.asc +++ b/book/11-git-internals/sections/environment.asc @@ -2,7 +2,7 @@ Git always runs inside a `bash` shell, and uses a number of shell environment variables to determine how it behaves. Occasionally, it comes in handy to know what these are, and how they can be used to make Git behave the way you want it to. -This isn't an exhaustive list of all the environment variables Git pays attention to, but we gathered the most useful. +This isn't an exhaustive list of all the environment variables Git pays attention to, but we'll cover the most useful. ==== Global Behavior @@ -30,13 +30,13 @@ If unset, `EDITOR` will be used. ==== Repository Locations -Git uses several environment variables to find the paths to files related to the current repository. +Git uses several environment variables to determine how it interfaces with the current repository. *`GIT_DIR`* is the location of the `.git` folder. -If this isn't specified, Git walks up the directory tree until it gets to `~`, looking for a `.git` directory at every step. +If this isn't specified, Git walks up the directory tree until it gets to `~` or `/`, looking for a `.git` directory at every step. *`GIT_CEILING_DIRECTORIES`* controls the behavior of searching for a `.git` directory. -If you access directories that are slow to load (such as those on a tape drive, or across a slow network connection), you may want to have Git stop trying earlier than it might otherwise. +If you access directories that are slow to load (such as those on a tape drive, or across a slow network connection), you may want to have Git stop trying earlier than it might otherwise, especially if Git is invoked when building your shell prompt. *`GIT_DISCOVERY_ACROSS_FILESYSTEM`* can be used to allow Git to cross filesystem boundaries when searching for a `.git` directory (the default behavior is not to cross the boundary). @@ -55,7 +55,7 @@ If you happen to have a lot of projects with large files that have the exact sam ==== Pathspecs -A ``pathspec'' refers to how you specify paths to things in Git. +A ``pathspec'' refers to how you specify paths to things in Git, including the use of wildcards. These are used in the `.gitignore` file, but also on the command-line (`git add *.c`). *`GIT_GLOB_PATHSPECS` and `GIT_NOGLOB_PATHSPECS`* control the default behavior of wildcards in pathspecs. @@ -92,29 +92,29 @@ If _this_ isn't set, Git falls back to the system user and host names. Git uses the `curl` library to do network operations over HTTP, so *`GIT_CURL_VERBOSE`* tells Git to emit all the messages generated by that library. This is similar to doing `curl -v` on the command line. -*GIT_SSL_NO_VERIFY* tells Git not to verify SSL certificates. +*`GIT_SSL_NO_VERIFY`* tells Git not to verify SSL certificates. This can sometimes be necessary if you're using a self-signed certificate to serve Git repositories over HTTPS, or you're in the middle of setting up a Git server but haven't installed a full certificate yet. -If an HTTP operation is lower than *`GIT_HTTP_LOW_SPEED_LIMIT`* bytes per second for longer than *GIT_HTTP_LOW_SPEED_TIME* seconds, Git will abort that operation. +If the data rate of an HTTP operation is lower than *`GIT_HTTP_LOW_SPEED_LIMIT`* bytes per second for longer than *`GIT_HTTP_LOW_SPEED_TIME`* seconds, Git will abort that operation. These values override the `http.lowSpeedLimit` and `http.lowSpeedTime` configuration values. -*GIT_HTTP_USER_AGENT* sets the user-agent string used by Git when communicating over HTTP. +*`GIT_HTTP_USER_AGENT`* sets the user-agent string used by Git when communicating over HTTP. The default is a value like `git/2.0.0`. ==== Diffing and Merging -*GIT_DIFF_OPTS* is a bit of a misnomer. +*`GIT_DIFF_OPTS`* is a bit of a misnomer. The only valid values are `-u` or `--unified=`, which controls the number of context lines shown in a `git diff` command. -*GIT_EXTERNAL_DIFF* is used as an override for the `diff.external` configuration value. +*`GIT_EXTERNAL_DIFF`* is used as an override for the `diff.external` configuration value. If it's set, Git will invoke this program when `git diff` is invoked. -*GIT_DIFF_PATH_COUNTER* and *GIT_DIFF_PATH_TOTAL* are useful when you're using `GIT_EXTERNAL_DIFF` or `diff.external`. +*`GIT_DIFF_PATH_COUNTER`* and *`GIT_DIFF_PATH_TOTAL`* are useful from inside the program specified by `GIT_EXTERNAL_DIFF` or `diff.external`. The former represents which file in a series is being diffed (starting with 1), and the latter is the total number of files in the batch. -*GIT_MERGE_VERBOSITY* controls the output for the recursive merge strategy. +*`GIT_MERGE_VERBOSITY`* controls the output for the recursive merge strategy. The allowed values are as follows: * 0 outputs nothing, except possibly a single error message. From 5c1163adbf9886cc72f2282488e59fb29514b5c1 Mon Sep 17 00:00:00 2001 From: Ben Straub Date: Fri, 26 Sep 2014 09:35:44 -0700 Subject: [PATCH 15/32] Index entry for excludes --- book/11-git-internals/sections/plumbing-porcelain.asc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/book/11-git-internals/sections/plumbing-porcelain.asc b/book/11-git-internals/sections/plumbing-porcelain.asc index 040c853b0..f46ba3edf 100644 --- a/book/11-git-internals/sections/plumbing-porcelain.asc +++ b/book/11-git-internals/sections/plumbing-porcelain.asc @@ -27,7 +27,7 @@ refs/ You may see some other files in there, but this is a fresh `git init` repository – it's what you see by default. The `description` file is only used by the GitWeb program, so don't worry about it. -The `config` file contains your project-specific configuration options, and the `info` directory keeps a global exclude file for ignored patterns that you don't want to track in a .gitignore file. +The `config` file contains your project-specific configuration options, and the `info` directory keeps a global exclude file (((excludes))) for ignored patterns that you don't want to track in a .gitignore file. The `hooks` directory contains your client- or server-side hook scripts, which are discussed in detail in <<_hooks>>. This leaves four important entries: the `HEAD` and (yet to be created) `index` files, and the `objects` and `refs` directories. From ebb56f2c1f8627e6939035e7d4c671d2c80e2c59 Mon Sep 17 00:00:00 2001 From: Ben Straub Date: Tue, 30 Sep 2014 20:34:35 -0700 Subject: [PATCH 16/32] HTTPSmart (part I) --- .../sections/transfer-protocols.asc | 80 +++++++++++-------- 1 file changed, 48 insertions(+), 32 deletions(-) diff --git a/book/11-git-internals/sections/transfer-protocols.asc b/book/11-git-internals/sections/transfer-protocols.asc index 8f0049848..71986a7a8 100644 --- a/book/11-git-internals/sections/transfer-protocols.asc +++ b/book/11-git-internals/sections/transfer-protocols.asc @@ -1,17 +1,25 @@ === Transfer Protocols -Git can transfer data between two repositories in two major ways: over HTTP and via the so-called smart protocols used in the `file://`, `ssh://`, and `git://` transports. +Git can transfer data between two repositories in two major ways: the ``dumb'' protocol and the ``smart'' protocol. This section will quickly cover how these two main protocols operate. ==== The Dumb Protocol -Git transport over HTTP is often referred to as the dumb protocol because it requires no Git-specific code on the server side during the transport process. -The fetch process is a series of GET requests, where the client can assume the layout of the Git repository on the server. +If you're setting up a repository to be served read-only over HTTP, the dumb protocol is likely what will be used. +This protocol is called ``dumb'' because it requires no Git-specific code on the server side during the transport process; the fetch process is a series of HTTP `GET` requests, where the client can assume the layout of the Git repository on the server. + +[NOTE] +==== +The dumb protocol is all but deprecated at this point. +It's read-only to begin with, and not secure or private, so most Git hosts (both cloud-based and on-premises) will refuse to use it. +It's advised to use the smart protocol, which we describe a bit further on. +==== + Let's follow the `http-fetch` process for the simplegit library: [source,shell] ---- -$ git clone http://github.com/schacon/simplegit-progit.git +$ git clone http://server/simplegit-progit.git ---- The first thing this command does is pull down the `info/refs` file. @@ -115,29 +123,11 @@ You have your tree object, so you continue walking your commits. They're all also within the packfile you just downloaded, so you don't have to do any more requests to your server. Git checks out a working copy of the `master` branch that was pointed to by the HEAD reference you downloaded at the beginning. -The entire output of this process looks like this: - -[source,shell] ----- -$ git clone http://github.com/schacon/simplegit-progit.git -Initialized empty Git repository in /private/tmp/simplegit-progit/.git/ -got ca82a6dff817ec66f44342007202690a93763949 -walk ca82a6dff817ec66f44342007202690a93763949 -got 085bb3bcb608e1e8451d4b2432f8ecbe6306e7e7 -Getting alternates list for http://github.com/schacon/simplegit-progit.git -Getting pack list for http://github.com/schacon/simplegit-progit.git -Getting index for pack 816a9b2334da9953e530f27bcac22082a9f5b835 -Getting pack 816a9b2334da9953e530f27bcac22082a9f5b835 - which contains cfda3bf379e4f8dba8717dee55aab78aef7f4daf -walk 085bb3bcb608e1e8451d4b2432f8ecbe6306e7e7 -walk a11bef06a3f659402fe7563abf99ad00de2209e6 ----- - ==== The Smart Protocol -The HTTP method is simple but a bit inefficient. -Using smart protocols is a more common method of transferring data. -These protocols have a process on the remote end that is intelligent about Git – it can read local data and figure out what the client has or needs and generate custom data for it. +The dumb protocol is simple but a bit inefficient, and it can't handle writing of data from the client to the server. +The smart protocol is a more common method of transferring data. +This method requires a process on the remote end that is intelligent about Git – it can read local data, figure out what the client has and needs, and generate custom data for it. There are two sets of processes for transferring data: a pair for uploading data and a pair for downloading data. ===== Uploading Data @@ -145,22 +135,24 @@ There are two sets of processes for transferring data: a pair for uploading data To upload data to a remote process, Git uses the `send-pack` and `receive-pack` processes. The `send-pack` process runs on the client and connects to a `receive-pack` process on the remote side. +====== SSH + For example, say you run `git push origin master` in your project, and `origin` is defined as a URL that uses the SSH protocol. Git fires up the `send-pack` process, which initiates a connection over SSH to your server. It tries to run a command on the remote server via an SSH call that looks something like this: [source,shell] ---- -$ ssh -x git@github.com "git-receive-pack 'schacon/simplegit-progit.git'" -005bca82a6dff817ec66f4437202690a93763949 refs/heads/master report-status delete-refs +$ ssh -x git@server "git-receive-pack 'simplegit-progit.git'" +005bca82a6dff817ec66f4437202690a93763949 refs/heads/master report-status delete-refs side-band-64k quiet ofs-delta agent=git/2:2.1.1~vmg-bitmaps-bugaloo-608-g116744e delete-refs 003e085bb3bcb608e1e84b2432f8ecbe6306e7e7 refs/heads/topic 0000 ---- The `git-receive-pack` command immediately responds with one line for each reference it currently has – in this case, just the `master` branch and its SHA. -The first line also has a list of the server's capabilities (here, `report-status` and `delete-refs`). +The first line also has a list of the server's capabilities (here, `report-status`, `delete-refs`, and some others, including the client identifier). -Each line starts with a 4-byte hex value specifying how long the rest of the line is. +Each line starts with a 4-character hex value specifying how long the rest of the line is. Your first line starts with 005b, which is 91 in hex, meaning that 91 bytes remain on that line. The next line starts with 003e, which is 62, so you read the remaining 62 bytes. The next line is 0000, meaning the server is done with its references listing. @@ -176,12 +168,12 @@ For instance, if you're updating the `master` branch and adding an `experiment` 0000 ---- +Git sends a line for each reference you're updating with the line's length, the old SHA, the new SHA, and the reference that is being updated. +The first line also has the client's capabilities. The SHA-1 value of all '0's means that nothing was there before – because you're adding the experiment reference. If you were deleting a reference, you would see the opposite: all '0's on the right side. -Git sends a line for each reference you're updating with the old SHA, the new SHA, and the reference that is being updated. -The first line also has the client's capabilities. -Next, the client uploads a packfile of all the objects the server doesn't have yet. +Next, the client sends a packfile of all the objects the server doesn't have yet. Finally, the server responds with a success (or failure) indication: [source] @@ -189,6 +181,30 @@ Finally, the server responds with a success (or failure) indication: 000Aunpack ok ---- +====== HTTP(S) + +This process is mostly the same over HTTP, though the handshaking is a bit different. +The connection is initiated with this request: + +[source] +---- +=> GET http://server/simplegit-progit.git/info/refs?service=git-receive-pack +001f# service=git-receive-pack +000000ab6c5f0e45abd7832bf23074a333f739977c9e8188 refs/heads/master report-status delete-refs side-band-64k quiet ofs-delta agent=git/2:2.1.1~vmg-bitmaps-bugaloo-608-g116744e +0000 +---- + +That's the end of the first client-server exchange. +The client then makes another request, this time a `POST`, with the data that `git-upload-pack` provides. + +[source] +---- +=> POST http://server/simplegit-progit.git/git-receive/pack +---- + +The `POST` request includes the `send-pack` output and the packfile as its payload. +The server then indicates success or failure with its HTTP response. + ===== Downloading Data When you download data, the `fetch-pack` and `upload-pack` processes are involved. From 629f7ef45efddcdcc95103bdda9c07cac89c8931 Mon Sep 17 00:00:00 2001 From: Ben Straub Date: Thu, 2 Oct 2014 21:11:42 -0700 Subject: [PATCH 17/32] HTTPSmartest --- .../sections/transfer-protocols.asc | 83 +++++++++++++------ 1 file changed, 56 insertions(+), 27 deletions(-) diff --git a/book/11-git-internals/sections/transfer-protocols.asc b/book/11-git-internals/sections/transfer-protocols.asc index 71986a7a8..d22c52e06 100644 --- a/book/11-git-internals/sections/transfer-protocols.asc +++ b/book/11-git-internals/sections/transfer-protocols.asc @@ -126,12 +126,12 @@ Git checks out a working copy of the `master` branch that was pointed to by the ==== The Smart Protocol The dumb protocol is simple but a bit inefficient, and it can't handle writing of data from the client to the server. -The smart protocol is a more common method of transferring data. -This method requires a process on the remote end that is intelligent about Git – it can read local data, figure out what the client has and needs, and generate custom data for it. +The smart protocol is a more common method of transferring data, but it requires a process on the remote end that is intelligent about Git – it can read local data, figure out what the client has and needs, and generate a custom packfile for it. There are two sets of processes for transferring data: a pair for uploading data and a pair for downloading data. ===== Uploading Data +(((git commands, send-pack)))(((git commands, receive-pack))) To upload data to a remote process, Git uses the `send-pack` and `receive-pack` processes. The `send-pack` process runs on the client and connects to a `receive-pack` process on the remote side. @@ -144,7 +144,9 @@ It tries to run a command on the remote server via an SSH call that looks someth [source,shell] ---- $ ssh -x git@server "git-receive-pack 'simplegit-progit.git'" -005bca82a6dff817ec66f4437202690a93763949 refs/heads/master report-status delete-refs side-band-64k quiet ofs-delta agent=git/2:2.1.1~vmg-bitmaps-bugaloo-608-g116744e delete-refs +005bca82a6dff817ec66f4437202690a93763949 refs/heads/master report-status \ + delete-refs side-band-64k quiet ofs-delta \ + agent=git/2:2.1.1~vmg-bitmaps-bugaloo-608-g116744e delete-refs 003e085bb3bcb608e1e84b2432f8ecbe6306e7e7 refs/heads/topic 0000 ---- @@ -163,8 +165,10 @@ For instance, if you're updating the `master` branch and adding an `experiment` [source] ---- -0085ca82a6dff817ec66f44342007202690a93763949 15027957951b64cf874c3557a0f3547bd83b3ff6 refs/heads/master report-status -00670000000000000000000000000000000000000000 cdfdb42577e2506715f8cfeacdbabc092bf63e8d refs/heads/experiment +0085ca82a6dff817ec66f44342007202690a93763949 15027957951b64cf874c3557a0f3547bd83b3ff6 \ + refs/heads/master report-status +00670000000000000000000000000000000000000000 cdfdb42577e2506715f8cfeacdbabc092bf63e8d \ + refs/heads/experiment 0000 ---- @@ -190,7 +194,9 @@ The connection is initiated with this request: ---- => GET http://server/simplegit-progit.git/info/refs?service=git-receive-pack 001f# service=git-receive-pack -000000ab6c5f0e45abd7832bf23074a333f739977c9e8188 refs/heads/master report-status delete-refs side-band-64k quiet ofs-delta agent=git/2:2.1.1~vmg-bitmaps-bugaloo-608-g116744e +000000ab6c5f0e45abd7832bf23074a333f739977c9e8188 refs/heads/master \ + report-status delete-refs side-band-64k quiet ofs-delta \ + agent=git/2:2.1.1~vmg-bitmaps-bugaloo-608-g116744e 0000 ---- @@ -207,43 +213,33 @@ The server then indicates success or failure with its HTTP response. ===== Downloading Data +(((git commands, fetch-pack)))(((git commands, upload-pack))) When you download data, the `fetch-pack` and `upload-pack` processes are involved. The client initiates a `fetch-pack` process that connects to an `upload-pack` process on the remote side to negotiate what data will be transferred down. -There are different ways to initiate the `upload-pack` process on the remote repository. -You can run via SSH in the same manner as the `receive-pack` process. -You can also initiate the process via the Git daemon, which listens on a server on port 9418 by default. -The `fetch-pack` process sends data that looks like this to the daemon after connecting: - -[source] ----- -003fgit-upload-pack schacon/simplegit-progit.git\0host=myserver.com\0 ----- - -It starts with the 4 bytes specifying how much data is following, then the command to run followed by a null byte, and then the server's hostname followed by a final null byte. -The Git daemon checks that the command can be run and that the repository exists and has public permissions. -If everything is cool, it fires up the `upload-pack` process and hands off the request to it. +====== SSH If you're doing the fetch over SSH, `fetch-pack` instead runs something like this: [source,shell] ---- -$ ssh -x git@github.com "git-upload-pack 'schacon/simplegit-progit.git'" +$ ssh -x git@server "git-upload-pack 'simplegit-progit.git'" ---- -In either case, after `fetch-pack` connects, `upload-pack` sends back something like this: +After `fetch-pack` connects, `upload-pack` sends back something like this: [source] ---- -0088ca82a6dff817ec66f44342007202690a93763949 HEAD\0multi_ack thin-pack \ - side-band side-band-64k ofs-delta shallow no-progress include-tag +00dfca82a6dff817ec66f44342007202690a93763949 HEADmulti_ack thin-pack \ + side-band side-band-64k ofs-delta shallow no-progress include-tag \ + multi_ack_detailed symref=HEAD:refs/heads/master \ + agent=git/2:2.1.1+github-607-gfba4028 003fca82a6dff817ec66f44342007202690a93763949 refs/heads/master -003e085bb3bcb608e1e8451d4b2432f8ecbe6306e7e7 refs/heads/topic 0000 ---- This is very similar to what `receive-pack` responds with, but the capabilities are different. -In addition, it sends back the HEAD reference so the client knows what to check out if this is a clone. +In addition, it sends back what HEAD points to (`symref=HEAD:refs/heads/master`) so the client knows what to check out if this is a clone. At this point, the `fetch-pack` process looks at what objects it has and responds with the objects that it needs by sending ``want'' and then the SHA it wants. It sends all the objects it already has with ``have'' and then the SHA. @@ -257,5 +253,38 @@ At the end of this list, it writes ``done'' to initiate the `upload-pack` proces 0009done ---- -That is a very basic case of the transfer protocols. -In more complex cases, the client supports `multi_ack` or `side-band` capabilities; but this example shows you the basic back and forth used by the smart protocol processes. +====== HTTP(S) + +The handshake for a fetch operation takes two HTTP requests. +The first is a `GET` to the same endpoint used in the dumb protocol: + +[source] +---- +=> GET $GIT_URL/info/refs?service=git-upload-pack +001e# service=git-upload-pack +000000e7ca82a6dff817ec66f44342007202690a93763949 HEADmulti_ack thin-pack \ + side-band side-band-64k ofs-delta shallow no-progress include-tag \ + multi_ack_detailed no-done symref=HEAD:refs/heads/master \ + agent=git/2:2.1.1+github-607-gfba4028 +003fca82a6dff817ec66f44342007202690a93763949 refs/heads/master +0000 +---- + +This is very similar to invoking `git-upload-pack` over an SSH connection, but the second exchange is performed as a separate request: + +[source] +---- +=> POST $GIT_URL/git-upload-pack HTTP/1.0 +0032want 0a53e9ddeaddad63ad106860237bbf53411d11a7 +0032have 441b40d833fdfa93eb2908e52742248faf0ee993 +0000 +---- + +Again, this is the same format as above. +The response to this request indicates success or failure, and includes the packfile. + +==== Protocols Summary + +This section contains a very basic overview of the transfer protocols. +The protocol includes many other features, such as `multi_ack` or `side-band` capabilities, but covering them is outside the scope of this book. +We've tried to give you a sense of the general back-and-forth between client and server; if you need more knowledge than this, you'll probably want to take a look at the Git source code. From 1d1283632948d2caf9c9cf9e85c8a8db86ce4398 Mon Sep 17 00:00:00 2001 From: Scott Chacon Date: Mon, 6 Oct 2014 13:02:53 +0200 Subject: [PATCH 18/32] remove leftover text in ch7 index result of a poor merge possibly? --- book/07-git-tools/1-git-tools.asc | 877 +----------------------------- 1 file changed, 1 insertion(+), 876 deletions(-) diff --git a/book/07-git-tools/1-git-tools.asc b/book/07-git-tools/1-git-tools.asc index 706e62fc2..073eae5fd 100644 --- a/book/07-git-tools/1-git-tools.asc +++ b/book/07-git-tools/1-git-tools.asc @@ -32,882 +32,7 @@ include::sections/bundling.asc[] include::sections/replace.asc[] -In this case, choose `1c002dd....` If you `git show` that commit, the following commands are equivalent (assuming the shorter versions are unambiguous): - -[source,shell] ----- -$ git show 1c002dd4b536e7479fe34593e72e6c6c1819e53b -$ git show 1c002dd4b536e7479f -$ git show 1c002d ----- - -Git can figure out a short, unique abbreviation for your SHA-1 values. -If you pass `--abbrev-commit` to the `git log` command, the output will use shorter values but keep them unique; it defaults to using seven characters but makes them longer if necessary to keep the SHA-1 unambiguous: - -[source,shell] ----- -$ git log --abbrev-commit --pretty=oneline -ca82a6d changed the version number -085bb3b removed unnecessary test code -a11bef0 first commit ----- - -Generally, eight to ten characters are more than enough to be unique within a project. -One of the largest Git projects, the Linux kernel, is beginning to need 12 characters out of the possible 40 to stay unique. - -==== A SHORT NOTE ABOUT SHA-1 - -A lot of people become concerned at some point that they will, by random happenstance, have two objects in their repository that hash to the same SHA-1 value. -What then? - -If you do happen to commit an object that hashes to the same SHA-1 value as a previous object in your repository, Git will see the previous object already in your Git database and assume it was already written. -If you try to check out that object again at some point, you’ll always get the data of the first object. - -However, you should be aware of how ridiculously unlikely this scenario is. The SHA-1 digest is 20 bytes or 160 bits. The number of randomly hashed objects needed to ensure a 50% probability of a single collision is about 2^80 (the formula for determining collision probability is `p = (n(n-1)/2) * (1/2^160))`. 2^80 is 1.2 x 10^24 or 1 million billion billion. That’s 1,200 times the number of grains of sand on the earth. - -Here’s an example to give you an idea of what it would take to get a SHA-1 collision. -If all 6.5 billion humans on Earth were programming, and every second, each one was producing code that was the equivalent of the entire Linux kernel history (1 million Git objects) and pushing it into one enormous Git repository, it would take 5 years until that repository contained enough objects to have a 50% probability of a single SHA-1 object collision. -A higher probability exists that every member of your programming team will be attacked and killed by wolves in unrelated incidents on the same night. - -==== Branch References - -The most straightforward way to specify a commit requires that it have a branch reference pointed at it. -Then, you can use a branch name in any Git command that expects a commit object or SHA-1 value. -For instance, if you want to show the last commit object on a branch, the following commands are equivalent, assuming that the `topic1` branch points to `ca82a6d`: - -[source,shell] ----- -$ git show ca82a6dff817ec66f44342007202690a93763949 -$ git show topic1 ----- - -If you want to see which specific SHA a branch points to, or if you want to see what any of these examples boils down to in terms of SHAs, you can use a Git plumbing tool called `rev-parse`. -You can see <<_git_internals>> for more information about plumbing tools; basically, `rev-parse` exists for lower-level operations and isn’t designed to be used in day-to-day operations. -However, it can be helpful sometimes when you need to see what’s really going on. -Here you can run `rev-parse` on your branch. - -[source,shell] ----- -$ git rev-parse topic1 -ca82a6dff817ec66f44342007202690a93763949 ----- - -==== RefLog Shortnames - -One of the things Git does in the background while you’re working away is keep a reflog – a log of where your HEAD and branch references have been for the last few months. - -You can see your reflog by using `git reflog`: - -[source,shell] ----- -$ git reflog -734713b... HEAD@{0}: commit: fixed refs handling, added gc auto, updated -d921970... HEAD@{1}: merge phedders/rdocs: Merge made by recursive. -1c002dd... HEAD@{2}: commit: added some blame and merge stuff -1c36188... HEAD@{3}: rebase -i (squash): updating HEAD -95df984... HEAD@{4}: commit: # This is a combination of two commits. -1c36188... HEAD@{5}: rebase -i (squash): updating HEAD -7e05da5... HEAD@{6}: rebase -i (pick): updating HEAD ----- - -Every time your branch tip is updated for any reason, Git stores that information for you in this temporary history. -And you can specify older commits with this data, as well. -If you want to see the fifth prior value of the HEAD of your repository, you can use the `@{n}` reference that you see in the reflog output: - -[source,shell] ----- -$ git show HEAD@{5} ----- - -You can also use this syntax to see where a branch was some specific amount of time ago. -For instance, to see where your `master` branch was yesterday, you can type - -[source,shell] ----- -$ git show master@{yesterday} ----- - -That shows you where the branch tip was yesterday. -This technique only works for data that’s still in your reflog, so you can’t use it to look for commits older than a few months. - -To see reflog information formatted like the `git log` output, you can run `git log -g`: - -[source,shell] ----- -$ git log -g master -commit 734713bc047d87bf7eac9674765ae793478c50d3 -Reflog: master@{0} (Scott Chacon ) -Reflog message: commit: fixed refs handling, added gc auto, updated -Author: Scott Chacon -Date: Fri Jan 2 18:32:33 2009 -0800 - - fixed refs handling, added gc auto, updated tests - -commit d921970aadf03b3cf0e71becdaab3147ba71cdef -Reflog: master@{1} (Scott Chacon ) -Reflog message: merge phedders/rdocs: Merge made by recursive. -Author: Scott Chacon -Date: Thu Dec 11 15:08:43 2008 -0800 - - Merge commit 'phedders/rdocs' ----- - -It’s important to note that the reflog information is strictly local – it’s a log of what you’ve done in your repository. -The references won’t be the same on someone else’s copy of the repository; and right after you initially clone a repository, you'll have an empty reflog, as no activity has occurred yet in your repository. -Running `git show HEAD@{2.months.ago}` will work only if you cloned the project at least two months ago – if you cloned it five minutes ago, you’ll get no results. - -==== Ancestry References - -The other main way to specify a commit is via its ancestry. -If you place a `^` at the end of a reference, Git resolves it to mean the parent of that commit. -Suppose you look at the history of your project: - -[source,shell] ----- -$ git log --pretty=format:'%h %s' --graph -* 734713b fixed refs handling, added gc auto, updated tests -* d921970 Merge commit 'phedders/rdocs' -|\ -| * 35cfb2b Some rdoc changes -* | 1c002dd added some blame and merge stuff -|/ -* 1c36188 ignore *.gem -* 9b29157 add open3_detach to gemspec file list ----- - -Then, you can see the previous commit by specifying `HEAD^`, which means ``the parent of HEAD'': - -[source,shell] ----- -$ git show HEAD^ -commit d921970aadf03b3cf0e71becdaab3147ba71cdef -Merge: 1c002dd... 35cfb2b... -Author: Scott Chacon -Date: Thu Dec 11 15:08:43 2008 -0800 - - Merge commit 'phedders/rdocs' ----- - -You can also specify a number after the `^` – for example, `d921970^2` means ``the second parent of d921970.'' -This syntax is only useful for merge commits, which have more than one parent. -The first parent is the branch you were on when you merged, and the second is the commit on the branch that you merged in: - -[source,shell] ----- -$ git show d921970^ -commit 1c002dd4b536e7479fe34593e72e6c6c1819e53b -Author: Scott Chacon -Date: Thu Dec 11 14:58:32 2008 -0800 - - added some blame and merge stuff - -$ git show d921970^2 -commit 35cfb2b795a55793d7cc56a6cc2060b4bb732548 -Author: Paul Hedderly -Date: Wed Dec 10 22:22:03 2008 +0000 - - Some rdoc changes ----- - -The other main ancestry specification is the `~`. -This also refers to the first parent, so `HEAD~` and `HEAD^` are equivalent. -The difference becomes apparent when you specify a number. -`HEAD~2` means ``the first parent of the first parent,'' or ``the grandparent'' – it traverses the first parents the number of times you specify. -For example, in the history listed earlier, `HEAD~3` would be - -[source,shell] ----- -$ git show HEAD~3 -commit 1c3618887afb5fbcbea25b7c013f4e2114448b8d -Author: Tom Preston-Werner -Date: Fri Nov 7 13:47:59 2008 -0500 - - ignore *.gem ----- - -This can also be written `HEAD^^^`, which again is the first parent of the first parent of the first parent: - -[source,shell] ----- -$ git show HEAD^^^ -commit 1c3618887afb5fbcbea25b7c013f4e2114448b8d -Author: Tom Preston-Werner -Date: Fri Nov 7 13:47:59 2008 -0500 - - ignore *.gem ----- - -You can also combine these syntaxes – you can get the second parent of the previous reference (assuming it was a merge commit) by using `HEAD~3^2`, and so on. - -==== Commit Ranges - -Now that you can specify individual commits, let’s see how to specify ranges of commits. -This is particularly useful for managing your branches – if you have a lot of branches, you can use range specifications to answer questions such as, ``What work is on this branch that I haven’t yet merged into my main branch?'' - -===== Double Dot - -The most common range specification is the double-dot syntax. -This basically asks Git to resolve a range of commits that are reachable from one commit but aren’t reachable from another. -For example, say you have a commit history that looks like <>. - -[[double_dot]] -.Example history for range selection. -image::images/double-dot.png[Example history for range selection.] - -You want to see what is in your experiment branch that hasn’t yet been merged into your master branch. -You can ask Git to show you a log of just those commits with `master..experiment` – that means ``all commits reachable by experiment that aren’t reachable by master.'' -For the sake of brevity and clarity in these examples, I’ll use the letters of the commit objects from the diagram in place of the actual log output in the order that they would display: - -[source,shell] ----- -$ git log master..experiment -D -C ----- - -If, on the other hand, you want to see the opposite – all commits in `master` that aren’t in `experiment` – you can reverse the branch names. -`experiment..master` shows you everything in `master` not reachable from `experiment`: - -[source,shell] ----- -$ git log experiment..master -F -E ----- - -This is useful if you want to keep the `experiment` branch up to date and preview what you’re about to merge in. -Another very frequent use of this syntax is to see what you’re about to push to a remote: - -[source,shell] ----- -$ git log origin/master..HEAD ----- - -This command shows you any commits in your current branch that aren’t in the `master` branch on your `origin` remote. -If you run a `git push` and your current branch is tracking `origin/master`, the commits listed by `git log origin/master..HEAD` are the commits that will be transferred to the server. -You can also leave off one side of the syntax to have Git assume HEAD. -For example, you can get the same results as in the previous example by typing `git log origin/master..` – Git substitutes HEAD if one side is missing. - -===== Multiple Points - -The double-dot syntax is useful as a shorthand; but perhaps you want to specify more than two branches to indicate your revision, such as seeing what commits are in any of several branches that aren’t in the branch you’re currently on. -Git allows you to do this by using either the `^` character or `--not` before any reference from which you don’t want to see reachable commits. -Thus these three commands are equivalent: - -[source,shell] ----- -$ git log refA..refB -$ git log ^refA refB -$ git log refB --not refA ----- - -This is nice because with this syntax you can specify more than two references in your query, which you cannot do with the double-dot syntax. -For instance, if you want to see all commits that are reachable from `refA` or `refB` but not from `refC`, you can type one of these: - -[source,shell] ----- -$ git log refA refB ^refC -$ git log refA refB --not refC ----- - -This makes for a very powerful revision query system that should help you figure out what is in your branches. - -===== Triple Dot - -The last major range-selection syntax is the triple-dot syntax, which specifies all the commits that are reachable by either of two references but not by both of them. -Look back at the example commit history in Figure 6-1. -If you want to see what is in `master` or `experiment` but not any common references, you can run - -[source,shell] ----- -$ git log master...experiment -F -E -D -C ----- - -Again, this gives you normal `log` output but shows you only the commit information for those four commits, appearing in the traditional commit date ordering. - -A common switch to use with the `log` command in this case is `--left-right`, which shows you which side of the range each commit is in. -This helps make the data more useful: - -[source,shell] ----- -$ git log --left-right master...experiment -< F -< E -> D -> C ----- - -With these tools, you can much more easily let Git know what commit or commits you want to inspect. - -=== Interactive Staging - -Git comes with a couple of scripts that make some command-line tasks easier. -Here, you’ll look at a few interactive commands that can help you easily craft your commits to include only certain combinations and parts of files. -These tools are very helpful if you modify a bunch of files and then decide that you want those changes to be in several focused commits rather than one big messy commit. -This way, you can make sure your commits are logically separate changesets and can be easily reviewed by the developers working with you. -If you run `git add` with the `-i` or `--interactive` option, Git goes into an interactive shell mode, displaying something like this: - -[source,shell] ----- -$ git add -i - staged unstaged path - 1: unchanged +0/-1 TODO - 2: unchanged +1/-1 index.html - 3: unchanged +5/-1 lib/simplegit.rb - -*** Commands *** - 1: status 2: update 3: revert 4: add untracked - 5: patch 6: diff 7: quit 8: help -What now> ----- - -You can see that this command shows you a much different view of your staging area – basically the same information you get with `git status` but a bit more succinct and informative. -It lists the changes you’ve staged on the left and unstaged changes on the right. - -After this comes a Commands section. -Here you can do a number of things, including staging files, unstaging files, staging parts of files, adding untracked files, and seeing diffs of what has been staged. - -==== Staging and Unstaging Files - -If you type `2` or `u` at the `What now>` prompt, the script prompts you for which files you want to stage: - -[source,shell] ----- -What now> 2 - staged unstaged path - 1: unchanged +0/-1 TODO - 2: unchanged +1/-1 index.html - 3: unchanged +5/-1 lib/simplegit.rb -Update>> ----- - -To stage the TODO and index.html files, you can type the numbers: - -[source,shell] ----- -Update>> 1,2 - staged unstaged path -* 1: unchanged +0/-1 TODO -* 2: unchanged +1/-1 index.html - 3: unchanged +5/-1 lib/simplegit.rb -Update>> ----- - -The `*` next to each file means the file is selected to be staged. -If you press Enter after typing nothing at the `Update>>` prompt, Git takes anything selected and stages it for you: - -[source,shell] ----- -Update>> -updated 2 paths - -*** Commands *** - 1: status 2: update 3: revert 4: add untracked - 5: patch 6: diff 7: quit 8: help -What now> 1 - staged unstaged path - 1: +0/-1 nothing TODO - 2: +1/-1 nothing index.html - 3: unchanged +5/-1 lib/simplegit.rb ----- - -Now you can see that the TODO and index.html files are staged and the simplegit.rb file is still unstaged. -If you want to unstage the TODO file at this point, you use the `3` or `r` (for revert) option: - -[source,shell] ----- -*** Commands *** - 1: status 2: update 3: revert 4: add untracked - 5: patch 6: diff 7: quit 8: help -What now> 3 - staged unstaged path - 1: +0/-1 nothing TODO - 2: +1/-1 nothing index.html - 3: unchanged +5/-1 lib/simplegit.rb -Revert>> 1 - staged unstaged path -* 1: +0/-1 nothing TODO - 2: +1/-1 nothing index.html - 3: unchanged +5/-1 lib/simplegit.rb -Revert>> [enter] -reverted one path ----- - -Looking at your Git status again, you can see that you’ve unstaged the TODO file: - -[source,shell] ----- -*** Commands *** - 1: status 2: update 3: revert 4: add untracked - 5: patch 6: diff 7: quit 8: help -What now> 1 - staged unstaged path - 1: unchanged +0/-1 TODO - 2: +1/-1 nothing index.html - 3: unchanged +5/-1 lib/simplegit.rb ----- - -To see the diff of what you’ve staged, you can use the `6` or `d` (for diff) command. -It shows you a list of your staged files, and you can select the ones for which you would like to see the staged diff. -This is much like specifying `git diff --cached` on the command line: - -[source,shell] ----- -*** Commands *** - 1: status 2: update 3: revert 4: add untracked - 5: patch 6: diff 7: quit 8: help -What now> 6 - staged unstaged path - 1: +1/-1 nothing index.html -Review diff>> 1 -diff --git a/index.html b/index.html -index 4d07108..4335f49 100644 ---- a/index.html -+++ b/index.html -@@ -16,7 +16,7 @@ Date Finder - -

...

- -- -+ - -