Friday, December 29, 2017

Resources for diving into Git internals

One of the things my boss asked me to do before I leave Farm Bureau for the bustling Bay Area was to put together some resources related to using Visual Studio Team Services.  One of my co-workers on the database team also asked me to put together some resources for learning git.  So, to that end, I pieced together a few things that would make my absence less disruptive. 

As part of the process, I ended up uncovering a few presentation that delved pretty deep under the covers.  Now, these are lessons that are best consumed by one who is already pretty familiar with git basics, but I think the reward is a deeper, richer understanding of git behavior that can make us more effective with the tool. I'm not going to rehash everything that is in the videos (just watch the videos, they're excellent), I'll just provide a link and a brief synopsis.





Deep Dive into Git - Edward Thomson

A common theme that reappears in many of these talks is the "object" in git, and what is and isn't an object.  Edward demonstrates the use of "git cat-file" and "git ls-tree", which were commands I was previously unaware of that allow us to inspect some of these objects and their metadata.  He also talks quite a bit about how branching and merging work.  He also demonstrates how to recover files that were added to the index (i.e. "git add <something>") but never committed, and looks briefly at stash.



GOTO 2015 - Understanding Git's Behaviour  - Steve Smith

Steve focuses a lot on the git data model, which consists of three types of blobs: content (files), trees (folders), and commits.  He shows how changing a file actually results in the creation of a completely new blob representing the new file, and how this maps to the trees and commits.  He looks at "refs", which are really just a pointer or indirection to an object (another talk compares these to "annotated tags" which are first class objects...).  I think the coolest thing was learning that the --exec command will allow you to run shell commands as part of a git command (think rebase and bisect).


Advanced GIT for Developers - Lorna Jane Mitchell - Laracon EU 2015

While Lorna's audience apparently didn't fully appreciate her humor, I thought she was pretty fun to watch.  The notion that “Git is how we communicate as a team of developers” hit home for me, and really hit home the reason why effective commit messages and rational branching strategies can make or break how easy it is to work in a particular repo.  The biggest differentiators for this talk were demoing the "rerere" functionality, which allows you to reuse a merge conflict resolution (like when you are doing a rebase or merging multiple feature branches).  She covers a lot of ground, and I enjoyed her takes on diffing, resolving merge conflicts, interactive rebase, bisect, and submodules.



Advanced Git: Graphs, Hashes, and Compression, Oh My!

This one dives deep into the plumbing, looking at how the content blobs are actually encoded.  He shows how to use the "git hash-object" command to manually create a git "object" out of a file, and use "git update-index" to manually add this object to the index.  He uses "git cat-file" quite heavily to inspect the various blobs.  I learned that "git rev-parse" is what actually converts shortened hashes and friendly names into the full 40 character hashes used everywhere by git.  The goal of this talk is to demystify what is happening inside the .git folder, and I think that goal is achieved.



Git From the Bits Up

This talk actually overlaps quite a bit with the "Graphs, Hashes, and Compression" talk, but I still found it a valuable and interesting talk.



Honorable Mention:  Tech Talk: Linus Torvalds on git.  Linus doesn't cover usage or internals, but does discuss the motivations for some of the design decisions and gives a historical perspective on git that is interesting, even if it isn't particularly useful.


No comments:

Post a Comment