git

complex, but worth it

aaron stacy

frontend engineer

@ waterfall mobile

waterfall is hiring!

terminology

  • vcs == source control
  • git != github

motivation

we shouldn't can't do our job without source control

the joel test: point #1

to understand why, understand how we got here

generation #1: tracking changes

  • focused on one file at a time
  • all local to your machine
  • rcs, sccs

generation #2: sharing

  • centralized "repositories"
  • worked with directory trees
  • networked
  • svn, cvs, sourcesafe, tfs

but concurrency is hard

  • hard for humans just like computers
    • multi-core processors
    • the mythical man-month
  • context switching
  • context merging

generation #3: distributed vcs

  • everyone has a copy of the entire history
  • change-set oriented
  • branching and merging are primary concerns
  • git, mercurial, bazaar, bitkeeper, darcs

git vs. <something else>

  • generally considered very fast
  • it doesn't try to be too intelligent
  • many find it less intuitive

the most important part: the social side

  • git has a huge community with tons of momentum
  • it's good to be involved in a vibrant developer community
  • post your projects online
    • don't wait until they're perfect!

alright let's draw

  • a git repo is a DAG
  • like a tree where branches grow back into each other
  • commits are the vertices
    • snapshots
    • diff's (incremental change sets)
    • tracked by sha1 hashes, not version numbers

branches

  • branches are just pointers or references to commits
  • the main one is called master
  • but there's not actually anything special about it

checkouts

  • special pointer to your current branch: HEAD
  • staging area of what you're about to commit: index
  • working copy: everything outside of .git

remotes

  • could be on the same file system or another machine
  • can be accessed over different protocols (http|ssh|git)
  • main remote is called origin
  • remotes are basically just URL's
  • remote tracking branches

commits

  • change your working copy
  • git add adds it to the index
  • git commit creates a new vertex in your repo's tree

fetching and merging

  • fetching just adds vertices from a remote
  • merging is just moving the branch pointers together
    • (and usually updating the HEAD)
  • it happens one of two ways:
    • fast-forward
    • three-way merge
  • pull: fetch + merge
  • push: inverse fetch + merge

crazy next-level stuff #1: feature branches

  • you're working on a feature that takes a while
  • you find out there's some massively horrible bug
  • put all of your feature work in a separate branch
  • do the bugfix and push it to production
  • and pick up where you left off

crazy next-level stuff #2: tangled working copy

  • find a bug while you're doing something else
  • you fix the bug
  • but you don't want to commit it with the other thing
  • git add --interactive

crazy next-level stuff #3: async development

  • one team member is responsible for the database
  • one is responsible for the UI
  • in a perfect world, do the database work first, then the UI
  • in the real world, stuff is forgotten, miscommunicated, etc.
  • shared git branch allows you to pass this work back and forth
  • rest of the team doesn't see the half-baked change

how do people actually use it?

  • git is incredibly flexible
  • "framework for workflows"
  • can be confusing, but some notable patterns are emerging
  • note: there's always one central git repo

git flow

  • two main branches:
    • develop: push to this any time you finish something
    • master: should always be deployable
  • other short-lived branches:
    • feature branches: kept local unless needed to share
    • release branches:
      • start these shortly prior to release
      • usually deployed to a staging area
      • only bugfixes are committed
      • merged back into development
    • hotfix branches

(source)

github flow

  • one branch: master
  • master is always deployable
  • features and bugs all happen in descriptive branches off of master
  • usually involves rigorous code reviews
    • github pull requests work well for this
    • forking also supports this process
  • deploy immediately after review

trade-offs

  • github flow is simpler
    • …to learn
    • …to build (graphical) tools for
    • …to screw up production
  • github flow requires a greater understanding of the code
    • every merge you need to know it doesn't break things
    • less appropriate for legacy code
  • github flow allows you to move faster
  • if you've got multiple deployments, you need git flow

(you probably don't need git flow for school)

thanks!

@aaronj1335   on and

slides at github.com/aaronj1335/git-complex-but-worth-it

don't forget waterfall is hiring!

questions?