2010-11-12

Git/Hub Workflow Experiences

The Jetpack project recently migrated its SDK repository to Git (hosted on GitHub), and we've been working out changes to the bug/review/commit workflow that GitHub's tools enable (specifically, pull requests).
 
Here are some of my initial experiences and my thoughts on them (which I've also posted to the Jetpack discussion group).
 
Warning: Git wonkery ahead, with excruciating details. I would not want to read this post. I recommend you skip it. ;-)


Part 1: Wherein I Handle My First Pull Request

To fix some test failures, Atul submitted GitHub pull request 33, I reviewed the changes (comprising two commits) on GitHub, and then I pushed them to the canonical repository via the following set of commands:
  1. git checkout -b toolness-4.0b7-bustage-fixes master
  2. git pull https://github.com/toolness/jetpack-sdk.git 4.0b7-bustage-fixes
  3. git checkout master
  4. git merge toolness-4.0b7-bustage-fixes
  5. git push upstream master

That landed the two commits in the canonical repository, but it isn't obvious that they were related (i.e. part of the same pull request), that I was the one who reviewed them, or that I was the one who pushed them.


Part 2: Wherein I Handle My Second Pull Request

Thus, for the fix for bug 611042, for which Atul submitted GitHub pull request 34, I again reviewed the changes (also comprising two commits) on GitHub, but then I pushed them to the canonical repository via this different set of commands (after discussion with Atul and Patrick Walton of the Rust team):
  1. git checkout -b toolness-bug-611042 master
  2. git pull https://github.com/toolness/jetpack-sdk.git bug-611042
  3. (There might have been something else here, since the pull request resulted in a merge; I don't quite remember.)
  4. git checkout master
  5. git merge --no-ff --no-commit toolness-bug-611042
  6. git commit --signoff -m "bug 611042: remove request.response.xml for e10s compatibility; r=myk" --author "atul"
  7. git push upstream master

Because Atul's pull request was no longer against the tip (since I had just merged those previous changes), when I pulled the remote bug-611042 branch into my local toolness-bug-611042 branch (step 2), I had to merge his changes, which resulted in a merge commit.

Merging the changes to my local master with "--no-ff" and "--no-commit" (step 5) then allowed me to commit the merge to my master branch manually (step 6), resulting in another merge commit.

For the second merge commit, I specified "--signoff", which added "Signed-off-by: Myk Melez " to the commit message; crafted a custom commit message that included "r=myk"; and specified '--author "atul"', which made Atul the author of the merge.

I dislike having the former merge commit in history, since it's extraneous, unuseful details about how I did the merging locally before I pushed to the canonical repository. I'm not sure how to avoid it, though.

On the other hand, I like having the latter merge commit in history, since it provides context for Atul's two commits: the bug number, the fact that the changes were reviewed, and a commit message that describes the changes as a whole.

I'm ambivalent about --signoff vs. adding "r=myk" to the commit message, as they seem equivalentish, with --signoff being more explicit (so in theory it might form part of an enlightened workflow in the future), while "r=myk" is simpler.

And I dislike having made Atul the author of the merge, since it's incorrect: he wasn't the author of the merge, he was only the author of the changes (for which he is correctly credited). And if the merge itself caused problems (f.e. I accidentally backed out other recent changes in the process), I would be the one responsible for fixing those problems, not Atul.


Part 3: Pushing Patches

In addition to pull requests, one can also contribute via patches. I've pushed a few of these via something like the following set of commands:
  1. git apply patch.diff
  2. git commit -a -m "bug : ; r=myk" --author ""
  3. git push upstream master
That results in a commit like this one, which shows me as the committer and the patch author as the author. And that seems like a fine record of what happened.


Part 4: To Bug or Not To Bug?

One of the questions GitHub raises is whether or not every change deserves a bug report. And if not, how do we differentiate those that do from the rest?

I don't have the definitive answers to these questions, but my sense, from my experience so far, is that we shouldn't require all changes to be accompanied by bug reports, but larger, riskier, time-consuming, and/or controversial changes should have reports to capture history, provide a forum for discussion, and permit project planning; while bug reports should be optional for smaller, safer, quickly-resolved, and/or non-controversial changes.

3 comments:

Dustin J. Mitchell said...

As for the pull producing a merge, there are two solutions. First, you can simply pull into master without first creating a non-master branch. I have an alias that runs 'git pull --no-ff --no-commit':

ghpull = !sh -c 'git pull --no-ff --no-commit git://github.com/$0/buildbot.git $1'

This leaves the changes staged without committing them. I usually run a quick 'git diff --cached' to verify what I'm committing is what I reviewed, and then commit. You could add your --signoff, --aithor, etc. on the 'git commit' line.

The second, slightly more difficult option that allows you to continue to use your topic branch is to use 'git pull', and when you see that it's performed a merge, use 'git reset --hard HEAD^2' to reset the branch to the second parent -- the youngest of the commits you just merged -- rather than the merge of that commit and your master. You can even do this if there are merge conflicts!

dietrich said...

for pushing patches, you don't need the upstream alias since you're pushing directly to the add-on sdk master. i use 'git push origin master'.

myk said...

Dietrich: I actually use a local repo I cloned from my own GitHub fork, as recommended by Lloyd Hilaiel, so my "origin" is not the canonical repository, which is why I use the "upstream" alias (to indicate the repository that is "upstream" from my fork).