2010-11-27

Further Adventures In Git(/Hub)ery

This evening I decided to check if there were any outstanding pull requests for the SDK repository (to which I haven't been paying attention).

There were! The oldest was pull request 29 from Thomas Bassetto, which contains two small fixes (first, second) to the docs.

So I fetched the branch of his fork in which the changes reside:

$ git fetch https://github.com/tbassetto/addon-sdk.git master

But that branch (and the fork in general) is a few weeks out-of-date, so "git diff HEAD FETCH_HEAD" showed a bunch of changes, and it was unclear how painful the merge would be.

Thus I decided to try cherry-picking the changes, my first time using "git cherry-pick".

The first one went great:

$ git cherry-pick 8268334070d03a896d5c006d1b4db94d4cb44b17
Finished one cherry-pick.
[master ceadb1f] Fixed an internal link in the widget doc
 1 files changed, 1 insertions(+), 1 deletions(-)

Except that I realized afterward I hadn't added "r,a=myk" to the commit message. So I tried "git commit --amend" for the first time, which worked just fine:

$ git commit --amend
[master 2d674a6] Fixed an internal link in the widget doc; r,a=myk
 1 files changed, 1 insertions(+), 1 deletions(-)

Next time I'll remember to use the "--edit" flag to "git cherry-pick", which lets one "edit the commit message prior to committing."

The second cherry-pick was more complicated, because I only wanted one of the two changes in the commit (in my review, I had identified the second change as unnecessary); and, as it turned out, also because there was a merge conflict with other commits.

I started by cherry-picking the commit with the "--no-commit" option (so I could remove the second change):

$ git cherry-pick --no-commit 666ad7a99e05e338348dfc579d5b1f75e8d3bb1b
Automatic cherry-pick failed.  After resolving the conflicts,
mark the corrected paths with 'git add <paths>' or 'git rm <paths>' and commit the result.
When commiting, use the option '-c 666ad7a' to retain authorship and message.

The conflict was trivial, and I knew where it was, so I resolved it manually (instead of trying "git mergetool" for the first time), removed the second change, added the merged file, and committed the result, using the "-c" option to preserve the original author and commit message while allowing me to edit the message to add "r,a=myk":

$ git add packages/addon-kit/docs/request.md
$ git commit -c 666ad7a
[master 774d1cb] Completed the example in the Request module documentation; r,a=myk
 1 files changed, 1 insertions(+), 0 deletions(-)

Then I used "gitg" and "git log master ^upstream/master" to verify that the commits looked good to go, after which I pushed them:

$ git push upstream master
[git's standard obscure and disconcerting gobbledygook]

Finally, I closed the pull request with this comment that summarized what I did and provided links to the cherry-picked commits.

It would have been nice if the cherry-picked commit that didn't have merge conflicts (and which I didn't change in the process of merging) had kept its original commit ID, but I sense that that is somehow a fundamental violation of the model.

It would also have been nice if the cherry-picked commit messages had been automatically annotated with references to the original commits.

But overall the process seemed pretty reasonable, it was fairly easy to do what I wanted and recover from mistakes, and the author, committer, reviewer, and approver are clearly indicated in the cherry-picked commits (first, second).

[Also posted to the discussion group.]

More Git/Hub Workflow Experiences

After posting about my first Git/Hub workflow experiences, I got lots of helpful input from various folks, particularly Erik Vold, Irakli Gozalishvili, and Brian Warner, which led me to refine my process for handling pull requests:

  1. From the "how to merge this pull request" section of the pull request page (f.e. pull request 34), copy the command from step two, but change the word "pull" to "fetch" to fetch the remote branch containing the changes without also merging it:

    git fetch https://github.com/toolness/jetpack-sdk.git bug-610507

  2. Use the magic FETCH_HEAD reference to the last fetched branch to verify that the set of changes is what you expect:

    git diff HEAD FETCH_HEAD

    (The exact syntax here may need some work; HEAD..FETCH_HEAD? three dots?)

  3. Merge the remote branch into your local branch with a custom commit message:

    git merge FETCH_HEAD --no-ff -m"bug 610507: get rid of the nsjetpack package; r=myk"

  4. Push the changes upstream:

    git push upstream master

I like this set of commands because it doesn't require me to add a remote, I can copy/paste the fetch command from GitHub (being careful not to issue the pull before I change it to a fetch), and I always type the same FETCH_HEAD reference to the remote branch in step three.

However, I wish the merge commit page explicitly referenced the specific commits that were merged. It does mention that it's a branch merge, it isn't obvious how to get from that page to the pages for the commits I merged from the branch.

"git log --oneline --graph", gitg, and gitk do give me that information, though, so I'm ok on the command line, anyway.

[More discussion can be found in the discussion group thread.]

2010-11-12

Git/Hub Workflow Experiences

The Jetpack project recently migrated its SDK repository to Git (hosted on GitHub), and we've been working out changes to the bug/review/commit workflow that GitHub's tools enable (specifically, pull requests).
 
Here are some of my initial experiences and my thoughts on them (which I've also posted to the Jetpack discussion group).
 
Warning: Git wonkery ahead, with excruciating details. I would not want to read this post. I recommend you skip it. ;-)


Part 1: Wherein I Handle My First Pull Request

To fix some test failures, Atul submitted GitHub pull request 33, I reviewed the changes (comprising two commits) on GitHub, and then I pushed them to the canonical repository via the following set of commands:
  1. git checkout -b toolness-4.0b7-bustage-fixes master
  2. git pull https://github.com/toolness/jetpack-sdk.git 4.0b7-bustage-fixes
  3. git checkout master
  4. git merge toolness-4.0b7-bustage-fixes
  5. git push upstream master

That landed the two commits in the canonical repository, but it isn't obvious that they were related (i.e. part of the same pull request), that I was the one who reviewed them, or that I was the one who pushed them.


Part 2: Wherein I Handle My Second Pull Request

Thus, for the fix for bug 611042, for which Atul submitted GitHub pull request 34, I again reviewed the changes (also comprising two commits) on GitHub, but then I pushed them to the canonical repository via this different set of commands (after discussion with Atul and Patrick Walton of the Rust team):
  1. git checkout -b toolness-bug-611042 master
  2. git pull https://github.com/toolness/jetpack-sdk.git bug-611042
  3. (There might have been something else here, since the pull request resulted in a merge; I don't quite remember.)
  4. git checkout master
  5. git merge --no-ff --no-commit toolness-bug-611042
  6. git commit --signoff -m "bug 611042: remove request.response.xml for e10s compatibility; r=myk" --author "atul"
  7. git push upstream master

Because Atul's pull request was no longer against the tip (since I had just merged those previous changes), when I pulled the remote bug-611042 branch into my local toolness-bug-611042 branch (step 2), I had to merge his changes, which resulted in a merge commit.

Merging the changes to my local master with "--no-ff" and "--no-commit" (step 5) then allowed me to commit the merge to my master branch manually (step 6), resulting in another merge commit.

For the second merge commit, I specified "--signoff", which added "Signed-off-by: Myk Melez " to the commit message; crafted a custom commit message that included "r=myk"; and specified '--author "atul"', which made Atul the author of the merge.

I dislike having the former merge commit in history, since it's extraneous, unuseful details about how I did the merging locally before I pushed to the canonical repository. I'm not sure how to avoid it, though.

On the other hand, I like having the latter merge commit in history, since it provides context for Atul's two commits: the bug number, the fact that the changes were reviewed, and a commit message that describes the changes as a whole.

I'm ambivalent about --signoff vs. adding "r=myk" to the commit message, as they seem equivalentish, with --signoff being more explicit (so in theory it might form part of an enlightened workflow in the future), while "r=myk" is simpler.

And I dislike having made Atul the author of the merge, since it's incorrect: he wasn't the author of the merge, he was only the author of the changes (for which he is correctly credited). And if the merge itself caused problems (f.e. I accidentally backed out other recent changes in the process), I would be the one responsible for fixing those problems, not Atul.


Part 3: Pushing Patches

In addition to pull requests, one can also contribute via patches. I've pushed a few of these via something like the following set of commands:
  1. git apply patch.diff
  2. git commit -a -m "bug : ; r=myk" --author ""
  3. git push upstream master
That results in a commit like this one, which shows me as the committer and the patch author as the author. And that seems like a fine record of what happened.


Part 4: To Bug or Not To Bug?

One of the questions GitHub raises is whether or not every change deserves a bug report. And if not, how do we differentiate those that do from the rest?

I don't have the definitive answers to these questions, but my sense, from my experience so far, is that we shouldn't require all changes to be accompanied by bug reports, but larger, riskier, time-consuming, and/or controversial changes should have reports to capture history, provide a forum for discussion, and permit project planning; while bug reports should be optional for smaller, safer, quickly-resolved, and/or non-controversial changes.