Blog Archives

What is Git cherry picking and how do you use it?

“Cherry pick a commit”. I’ve heard the phrase often. It sounds kind of endearing, yet scarily technical at the same time. What is cherry picking and why would you want to do it? One fine day I found that I needed it, and suddenly I appreciated the what and the why. So I figured out the how. I hope this post will help you towards the same understanding.

Here’s the scenario: I’d applied a change to the latest version of the Kubeflow docs. Specifically, the change added a banner and associated logic to inform readers if they’re reading an archived version of the docs. Now I needed to copy the same banner and logic to the older (archived) versions of the docs.

More details of the scenario

The screenshot below shows the banner that I wanted to add to all the archived versions of the docs:

The way we store archived versions of the Kubeflow docs is to make a branch of the current version (that is, a branch from the master). For example, here’s v0.6 of the docs, for which the source is in this branch on GitHub. The master branch contains the current version of the docs.

I’d added the banner and accompanying logic to the master branch in this pull request (PR). Now I needed to copy the code to all the archived branches. I didn’t want to have to copy/paste all my changes into the relevant files in every affected branch.

Enter cherry picking.

Picking sweet cherries

It’s useful to know that, when you’re using GitHub, cherry picking a commit is equivalent to cherry-picking a PR. GitHub squashes all the commits in a PR into a single commit when merging the PR into the code base.

What does a cherry-picked PR look like? No different from any other PR. It’s a collection of changes that you want to make, pointing to the branch on which you want to make them. For example, PR #1550 is a cherry pick of PR #1535, with a few extra changes added after cherry picking.

Below are the steps that I figured out to prepare and do the cherry picking. One thing to note in particular is that I had to do something different if my fork of the repository already contained a copy of the branch into which I intended to cherry pick.

The first step is to check out the master branch, which contains the updates that I want to copy to the archive branches:

git checkout master

Make sure my local working directory is up to date, by pulling all content from the remote master branch. (I’m working on a fork of the Kubeflow website repository. The convention is to give the name upstream to the repository from which you forked.)

git pull upstream master

Get a log of commits made to the master branch, to find the commit that I want to cherry pick:

git log upstream/master

A commit name consists of a long string of letters and numbers. Let’s say that I need the commit named e895a107edba5e68cc0e36fa3a05a687e806cc19.

Check to see which branches I have locally:

git branch -v

Also check my fork on GitHub to see which branches I already have there.

Now I’m ready to prepare the first archived branch for cherry picking. Let’s say I start with the version 0.6 branch of the docs, named v0.6-branch. If I don’t already have the branch on my fork, I need to get a copy of the branch from the remote master, and then push that copy up to my fork, so that I have a clean slate to apply the cherry pick to. So, I pull the branch down to my local working directory then push it up to my fork. In this example, the branch name is v0.6-branch:

git checkout master
git pull upstream v0.6-branch:v0.6-branch
git checkout v0.6-branch
git push origin v0.6-branch

(I’m working on a fork of the Kubeflow website repository. By default, the name of your fork of the repository is origin.)

In the cases where I do already have the branch on my fork, I need to copy the branch from my fork down to my local working directory, check that the branch is up to date by fetching updates from the main repository, then push the branch back up to my fork. In this example, the branch name is v0.5-branch:

git fetch origin v0.5-branch:v0.5-branch
git checkout v0.5-branch
git status
git fetch upstream v0.5-branch
git push origin v0.5-branch

Now I’m ready to cherry pick the changes I need. Remember, I’m cherry picking from master into an archive branch. Let’s say I want to cherry pick into the v0.6-branch:

git checkout v0.6-branch
git cherry-pick e895a107edba5e68cc0e36fa3a05a687e806cc19

The long string of letters and numbers is the name of the commit, which I obtained earlier by running git log.

The changes are now in my local copy of the branch. I can make extra changes if I want to. (For example, in my case I needed to update some metadata that relates specifically to the branch, including an archive flag used in the logic that determines whether to display the banner on the doc pages.)

When I’m happy with the cherry-picked updates and any other changes I’ve made, I push the updated branch up to my fork:

git push origin v0.6-branch

Then I create a PR and specify the base branch to be the name of the branch into which I’m cherry picking the changes. In the case of the above example, the base branch should be “v0.6-branch”. The screenshot below shows the base option, currently pointing to “master”, on the GitHub UI when creating a PR:

Can the cherries turn sour?

In the above scenario, I used cherry picking to apply a change going backwards in time. The requirement was to apply an update to older versions of the docs, which as a rule we don’t update very often. I didn’t cherry pick from a feature branch into the master branch. There are plenty of warnings on the web about things that could go wrong when you cherry pick. I found this post by Rob Friesel helpful in providing context in a non-scary way.

How did I make the banner itself?

That’s another story. 🙂

%d bloggers like this: