A Beginner's Guide to Contributing to Open-Source Projects: Navigating Git and GitHub

Making Sense of Remote Upstream and Remote Origin

·

9 min read

As I made the transition to a career in software engineering, I kept hearing about the benefits of contributing to open-source projects. Although the idea initially intimidated me, I eventually found an open-source project that aligned with my values and goals and eagerly signed up to help solve real-world problems.

While I knew that contributing to an open-source project would be challenging and involve a lot of learning, I also recognized that the experience would provide valuable skills that are difficult to acquire through solo projects. Now, after three months of contributing to the Hack for LA website, I want to share my notes and realizations with fellow junior developers. My hope is that this series will inspire and motivate others to seek out open-source projects that excite them.

In this first blog article, I'll discuss my experience with GitHub. While I was already familiar with GitHub before I began contributing to HfLA, I realized that my understanding of the relationships between local and remote repositories was limited. Although I had heard about the benefits of version control for continuous product development, I hadn't taken full advantage of its features. However, working on open-source projects (or any team project) made terms like upstream, origin, feature branch, fetch, and merge more common. Despite the availability of cheat sheets for git and GitHub, I've found that gaining a deeper understanding of what and why, rather than just how, is essential for consistent contributions to a project.

Pre-requisite: You have git installed on your local machine, have a GitHub account, have experience cloning remote repos to your local machine, creating remote repos, and pushing local commits to your remote repos.

Disclaimer: My learning notes focus on technical solutions and routines, and I omitted steps to join a project as a contributor, as I assume this process and requirements will be different depending on the projects. If you are interested in Hack for LA's onboarding process, check out Hack for LA Getting Started.

Upstream vs Origin

Both terms refer to remote repositories, but before discussing their differences, let's first review the—

Steps Typically Followed When Contributing to an Open-Source (Or Team) Project:

  1. Find the "official" project repository (the project you want to contribute to) on GitHub.

  2. Create a new fork, which is a copy of the "official" repository to your own GitHub account.

  3. Clone your forked repository to your local machine.

After completing these steps, you will have a local repository, which has an origin that is your forked remote repository, and an upstream that is the "official" project repository. Now, we can see how this can be confusing, as the upstream is actually the true original repository. To help remember these relationships, I tend to think of my local repository as a child, the remote origin as a parent, and the remote upstream as the ancestor.

If you are unsure whether your origin and upstream are set up correctly, you can use the following commands to—

Double-Check Remotes:

git remote show origin

git remote show upstream

or git remote -v to show both

Troubleshoot Remotes:

You should have the correct remote repos set up if you followed the steps above, but in case you don't and want to—

Add an Upstream or Origin to Your Local Repo:

git remote add upstream https://github.com/ORIGINAL_OWNER/ORIGINAL_REPOSITORY.git
$ git remote add origin https://github.com/YOU/YOUR_FORKED_REPOSITORY.git

In the rare occasion that you need to—

Reset Remote Upstream or Origin:

git remote set-url upstream https://github.com/ORIGINAL_OWNER/ORIGINAL_REPOSITORY.git
$ git remote set-url origin https://github.com/YOU/YOUR_FORKED_REPOSITORY.git

🤯 Why does this matter?

It matters because while you hack the code and make changes locally, other contributors are also doing the same, and new changes may have been merged into the upstream repo. You want to keep your local repo synced up with the upstream as much as you can, to avoid confusion and waste of time and energy.

Local Repo

Assuming that you have forked the project repo, cloned a copy to your local machine, and feel ready to tackle your first good issue (congrats on completing the basic GitHub setup for contributing to an open-source or team project!), the next step is to create a feature branch. I call it a feature branch, but it really means a branch that you would be making changes in, whether it's a bug fix, refactoring, editing comments, or adding a feature 😉. It's a good practice because 1) you get the chance to give the branch a descriptive name which will come in super handy when you submit a pull request, and 2) you'll have the freedom to break things and test bold changes without worrying about "messing up" your main branch.

Main Branch

By default, you're on the main branch. The git branch command lists all branches and highlights the branch you're currently in. I think it's also a good habit to use this command from time to time to ensure that you're making changes to the intended branch.

Before creating a feature branch, now it's time to—

Sync up the Main Branch With the Remote Upstream:

git pull upstream main-branch-name

You may be prompted to enter a merge message if there have been new commits in the upstream repo (which is highly probable if the team has many contributors), and your console will look like this:

If you're like this 🫣😬😵🫠, you're not alone—the first time I was "forced" to use vi (or vim editor), I was screaming inside: "Get me out of here!!!" 😱 but I overcame my fear of using the vi editor, and you can too. Here are three solutions to—

Merge Remote Branch into Local Branch and Add Message in VI Editor:

  1. Learn and get used to the basics of the vi editor. It's not as bad as it seems (or is it?). These are the steps to add a commit message:

    1. Press "i" (for "insert", as you may have guessed), and you'll see -- INSERT -- at the bottom of the window.

    2. Now you'll notice that you can move the cursor using the arrow keys on your keyboard. The first line is the default merge message, but it's your choice to keep, edit, or delete and rewrite it.

    3. Once you're happy with the merge message, press the "esc" button. You'll notice that the -- INSERT -- at the bottom disappeared.

    4. But we haven't completely "escaped" yet. You need to type ":wq," which stands for "write and quit."

    5. Finally, press the "return" key.

  2. If the steps above seem too tedious, an alternative solution would be to configure git to use an editor of your choice. However, I haven't tried doing this, so I can't endorse the StackOverflow answer. Please do your own research. Once I memorized the cryptic commands above, I didn't mind the solution anymore because it makes me feel like a hacker 🥷🏻👩🏻‍💻.

  3. I'm not sure if this is considered good practice, but if you're like me, whenever you pull from the upstream repo, it's purely to sync up and merge commits into your local branch, the merge message almost becomes irrelevant at this point. You could use git pull upstream main-branch-name --no-edit instead of just git pull upstream main-branch-name.

    From the git documentation: "--no-edit: Invoke an editor before committing successful mechanical merge to further edit the auto-generated merge message, so that the user can explain and justify the merge. The --no-edit option can be used to accept the auto-generated message (this is generally discouraged)." 👈🏻

Feature Branch

Now that our local main branch is synced up with the upstream main branch, we can create and switch to a feature branch! Feature branches are created off of the main branch. Therefore, if you have just pulled from the upstream in your main branch and created a feature branch immediately after, your feature branch should be up to date with the upstream too.

Checkout and Switch to a Feature Branch

  1. Double-check that we are on the main branch by running the following command:

    git branch

  2. To create and switch to a feature branch, run the following command:

    git checkout -b feature-branch-name

    Depending on the project or team, you may be asked to name your feature branch following specific guidelines. If not, it's still a good practice to give an informative name to your feature branch. Use a few words to sum up what the changes are for and include the number of the issue you are trying to resolve. For example: "debug-update-label-gha-3341" for a feature branch that fixed a bug in the GitHub Actions update_label.js file.

    Note that git branch feature-branch-name and git checkout -b feature-branch-name both create a new branch, but the latter also switches to the new branch.

Now that we are in the feature branch, let's start hacking! If you have doubts, run git branch to check which branch you are currently on.

Sync Up Feature Branch with Remote Upstream

In your feature branch, you can and should still run git pull upstream main-branch-name from time to time to merge in any commits the upstream repository may have had since you created the feature branch. How often you should run this command will depend on the project and team, so it's hard to give an exact frequency. However, it's essential to keep your feature branch up to date with the upstream main branch to avoid or *minimize any conflicts that may arise later.

* You may need to resolve merge conflicts when running git pull upstream main-branch-name. I will write a separate article on how to resolve merge conflicts.

Remote Feature Branch

Now that your feature branch has merged all recent commits from the remote upstream repo and you have finished making changes locally, you can do the following to—

Commit Local Changes and Push Local Feature Branch to Remote Origin:

  1. Run git add . or git add file-path (if you have only changed one file, it's better to specify the file path).

  2. Run git commit -m "your-brief-yet-descriptive-commit-message" to commit the changes locally.

  3. Finally, push the changes to your remote origin so that you can create a Pull Request from your remote feature branch.

git push --set-upstream origin your-feature-branch-name

You can also use git push -u origin your-feature-branch-name, but let's break down what this command really does:

  • git push uploads the local repository content to a remote repository.

  • origin refers to your remote forked copy of the project on GitHub.

  • your-feature-branch-name specifies which local feature branch you want to push to the remote repo.

  • Finally, the --set-upstream or -u flag tells git to set up a link between your local feature branch and the remote feature branch, tracking changes.

The git push -u origin your-feature-branch-name command is only required the first time you push your local feature branch to your remote origin repository. After setting up the upstream link, you can use the git push command to push any future commits made to this feature branch locally to the remote feature branch.

Thank you for reading this lengthy article. I hope you found it helpful.

In future articles of this series, I plan to cover topics such as submitting and reviewing pull requests, troubleshooting Git and GitHub, and sharing my experience hacking GitHub Actions and writing bots that automatically check issues for updates and add/remove labels.

I welcome feedback and suggestions! Please feel free to leave a comment below.

Cheers!

Bitian