Learn Enough Git to beĀ Useful

By Jeff Hale

In this article we’re going to look at how you use Git with GitHub in four common scenarios.

  1. Alone with your own original project pushing to the master branch.
  2. Alone with your own original project with pull requests.
  3. Alone with a cloned project.
  4. Collaboratively contributing to a project.

Git is essential version control technology for developers, data scientists, and product managers to understand.

GitHub is the most popular remote registry for your remote Git repositories. It hosts over 100 million repositories and is where lots of Open Source Software is built.

If you’re a developer, you’re expected to have a GitHub account that shows you can do things with code. Although GitLab and BitBucket are popular registry options, GitHub — recently acquired by Microsoft — is the clear market leader.

If you have a GitHub account and have installed and configured Git on your computer, skip ahead.

If you don’t have a GitHub account, go here and sign up for a free one. See how to install and configure Git here. You want the command line tool. Follow the links to download it, install it, set your username, and set your commit email address.

Once that’s out of the way, you’re ready to git started with our basic example. So punny! 😆

Enter a repository name. Then enter a description — this description becomes the README at initialization if you toggle Initialize this repository with a README. You can change the README content later.

Here, I’m making a repo for my article on Python OS commands.

Keep the Public radio button highlighted if you want other folks to be able to find your work. Private repos used to cost money, but users now have unlimited free repos.

Add a .gitignore from the dropdown list. I chose Python for my os-examples project because I was working on a Python project. The .gitignore file will match common folders and file types you’ll want to keep out of your Git repo. You can alter your .gitignore later to exclude other unnecessary or sensitive files.

I suggest you choose a license from the Add a license dropdown. The license defines what users of your repository content can do. Some licenses are more permissive than others. Default copyright laws apply if no license is chosen. Learn more about licenses here.

Now you’ve got a repo on GitHub, sweet! Time to copy it to your local machine.

2. Grab the URL from GitHub for cloning.

3. Start a terminal session and navigate to your local directory where you want your repo to be located. Consider this location for a moment — you might want to make a new directory to keep things tidy.

Then run:

git clone https://github.com/your_repo_name/your_repo

ls to see your new directory

cd into your directory

ls to see your new directories and files

Now it’s time to jump into your Push to Master Workflow.

1. When you clone a repo, you start on your local repo’s master branch by default. It’s good practice to create and work in a temporary branch.

Make a new branch and switch to it with git checkout -b my_feature_branch.

You can name your branch whatever you like. However, as with most names in programming, it’s a good idea to be both brief and descriptive. 😄

2. Write some code in your code editor and save it. I’m a fan of Atom. It’s free, open source, popular, and works well.

3. Check the status of your local file changes with git status. You should see a list of files in red that are new. These are untracked files. Git is aware of these files, so they have been indexed by Git. These files do not match files in your .gitignore, so they won’t be ignored by Git.

For example, here’s my untracked file.

4. To stage all changes, enter git add -A.

You might see that a file that you don’t want to add to your GitHub repo is listed. In that case, you can add the files you do want staged individually with git add my_filename.

Also, you should alter your .gitignore file to exclude file types and folders you don’t want to add to your repo. Save your .gitignore file and it should show up in red as a file with unstaged changes.

git status again to make sure that the files you want to commit are staged. File changes must be staged to be committed.

To use a baseball or softball analogy, I think of staged changes like a batter in the on deck circle. They are preparing for what’s up next: the real thing.

Batter in on deck circle.Credit: Tdorante10 [CC BY-SA 4.0 (https://creativecommons.org/licenses/by-sa/4.0)]

Files to be committed will show in green.

5. Commit changes with git commit -m "my commit message". Use the present tense, e.g. “update links to references”.

6. Switch to the local master branch with git checkout master.

You can see all of your branches with git branch -v.

7. Merge your local feature branch into your local master branch with git merge my_feature_branch.

8. Push the updated master branch to GitHub with git push origin master.

9. Delete your old, local, non-master branch with git branch -d my_feature_branch.

REPEAT 🔁

That’s it if you’re working alone. Now your work is backed up, changes can be rolled back, and other people can use and see your code. 😄

Next we’ll look at a slightly more involved workflow with pull requests.

I’ll demonstrate this workflow by pushing a change to my GitHub repo used to make an example for an article about how to use argparse.

Everything is the same as Scenario 1’s setup and workflow until you get to Step 6. Don’t switch to your local master branch, instead stay on your local feature branch. Here’s the new Step 6:

6. Push your feature branch to your remote repository with git push origin my_feature_branch.

7. Navigate to your GitHub repo in your browser and you should see an alert that you can Compare & pull request.

Click the green button and you’ll be able to create the pull request.

8. Click Create pull request. I don’t have continuous integration set up here, but I’ll dig into that topic in a future article about creating an open source project. Follow me to make sure you don’t miss it. 😄

9. Confirm you want to merge the pull request.

10. You’ll be given the option to delete the remote branch. Delete it.

11. Go back to your terminal and switch back to your local master branch with git checkout master.

12. Download the master branch from your remote repository with git pull origin master. Alternatively, you could just merge your local changes from your feature branch. However, it’s good to develop the habit of pulling down changes from your remote repository in case changes from your collaborators have been merged into master. Additionally, you might have made a quick fix to a markdown document through the browser and forgotten about it. So better safe than sorry. 😄

13. Delete the local feature branch you don’t need anymore with git branch -d my_feature_branch.

Start the cycle again by choosing your next bug or feature to work on and creating and checking out a new branch. 🔁

Note that you could create and manage the pull request process from the command line, but I find it easier to manage the process from the browser.

Here’s the scenario: you want to play with code someone else wrote and put on GitHub. You might find yourself doing this for a tutorial. Let’s assume you aren’t sending your work to GitHub.

For example, you could download the repository with the example code from Müller and Guido’s Introduction to Machine Learning with Python (a book I highly recommend for data scientists).

Make a copy via by grabbing the url from GitHub with git clone https://github.com/amueller/introduction_to_ml_with_python.git.

Fire up Jupyter Lab in your cloned repo and start playing with the examples. That’s it if you’re just playing locally. Short and sweet!

You want to contribute to an existing open source project. Bully for you! 👏

Now we’re flying together

Let’s use firstcontributions/first-contributions. This GitHub repo is designed to help you make a first contribution to an open source project. It’s pretty basic: you make a pull request to add your name to a list.

In this scenario, you start by forking the repo.

  1. Click Fork in the top right of the GitHub repo.

GitHub will fork the repo for you and take you to your fork.

2. Now clone your repo locally like before, substituting your GithHub username.

git clone https://github.com/your_username/first-contributions.git

cd into your project folder.

3. Set up your local repository to track change in the original repository.

git remote -v shows you your remote repositories on GitHub. You should see something like this:

origin https://github.com/discdiver/first-contributions.git (fetch)
origin https://github.com/discdiver/first-contributions.git (push)

To track the original firstcontributions repository we need to add it as a remote upstream repository. The command to use is git remote add upstream https://github.com/firstcontributions/first-contributions.git

git remote -v should now show you something like this:

origin https://github.com/discdiver/first-contributions.git (fetch)
origin https://github.com/discdiver/first-contributions.git (push)
upstream https://github.com/firstcontributions/first-contributions.git (fetch)
upstream https://github.com/firstcontributions/first-contributions.git (push)

You’re set!

4. Now make a new local branch with git checkout -b “add_my_name”.

Whenever you need it, git branch will show you your local branches and which branch you are on. 😄

5. Open the Contributors.md file in your code editor. Add your name to a line other than the first line or last line. Save the file.

6. Check that everything looks good with git status.

7. Stage changes with git add -A.

8. Commit your changes with git commit -m “my commit message”.

9. Switch to the master branch with git checkout master.

10. Merge your changes into your local master branch from your local feature branch with git merge add_my_name.

11. Grab the latest changes from the remote firstcontributions repo to make sure that your changes won’t interfere with other changes that have been made since you cloned the repo. Use git fetch upstream to get the changes into your local Git repository.

12. If there are changes in the original repo, you can merge the changes into your local master branch with git merge upstream/master.

Instead of merging changes, you could integrate the changes and create a linear commit history with git rebase. Rebasing can introduce problems, so it’s not advised unless you’re sure what you’re doing. Here’s a discussion of the topic and a strong anti-rebase argument is here.

13. When integrating your commits, your changes will sometimes conflict with other people’s changes that have already been merged to the remote repository. For example, you might have changed a line of code that another person also changed.

Atom makes it easy to resolve these merge conflicts, by showing you which sections of a file were changed by you or them.

Wilma wants to add her name

Select whose changes you want to use for each conflict. When all changes are resolved you can commit your changes.

On more advanced projects you may need to run tests and/or build documents to make sure your changes didn’t break things. For this basic example, neither step is necessary.

14. Push your local changes to your fork of the original repository with git push origin master.

15. Go to your repo in your browser and create a pull request.

On more complicated projects there may be continuous integration (CI) tests that run automatically at this point. If so, you’ll be able to see whether your changes make the repo pass or fail its CI tests. If it fails, you’ll probably have to make changes. 😦

You can push additional commits to the same branch and the CI tests will rerun automatically.

For this little example, you don’t have to worry about CI tests. Just sit back and let the project maintainers review your PR and merge it into the master branch. Congratulations on making a PR to an open source project! 👍

REPEAT 🔁

Let’s recap our workflows.

Create a repo and clone it locally first. Then:

  1. git checkout -b my_feature_branch
  2. Write and save code.
  3. git status
  4. git add -A
  5. git commit -m "my commit message"
  6. git checkout master
  7. git merge my_feature_branch
  8. git push origin master
  9. git branch -d my_feature_branch
  1. git checkout -b my_feature_branch
  2. Write code.
  3. git status
  4. git add -A
  5. git commit -m "my commit message"
  6. git push origin my_feature_branch
  7. Open PR in browser.
  8. Confirm PR.
  9. Merge PR.
  10. Delete remote feature branch.
  11. git checkout master
  12. git pull origin master
  13. git branch -d my_feature_branch
  1. git clone repo_to_be_cloned
  1. Fork original repo in browser.
  2. git clone repo_to_be_cloned
  3. git remote add upstream original_repo
  4. git checkout -b my_feature_branch
  5. Write code.
  6. git status
  7. git add -A
  8. git commit -m "my commit message"
  9. git checkout master
  10. git merge my_feature_branch
  11. git fetch upstream
  12. git merge origin/master
  13. Resolve any merge conflicts. If there were conflicts, commit your changes.
  14. git push origin master
  15. From your browser, make a pull request to merge your branch into the remote upstream master branch.

Knowing how to use Git and GitHub is powerful, essential, and occasionally confusing. Hopefully, this guide has provided you with some actionable workflows you can refer to when needed. If it did, please share it on your favorite social media channels so that others can find it too! 👏

In a future article we’ll look at other important Git commands to know. Follow me to make sure you don’t miss it. 😄

I write articles about Data Science, Python, Docker, and other tech topics. If you’re interested any of those things, check them out here.

Happy Git-ing!