Git Terminology
When working with Git, you will probably hear some of the terms related to Git for the first time. Git terminology involves a lot of common words but with different meanings and this can be confusing.
This article is focused on providing brief definitions of the most commonly used Git terms.
Version Control System(VCS)
A software that is used to manage our project and different versions of it by keeping track of the changes that we have made to it. Git, SVN, Mercurial are some of the well-known VCS out there.
Git Repository
The repository is the place where Git permanently stores the commits that we made. Along with commits, it also stores additional information that is needed for version control.
Local Repository
A Git repository present on our local system is known as a local repository. A local repository has a working directory associated with it where we can check out different files and modify them.
Remote Repository
A Git repository hosted on a server provided by Git hosting services like GitHub or Bitbucket is called a Remote Repository. In most cases, the remote repository is a bare repository which means that it has no working directory associated with it and just the .git folder to manage different versions of our project. This is the repository where we will be pushing and pulling changes from our local repository.
Remote
Remote is a short name given to remote repositories. Remote sometimes also implies the connection that was established between our local repository and the remote repository.
Working Directory
It is the place where we can make changes to our project. We can create, update and delete files in the working directory.
Staging Area/Index
It is the place where we can add changes that we want to include in our next commit. It is used to segregate and organize the changes. It can be used to view the different changes that we have made to our project. We can also use a commit to compare the previous changes with the current version and eliminate errors.
Commit
A commit is a snapshot of the version of our project at a particular instant in time. It is permanently stored in the Git repository and we can always roll back to this point. A commit also stores additional information like the name and email of the author who made the commit, the date and time when the commit was made, etc.
Commit Hash/SHA
Each commit has a unique identifier which is called that commit's hash or SHA. It is generated using the cryptographic hash function called SHA-1. SHA-1 stands for Secure Hash Algorithm - 1.
Staged Files
Files currently present in the staging area are called staged files. The changes made to these files will be included in the next commit.
Unstaged Files
Files not present in the staging area currently, but were added to it at a previous point in time and were part of a previous commit are called Unstaged Files.
Untracked Files
Newly created files that were added to the staging area are called untracked files. They are called so because any changes made to these files are not tracked by Git.
Diff
A diff is short for difference and is used to compare the differences between two versions of the same file. The Git Diff command is used to show these diffs between files.
Patch
A patch is a file that contains the diffs between files and may include additional metadata like the commit hash, committer name, and email, date and time, etc. They are mostly used to suggest changes to the authors of repositories that you don't have access to.
Stash
Stashing is the process of temporarily storing the unsaved changes of our project in a safe location. We can store both staged and unstaged files but untracked files cannot be stashed. Stashing is done to store the unsaved changes that we don't want to commit yet, and move our focus to some other task. We can then un-stash these changes and continue from where we left.
Branch
A branch is a simple pointer to a commit. Branches are used to provide an independent workspace to the developers where they can develop features and experiment with new things without worrying about corrupting the rest of the project. These branches can then be merged into each other.
Master
Master is the default name given to the first branch which is created by Git when we initialize a Git repository. The master branch will usually have the up-to-date, production-ready code. This is the branch where changes from other branches will be merged.
Main
Main is another name similar to master which can be given to a branch to indicate that this branch has the final working project and the production-ready code.
HEAD
HEAD is a simple pointer that points to the tip of a branch i.e. the most recent commit of the current working branch. Normally, the HEAD points to the branch which in turn points to a commit, but if the HEAD directly points to a commit, a tag, or a remote-tracking branch, then it is called a Detached HEAD.
Tag
A tag is a way of marking a specific point(commit) in the history of our project. This is done so that we can reference them in the future. They are mostly used to mark the software release versions like v1.0.1, v2.0, etc. Tags are of two types - Lightweight(Unannotated) and Annotated. A lightweight tag simply points to a commit. Annotated tags carry additional information(called metadata) like the tagger name, tagger email, and a tag message.
Checkout
Checkout is the process of navigating between different entities in Git. These entities can be branches, commits, tags, or even files. To checkout, an entity means to switch to that particular entity.
Merge
Merging is the process of combining different branches into a single one. It is a way of adding changes to a branch that were made on some other branch. There are two merging strategies in Git - Fast Forward Merge and Three-Way Merge.
Merge Conflict
When the same part of a file is modified on two different branches and we are trying to merge these branches, then Git will block this merge as it doesn't which version of the file to keep and which one to discard. This situation is called a Merge Conflict.
Rebase
Rebasing is the process of changing the commit from which a branch originated(the base commit). It is similar to merging as we adding the newer commits of some other branch to the history of the rebased branch.
Cherry-Pick
Cherry-picking is the process of selecting specific commits from a branch and adding them to the tip of another branch. It is different from merging or rebasing as we are only selecting a single commit and adding it to other branches instead of merging the entire branch.
Clone
Cloning is the process of making a copy of a repository. This copy will be available to us on our local system.
Fetch
Fetching is the process of retrieving changes from the remote repository or any other repository. The commits that are fetched are added to the remote-tracking branches and these branches can then be merged with our local ones.
Push
When we are done working on a feature, we can push that change to a remote repository to share with other developers on our team. Pushing is the process of updating the remote repository with the changes that we made in our local repository. It is advised to pull changes before pushing as Git will block a push if the changes of the remote repository are absent from our local repository.
Pull
Pulling is the process of fetching and merging the changes made to the remote branches to our corresponding local branches. The Git Pull command is a combination of the Git Fetch and Git Merge commands.
Fork
A fork is a copy of a remote repository that is present on a server. The process of copying a remote repository to create an identical server-side repository is called forking. Forking is mostly done to contribute to open source projects. Forking is different from cloning as forking will create a remote copy of a repository on some Git hosting websites, but cloning creates a local copy of a repository.
Pull Request
Pull Request is a way of informing other team members about the changes that you have pushed to the central shared repository. We can discuss these changes and others can suggest changes or follow-up commits to our changes. It is a feature provided by Git repository hosting services like GitHub.
Origin
Origin is the default name given to the repository from where we cloned our local repository. It is the place from where our local repository originated.
origin/master
Origin/master is a remote-tracking branch that tracks the changes made to the master branch of the origin repository, hence the name.
Upstream and Downstream
Any repository that we clone from, push to, or pull from is called an Upstream. The repository which is cloning from, pulling from, or pushing to the Upstream is called the downstream. Upstream is the central remote repository where all the collaborator's clones from and downstream are the local repositories of these developers.
Hook
A hook is a fixed set of instructions that automatically run when a particular event occurs. They are stored in the .git/hooks file.
Summary
In this tutorial, we learned about some of the most frequently used terms in Git. These terms may sound confusing to a beginner but are very important to understand the working of Git.