The Inner Workings of Git: A Step-by-Step Guide.

The Inner Workings of Git: A Step-by-Step Guide.

Git is a distributed version control system designed to track changes in source code during software development. It allows multiple developers to work on a project simultaneously while maintaining a complete history of all changes.

Version Control System:

Version Control System (VCS) is a software that helps software developers to work together and maintain a complete history of their work.

Types Of VCS

  1. Centralized Version Control System.
  2. Distributed/Decentralized version control system.

we will concentrate only on distributed version control system and especially on Git. Git falls under distributed version control system.

Distributed/Decentralized version control system.

DVCS clients not only check out the latest snapshot of the directory but they also fully mirror the repository. If the server goes down, then the repository from any client can be copied back to the server to restore it. Every checkout is a full backup of the repository. Git does not rely on the central
server and that is why you can perform many operations when you are offline. You can commit
changes, create branches, view logs, and perform other operations when you are offline. You require
network connection only to publish your changes and take the latest changes.

Git Architecture:

Git uses a three-stage architecture – working directory, staging area, and local repository to optimize change tracking.

Key concepts like committing, branching, merging, and remotes enable powerful version control workflows.

Git maintains an extensive history and provides commands like git log and git diff to analyze changes over time.

Advantages Of Git:

Free and open source:

Git is released under GPL’s open source license. It is available freely over the internet. You can use Git to manage property projects without paying a single penny. As it is an open source, you can
download its source code and also perform changes according to your requirements .

Implicit backup:
The chances of losing data are very rare when there are multiple copies of it. Data present on any client side mirrors the repository, hence it can be used in the event of a crash or disk corruption.

Fast and small:
As most of the operations are performed locally, it gives a huge benefit in terms of speed. Git does not rely on the central server; that is why, there is no need to interact with the remote server for every operation. The core part of Git is written in C, which avoids runtime overheads associated with other high-level languages. Though Git mirrors entire repository, the size of the data on the client side is small. This illustrates the efficiency of Git at compressing and storing data on the client side.

No need of powerful hardware:
In case of CVCS, the central server needs to be powerful enough to serve requests of the entire team. For smaller teams, it is not an issue, but as the team size grows, the hardware limitations of
the server can be a performance bottleneck. In case of DVCS, developers don’t interact with the server unless they need to push or pull changes. All the heavy lifting happens on the client side, so the server hardware can be very simple indeed.

Security:
Git uses a common cryptographic hash function called secure hash function (SHA1), to name and identify objects within its database. Every file and commit is check-summed and retrieved by its
checksum at the time of checkout. It implies that, it is impossible to change file, date, and commit message and any other data from the Git database without knowing Git.

Git Components:

Git has several key components that work together to manage version control effectively. Here’s an overview of the main components:

Working Directory: This is the actual directory where project files are located. Modifications made to files in the working directory are considered ‘untracked’ until explicitly staged for commit.

Staging Area : The staging area acts as an intermediate step between the working directory and the .git directory. Files in the staging area are ‘staged’ to be included in the next commit. This allows for a selective and controlled approach to committing changes.

Repository: A Git repository (repo) is where all your project files and their entire version history are stored. Repositories can be local (on your machine) or remote (hosted on services like GitHub).

Remote Repositories: Remote repositories serve as centralized hubs where team members can push and pull changes, ensuring a synchronized and collaborative development process.

Key Git Concepts:

Git facilitates powerful version control and collaboration by employing a model based on commits, branches, merging, and remote repositories. This section explains how these key concepts work together to enable version control.

Committing Changes: Commit holds the current state of the repository. A commit is also named by SHA1 hash. You can consider a commit object as a node of the linked list. Every commit object has a pointer to the parent commit object. From a given commit, you can traverse back by looking at the parent pointer
to view the history of the commit. This creates a commit object with the following steps:

  • Adding changes – The git add command stages edits from the working directory to be included in the next commit. This adds files to the staging area.
  • Committing locally – The git commit command snapshots changes from the staging area and adds the commit to the local repository timeline creating a new revision. Commits always include metadata like a timestamp and author.

By repeating this edit, stage, and commit cycle, developers build up linear project history over time.

Branching: Branches are used to create another line of development. By default, Git has a master branch, which is same as trunk in Subversion. Usually, a branch is created to work on a new feature. Once the
feature is completed, it is merged back with the master branch and we delete the branch. Every branch is referenced by HEAD, which points to the latest commit in the branch. Whenever you make a commit, HEAD is updated with the latest commit.

we can create a new branch to add a feature without impacting the main codebase:

git branch new-feature
git switch new-feature
  • Creating branches – The git branch command generates new branch pointers, creating independent streams of development.
  • Switching branches – Developers toggle between branches using the git switch command to work on features in isolation.

Merging: Once you’ve made changes in a branch and are ready to integrate them back into another branch (like main), you use git merge. This combines the histories of the branches and can sometimes result in conflicts if changes overlap.

git merge command: 

git checkout main
git merge new-feature

Remotes: Remotes refer to shared repositories stored on remote servers. Teams collaborate across a network by:

Pull and Push:

  • Pull: When you want to update your local repository with changes from a remote repository, you use git pull. This fetches changes and merges them into your current branch.
  • Push: To send your local commits to a remote repository, you use git push. This updates the remote branch with your local commits.

Common Commands in Git:

git init – Initializes a new Git repository in the current directory.

git clone – Creates a copy of an existing remote repository on your local machine.

git status – Displays the state of the working directory and staging area, showing which changes are staged, unstaged, or untracked.

git add – Marks files in the working directory that have been newly created or altered to be included in the next commit. This adds them to the staging area.

git push/pull – Synchronizes changes from a local repository to a remote repository. git push transfers committed changes to remote repositories, making them accessible to others collaborating on the same project. git pull retrieves the latest commits from the remote, updating the local repository with the changes made by others.

git commit – Records the files within the staging area as a new commit in the repository update history.

git log – Shows the commit history for the current branch, displaying commit hashes, messages, and authors.

Conclusion:

Git is powerful for version control, enabling collaboration, backup, and a detailed history of changes. By understanding its core concepts, you can manage projects effectively and handle code changes efficiently.

shamitha
shamitha
Leave Comment
Share This Blog
Recent Posts
Get The Latest Updates

Subscribe To Our Newsletter

No spam, notifications only about our New Course updates.