Let’s start this post by reviewing the software development history. If you have pursued an education in computer science or a related field, you are probably familiar with a term called “Waterfall model” in the software development life cycle. It leads us back to the initial days when software was not too complex, the requirements were defined and the revisions were close to none. Everything was going smooth when suddenly software became immortal and clients began asking for revisions according to the changing trend in the market 🤖
Then came the “Agile” way of doing things which is still very popular in many companies. Agile does not require you to submit your multi-page requirement to the company, but are rather ready for revisions and updates every week. How great it is for the customers as well as the client? But there is a simple problem though! When the LOC (lines of code) is in multi-million number and software is huge, a simple change in the code will demand additional work in integration and will request to analyze and run the tests again on the integrated code. This might happen every week and with repeated such scenarios, humans are bound to make mistakes! Hence, we needed something that could automate the process and take this load off our shoulders.
This gave birth to the CI/CD pipeline mechanism 💥 It could automate the integration, testing, and building up of code. With just a single manual push, you could also deploy the website and serve your users in no time. This is made possible with tools that are available just for this purpose: CI/CD tools. This process is termed as DevOps and involving Git in a similar fashion is termed as GitOps in the technology world. This post will guide you through this fairly new term called GitOps and how infrastructure as a code is changing the overall operations process today. With the tools and processes used in GitOps, I hope that by the end of this post, you will have a good understanding of GitOps.
Table of Contents – Guide to GitOps
- Infrastructure as Code (IaC)
- What is GitOps?
- Difference Between IaC and GitOps
- Why GitOps?
- How GitOps Works?
- Tools For GitOps
Infrastructure as Code (IaC) is a new flower in the bouquet of infrastructure management and operations in software development 💐 The concept is very simple, I am quite certain you might have got the idea yourself from the name.
Infrastructure as Code denotes the development and maintenance of the infrastructure with the help of code. Conventionally, we used to maintain the infrastructure by physically maintaining the server and data center. Infrastructure as a code helps us define the requirements in a file (normally YAML or a JSON file) and upload it on to the tool. The tool on the cloud then reads the file, resolves the requirement, and assigns the required components to the user.
For example, on AWS, if I would like to initialize an EC2 instance under the ECS container, I can just mention the same in a YAML file and push it through the command line. The EC2 will be allocated to me automatically without any hassle. Obviously, this requires appropriate permissions but is definitely possible.
The bottom line is, if we want to maintain or initialize the infrastructure via Code files, we call it: Infrastructure as Code or IaC. The whole point behind discussing this concept will be cleared further into my article but remember: Even though GitOps relates to IaC, IaC does not mean GitOps❗ You may see these terms interchangeably on the internet, but there is a difference that will be taken into account in the third section below.
With the term “GitOps”, the first similar term that comes to mind is “DevOps”. DevOps includes the development of operations tools and practices into one. In GitOps, we have moved from cloud operations to Git. Since Git is already used for software development, involving Git in operations not only includes the practices of DevOps, but also leverages the power of a version control system that is as strong as Git 💪
The main difference between DevOps and GitOps is that Git uses IaC for its operations instead of maintaining operations directly in GitOps. It will be quite accurate to consider GitOps as a transformation to DevOps in recent times. With the power of a version control system (including the push and pull feature), GitOps allows the developers to push the infrastructure code into the environment repository of the software. Noticing that a change has occurred, GitOps makes the required changes into the software environment and infrastructure and move it further into the CI/CD pipeline.
During my meetings, while gossiping with different developers and newbies, I have noticed that people often take IaC and GitOps as a single term. I am sure some of the readers either might have a vague idea or would be confused between the two. Hence, it is important to address this difference. This comparison table can be helpful to clarify things:
|Code Change Process||Via Merge Request||Review/Approval|
|Code Storage||Git||May or May not depend on Git|
|Infrastructure Updates||Automated||FTP/SSH/Manual/CLI etc.|
As you can see IaC is a different process and stands on its own, while GitOps takes advantage of IaC as well as Git into its system.
Although GitOps is fairly new, the use in it is important & promising because of a few notable aspects: ✍
- Involvement Of Git: As I mentioned in the above sections, the involvement of a version control system as popular as Git does not require learning an absolutely new tool such as Amazon’s CodeDeploy. As a developer, there is a great chance you are familiar with Git and GitHub user interface and facilities. On top of that, you get all the benefits of a world-class version control system into your operation management.
- Transparency: Since Git is familiar to everyone in the team, anyone (among authorized users) can just open the platform and look at the changes very easily. You can also pull the changes and analyze them on the system. Such transparency is extremely useful for the new joiners who want to understand the overall operations and infrastructure.
- Developer Centric: DevOps is not developer-centric. DevOps requires two teams: one for development and one for operations management. What is happening in the infrastructure and cloud is the responsibility of the operations team. With GitOps, since Git is handling the infrastructure, developers just need to push the changes to the repository. This makes GitOps a developer-centric platform where the developers themselves can manage and control the infrastructure and environment.
- Revisiting Previous Environments is Easier: With GitOps, it is extremely easy to revisit the previous environment configurations of the software. All you need to do is open the previous version, which is just two steps in Git, and look at the code of that time to understand the difference between current changes in the environment and infrastructure.
- Rollbacks are Easier: Earlier to GitOps, rollbacks were a headache since it is very hard to maintain the previous version and maintaining the compatibility of the system. With GitOps, with just a few clicks, you can rollback to the previous state if something goes wrong in the new state. If you are familiar with Git, you must already be familiar with this as part of the software development.
- Collaborating is Easier: Git like other version control systems allows collaboration for software development. Therefore, leveraging the same capabilities of Git, GitOps allows authorized users to collaborate smoothly sitting anywhere in the world.
- Environment Duplication: When we are dealing with the traditional methods of developing the infrastructure, duplicating the environment for any purpose requires heavy work and a lot of time. With GitOps, as we use IaC underneath, we can duplicate the environment in no time for other teams, regions, etc. since everything is saved as code inside the repository.
- Cost Efficient: GitOps is very cost-efficient as infrastructure management is the main reason for expense in a project. Converting everything to code and taking everything to Git, reduces the cost of infrastructure management at a huge rate.
- Deploy Faster: GitOps allows faster deployments since the infrastructure and code management becomes extremely easy with GitOps.
- Extremely Secure: Git only allows authorized users to access the environment or making any changes to it. In addition to that, Git has been known to be extremely secure towards malicious attacks trying to pry over to the sensitive repositories.
- Extremely Easy To Audit: At last, when the time comes to analyze what changes were done, how they were implemented and what events were executed at what time, Git Logs provides all the logs for such scenarios and no external tools are required. This makes the audit process very easy.
Phew 💥 I hope I was able to convince you now how GitOps is not just some random talk that would fade away with time. With so many important features and strong results, GitOps is certainly here to stay for a long time!
As shown by the image placed in the introductory section above, GitOps works by the combination of two things: an IaC system and a CI/CD pipeline. Therefore, we need to create a mechanism to get things working the GitOps way. To understand the pipeline and development process, you need to understand the following prerequisites:
Environment Repository and IaC
There are two types of repositories on a system working under GitOps:
- The environment repository.
- The code repository of the application.
The environment repository is the only one in the system and contains the environment configuration code into it. This configuration code manages the commands and creation of the environment. The environment repository is the heart of IaC 💗 The developer needs to push the file with the code working as instructions for deployment into this repository. When we want the changes again in the future, we repeat the same process. The second is the code repository which is a normal GitHub repository that you might already be familiar with.
Another component of this system is the CI/CD pipeline. Therefore, in our hands we have two components till now:
Can you guess what we need now? 🤔
Yes! A method to connect these two boxes so that CI/CD knows the time when to run the tests and deploy i.e. to run the pipeline. For this, we explore the methods of deployment in GitOps 💡
Deployments in GitOps
The following section explains the deployment process in GitOps. In layman’s terms, this section will explore the methods as to how CI/CD tools will know that some changes have happened in the repository (environment or code) since everything is code-based now.
The push-based deployments is the first of the two methods used for deploying the changes in GitOps. Push-based deployments are the traditional deployment strategy that we use in DevOps as well as with various CI/CD tools.
Push-based deployment in GitOps is a simple process of pushing the code to the repository which goes into the build pipeline. If the code is intended to change the environment configuration, the environment repository is updated and this change will trigger the CI/CD pipeline.
Since we are only focused on the environment, we will not consider code deployment or container management into account. The application code deployment is assumed to be known as a user of Git. The same changes are shown below with a flow image:
The above image shows that an application repository change triggers in the build pipeline. If there is a change in the environment repository, the same is updated in the repository. This change then triggers the deployment pipeline and all changes are deployed. The tools used in push-based deployment are Jenkins, TeamCity, CircleCI, Travis, etc.
Push-based deployment looks simple and familiar to us, but it suffers one major disadvantage. It is a one-way road. The changes in the environment repository trigger the deployment pipeline and hence the infrastructure. That’s great! But what if something happens to the infrastructure itself? 🤦♂️We do not have a method except manually scrutinizing the infrastructure state and the current environment repository state or code from time to time to know if everything is alright or not. This disadvantage is so big that it cannot be ignored ❌ As a result of this, push-based deployments in GitOps is never recommended and now you know why.
Push-based deployments were smooth except for one major disadvantage but pull-based deployment covers that.
Pull-based deployment is push-based deployment + operator
An operator is a tool that can not only update the infrastructure but can also observe it for unintended changes regularly. The operator can detect any difference between the deployed infrastructure and the desired infrastructure and act accordingly on the environment repository. The flow diagram for the pull-based deployment might look like the following:
There are two additions in the above image: the first is the addition of the operator into the system and the second is the double-ended arrow between the operator and the deployment pipeline. Since the operator not only can trigger the deployment pipeline for changes, it can also reach out to it directly, observe any difference between the desired state and the deployed state, and can write the same in the environment repository. Furthermore, the protagonist here is the operator and it is extremely necessary to have a good command of it.
Since now we are working with a version control system which contains a lot of branches, we can even connect different operators with different branches to observe. Although it will increase the complexities of the system. You can use Helm Operator for the same or can explore more and comment on your favorite operator in the comment section below 👇
In the following table, I have described different tools for each process required from code push to deployment using GitOps. I recommend comparing the tools and use the one that suits you best according to your requirements: 👇
|Code and Repository Storage||Git|
|Infrastructure Provisioning||Terraform||AWS Cloud Formation||Pulumi|
While I was writing this post, I have found the above tools to be popular with high performance. If you found another suitable tool that is not listed above, feel free to share with us! 😃
GitOps is a recent advancement in the whole software development life cycle. There is no doubt that GitOps will stay with us for a long time according to the pace it is growing. But it is also true that since GitOps is fairly new, organizations are yet to shift their complete focus towards it. This focus will work as a catalyst in methods & tools helping to create better software development through GitOps.
Currently, this is the best time to learn this technology and be ahead in the race 🏁 GitOps uses infrastructure as code as the underlying mechanism to deploy changes easily, cheaply, and quickly. GitOps is a developer-centric process and is therefore widely accepted into the community.
Usually, on the internet and workshops, you would see that Kubernetes has been used with GitOps and I have also mentioned the same in the table above. This makes Kubernetes seem necessary for GitOps which is not true. GitOps can be achieved without Kubernetes too but the container orchestration will become additional work. If there is no problem for you, there is definitely no problem in not using Kubernetes.
The only challenge GitOps sees is that since it revolves around Git, not everything in SDLC does the same. Although we are moving towards the same goal, we are still in progress. GitOps is definitely easier and achievable but complex tasks for big projects might force you to go outside the GitOps bubble 💭 This can only be reviewed by building the model and architecture of the project. If you ask me, GitOps is definitely something worth the try and as I see it, it has a bright future ✨
💬 I invite you to share your thoughts and experiences with me in the comment section below and help everyone explore more about GitOps 🔍