In today’s world people still argue about what or who DevOps is. Some may say it is a person who writes CI/CD or manages cloud infrastructure, others claim that it is a culture and knowledge-sharing attitude. If you are reading this post, you most probably have your own opinion on that matter. There probably isn’t a single “right” answer to the question. Whichever way you understand “DevOps”, you need to use some specific tools, and in this article I’ll share some insights on how we handle our processes at Netguru. I could have just listed all common DevOps tools, but there are plenty of articles about that on the Internet, so let me focus on Netguru's way of using 5 selected tools. As a tasty starter let me tell you that some of those were developed internally!
One of the most common tools in the DevOps inventory, Terraform provides you an easy way to write your infrastructure as code. Let's talk about how we use it at Netguru. First of all, we use modules, which allow you to split infrastructure into smaller bricks. Each brick is responsible for an indivisible part of the environment. One of the examples could be a VPC module for AWS. It manages all the networking, subnets, routing tables etc. This is the base module we start with: it spins up the basic environment with no additional services. All other modules are built on top of this infrastructure. We also use something that we call templates. Templates are groups of modules used to spin up a specific environment. If you want to set up a 3-tier application - we already have a template for it, you don’t have to look for each module separately! Additionally, we use standard best practices for Terraform, like remote state file, state locking, and Atlantis. Sometimes, we throw Terragrunt into this mix if the project is demanding enough.
Each organization has some domains to manage. This can be as simple as one domain with a single entry or as complex as a DNS solution. In the case of Netguru, we have quite a few, and I’m counting only internal domains owned by our organization. Some of those are for public use like netguru.com, some are reserved for internal development. The latter ones can have a few hundred entries. So how do we manage them? That’s an excellent question! We created a single source in a Git repository, which leads to each domain having its own unique file. When the file is updated on the master branch, it triggers an automation to sync changes with the domain registrar using an API. This way we can control all historical DNS changes, plus developers can request their own updates via Git pull requests.
In every application development lifecycle comes a time when it should be tested on a production-like environment. Usually, this environment is called staging - it can be temporary or permanent. There can be a few concurrent staging environments running different branches of an application. At Netguru we have an internal tool to spin up such infrastructure quickly. Each developer can go to a dedicated website, click through a GUI, select the desired services like databases, compute power, number of running applications and exposed ports. And keep in mind that those are only example settings, there are many more to choose from! Remember the Terraform templates I was talking about in the first section? In the backend, those templates are run by using Terraform to spin up the entire environment. Developers will have a staging infrastructure ready to deploy in 5 minutes. What’s more, at Netguru we have a special team dedicated to this kind of project, and the one I’m describing here is actively developed and improved each day. Unfortunately, I cannot share more technical details here, but if you are going to join us one day, you will get to learn all the insights.
Imagine that you have hundreds of servers to manage on different projects and each team requires SSH access to a selected set of servers. Access is given based on SSH keys, which can be personal or shared within the team. How do you manage and rotate keys in this scenario? Keep in mind that servers can be spread across multiple cloud providers or bare-metal owned by the client, so vendor-locked solutions won’t work. At Netguru, we have implemented a centralized SSH access control system. There are two types of files required for this to work. The first one describes each member or team within Netguru. The most important information here is their public SSH key. The second type of files describes each server and the person who can access it. It can refer to people, teams or direct SSH keys. How does it all work? All of these definitions are processed by an internal application which exposes an API endpoint. When you authorize this endpoint, it will list members who can access a given machine. We configure the SSH daemon on each server to ask for authorized keys from this API. That’s it, simple solutions are the best!
As you might have already noticed, we like to automate things. The same goes for our monitoring system. We keep all of our monitoring configurations in a Git repository and each project is described as a simple .yml config file. Such a file describes all servers that belong to a project and which elements should be monitored (for example node stats, Docker containers, public endpoints etc). Additionally, you can specify alerts policies and notification channels. When someone updates or adds a new configuration file, automation is run to update the monitoring tool settings and set up all dashboards. By doing this, we don’t have to manage projects from the GUI level as everything is standardized. Additionally, we can integrate multiple monitoring solutions for high availability to double-check critical application components with high SLA requirements.
All of the above are just example tools we use everyday at work and there are many more secret tips and tricks to discover. Has this post caught your eye? Do you see room for improvement? Well, we are always open to change! Maybe you can share your knowledge with us as well? If you’re interested in joining, now is a good time - we are growing and our DevOps team is expanding. You can find more details here