How we speed up backend development
Every time a rider asks for a journey, a driver performs a drop off, or a corporate client books for a ride at Cabify, a lot of pieces that compose our technical puzzle start to interact among them. These pieces are called microservices, and all of them create an interconnected ecosystem that allows us to provide service to our customers. In order to develop them we mainly use Golang, Elixir and Ruby, being the first one the most widely adopted language.
In their daily job our developers must face a lot of different problems that can be grouped in two different categories: those related with the business domain, and those related with the technical infrastructure. Middleware, the team I belong to, is in charge of tackling the accidental complexity of the infrastructure so other teams can focus on addressing problems in their own domain.
One of our main missions is to find a way the teams in charge of developing the microservices do not spend time trying to fit them into the Cabify’s platform. As Golang is the most widely adopted language, we have been focused on finding tools and libraries for this language to abstract the integration of the services in the platform.
As you might know there is no industry-standard framework to build services in Golang (aka Go). While Ruby has Ruby on Rails and Java has Spring, Go has a rich ecosystem compound by many loosely coupled libraries. The Go official libraries provide us a standard way of handling basic HTTP communications, data encoding/decoding, time management, etc. But many other decisions are left to the programmers. As no tool satisfied our necessities we made a new one. And we called it Servant.
Servant is not a framework but the integration of multiple libraries and tools properly configured to facilitate their setup to match the infrastructure we have at Cabify. It tries to hide as many irrelevant details as possible so developers can focus on their domain logic, instead of infrastructure problems. This internal library allows us to speed up the backend development. Do you want to know how it works? Let’s do a quick tour.
This is the smallest microservice we can write thanks to Servant.
1
2
3
4
5
6
7
8
package main
import "gopkg.cabify.tools/servant"
func main() {
service := servant.NewService()
service.Run()
}
The servant.NewService()
returns a new service that we’ll use as an entry point for almost every interaction. It will read the configuration from the environment variables, and will use it to initialize the internal machinery. The service.Run()
executes the service in an infinite loop.
Let’s execute it:
1
2
3
4
5
6
7
8
9
{"appname":"main","appversion":"","level":"info","msg":"Creating a new service","time":"2022-04-01T09:49:48+02:00"}
{"level":"info","msg":"Initializing default Gin server","time":"2022-04-01T09:49:48+02:00"}
{"level":"info","msg":"Configuring HTTP server healthcheck route","path":"/status","time":"2022-04-01T09:49:48+02:00"}
{"level":"info","msg":"Configuring HTTP server healthcheck route","path":"/status/liveness","time":"2022-04-01T09:49:48+02:00"}
{"level":"info","msg":"Configuring HTTP server healthcheck route","path":"/status/readiness","time":"2022-04-01T09:49:48+02:00"}
{"level":"info","msg":"Configuring HTTP server Prometheus metrics route","path":"/metrics","time":"2022-04-01T09:49:48+02:00"}
{"level":"info","msg":"Configuring HTTP server middleware to obtain HTTP metrics","time":"2022-04-01T09:49:48+02:00"}
{"address":":8080","level":"info","msg":"Starting task","task-name":"http-server","time":"2022-04-01T09:49:48+02:00"}
{"level":"info","msg":"Service is up and running","time":"2022-04-01T09:49:48+02:00"}
Structured logging.
Servant configures the log backend according to the Cabify standards, using the Logrus library. It is written in JSON and it provides several normalized fields as level
, msg
or time
. Having those normalized fields will do easier to dive into the logs via Kibana as all the services use the same name for the same purpose, and will avoid the cardinality explosion we would have if every service would name them in their own (we could have msg
, message
, memo
, report
, etc).
Health checks.
Take a look to:
1
2
3
{"level":"info","msg":"Configuring HTTP server healthcheck route","path":"/status","time":"2022-04-01T09:49:48+02:00"}
{"level":"info","msg":"Configuring HTTP server healthcheck route","path":"/status/liveness","time":"2022-04-01T09:49:48+02:00"}
{"level":"info","msg":"Configuring HTTP server healthcheck route","path":"/status/readiness","time":"2022-04-01T09:49:48+02:00"}
Three endpoints have been created:
/status
/status/liveness
/status/readiness
These endpoints are called health checks, and are used by our container orchestrators and load balancers to determine if the service container is up and ready to receive traffic, and if it’s not alive and must be restarted. To test how they work we can do:
1
2
$ curl -w "\n" http://localhost:8080/status
{"status":"ok"}
Nice. Our service is alive and ready to work.
Metrics.
The next interesting line is:
1
{"level":"info","msg":"Configuring HTTP server Prometheus metrics route","path":"/metrics","time":"2022-04-01T09:49:48+02:00"}
It explains to us that a route for /metrics
has been configured in our HTTP server in order to export metrics to a Prometheus server. Why do we do that? Because we need to collect, aggregate and analyze different metrics to be able to evaluate the response time of our application, the memory usage, CPU consumption, bandwidth, availability, error rate, etc.
Let’s see how many requests have been handled:
1
2
3
4
5
6
$ curl -w "\n" http://localhost:8080/metrics | grep promhttp_metric_handler_requests_total
# HELP promhttp_metric_handler_requests_total Total number of scrapes by HTTP status code.
# TYPE promhttp_metric_handler_requests_total counter
promhttp_metric_handler_requests_total{code="200"} 7
promhttp_metric_handler_requests_total{code="500"} 0
promhttp_metric_handler_requests_total{code="503"} 0
Awesome. No error has been handled.
Bootstrap and shutdown.
Let’s check now these lines:
1
2
3
{"appname":"main","appversion":"","level":"info","msg":"Creating a new service","time":"2022-04-01T09:49:48+02:00"}
[...]
{"level":"info","msg":"Service is up and running","time":"2022-04-01T09:49:48+02:00"}
As you see our service is up and running. Servant provides a way of starting the different parts of our service following a given order. And if we kill our app using Ctrl+C, what will happen?
1
2
{"delay":"2s","level":"info","msg":"Termination started, waiting before shutting down resources","time":"2022-04-01T09:54:39+02:00"}
{"address":":8080","level":"info","msg":"Shutting down task","task-name":"http-server","time":"2022-04-01T09:54:41+02:00"}
Servant is listening for the SIGHUP
, SIGINT
and SIGTERM
signals. When one of them is captured the shutdown sequence is started so our program is terminated in a graceful manner. First, Servant waits a few seconds to let ongoing actions finish before the program is terminated, being terminated in the opposite order they were started. After that, the HTTP server is closed, and the program exits with a status code of 0.
Servant in a nutshell.
As a summary, what we have shown to you is the essence of Servant. This code represents a minimum service that can be deployed in the Cabify platform. It provides health checks to be used by our container orchestrators and load balancers to determine if the service container is up and ready to receive traffic, or must be restarted. It also provides Prometheus instrumentation to be scrapped and served by our monitoring infrastructure. It shows the log of the application in a normalized way so we can easily dive into it via Kibana. The service lifecycle is managed by the library to perform graceful bootstraps and shutdowns.
All this without worrying our developers about the current infrastructure. All of that with just two lines of source code.
But that is not all: Servant provides a lot of useful functionalities and tools, and facilitates their integration and configuration. Thanks to it the developers will have HTTP and gRPC clients and servers, integration with caching services and MySQL databases, distributed tracing, recoverers, circuit-breakers, retriers, sane default configuration, and many more.
We are hiring!
Join our team and help us transform our cities with sustainable mobility.
Check out the open positions we have in .