What is Jaeger?
Jaeger is software that you can use to monitor and troubleshoot problems on interconnected software components called microservices. Several microservices communicate with each other to complete a single software function. Developers use Jaeger to visualize the chain of events in these microservice interactions to isolate the problem when something goes wrong. Jaeger is also called Jaeger Tracing because it follows, or traces, the path of a request through a series of microservice interactions.
Why is Jaeger important?
In the past, software design was monolithic, with several functions bundled together in a single code base. For example, to design a food ordering app, the food menu, restaurants, and payment systems were all bundled into a single software unit. This type of software design made the solution more complex and difficult to maintain. When developers made code changes in one area, it affected the entire system, making change management a long and tedious process. To solve this problem, architecture design became more modular.
Microservices
Modern applications function as a collection of smaller independent units called microservices. For example, a modern food-ordering app might consist of these parts:
- A geolocation service to detect where the customer is
- A service that collects and sends orders to restaurants
- A payment gateway with several payment options
Each microservice works as an independent application and has access to its own database and logic. Microservices communicate with each other using requests and responses, like a web application. A microservices base system is also called a distributed system.
Troubleshooting microservices architecture
It is challenging to investigate problems in distributed systems because of the complex behavior of microservices. For example, placing a food order on a modular app might trigger several requests to different microservices. These requests can happen concurrently and independently and need not be sequential. If a problem occurs with the food order, developers need to determine which microservice caused it. Conventional problem tracking approaches provide only a partial picture of the request, making troubleshooting microservices tedious.
Jaeger is a software tool that IT teams use to gain visibility and clarity on the whole chain of events. They can resolve problems faster and improve the customer experience.
What is Jaeger used for?
Developers use Jaeger to improve distributed system performance in several different ways. We give some examples below.
Distributed transaction monitoring
Jaeger has features that monitor data movements between microservices. Developers can take a proactive approach to detect and resolve problems before disrupting the user experience.
Latency optimization
Jaeger analytics can locate bottlenecks in microservices that slow down an application. Developers use Jaeger to inspect the behavior of microservices and find ways to make them faster.
Root cause analysis
In a microservice architecture, one problem can lead to others. Developers can use Jaeger to find the starting point of a series of related issues in an application.
Service dependency analysis
Service dependency means an application depends on several components to run. For example, a navigation app depends on location services on the mobile app. Developers use Jaeger to understand the complex relationships between the different microservices.
Distributed context propagation
Distributed context propagation is the way an application passes descriptive information along with the data. This helps developers to evaluate microservice performance as a whole. For example, Jaeger tags order requests with the customer’s name so that developers can associate the request path with the specific customer.
How does Jaeger work?
Jaeger works on the principles of distributed tracing and uses the OpenTracing framework.
Distributed tracing
Distributed tracing is a software technique that monitors sequences of events among microservices. It keeps track of all connections and provides charts and graphs to visualize request paths in an application. As a distributed tracing tool, Jaeger tracks request movements by assigning a unique identifier to each request and collecting information when a particular service processes the request.
OpenTracing
OpenTracing is an open-source, or freely available, framework that gives the standards to make accurate, turnkey distributed tracing a reality across modern software systems. For example, it provides a common standard for defining the structure of monitored information that travels between microservices. Jaeger uses OpenTracing to provide a complete solution to collect, store, manage, analyze, and visualize microservices data.
OpenTracing data model
The OpenTracing data model provides the basic definition to connect data from different components. The two main terms it uses are span and trace.
Span
A span is a single logical unit of work done in a distributed tracing system. Each span has these components:
- An operation name
- A start time and stop time
- Tags or values that help developers to analyze the span
- Logs that store any messages the microservice generates
- Span context or additional descriptions of the span
Trace
A trace is a collection of one or more spans that belong to the same process. It represents the events that happen during a specific time. Spans that belong to the same trace share the same Trace ID. For example, a trace that is generated when a customer orders food results in the following spans:
- Customer submits an order
- Payment is processed
- Order list is submitted to the restaurant
- Food is picked up
- Food is delivered
What are the components of Jaeger?
The Jaeger distributed tracing platform consists of the following components.
Jaeger client
The Jaeger client contains language-specific implementations of OpenTracing API in programming languages like Go, JavaScript, Java, Python, Ruby, and PHP.
Developers use these APIs to create Jaeger spans without writing the source code for distributed tracing.
Jaeger agent
The Jaeger agent is a network daemon or a process that runs continuously in the background to perform functions that are required by other processes. It listens for spans that the client sends through user datagram protocol (UDP), a type of communication method that allows applications to exchange messages over a network.
The agent connects to the client in container environments like Amazon Elastic Kubernetes Service. The agent groups create spans in batches and send them to the collector. This allows the application to run without actively sending trace information to the Jaeger backend.
Jaeger collector
A Jaeger collector is a software component that retrieves traces from the Jaeger collector. It checks, processes, and stores the traces in the database.
Storage
The Jaeger tracing system receives spans and stores them in a persistent storage backend or database. Persistent storage means the stored data remains intact even if the computer is powered off. For instance, developers use AWS OpenSearch Service as persistent storage for storing and accepting spans.
Ingester
One way to deploy Jaeger is by sending trace data to Kafka, a distributed system for applications to store and retrieve streams of information. An ingester is a module that reads trace data from Kafka and stores it separately.
Query
The query service retrieves trace information from the database. Developers use queries to find traces with specific time, tags, duration, and operation.
Jaeger Console
Jaeger Console is a software program with a user interface that you can use to view and analyze traces. It displays trace data in graphs and charts.
How do developers use Jaeger?
When developers build an application, they use Jaeger client libraries to create spans. By adding codes to the program to produce trace data, they create what is known as an instrumented application. The instrumented application automatically generates the following:
- Spans containing span ID, trace ID, tags, logs, and span context
- Traces for every request
Developers use the Jaeger Console to search, filter, visualize, and analyze this distributed tracing data. They can use the Jaeger UI software to view detailed information such as process duration, errors, and logs from the microservices.
What are Jaeger sampling strategies?
An instrumented application automatically transmits trace data whenever the application is running. You can use this trace data to measure application performance. It is also called telemetry data. To prevent the Jaeger backend from being overwhelmed with excessive telemetry data, you can filter or sample it by configuring sampling strategies in your Jaeger implementation. These are some sampling strategies:
- Constant sampling collects the same number of samples for all types of traces.
- Probabilistic sampling collects samples randomly until it reaches a certain percentage.
- Rate limiting sampling retrieves a specific number of samples every second.
- Adaptive sampling automatically adjusts the sample rate to achieve a number of traces for a specific duration.
What is AWS App Mesh?
AWS App Mesh is a service mesh or a software infrastructure that does the heavy lifting so that you can more easily manage microservices-based distributed systems. AWS App Mesh does the following:
- Provides consistent end-to-end visibility and high availability for your applications.
- Configures each service to export telemetry data and implement consistent communications control logic across your application.
- Provides network traffic control to help developers build secure cloud applications.
You can use AWS App Mesh as a stand-alone solution for your distributed tracing needs. It also supports several non-AWS third-party tools, like Jaeger, that you might use to monitor, log, or trace microservices communications.
Get started with Jaeger on App Mesh by creating an AWS account today.