Developing Event-driven Applications with Apache Kafka and Red Hat AMQ Streams
Abstract
| Goal | Describe the principles of event-driven applications. |
| Objectives |
Describe event-driven applications. |
| Sections |
|
User requirements for modern applications have changed drastically in the past decade. Users expect applications to be responsive and always available. At the same time, applications experience large numbers of users that transfer unprecedented amounts of data. Users also expect new application features, which means developers must redeploy the application in a rapid fashion.
To meet the demands of users and application stakeholders, developers commonly use public and private clouds. Cloud features, such as elastic resource allocation or exposed APIs for self-service deployments, help developers to meet the demands.
Modern applications are commonly structured as distributed applications (or systems). Distributed applications enable developers to fully utilize the cloud, as well as distributed systems orchestrators, such as Red Hat OpenShift Container Platform.
Distributed systems divide an application into logically separate components that communicate with each other by exchanging messages. Developers often implement these distributed applications by using a microservice architecture. Developers can design microservices with varying degrees of separation and isolation, or coupling. Loosely coupled microservices are designed to provide:
Horizontal scalability of individual system parts.
A high degree of encapsulation. This can lead to more resilient and reliable systems.
Independence of each component. This enables polyglot systems.
Developers that implement distributed systems by using microservices, however, often encounter issues with stateful data management and synchronous communication.
Developers might design microservices that need to share system state, for example:
A gateway microservice implements sticky session management because of a legacy component.
Multiple microservices share state to provide transaction management.
Depending on the implementation of the state management, some services might become tightly coupled to the shared state. Shared state can lead to issues, such as:
The need for additional infrastructure to share and manage state, which can reduce system resiliency.
Difficulty implementing a rolling zero-downtime upgrade because administrators must migrate distributed state.
The need for implementing logic to prevent inconsistent state or race conditions.
Developers can maintain and distribute the application state by using a distributed cache or database system. Distributing state, however, can be difficult to manage as the number of replicas increase, for example, due to caching issues.
In distributed systems, computation of a problem is split between multiple components. Consequently, components of a distributed system must be able to communicate with each other. Developers often choose a synchronous protocol for communication, such as:
REST
gRPC (in synchronous mode)
SOAP
In synchronous communication mode, a message blocks the thread until the component is ready to respond. For example, a REST API GET request that returns a list of users might block the execution thread until the database responds with the result.
Developers might find synchronous point-to-point protocols easier to work with, because those protocols enable procedural-style programming in distributed systems. Familiarity with the protocols enables faster initial development. Because synchronous communication increases coupling between services, however, relying on synchronous communication provides a number of challenges.
Synchronous communication can increase overall latency of the system. If one part of a distributed system synchronously waits for a response from a slower part of the system, then performance issues propagate through both parts.
Applying synchronous communication in microservices increases coupling between services. Consequently, clients might experience increased latency, or timeout exceptions.
Similarly to latency, synchronous communication can propagate failures from one part of the system to another. Synchronous microservices often implement mitigating design patterns that prevent cascading failures, such as bulkhead or circuit breakers. However, depending on the system architecture, a failure in one part of the system can still render the overall system unreliable or unresponsive.
Event-driven architecture (EDA) is a software architecture model that promotes capturing, communication, and processing of events. Events in EDA enable distributed applications to use asynchronous communication, and promote loose coupling between parts of a distributed application. EDA also enables implementing distributed state as events, which results in services de-coupled from application state.
The following are some of the benefits of EDA:
Improved scalability, because loosely coupled components scale independently.
Enhanced stability, because loosely coupled components fail independently.
Easier parallelization, because multiple components can process the same events independently. Additionally, this simplifies the creation of components unrelated to business logic, such as logging, monitoring, auditing, and alerting.
Apache Kafka is used to implement message-driven communication patterns in distributed systems. Kafka provides the infrastructure that is necessary to produce, consume, and react to events in your distributed systems.
Quarkus is a Java framework for distributed systems that enables developers to create microservices. Quarkus microservices then use the Kafka infrastructure for producing and consuming events in a non-blocking, asynchronous manner.
Events EDA typically represent one of the following:
Real-world event, such as a customer placing an order.
Internal event, such as a notification that a customer order was persisted to a database.
An event carries data, and is separated from an action that might result from the event. This means an event has no implementation of the business logic that processes the event.
Developers that use events to model real-world events often employ Domain Driven Design (DDD). DDD is a methodology for structuring code and defining loosely coupled microservices based on the application business domains. In DDD, developers work with domain experts to split a complex system into multiple domains. Events within a domain are called domain events.
DDD defines the following domain event types:
- Commands
A request for an action. For example, a command event can instruct a system to persist an order.
- Event
A notification about a real-world event. The
eventnotification is often a result of acommandnotification. For example, a customer placing an order might emit anOrderPlacedevent.- Query
A request for information. Queries imply no side effects. This means that queries do not alter state in a system. For example, finding a customer based on customer ID might emit a
CustomerQueryevent.
Similarly, distributed systems can emit internal state requests as events.
For example, a part of the system emits a PersistOrderCommand event.
Another part of the system processes the event and emits a OrderPersisted internal state notification.
Internal events are useful for state replication and scalability of stateful system parts. Such events are commonly called integration events.
Kafka enables developers to implement a number of event-driven patterns, such as the following:
- Publisher-Subscriber (Pub-Sub)
The pub-sub architecture defines a set of publishers (or producers), that send messages (or events) to message brokers. Subscribers receive and process messages from message brokers.
- Event Streaming
Kafka enables developers to create streaming pipelines. Because Kafka streaming pipelines are easy to design, parallelize, and scale, streaming pipelines enable developers to process a large amount of data in real time.
Event streaming is useful for processing large amounts of data in real time.
In Kafka, developers can use Kafka Streams to implement the event streaming pattern.
Reactive architecture describes a set of principles that developers should follow to adapt software systems to modern demands.
The preceding diagram follows the Reactive Manifesto and defines reactive principles with their causal relationships.
The following table illustrates some of the changes between traditional and modern application systems.
Table 1.1. Change of Application Requirements
| Demand | Traditional workloads | Modern workloads |
|---|---|---|
| Number of nodes | Tens of nodes | Up to thousands of nodes |
| Amount of data | Gigabytes | Up to petabytes (1 PB = 1,000,000 GB) |
| Uptime requirements | Between 99% and 99.9% uptime (hours of downtime per month) | Up to 99.999% uptime (seconds of downtime per month) |
According to the Reactive Manifesto, systems that aim to handle modern requirements should follow the following reactive principles:
- Responsive
Each part of a reactive system must respond as quickly as possible if it is feasible. Responsive services should deliver responses at similar rates at all times. This enables fast error detection and increased reliability of the overall system.
- Resilient
A reactive system stays responsive in the event of failures. Developers can achieve reactive resiliency by using replication, error containment and isolation, and delegation.
Reactive services often achieve resiliency by using replication on different threads, nodes, and networks.
- Elastic
A reactive system can adapt to varying loads. As the load increases, developers can meet the demand by providing more resources to the system, such as adding nodes. Similarly, when the load decreases, developers can remove additional resources.
Elasticity of a reactive system implies removing points of contention, such as a database that limits the number of nodes.
- Message Driven
A reactive system uses asynchronous, non-blocking, message-driven (or event-driven) communication patterns. The message-driven architecture enables reactive systems to load balance workload.
For example, a part of the system that is becoming overloaded starts applying backpressure, which means it does not consume all messages. Another replica of that system part can then process the remaining messages. This leads to greater failure isolation and resiliency.
Quarkus uses Eclipse Vert.x and Netty to enable developers to implement non-blocking, asynchronous, input/output (I/O) applications. Quarkus then uses SmallRye Reactive Messaging to enable developers to implement event-driven, reactive applications.
Because reactive systems require application architecture that is different from synchronous, imperative-style microservices, Quarkus enables developers to also implement blocking, synchronous, imperative programs. This is useful for migrating systems to the reactive architecture.
Quarkus and Kafka enable developers to use reactive programming techniques to implement reactive-style distributed systems. For example, Kafka topics can provide event resiliency while Quarkus enables developers to reactively consume and process the events.
This course uses the Quarkus framework to implement event-driven distributed systems with reactive elements. See the Quarkus Reactive Architecture document in the references to learn more about how Quarkus implements reactive-style or fully reactive systems.