Developing Event-driven Applications with Apache Kafka and Red Hat AMQ Streams
- SectionDescribing Kafka and AMQ Streams
- Quiz: Describing Kafka and AMQ Streams
- Describing the Kafka Ecosystem and Architecture
- Quiz: Describing the Kafka Ecosystem and Architecture
- Creating Topics
- Guided Exercise: Creating Topics
- Sending Data with Producers
- Guided Exercise: Sending Data with Producers
- Receiving Data with Consumers
- Guided Exercise: Receiving Data with Consumers
- Defining Data Formats and Structures
- Guided Exercise: Defining Data Formats and Structures
- Lab: Introducing Kafka and AMQ Streams Concepts
- Summary
Abstract
| Goal | Build applications with basic read and write messaging capabilities. |
| Objectives |
|
| Sections |
|
| Lab |
Introducing Kafka and AMQ Streams Concepts |
After completing this section, you should be able to describe Kafka and AMQ Streams' history and use cases.
Apache Kafka is an open source distributed system composed of servers and clients that communicate through the TCP protocol. Kafka was created as a high-performance messaging system, and it is referred to as a distributed commit log system, or as a distributed streaming platform.
Kafka was initially developed at LinkedIn to rebuild the user activity tracking pipeline as a set of real-time publish/subscribe communication channels. It was designed as a high-performance messaging platform that handles the real-time data feeds that a large company might have. Kafka is used in some of the largest data pipelines in the world, including more than 80% of all Fortune 100 companies. It was released as an open source project in late 2010, it was proposed and accepted as an Apache Software Foundation incubator project in July 2011, and finally graduated from the incubator in October 2012.
The Red Hat AMQ Streams component is a scalable, distributed, and high-performance data streaming platform based on the Apache Kafka project. You can use AMQ Streams on Red Hat OpenShift, or on Red Hat Enterprise Linux.
At the core of AMQ Streams, Kafka provides:
A publish/subscribe messaging model, similar to a traditional enterprise messaging system.
A distributed system optimized for high throughput of messages and low latency.
A durable, distributed, and fault-tolerant storage of data.
The ability to replay streams of events.
The ability to scale horizontally as the data streams grow.
All of these capabilities make AMQ Streams suitable for event-driven and event sourcing architectures.
AMQ Streams provides a streaming platform that allows the exchange of data with high throughput and low latency. This is a benefit to many common use cases:
- Messaging
Replaces traditional messaging systems such as Apache ActiveMQ or RabbitMQ. AMQ Streams was designed for fault tolerance and therefore provides strong durability and replication with better throughput as compared with traditional messaging systems. For example, you can use AMQ Streams as the communication layer in your event-driven application.
- Stream Processing
Responds to real-time events by storing, aggregating, enriching, and processing data streams. For example, you can capture the interactions of the users of your website, analyze the behavior, and then build customer profiles that help you to increase sales.
- Data Integration
Captures streams of events or data changes, and generates feeds compatible with other data systems to consume. For example, you can capture database changes in a monolithic application, and send events based on those changes without changing the application code.
- Metrics
Aggregates statistics from the components of your distributed applications to produce centralized feeds of operational data. For example, you can send real-time metrics of the vehicles of your company to AMQ Streams, and then build an application that consumes all that data and generates a real-time status report.
- Log Aggregation
Abstracts the file details and transforms them into a stream of data. It allows for easier support of multiple data sources, and allows the distribution of data consumption. For example, you can send the access logs of your Apache, and NGINX servers to AMQ Streams, and then build a monthly access report.