Data Streaming Products
Jakob Jenkov |
There are many different data streaming products, and it can be hard to know where to start studying them, and which products do what etc. Therefore I will give an overview of some of the popular data streaming products in this tutorial.
Data streaming products fall into a few basic categories. These categories are:
- Data Streaming Services
- Data Stream Processing Tools
Data streaming services are the types of services that can store data sequentially to form a stream. Data is written to a data streaming service by a data producer. Once stored, the data can be read by data consumers.
Data stream processing tools are APIs, applications and services which can read data from a data streaming service, and analyze, or help you analyze, the data.
Several data streaming products contain both tools for storing and accessing the stream data, and for analyzing it too. I have listed the most commonly known products and its overall features in the table below. I apologize if my categorization is not 100% correct. Products do change over time! If you find a wrong categorization, please notify me (e.g. via email, LinkedIn or Twitter).
Product | Stream Storage | Stream Processing |
---|---|---|
Apache Spark | Yes | Yes |
Apache Kafka | Yes | Yes |
Confluent Kafka | Yes | Yes |
Amazon Kinesis | Yes | No |
Akka Streams | Yes | Yes |
Apache Flink | Yes | Yes |
Apache Pulsar | Yes | No |
Apache Kafka Consumer / Producer API | Part of Kafka | Yes |
Apache Kafka Streams | Part of Kafka | Yes |
ReactiveX | No | Yes |
Apache Storm | No | Yes |
Apache Samza | No | Yes |
Tweet | |
Jakob Jenkov |