Adding MQ section in level101 (#150)

* wip

* wip

* wip

* wip

* enrich MQ Key Concepts

* update MQ types intro

* reformat intro.md

* enrich mq type

* Update intro.md

* Update key_concepts.md

* Update key_concepts.md

* Update further_reading.md

* add link to mq intro

* remove non MQ section
pull/157/head
Dick Tang 7 months ago committed by GitHub
parent 8751885b3b
commit c63e90212d
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -0,0 +1,23 @@
# Conclusion
We have covered basic concepts of Message Services. There is much more to learn and do. We hope this course gives you a good start and inspires you to explore further.
# Further reading
[https://sudhir.io/the-big-little-guide-to-message-queues](https://sudhir.io/the-big-little-guide-to-message-queues)
[Understanding message brokers: learn the mechanics of messaging though ActiveMQ and Kafka](http://www.oreilly.com/programming/free/understanding-message-brokers.csp)
[Video: The Myth of the Magical Messaging Fabric by Jakub Korab](https://www.youtube.com/watch?v=Ie3--CSpCGs)
[G. Fu, Y. Zhang and G. Yu, "A Fair Comparison of Message Queuing Systems," in IEEE Access, vol. 9, pp. 421-432, 2021, doi: 10.1109/ACCESS.2020.3046503.](https://ieeexplore.ieee.org/document/9303425) ([PDF](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9303425))
[Design Patterns for Cloud Native Applications: Chapter 2 Communication Patterns]()
[Choose between Azure messaging services - Event Grid, Event Hubs, and Service Bus](https://docs.microsoft.com/en-us/azure/event-grid/compare-messaging-services)
[Exactly-once message delivery](https://exactly-once.github.io/posts/exactly-once-delivery/)
[Task Queues](https://taskqueues.com/)
[RabbitMQ tutorial](https://www.rabbitmq.com/getstarted.html)

@ -0,0 +1,52 @@
# Messaging services
## What to expect from this course
At the end of training, you will have an understanding of what a Message Services is, learn about different types of Message Service implementation and understand some of the underlying concepts & trade offs.
## What is not covered under this course
We will not be deep diving into any specific Message Service.
## Course Contents
* [Introduction to Messaging Service](https://linkedin.github.io/school-of-sre/level101/messagequeue/intro/#introduction)
* [Delivery guarantees](https://linkedin.github.io/school-of-sre/level101/messagequeue/key_concepts/#delivery-guarantees)
* [Messages ordering and parallelism](https://linkedin.github.io/school-of-sre/level101/messagequeue/key_concepts/#messages-ordering-and-parallelism)
* [Fan Out / In](https://linkedin.github.io/school-of-sre/level101/messagequeue/key_concepts/#fan-out--in)
* [Poison Pills and Dead Letters](https://linkedin.github.io/school-of-sre/level101/messagequeue/key_concepts/#poison-pills-and-dead-letters)
## Introduction
In today's distributed systems and microservices architectures, messaging services play a crucial role in ensuring reliable communication and coordination between different components. These services enable the asynchronous exchange of messages, providing a wide range of benefits, such as increased performance, improved fault tolerance, and enhanced scalability.
This article will provide an overview of the various types of messaging services available, including general-purpose message queues, pub/sub messaging, stream processing, brokerless messaging, and database-as-queue systems. We will also explore key concepts like delivery guarantees, message ordering, parallelism, poison pills, and dead letters, which are essential to understanding how messaging services function and how they can be effectively utilized.
### Types of Messaging services:
In this section, we will explore various types of messaging services, each designed to address different requirements and use cases in distributed systems.
1. **General-purpose message queue:** General-purpose message queues are versatile and can be used in various non-very-scenarios, from distributing tasks and buffering requests to enabling communication between microservices. These messaging systems are designed to provide reliable message delivery and ensure that messages are processed in the correct order, and handle message volumes typically up to 100,000 messages per second. Message queues often support multiple messaging patterns, such as point-to-point and publish-subscribe, providing flexibility for different use cases. Examples of general-purpose message queues include [RabbitMQ](https://www.rabbitmq.com/), [ActiveMQ](https://activemq.apache.org/), and [Amazon SQS](https://aws.amazon.com/sqs/). By using these message queues, developers can decouple their applications and scale them independently, improving overall system resilience and performance.
2. **Pub/Sub messaging:** Publish-Subscribe (Pub/Sub) messaging services allow publishers to send messages to multiple subscribers without direct point-to-point connections. This enables decoupling of producers and consumers, making the system more scalable and fault-tolerant. Pub/Sub systems are particularly useful in scenarios where multiple consumers need to receive and process the same messages, such as sending notifications, logging, or data replication. The Pub/Sub model supports dynamic subscription management, allowing consumers to subscribe and unsubscribe from specific topics or channels at runtime. Examples of Pub/Sub messaging services include [Google Cloud Pub/Sub](https://cloud.google.com/pubsub), [Apache Pulsar](https://pulsar.apache.org/), [Azure Event Grid](https://azure.microsoft.com/en-us/products/event-grid), [AWS SNS](https://aws.amazon.com/sns/) and [NATS](https://nats.io/). By adopting a Pub/Sub messaging system, developers can create event-driven architectures, reduce system complexity, and streamline the integration of new services.
3. **Streaming processing:** Stream processing services are designed to handle large volumes of real-time data (say 1 milion messages per second), allowing continuous processing and analysis of data streams. These services enable complex event processing, time-windowed aggregations, and stateful transformations. They provide a robust platform for building real-time analytics, monitoring, and alerting applications. Stream processing systems often use a combination of in-memory and disk-based storage to balance performance and durability. They also support horizontal scaling, allowing developers to process massive data volumes with low latency. Examples of stream processing services include [Apache Kafka](https://kafka.apache.org/), [Amazon Kinesis Data Streams](https://aws.amazon.com/kinesis/data-streams/), [Azure Event Hubs](https://azure.microsoft.com/en-us/products/event-hubs), [RocketMQ](https://rocketmq.apache.org/), [Apache Pulsar](https://pulsar.apache.org/) and [Redis Streams](https://redis.io/docs/data-types/streams/). By leveraging stream processing services, organizations can gain valuable insights from their data in real-time and make data-driven decisions more effectively.
4. **Brokerless:** Brokerless messaging systems enable direct communication between producers and consumers without relying on a central broker, thereby reducing latency and improving performance. In brokerless systems, nodes communicate with each other using a peer-to-peer architecture, which can simplify deployment and reduce the need for dedicated message broker infrastructure. These systems are particularly suitable for high-performance, low-latency applications or situations where network connectivity is intermittent or unreliable. Examples of brokerless messaging systems include [ZeroMQ](https://zeromq.org/), [nanomsg](https://nanomsg.org/), [Chronicle Queue](https://github.com/OpenHFT/Chronicle-Queue) and [the Disruptor](https://lmax-exchange.github.io/disruptor/). By adopting a brokerless messaging system, developers can build lightweight, fast, and efficient communication channels between components in their distributed systems
5. **Database-as-queue** In some cases, a traditional relational or NoSQL database can be used as a message queue, allowing for simpler integration with existing systems and providing familiar tools for data management. This approach can be particularly useful for smaller-scale applications or as a transitional step when migrating from monolithic to distributed architectures. Using a database-as-queue can leverage built-in database features like transactions, indexing, and querying to manage messages effectively. Examples of using a database-as-queue include PostgreSQL's LISTEN/NOTIFY feature or leveraging Amazon DynamoDB Streams. While using a database-as-queue might not provide the same performance and scalability as dedicated messaging services and sometimes is considered as an anti-pattern, it can be a suitable option for specific use cases or when the application requirements are less demanding.
### Comparsion
| | Performance | Scalability | Flexibility | Complexity | Functionality | Ease of Use |
|------------------------|--------------------|--------------------|----------------------|---------------------|----------------------|----------------------|
| General-purpose MQ | Moderate | Moderate | High | Moderate | High | Moderate |
| Pub/Sub | Moderate to High | High | High | Moderate | Moderate to High | Moderate to High |
| Stream processing | High | High | Moderate to High | High | High | Moderate |
| Brokerless | High | Moderate | Moderate | Low to Moderate | Moderate | High |
| Database-as-queue | Low to Moderate | Low to Moderate | Moderate | Low | Low to Moderate | High |

@ -0,0 +1,63 @@
# Key Concepts
Lets looks at some of the key concepts when we talk about messaging system
### Delivery guarantees
One of the essential aspects of messaging services is ensuring that messages are delivered to their intended recipients. Different systems offer varying levels of delivery guarantees, and it is crucial to understand these guarantees to choose the right messaging service for your needs.
* **at-most-once-delivery** This guarantee ensures that a message is delivered at most once to its intended recipient. In other words, messages may be lost, but they will never be delivered more than once. This approach is suitable for scenarios where message loss is tolerable and duplication is not desired.
* **at-lesat-once-delivery** Under this guarantee, a message will be delivered to its intended recipient at least once, but it may be delivered multiple times in case of failures. This approach is appropriate for situations where message loss is unacceptable, but duplication can be managed by the recipient.
* **exactly-once-delivery** This guarantee ensures that a message is delivered exactly once to its intended recipient, with no loss or duplication. This is the most stringent level of delivery guarantee and is suitable for applications where both message loss and duplication are unacceptable. However, it's important to note that it is challenging, if not impossible, for any messaging system to guarantee exactly-once delivery due to the inherent complexities and potential failures in distributed systems.
Selecting the right delivery guarantee depends on the specific requirements of your application. For example, in financial transactions, an exactly-once delivery guarantee is essential to avoid double-processing of payments or missed transactions. In contrast, a log monitoring system may only require at-most-once delivery to reduce system overhead.
### Messages ordering and parallelism
Ensuring the correct order of messages and processing them in parallel can be a challenge in distributed messaging systems. The following strategies help maintain order and ensure parallelism:
* **Strict ordering**: In some cases, maintaining strict order is essential, such as when processing financial transactions. This may require additional overhead, such as sequencing numbers, buffering, and reordering messages.
* **Partial ordering**: Partial ordering can be used when only a subset of messages must be ordered. For example, messages within a specific group or partition must be processed in order, but messages between groups or partitions can be processed independently.
* **Unordered processing**: In some scenarios, processing messages in any order is acceptable. This approach reduces complexity and enables higher parallelism, improving overall system performance.
Strategies to maintain order and ensure parallelism:
Partitioning messages by a key, using sequencing numbers, and buffering can help maintain order while still allowing parallel processing. It is crucial to strike the right balance between ordering requirements and parallelism to optimize system performance.
### Fan Out / In
Fan out and fan in are two crucial concepts in messaging systems that deal with the distribution of messages among multiple consumers and the aggregation of messages from multiple producers, respectively.
#### Fan Out
Fan out is a pattern where a single message is sent to multiple consumers, ensuring that each consumer receives a copy of the message. This can be achieved using the Publish-Subscribe (Pub/Sub) messaging pattern or by creating multiple bindings with unique routing keys in a message broker like RabbitMQ. Fan out is useful in scenarios where multiple services or applications need to process the same messages independently, such as sending notifications to multiple subscribers or replicating data across multiple databases.
#### Fan In
Fan in is a pattern where messages from multiple producers are aggregated and processed by a single consumer or a group of consumers. This can be achieved by using a message broker with multiple producers sending messages to a shared queue, which is then consumed by one or more consumers. Fan in is beneficial when you need to centralize processing or consolidate data from multiple sources, such as aggregating logs from various services or combining sensor data from multiple IoT devices.
### Poison Pills and Dead Letters
In messaging systems, poison pills are problematic messages that can cause failures or crashes in the message processing pipeline. To handle these messages, messaging services often employ Dead Letter Queues (DLQs).
A poison pill is a message that cannot be processed due to various reasons, such as invalid format, missing information, or incorrect data. When a consumer encounters a poison pill, it must handle it gracefully to avoid crashing or getting stuck in an infinite processing loop.
A Dead Letter Queues (DLQ) is a separate queue used to store poison pills or messages that could not be processed successfully. Instead of discarding problematic messages, they are redirected to a DLQ, allowing engineers to analyze and resolve the issues.
To handle poison pills effectively, you can implement error handling and monitoring, set up retries with backoff policies, and use DLQs for further analysis and resolution. Regularly monitor the DLQ, identify patterns causing poison pills, and implement fixes in the message processing pipeline to prevent future issues.
### Messaging Patterns
#### Point-to-Point (Queue-based)
In this pattern, messages are sent from a single producer to a single consumer via a message queue. The message is consumed by only one consumer, even if there are multiple consumers listening to the queue. This pattern ensures that the message is processed by a single consumer, making it suitable for scenarios where messages must be processed in sequence or by specific consumers.
#### Publish-subscribe pattern
The publish-subscribe pattern involves a producer sending messages to a topic, and multiple consumers subscribing to that topic to receive the messages. This pattern allows for one-to-many communication, where a single message can be delivered to multiple consumers simultaneously. It is ideal for event-driven architectures and applications that require real-time updates or notifications.

@ -53,6 +53,10 @@ nav:
- Introduction: level101/databases_nosql/intro.md
- Key Concepts: level101/databases_nosql/key_concepts.md
- Conclusion: level101/databases_nosql/further_reading.md
- Message Queue:
- Introduction: level101/messagequeue/intro.md
- Key Concepts: level101/messagequeue/key_concepts.md
- Conclusion: level101/messagequeue/further_reading.md
- Big Data:
- Introduction: level101/big_data/intro.md
- Evolution and Architecture of Hadoop: level101/big_data/evolution.md

Loading…
Cancel
Save