Percolator: Large-scale Incremental Processing Using Distributed Transactions and Notifications

September 28, 2023 · 1135 words · 3 min · Distributed System Transaction

It has been a while since I last studied, and I wanted to learn something interesting. This time, I’ll be covering Percolator, a distributed transaction system. I won’t translate the paper or delve into detailed algorithms; I’ll just document my understanding. Percolator and 2PC 2PC The Two-Phase Commit (2PC) protocol involves two types of roles: Coordinator and Participant. The coordinator manages the entire process to ensure multiple participants reach a

Dynamo: Amazon’s Highly Available Key-value Store

August 1, 2023 · 425 words · 2 min · Distributed System Storage

An old paper by AWS, Dynamo has been in the market for a long time, and the architecture has likely evolved since the paper’s publication. Despite this, the paper was selected as one of the SIGMOD best papers of the year, and there are still many valuable lessons to learn. Design Dynamo is a NoSQL product that provides a key-value storage interface. It emphasizes high availability rather than consistency, which leads to differences in architectural design and technical choices compared to other systems.

MIT6.824 AuroraDB

August 1, 2023 · 524 words · 2 min · Distributed System Database Cloud-Native MIT6.824

This article introduces the design considerations of AWS’s database product, Aurora, including storage-compute separation, single-writer multi-reader architecture, and quorum-based NRW consistency protocol. The article also mentions how PolarDB was inspired by Aurora, with differences in addressing network bottlenecks and system call overhead. Aurora is a database product provided by AWS, primarily aimed at OLTP business scenarios. In terms of design, there are several aspects worth noting: The design premise of

MIT6.824 Chain Replication

February 8, 2023 · 463 words · 3 min · Distributed System MIT6.824 ChainReplication

This post provides a brief overview of the Chain Replication (CR) paper, which introduces a simple but effective algorithm for providing linearizable consistency in storage services. For those interested in the detailed design, it’s best to refer directly to the original paper. Introduction In short, the Chain Replication (CR) paper presents a replicated state machine algorithm designed for storage services that require linearizable consistency. It uses a chain replication method to improve throughput and relies on multiple replicas to ensure service availability.

MIT6.824-ZooKeeper

January 3, 2023 · 399 words · 2 min · Distributed System MIT6.824 ZooKeeper

This article mainly discusses the design and practical considerations of the ZooKeeper system, such as wait-free and lock mechanisms, consistency choices, system-provided APIs, and specific semantic decisions. These trade-offs are the most insightful aspects of this article. Positioning ZooKeeper is a wait-free, high-performance coordination service for distributed applications. It supports the coordination needs of distributed applications by providing coordination primitives (specific APIs and data models). Design Keywords There are two key phrases in ZooKeeper’s positioning: high performance and distributed application coordination service.