March 10, 2025 · 881 words · 5 min
·
Performance
Profiling
What: What is CPU Profiling A technique for analyzing program CPU performance. By collecting detailed data during program execution (such as function call frequency, time consumption, call stacks, …
February 8, 2025 · 502 words · 3 min
·
LevelDB
MVCC
Storage
LevelDB implements concurrent sstable read/write operations and snapshot reads through MVCC. Let’s examine its implementation.
Sequence Number LevelDB uses Sequence Numbers as logical clocks to …
December 31, 2024 · 4802 words · 10 min
·
Prometheus
TSDB
Storage
Recently got promoted, I took a moment to summarize some of my previous work. A significant part of my job was building large-scale database observability systems, which are quite different from …
February 19, 2024 · 557 words · 3 min
·
Borg
Kubernetes
Cluster Management
Paper Reading
Borg is a cluster management system, similar to the closed-source version of Kubernetes (k8s).
It achieves high utilization through admission control, efficient task packing, overcommitment, machine …
September 28, 2023 · 1135 words · 3 min
·
Distributed System
Transaction
Paper Reading
It has been a while since I last studied, and I wanted to learn something interesting. This time, I’ll be covering Percolator, a distributed transaction system. I won’t translate the paper …
August 1, 2023 · 425 words · 2 min
·
Distributed System
Storage
An old paper by AWS, Dynamo has been in the market for a long time, and the architecture has likely evolved since the paper’s publication. Despite this, the paper was selected as one of the …
August 1, 2023 · 524 words · 2 min
·
Distributed System
Database
Cloud-Native
MIT6.824
This article introduces the design considerations of AWS’s database product, Aurora, including storage-compute separation, single-writer multi-reader architecture, and quorum-based NRW …
February 8, 2023 · 463 words · 3 min
·
Distributed System
MIT6.824
Chain Replication
This post provides a brief overview of the Chain Replication (CR) paper, which introduces a simple but effective algorithm for providing linearizable consistency in storage services. For those …
January 3, 2023 · 638 words · 3 min
·
Distributed System
MIT6.824
ZooKeeper
This article mainly discusses the design and practical considerations of the ZooKeeper system, such as wait-free and lock mechanisms, consistency choices, system-provided APIs, and specific semantic …
October 10, 2022 · 1056 words · 5 min
·
Big Data
Lakehouse
Stream Compute
Storage
The Iceberg community provides an official Flink Connector, and this chapter’s source code analysis is based on that.
Overview of the Write Submission Process Flink writes data through RowData …
May 10, 2022 · 712 words · 4 min
·
LevelDB
LSM
Storage
This is the second chapter of my notes on reading the LevelDB source code, focusing on the write flow of LevelDB. This article is not a step-by-step source code tutorial, but rather a learning note …
April 15, 2022 · 1039 words · 5 min
·
Raft
Distributed System
MIT6.824
Earlier, I looked at the code of Casbin-Mesh because I wanted to try GSOC. Casbin-Mesh is a distributed Casbin application based on Raft. This RaftKV in MIT6.824 is quite similar, so I took the …
April 9, 2022 · 1312 words · 7 min
·
LevelDB
LSM
Storage
This is the first chapter of my notes on reading the LevelDB source code, focusing on the startup process of LevelDB. This article is not a step-by-step source code tutorial, but rather a learning …
February 21, 2022 · 953 words · 5 min
·
Paper Reading
Consensus
Distributed System
MIT6.824
Finally, I managed to complete Lab 02 during this winter break, which had been on hold for quite some time. I was stuck on one of the cases in Test 2B for a while. During the winter break, I revisited …
November 21, 2021 · 705 words · 4 min
·
Data Structure
SkipList
Some time ago, I decided to implement a simple LSM storage engine model. As part of that, I implemented a basic SkipList and BloomFilter with BitSet. However, due to work demands and after-hours …
October 6, 2021 · 1284 words · 7 min
·
DFS
Paper Reading
Distributed System
The primary project in my group is a distributed file system (DFS) that provides POSIX file system semantics. The approach to handle “lots of small files” (LOSF) is inspired by Haystack, …
September 16, 2021 · 1908 words · 9 min
·
Paper Reading
MIT6.824
DFS
Distributed System
I recently found a translated version of the Bigtable paper online and saved it, but hadn’t gotten around to reading it. Lately, I’ve noticed that Bigtable shares many design similarities …
September 9, 2021 · 1121 words · 6 min
·
GFS
MIT6.824
Paper Reading
This article introduces the Google File System (GFS) paper published in 2003, which proposed a distributed file system designed to store large volumes of data reliably, meeting Google’s data …
August 15, 2021 · 834 words · 4 min
·
OS
Linux
Network
IO
Let’s start with epoll.
epoll is an I/O event notification mechanism in the Linux kernel, designed to replace select and poll. It aims to efficiently handle large numbers of file descriptors and …
May 2, 2021 · 566 words · 3 min
·
CPU
Cache
Performance
The motivation for this post comes from an interview question I was asked: What is CPU false sharing?
CPU Cache Let’s start by discussing CPU cache.
CPU cache is a type of storage medium …
February 21, 2021 · 564 words · 3 min
·
Network
HTTPS
HTTP
HTTPS (HTTP over SSL) was introduced to address the security vulnerabilities of HTTP, such as eavesdropping and identity spoofing. It uses SSL or TLS to encrypt communication between the client and …
January 22, 2021 · 1541 words · 8 min
·
MIT6.824
Distributed System
Paper Reading
The third year of university has been quite intense, leaving me with little time to continue my studies on 6.824, so my progress stalled at Lab 1. With a bit more free time during the winter break, I …