Blogs

CPU Profiling: What, How, and When

March 10, 2025 · 881 words · 5 min · Performance Analysis

What: What is CPU Profiling A technique for analyzing program CPU performance. By collecting detailed data during program execution (such as function call frequency, time consumption, call stacks, etc.), it helps developers identify performance bottlenecks and optimize code efficiency. Typically used in performance analysis and root cause diagnosis scenarios. How: How Profiling Data is Collected Common tools like perf are used to collect process stack information. These tools use sampling statistics to capture stack samples executing on the CPU for performance analysis.

LevelDB MVCC

February 8, 2025 · 502 words · 3 min · LevelDB MVCC

LevelDB implements concurrent sstable read/write operations and snapshot reads through MVCC. Let’s examine its implementation. Sequence Number LevelDB uses Sequence Numbers as logical clocks to maintain a total order of KV write operations. The Sequence Number is encoded in the last few bytes of the InternalKey. This encoding ensures data ordering during memory writes. Versioning Every change to the sstable file collection triggers a version upgrade in LevelDB. Each Version represents the database state at a specific moment, containing sstable metadata and compaction-related information.

Prometheus--TSDB

December 31, 2024 · 4802 words · 10 min · Prometheus TSDB

Recently got promoted, I took a moment to summarize some of my previous work. A significant part of my job was building large-scale database observability systems, which are quite different from cloud-native monitoring solutions like Prometheus. Now, I’m diving into the standard open-source monitoring system. This article mainly discusses the built-in single-node time series database (TSDB) of Prometheus, outlining its TSDB design without delving into source code analysis. Analyzing the

Borg: Large-scale Cluster Management at Google with Borg

February 19, 2024 · 557 words · 3 min · Borg K8s Cluster Management

Borg is a cluster management system, similar to the closed-source version of Kubernetes (k8s). It achieves high utilization through admission control, efficient task packing, overcommitment, machine sharing, and process-level performance isolation. It provides runtime features to reduce failure recovery time for high-availability applications and scheduling policies that reduce the probability of correlated failures. It offers a declarative job description language, DNS integration, real-time job monitoring, and tools for analyzing and simulating system behavior, simplifying usage for end-users.

Percolator: Large-scale Incremental Processing Using Distributed Transactions and Notifications

September 28, 2023 · 1135 words · 3 min · Distributed System Transaction

It has been a while since I last studied, and I wanted to learn something interesting. This time, I’ll be covering Percolator, a distributed transaction system. I won’t translate the paper or delve into detailed algorithms; I’ll just document my understanding. Percolator and 2PC 2PC The Two-Phase Commit (2PC) protocol involves two types of roles: Coordinator and Participant. The coordinator manages the entire process to ensure multiple participants reach a