How to Implement SkipList

November 21, 2021 · 705 words · 4 min · DataStructure SkipList

Some time ago, I decided to implement a simple LSM storage engine model. As part of that, I implemented a basic SkipList and BloomFilter with BitSet. However, due to work demands and after-hours laziness, the project was put on hold. Now that I’m thinking about it again, I realize I’ve forgotten some of the details, so I’m writing it down for future reference. What is SkipList? SkipList is an ordered data structure that can be seen as an alternative to balanced trees.

Kylin Overview

November 10, 2021 · 803 words · 4 min · Thesis OLAP DB Distributed System Differential Privacy Kylin

Previously, I was hoping to work on an interesting thesis, but I couldn’t find a suitable advisor nearby. I initially found a good advisor before the college started the topic selection, but it turned out they couldn’t take me on. However, I wasn’t that interested in the advisor’s field, so I decided to look for something else. Recently, the college’s thesis selection process started, and I found an interesting topic in the list.

DFS-Haystack

October 6, 2021 · 1284 words · 7 min · DFS Paper Reading Distributed System

The primary project in my group is a distributed file system (DFS) that provides POSIX file system semantics. The approach to handle “lots of small files” (LOSF) is inspired by Haystack, which is specifically designed for small files. I decided to read through the Haystack paper and take some notes as a learning exercise. These notes are not an in-depth analysis of specific details but rather a record of my thoughts on the problem and design approach.

MIT6.824 Bigtable

September 16, 2021 · 1908 words · 9 min · Paper Reading MIT6.824 DFS Distributed System

I recently found a translated version of the Bigtable paper online and saved it, but hadn’t gotten around to reading it. Lately, I’ve noticed that Bigtable shares many design similarities with a current project in our group, so I took some time over the weekend to read through it. This is the last of Google’s three foundational distributed system papers, and although it wasn’t originally part of the MIT6.824 reading list, I’ve categorized it here for consistency.

MIT6.824 GFS

September 9, 2021 · 1121 words · 6 min · GFS MIT6.824 Paper Reading

This article introduces the Google File System (GFS) paper published in 2003, which proposed a distributed file system designed to store large volumes of data reliably, meeting Google’s data storage needs. This write-up reflects on the design goals, trade-offs, and architectural choices of GFS. Introduction GFS is a distributed file system developed by Google to meet the needs of data-intensive applications, using commodity hardware to provide a scalable and fault-tolerant solution.