MySQL Index Overview

March 21, 2021 · 516 words · 3 min · DB MySQL

Database indexes are sorted data structures in DBMS that help in quickly querying and updating data in a database. Generally, data structures used for building indexes include B-trees, B+ trees, hash tables, etc. MySQL uses B+ trees to build indexes. The reason for this choice is that a B+ tree node can store more data, and in a B+ tree, only leaf nodes store data, while non-leaf nodes store only indexes.

HTTPS Introduction

February 21, 2021 · 564 words · 3 min · Network HTTPS HTTP

HTTPS (HTTP over SSL) was introduced to address the security vulnerabilities of HTTP, such as eavesdropping and identity spoofing. It uses SSL or TLS to encrypt communication between the client and the server. Problems with HTTP Communication uses plain text, making it susceptible to eavesdropping. Unable to verify the identity of the communication party, making it vulnerable to spoofing (e.g., Denial of Service attacks). Cannot guarantee message integrity, making it possible for messages to be altered (e.

MIT6.824-MapReduce

January 22, 2021 · 1541 words · 8 min · MIT6.824 Distributed System Paper Reading

The third year of university has been quite intense, leaving me with little time to continue my studies on 6.824, so my progress stalled at Lab 1. With a bit more free time during the winter break, I decided to continue. Each paper or experiment will be recorded in this article. This is the first chapter of my Distributed System study notes. About the Paper The core content of the paper is the proposed MapReduce distributed computing model and the approach to implementing the Distributed MapReduce System, including the Master data structure, fault tolerance, and some refinements.

Chinese Spam Email Classification Based on Naive Bayes

May 6, 2020 · 897 words · 2 min · ML

Chinese Spam Email Classification Based on Naive Bayes Training and Testing Data This project primarily uses open-source data on GitHub. Data Processing First, we use regular expressions to filter the content of Chinese emails in the training set, removing all non-Chinese characters. The remaining content is then tokenized using jieba for word segmentation, and stopwords are filtered using a Chinese stopword list. The processed results for spam and normal emails

Java Multithreading Programming

November 1, 2019 · 1626 words · 8 min · JAVA

Yesterday evening, while revisiting the book “Advanced Java: Multithreading and Parallel Programming” by Liang Yung, I thought it would be a good idea to take the opportunity to document my understanding. Java Multithreading Programming Java provides built-in support for multithreading. A thread is a single sequential flow of control within a process, and multiple threads can run concurrently within a process, each performing different tasks. Multithreading is a specialized form of multitasking that consumes fewer resources.