Yunpeng's Blog

Re-understand JavaScript

2019-06-18T05:56:39.000Z

In this post, we will be looking at a few interesting (but could be challenging) JavaScript questions.

Most of them are actually testing the so-called “down-side” of JavaScript. You certainly should not write such code in a real-world codebase. As I have repeated many times, code should be clear, precise and concise, in that order of importantce. Nevertheless, they are indeed good questions to test your competency in JavaScript.

9 or 10?

You are given a function called magic_length, which is defined as follows:

1
2
3

function magic_length(input) {
    return input.length == 10 && input == ",,,,,,,,,";
}

Please give one possible value of input such that magic_length(input) will return true. Notice: input should be of basic data type provided by built-in libraries.

Introducing your Guide to be AWS Certified

2019-06-15T12:34:12.000Z

Hi guys, it has been a long time since my last post. In the past few months, I have been preparing for the AWS Certified Solutions Architect – Associate examination, which is part of the series of AWS Certificate Examination.

This examination (and its siblings) focus on some cloud computing concepts, as well as a lot of details specific to the services provided by AWS. To prepare for this examination, there are a few important learning resources:

Official Examination Guide and sample questions;
AWS whitepapers, which have been collected and available at https://aws.amazon.com/whitepapers/;
AWS Training Learn Library, which offers a free subscription currently;
Some online courses, of which A Cloud Guru being one of the most popular providers.

Consistency between Redis Cache and SQL Database

2019-05-04T09:02:18.000Z

Nowadays, Redis has become one of the most popular cache solution in the Internet industry. Although relational database systems (SQL) bring many awesome properties such as ACID, the performance of the database would degrade under high load in order to maintain these properties.

In order to fix this problem, many companies & websites have decided to add a cache layer between the application layer (i.e., the backend code which handles the business logic) and the storage layer (i.e., the SQL database). This cache layer is usually implemented using an in-memory cache. This is because, as stated in many textbooks, the performance bottleneck of traditional SQL databases is usually I/O to secondary storage (i.e., the hard disk). As the price of main memory (RAM) has gone down in the past decade, it is now feasible to store (at least part of) the data in main memory to improve performance. One popular choice is Redis.

How Query Optimizer Works in RDBMS

2019-02-06T16:09:39.000Z

In a previous post, we discussed how the various relational operators are implemented in relational database systems. If you have read that post, you probably still remember that there are a few alternative implementations for every operator. Thus, how should RDBMS determine which algorithm (or implementation) to use?

Obviously, to optimize the performance for any query, RDBMS has to select the correct the algorithm based on the query. It would not be desirable to always use the same algorithm. Also, SQL is a declarative language (i.e., as a programmer we only declare what we want to do with the language, not tell how the language should accomplish the task). Therefore, it would be an anti-pattern if the user of the database system needs to specify which algorithm to use when writing the query. Instead, the correct approach would be that the user would treat the entire system as a blackbox. The end-user should not care about which algorithm is picked but expect the performance optimization is guaranteed.

Understanding How is Data Stored in RDBMS

2019-01-20T13:16:19.000Z

We all know that DBMS (database management system) is used to store (a massive amount of) data. However, have you ever wondered how is data stored in DBMS? In this post, we will focus on data storage in RDBMS, the most traditional relational database systems.

Physical Storage

Data can be stored in many different kinds of medium or devices, from the fastest but costy registers to the slow but cheap hard drives, or even magnetic tapes. Nowadays, IaaS providers such as AWS even provides services such as S3 Glacier as a low-cost archiving storage solution. The diagram below shows the memory hierarchy of common devices.

Evaluation & Implementation of Relational Operators

2019-01-05T14:10:55.000Z

This post talks about some basic implementation of relational operators in traditional RDBMS (relational database management systems). It was based on Chapter 14 of the textbook by Raghu Ramakrishnan and Johannes Gehrke.

Below we will talk about the classical evaluation & implementation of relational operators one-by-one, namely:

Selection
Projection
Join, cross product
Set operations (intersection, union, difference)
Grouping & aggregation

Literature Review on Join Reorderability

2018-12-22T08:24:11.000Z

Recently, I was looking at some research papers on the join reorderability. To start with, let’s understand what do we mean by “join reorderability” and why it is important.

Background Knowledge

Here, we are looking at a query optimization problem, specifically join optimization. As mentioned by Benjamin Nevarez, there are two factors in join optimization: selection of a join order and choice of a join algorithm.

As stated by Tan Kian Lee’s lecture notes, common join algorithms include iteration-based nested loop join (tuple-based, page-based, block-based), sort-based merge join and partition-based hash join. We should consider a few factors when deciding which algorithm to use: 1) types of the join predicate (equality predicate v.s. non-equality predicate); 2) sizes of the left v.s. right join operand; 3) available buffer space & access methods.

For a query attempting to join n tables together, we need n - 1 individual joins. Apart from the join algorithm applied to each join, we have to decide in which order these n tables should be joined. We could represent such join queries on multiple tables as a tree. The tree could have different shapes, such as left-deep tree, right-deep tree and bushy tree. The 3 types of trees are compared below on an example of joining 4 tables together.

Redis Cluster & Common Partition Techniques in Distributed Cache

2018-07-27T05:09:53.000Z

In this post, I will discuss a few common partition techniques in distributed cache. Especially, I will elaborate on my understanding on the use of Redis Cluster.

Please understand that at the time of writing, the latest version of Redis is 4.0.10. Many articles on the same topic have a different idea from this post. This is mainly because, those articles are probably outdated. In particular, they may refer to the Redis Cluster implementation in Redis 3. Redis Cluster has been improved a lot since Redis 4.

(This article was based on part of my project report. You may want to take a look at the full report here. You may need a valid account to gain access to NUS SoC Digital Library.)

Common Partition Techniques

Here, we refer to horizontal partitioning, which is also known as data sharding. Traditionally, there are 3 approaches to achieve data partitioning, namely, server-side partitioning, cluster proxy, and client-side partitioning.

To Select the Correct Technical Stack for Web

2018-04-29T14:04:05.000Z

When I planned to upgrade the CS1101S DG Website project, selection of the technical stack became a big headache. The current decision is

Backend: Spring Boot 2.x
Frontend: Vue.js 2.x + Bootstrap 4.x (integrated with Vue.js using Bootstrap Vue)

In this post, I would like to present the decision-making process.

What are the possible languages, frameworks?

Certainly, there are many different choices. Let’s compare them as follows. To select a backend framework, it is essentially to select a server-side programming language.

Java (current choice): good for scalability and maintainability, used in many enterprise applications. As a relatively old language, its robustness is no doubt.
PHP: also a traditional choice. However, its performance is not as good as Java (since Java is a fully compiled language, PHP is parsed into opcode and sent to Zend Engine).
Ruby: a dynamic-typed language, which becomes famous due to Ruby on Rails. You can write less code to achieve more functionalities. However, its performance is even worse and its development environment is also not trivial to set up.
Node.js: a newer technology than others. It provides a unified language for both frontend and backend development. It is fast since it leverages JavaScript event loop to create non-blocking I/O.
Python: clear and compact syntax that is helpful to developers. Similar to Ruby, it has potential performance issues.

Blogging with Hexo.js

2018-04-11T05:00:17.000Z

As you may already know, this blog is built using Hexo.js with theme Next. In this post, I will discuss the reasons why I select this static site generator and this theme.

Why do I select Hexo.js?

I want a blog website that only consists of static webpages. Thus, I cannot use any content management system (CMS) with dynamic pages, like WordPress and Drupal.
- This provides me with more options to host it. For instance, GitHub Pages only supports static webpages.
- Static webpages are generally faster. They do not need any server-side pre-rendering.
It may be a waste of time to write raw HTML, CSS & JavaScript code for every page of the blog. Much of the code can be reused. Thus, I need a framework to help me generate the static webpages.
I want to develop in both Windows and Linux-based environment. This means some programming languages like Ruby may be troublesome. Thus, I will not choose engines like Jekyll.
The body of my blog posts should not be in plain text. I need basic styling of the text. Also, I may insert code snippets to technical posts sometimes.
- Therefore, the framework had better support Markdown and/or AsciiDoc.
- I know how to use LaTeX. My slides for my CS1101S classes are all typed in Latex with Beamer package. However, although LaTeX is very powerful, I have to say its syntax is way too complex.
  - In fact, the Next theme also supports math equation rendering by either MathJax or Katex.

Given all the factors mentioned above, I choose Hexo.js in the end.

Hello World

2017-10-13T04:40:58.000Z

Welcome to Hexo! This is your very first post. Check documentation for more info. If you get any problems when using Hexo, you can find the answer in troubleshooting or you can ask me on GitHub.

Quick Start

Create a new post

1	$ hexo new "My New Post"

More info: Writing