Reading Notes: “Interviewing Life – Designing a Simple Decentralized Job Scheduler”

Original link: https://www.hwchiu.com/read-notes-63.html

Title: “Interviewing Life – Designing a Simple Decentralized Job Scheduler”
Category: others
Link: https://medium.com/@raxshah/system-design-design-a-distributed-job-scheduler-kiss-interview-series-753107c0104c

This article is a technical interview article, which discusses how to design the overall system to fulfill the requirements when developing a project similar to Job Scheduler. The overall architecture is based on the principle of KISS, which is simplicity.

The whole process principle is basically

  1. Understand all functional requirements, both functional and non-functional
  2. Understand the possible data, and estimate the overall size according to the size and functional requirements
  3. Plan the overall architecture according to the above requirements. The scale can sometimes help to summarize the ratio of “read and write” to each other, which will affect the architecture design.

Common types of functional aspects such as

  1. What operations are provided for users, such as submitting a Job, listing all Jobs (current, history)
  2. The running time limit of each job (ex, 5min), and the job can be run repeatedly or only once, etc.
  3. The job itself also has a priority design, which can cut the queue, etc.

Indirect functions such as

  1. Can be dynamically scaled to support different levels of demand
  2. Regardless of any errors, the job information submitted by the user cannot be lost
  3. Asynchronous design, the user can continue other work after submitting the job, and the user will be actively notified after the job is completed

With the functional requirements, the next step is the number and size requirements. For example, the architecture should be able to reach 1000 jobs per second (1000QPS),
From these requirements, estimate how much CPU and how much memory is needed. At the same time, these quantities can also meet the functional requirements. For example, each Job can run for up to five minutes.

So maybe you will get 10,000 (16C) machines, and 100 (16GB) machines to provide services. Basic computing can quickly understand whether the demand needs a distributed architecture to handle it. The amount of sample data in this article is just Obviously there is no way to scale up.

Next, we will design related architectures based on the decentralized architecture, including such as

  1. Load Balancer
  2. Backend
  3. DB
  4. Job scheduler
  5. Job Executor
  6. Queue
  7. File system

Plan these architectures step by step, and explore the ways in which components communicate with each other, and how these ways can be combined to meet functional/non-functional requirements

For detailed requirements, please refer to the full text

personal information

I currently have Kubernetes-related courses on the Hiskio platform. Interested people are welcome to refer and share, which contains my various ideas about Kubernetes from the bottom to the actual combat.

For details, please refer to the online course details: https://course.hwchiu.com/

In addition, please click like to join my personal fan page, which will regularly share various articles, some are translated articles, and some are original articles, mainly focusing on the CNCF field
https://www.facebook.com/technologynoteniu

If you use Telegram, you can also subscribe to the following channels, where I will regularly push notifications of various articles
https://t.me/technologynote

Your donation will give me the motivation to grow my article

This article is reprinted from: https://www.hwchiu.com/read-notes-63.html
This site is for inclusion only, and the copyright belongs to the original author.

Leave a Comment