Original link: https://zdyxry.github.io/2022/07/31/Weekly-Issue-2022-07-31/
article
technology
Some System Design Interview Tips
Define requirements; define system interfaces and business entities; draw an architecture diagram; discuss specific details.
Maintaining your own checklist gives you more control over what you’re doing.
Alerts, what are they good for?
How to set alarm rules? In our company, we usually rely on experience to beat our heads. The authors propose a quantifiable rule to evaluate the rule:
Impact: The more severe the higher the value
Frequency: The higher the frequency of occurrence, the higher the value
Recoverability: the higher the number if the problem involved requires more manual effort
The final calculation method is (i+f)*r
, and the author gives numerical suggestions:
1–19 ignore
20–49 alert
50–79 evaluate the event. is it rated properly and if yes, what improvements can be made if any. The below example of US East 1 going down is a worse case scenario that relies on DR however, with good monitoring, can be detected and actioned
80–100 IMHO, any event with this scoring should NOT exist and if it does, we are in dire trouble
Why floating point arithmetic is imprecise
Discussion on [[Open Source]]
User settings, Lamport clocks and lightweight formal methods
Auxiliary Reading: The Weekly (Issue 21): Introduction to the Lamport Clock
The RED Method: key metrics for microservices architecture
[[weaveworks]] About the classification method RED
for setting monitoring indicators:
Rate, service requests per second
Errors, the number of failed requests per second
Duration, the time-consuming distribution of each request
Utilization: Expressed as a percentage over a certain time interval. For example, “One disk is running at 90% utilization”.
Saturation: as the queue length. For example, “The average run queue length of the CPU is 4”.
Errors: The count of error events.
Vertical CPU Scaling: Reduce Cost of Capacity and Increase Reliability
Calculate the CPU cores ultimately allocated to the Pod by reasonably calculating the CPU usage of the Pod and the possible failures that can be expected to be tolerated.
The reason for this is that responsibilities within a storage cluster can change over time, and all pods must therefore be allocated sufficient resources so that they can become the busiest pod in the cluster.
Collect the CPU utilization of Pods in the past 14 days. From these data, extract the P99 value every 8 hours, select the Pod with the highest utilization as the cluster benchmark, and the third highest in the data of the Pod with the highest utilization as the final calculation quota. numerical value. Quota = Peak usage / Utilization target.
Three git processes and release models
Comparison of various Git development processes.
To implement a simple remote login server, you need to pay attention to the use of tty. For related reading, see the previous introduction to tty.
Life
How do I get better at giving feedback?
How to give better feedback? (I think it depends on the person.
What are some paintings that laymen think are ugly but are actually very powerful?
I don’t know if Li is really bad, but ugly is really ugly. . .
This article is reprinted from: https://zdyxry.github.io/2022/07/31/Weekly-Issue-2022-07-31/
This site is for inclusion only, and the copyright belongs to the original author.