The only ordinary engineer who maintains the API can easily bring down Twitter, Musk: Let’s rewrite the code

Author|Chu Xingjuan, Nuka Cola

“From the beginning to the end, we were always smiling.”

On Monday morning local time, there was a new glitch on the Twitter website.

One Twitter user logged in and found a whole bunch of interrelated questions. The first is that clicking on the link fails to redirect, instead a cryptic error message pops up saying “Your current API plan does not include access to this endpoint”.

“I guess this means Twitter is in desperate need of cash, and it started charging for access to the Twitter API, which Twitter itself can’t pay for,” tweeted slightly wry comment from Princeton computer science professor Arvind Narayanan.

Narayanan also wrote: “To add insult to injury, everyone is posting screenshots of the error message, but the image is also broken.” Yes, the image on Twitter didn’t load properly afterward either. Additionally, users reported being unable to access TweetDeck, Twitter’s professional user client.

“Twitter is broken enough to joke about, but powerful enough that we can joke about it on twitter, the platonic ideal of hardcore software,” quipped tech analyst Benedict Evans on the volatile social media site .

With the failure of the image loading function, Twitter began to fall into chaos, and countless users rushed to share the news about the glitch. Some netizens pointed out that “incoming and outgoing access to the Twitter API was broken”, and some netizens under the tweet said, “it’s more interesting if this app is broken” with the mentality of “watching the fun is not a big deal”.

In a tweet, the company gave a rather vague explanation: Some parts of Twitter may not work as expected right now. We made internal adjustments that had some unintended consequences. The issue was later confirmed to be due to a change to the Twitter API free access shutdown schedule.

On February 1, the company announced that it would no longer support free access to its API, effectively ending the existence of third-party clients and greatly limiting the ability of outside researchers to study the Twitter network. The company has been building new paid APIs for use by outside developers.

It is worth noting that Twitter blocked the development interface of user data for third-party application developers in 2014, and then strictly limited the tokens used for login. Developers must pay Twitter to use its API. Later, Twitter founder Jack said it was “the worst thing we’ve ever done,” defending that he “wasn’t running the company at the time,” adding that “the company has been working hard to fully reopen it.”

Maintained by one person, closed his own internal access

It is reported that Musk’s large-scale layoff plan has sharply reduced the number of Twitter engineers, resulting in only one person participating in major projects involving platform APIs.

According to a current employee, on Monday the lone on-site reliability engineer performed a “misconfiguration change that basically broke the normal functioning of the Twitter API.” Internal tools and public-facing APIs were all down. Engineers scrambled to fix the problem while chanting “It’s over” and “Twitter is down” on Slack.

Musk is said to have been furious when he learned of the situation.

Later that day, Musk tweeted, “A small change to the API can have a huge impact. The code stack is already extremely fragile and will eventually need to be completely rewritten.” Prior to this, Twitter investor Marc Andreessen also tweeted A screenshot showing the company’s API glitches spreading across the site.

“Musk’s explanation seems to have strayed from the real reason. It appears that Musk simply didn’t understand the dependencies in his tech stack, and inadvertently ordered the shutdown of Twitter’s own access to them while trying to cut off access to free external users. API internal access,” commented columnist Ahmed Bab.

When Musk took over Twitter, he promised to dramatically improve the site’s speed and stability. His colleagues screened employees for their skill level, eventually laying off thousands of employees who weren’t “capable” enough to succeed under Musk.

Internal employees are familiar with it

However, the continuous layoffs have resulted in Twitter now having less than 550 full-time engineers. According to media statistics, Musk has laid off about 80% of its employees. Things are now in line with the original expectations of former employees, and the loss of personnel has caused Twitter to expose catastrophic outages with increasing frequency.

This Monday’s misconfiguration change is already Twitter’s sixth service outage this year that has caused widespread impact:

  • On January 23rd, Android users were temporarily unable to load or post new tweets.

  • On February 8, an error message reminded users that “you have exceeded the daily tweet limit”, preventing them from posting normally.

  • On February 15th, Tweets failed to load.

  • On February 18, the tweet timeline broke and replies disappeared.

  • On March 1st, the timeline did not work properly.

The above mentioned is just a service downtime. In addition, issues such as Musk’s tweets being more prominent than other users on the timeline also disrupted the normal user experience.

“These outages have become more and more frequent, and I even feel that everyone has become numb,” said a current employee. It is reported that the atmosphere in Twitter’s headquarters is “relaxed and harmonious.” “From the beginning to the end, we were always smiling,” said one employee.

It took Twitter all morning to fix the problem, as Twitter had run out of experienced employees to restore service. “This is the inevitable result of laying off 90% of the staff.”

In many ways, Monday’s outage represented the peak of Musk’s influence on Twitter. Focused on earning back the $44 billion acquisition cost, Musk has been laying off staff and scaling back the free services Twitter offers.

An engineer was forced to take charge of a major project alone, which eventually led to the sudden “thunderstorm” of a project that served both users and employees and interconnected multiple critical systems.

Should technical debt also be blamed?

However, some current employees believe that many of Twitter’s current technical problems existed long before Musk took over. It’s not for nothing that the original Twitter was dubbed a “failed whale.”

A current employee noted, “Twitter 1.0 created too much technical debt. If you make a change now, everything will fall apart.”

For example, during the early development of Twitter, the MVP chose Rails. Rapid development capabilities brought rapid product verification, but the inefficiency of Rails made Twitter quickly hit the technical ceiling: Around 2007, Twitter hung up at every turn, even for three days. Later, after the new technical director took office, he did a drastic debt restructuring, abandoned Rails, embraced the Java ecosystem, rewritten many core services with scala, and finally stabilized the service.

In 2011, Twitter also encountered site stability issues. When using the Twitter API of the HTTP protocol, calling the OAuth method of statuses or home_timeline would cause some problems.

However, it is unclear whether Musk, who is now at the helm of Twitter, really cares about Twitter’s technical debt.

Earlier, Musk said on Twitter, “I apologize for the super slow speed of Twitter in many countries / regions. App is performing more than 1000 RPCs with improper batch processing, just to render a home page timeline!”

Subsequently, Eric Fronhoefer, an engineer who was still at Twitter at the time, publicly pointed out Musk’s mistakes. The reasons given also included years of technical debt that made Twitter trade speed and functionality for performance, but he was fired the next day.

Interestingly though, Fronhoefer also noted at the time that “we should probably prioritize some major rewrites to combat 10+ years of technical debt and call for aggressive feature removal.”

Now, Musk also said in a tweet that he would completely rewrite it.

While Twitter managed to recover within a few hours this time, the story behind the glitch seemed to point to more trouble ahead for Musk.

The text and pictures in this article are from AI Frontline

loading.gif

This article is reproduced from https://www.techug.com/post/the-only-ordinary-engineer-who-maintains-the-api-can-easily-destroy-twitter-musk-let-s-rewda19e85f829a1cdccff0/
This site is only for collection, and the copyright belongs to the original author.