After 5 years of code review for startups, I summed up these dozen lessons

When I was at PKC, I did more than 20 code reviews, and many of those clients were startups that had just raised their Series A or B rounds. They usually have extra cash at this point, have already looked at those “do it or die” points in product-market fit, and thought it would be best to dig into the security issues, so they came to us to review the code.

It’s been a sleepless job, and we’ve delved into the technology stack and architecture, across a variety of domains, and uncovered all kinds of security issues, some catastrophic, some just for fun, and so on. . We also had a chance to chat with their tech executives and CTO about the engineering and product challenges they faced when they were just starting to scale.

Some of these reviews are seven or eight years old, and it’s interesting to look at these startups now and see which ones are doing well and which ones are gone.

What I’d like to share here, though, are some of the things I’ve learned from these observations, which may be a little unexpected. I’ve roughly ordered them from least to most relevant to security.

1. It doesn’t take hundreds of engineers to build great products

I also wrote a long article about it. Although the startups who came to us to review their code all started at a similar level, the size of the technical teams varied widely. Surprisingly, sometimes the most impressive and functional products are built by smaller teams. A few years later, it was these “little strong” teams that flourished in their own market.

2. Simple is better than clever

I’m a self-confessed elitist, and well, I’ve had to accept the fact that the startups we’ve reviewed, the ones that are doing the best right now, usually have an almost axiomatic “keep it simple” kind of engineering. method. Conversely, being smart is disgusting, and the companies that made us feel like, “Wow, these guys are crazy,” are mostly gone. Generally speaking, the main “football” that gets many companies into trouble is the premature move to microservices, architectures that rely on distributed computing, and messaging-heavy designs.

3. The most impactful results of a review always appear in the first and last hours

When you think about it, it makes sense: in the first few hours, obvious problems can easily pop up — just grep the code or test some basic functionality and they’ll show up; Within hours, you’re fully integrated into their codebase, so it’s easier to spot problems.

4. In the last 10 years, writing secure software has become very easy

I haven’t done statistics, but it feels like there were far more vulnerabilities per SLOC (annotation: lines of source code) before 2012 than code after 2012 (we started reviewing in 2014). Maybe it’s because of the use of Web 2.0 frameworks, or the increased security awareness of developers. Whatever it is, I think security has really been fundamentally improved in terms of the tools and default settings that software engineers now have.

5. All serious security flaws are easy to check out

In maybe 1 in 5 of the code we reviewed, we found a “big fish” — a bug so severe that we had to call our customers right away to get them fixed. I can’t recall a single case where the vulnerability was so clever that it took a lot of detours to find it. That’s one reason why those serious vulnerabilities are so bad – they’re so easy to find and exploit.

“Discoverability” (discovered by an attacker) has always been an integral part of code vulnerability analysis, so it is common to find vulnerabilities during code vulnerability analysis. But I still think more emphasis should be placed on discoverability of vulnerabilities. When the vulnerability is not exposed, it is a probabilistic event; but when it is exposed, it is 100%. Hackers are lazy, they like to pick the lowest fruit on the tree.

If the user’s reset token is placed in the response (as Uber discovered circa 2016), hackers can just use it to reset the user’s password, and they don’t care about the heap spray bugs, even if they’re more serious . One might object that putting too much emphasis on discoverability perpetuates “stealth security” because it relies so much on guessing what an attacker can or should know. But again, from my personal experience, in practice, discoverability can be used to give security as a good predictor.

6. Preset security settings provided in the framework and infrastructure, which greatly improves security

I also wrote a long article on this. But in essence, things like React escaping all HTML code by default to avoid cross-site scripting, and the serverless stack taking OS and web server configuration away from developers (and avoiding their mistakes) are also significant Improved security. In contrast, the PHP code we reviewed was riddled with XSS. These newer stacks/frameworks are not without flaws, but they have a smaller attack surface, which is exactly where they can make a huge difference in practice.

7. A single repository is easier to review

From an ergonomic point of view, it is easier for a security researcher to review a single repository than to review different codebases split into a series of services. A single repository saves us from having to write wrapper scripts around various tools, makes it easier to determine if a piece of code is being called from elsewhere, and most importantly, doesn’t have to worry about version conflicts for common libraries in different repositories.

8. Finding vulnerabilities in dependent libraries can easily cost you all the time

It is difficult to tell whether a particular vulnerability in a dependent library will be exploited. Our industry is under-invested in securing base libraries, which is why libraries like Log4j have such a big impact. Node and npm are absolutely horrible in this regard – we can’t audit the dependency chain inside. The dependabot released by GitHub is a big red envelope, and in most cases, we can just tell customers to upgrade things in priority order.

9. Never deserialize untrusted data

This happens most in PHP, because for some reason PHP developers prefer to serialize/deserialize objects instead of using JSON. However, we have found that in almost all cases, the server can lead to horrific vulnerabilities when deserializing and parsing the object passed in by the client. For those of you who don’t know, see here: Portswigger has a breakdown of possible problems (focusing on PHP, by the way. Are you saying this is a coincidence?).

In conclusion, what all deserialization vulnerabilities have in common is that the user is able to manipulate objects that the server will use later, which is a very powerful capability with a large attack surface. Conceptually, it’s similar to prototype pollution and letting users generate their own HTML templates. Want a fix? It’s much better to let the user send a JSON object (it has very few data types available, so it’s hard to hack) and then construct it manually based on the fields in the object. Slightly more work, but well worth it!

10. Business logic flaws are rare, but when they occur, they are often very bad

Think about it – flaws in business logic definitely impact the business. An interesting corollary is that even if your protocol is built to provide something provably safe, human errors are extremely common, and they come in the form of bad business logic (see those exploiting poorly written intelligence). A series of absolutely devastating loopholes created by the contract).

11. Custom fuzzing is very effective

A few years after we reviewed our code, I started requiring all our code reviews to include custom fuzzers to test production APIs, authentication, etc. It’s common in a way, like I stole the idea from Thomas Ptacek, who mentioned it in the job posting. I’ve always thought it was a waste of time until I did it and didn’t find it “really fragrant” – I always thought it was just an example of misapplied engineering, and review time was best spent reading code and trying out various assumptions. But fuzzing has proven to be very effective and efficient in spending time, especially on larger codebases.

12. Acquisitions make security more complicated

There are more code patterns to look at, more AWS accounts, and more types of SDLC tools. Of course, acquisition often means switching to other languages and/or frameworks, each with their own usage patterns.

13. There is always at least one security enthusiast hiding in the software engineering team

Hardly anyone could see who they were at a glance. With security skills increasingly tilted toward software, digging out such talent accurately is like finding a treasure.

14. The ability to quickly repair vulnerabilities is related to the excellent operation and maintenance capabilities of general projects

In the best case, the customer asks us to give them feedback as soon as we find a problem, and they fix it right away.

15. Few people get JWT tokens and webhooks right the first time

When using webhooks, people almost always forget to authenticate incoming requests (or maybe the service they’re using doesn’t allow authentication… that’s frustrating!). One of our researchers, Josh, asked a series of questions about this type of situation that led to the DefCON/Blackhat talk. It is notoriously difficult to properly handle JWTs, even with some off-the-shelf libraries. A lot of implementation code doesn’t properly expire the token on logout, or incorrectly authenticates the JWT, or simply defaults to trusting it.

16. MD5 is still used in many places, but in most cases it is fine

In fact, MD5 can be used to do many other things besides being collision resistant when computing cryptographic hashes (which it can’t). For example, because it is very fast to calculate, it is often used in automated tests to quickly generate large numbers of pseudo-random GUIDs. At this time, it doesn’t really matter whether MD5 is safe or not, but static analysis tools may still call the police. (ignore them)

I don’t know if you have seen these views. If you have good ideas, please leave a message.

Original link:

https://kenkantzer.com/learnings-from-5-years-of-tech-startup-code-audits/

The text and pictures in this article are from InfoQ

This article is reprinted from https://www.techug.com/post/i-have-reviewed-the-code-for-start-ups-for-five-years-and-i-have-summarized-these-more-than- ten-experiences-2/
This site is for inclusion only, and the copyright belongs to the original author.