Kotlin Lead: How did we design Kotlin step by step?

Author | Roman Elizarov

Translator | Liu Yameng

Planning | Deng Yanqin

A real programming language is alive, constantly changing and evolving. As with any production code, most of their designers’ time is spent on bug fixes and small improvements, not radical new features. What makes Kotlin unique is that it has been developing in a use-case and community-driven way over the years, starting long before the 1.0 stable release in 2016, and even before it hit the market in 2011 for a while.

1The story of Kotlin null safety

Take Kotlin null-safety as an example. The first question that any language design must answer is: Why should this or that feature be incorporated into a language when there are so many potentially interesting features in the research literature and other languages?

Kotlin is designed to be a better Java for those who are already programming in Java, so its design goals focus on addressing all the known drawbacks that Java programmers often suffer from. Based on this, adding null safety was a natural choice, since NullPointerException is the most common problem encountered in real-world Java code.

Null safety and its question mark syntax existed in many research languages for many years before Kotlin. The conceptual idea is simple and time-tested, so integrating it into a practical language seems easy. However, as this straightforward design with non-null and nullable types started to be used in real code, it quickly began to conflict with another goal of Kotlin language design (seamless interoperability with Java).

Most of the Java code that Kotlin has to interoperate with is not marked as empty. Null-safe languages must assume that Java methods can return null, but providing each Java method with a nullable result type in a null-safe language would result in very verbose code, which is not practical. For the Kotlin problem, we experimented extensively with real-life projects, took some promising approaches, and collaborated on Kotlin-specific research with Ross Tate at Cornell University, culminating in Flexible Types , in Kotlin, this flexible type is colloquially known as a platform type) gives a solution.

The basic idea behind flexible types is that in order to interoperate with less type-strict languages like Java, we don’t use broader nullable types such as String? , which includes all strings and an empty type, or a A narrower type, like String, but we use a flexible type — a range of types from String to String? to represent an unknown type from Java that is in that range. Relax the type system, allowing all operations on any type within the scope of a flexible type to be checked for correctness at runtime. This solution achieves a pragmatic compromise in developer experience, so Kotlin developers are no worse off when using the Java API than when using Java itself, but still enjoy a safer type system when using the Kotlin API. For related details, see Flexible Types in JVMLS 2015 – Kotlin.

Why didn’t anyone do this before Kotlin? Before this, no one had attempted to integrate null safety into the language’s type system, while maintaining safety and interoperability at such a large scale. This same collaboration resulted in a mixed site variance solution, which is also required by Kotlin for similar Java interoperability reasons (see FOOL 2013: Mixed site variance). In fact, even today, Java interoperability still consumes a significant portion of the time spent in Kotlin language design.

2 Evolution and coroutines

In the initial design of the language, the most important consideration is which features to remove, not which ones to include. Much research language revolves around a few core ideas. Practical languages eventually become more inclusive, especially when you consider that it has to be understandable and easy to learn by professional developers who are used to writing other industrial languages. However, keeping the language short at first is easy. Of course, this is natural due to the limited resources of the language development team.

As the language is used more and more in real life, it will face real-life industrial code with its idiosyncrasies, quirks, and patterns. Real-life languages are under pressure to better support all languages. As a language evolves, the focus of language design inevitably shifts from initial design goals to feature interaction and support. The challenge is to maintain the conceptual integrity of the language, ensuring that new features are not only implementable, but also easily understood and adopted by users of existing languages and integrated into their ecosystem.

Kotlin coroutines (Coroutines) were added after the 1.0 stable release of the language, with the first experimental support introduced in 2017. Kotlin coroutines were heavily inspired by C# async/await, but the final Kotlin design was very different from what was explained in Onward 2021, Kotlin Coroutines: Design and Implementation.

One of the reasons for this discrepancy is hindsight. At that time, we have realized that the internal implementation mechanism of the C# yield keyword is almost the same, which supports both the synchronous enumerator coroutine and the async/await mechanism of the asynchronous coroutine. A natural desire is to unite the two. This way, language teams can spend less effort implementing a simpler language feature and compiler support. Then, various types will be provided through libraries, enabling separate support for synchronous and asynchronous coroutines.

Another reason is the aforementioned conceptual completeness. The Kotlin language already has its own legacy and a lot of code, so new features that support coroutines must fit into the existing codebase and must help existing users. Therefore, a lot of emphasis is placed on interoperability with all asynchronous and reactive Java programming frameworks (used by Kotlin developers) and its use in desktop UIs and mobile applications performance and ease of use (which was getting a lot of attention in the Kotlin ecosystem at the time).

Differences in use cases in focus and form inevitably lead to differences in design. Unlike future/promise-based designs that bring another type of future to an already diverse ecosystem, this design is based directly on underlying continuations and introduces a LISP-inspired call-with-current -continuation primitive (called suspendCoroutine in Kotlin), which makes it easier to integrate Kotlin coroutines with all existing libraries.

3 Tradeoffs

The design of many new features is full of trade-offs. For example, we recently improved type inference for recursive generics in Kotlin 1.6 (see KT-40804 Infer types based on upper bounds). The original enhancement request came from API users using recursive generic types in the builder pattern, where the result of a function is reified without explicitly specifying the function’s type parameter and without any context from which to infer it. In this case, the user wishes to infer a wildcard type to represent the type family.

However, Kotlin is designed to suppress type inference in this case. In Kotlin, a call to the function listOf(1) infers the result type of List because the type of the parameter gives a type hint. However, the call to listOf(), which has neither parameters nor a type in the context, fails to compile. Although technically it might be inferred to be a List<Any?> , representing the widest type that this function can return. Instead, Kotlin forces the developer to specify the type explicitly in the call, like istOf(). This avoids the compiler having to guess the developer’s intent, which is often wrong in the actual code, and thus prevents further bugs in the code.

The difficulty with recursive generics is that Kotlin has no explicit syntax to specify such recursive types in order for the code to compile. So we have multiple options. One of the most popular options is to use a special syntax that tells the compiler to infer upper bounds on type parameters. In practice, however, this means that in all the use cases listed in the table, we have to write some extra boilerplate code to keep the compiler happy. So we end up with a special set of rules that detect usage patterns of recursive generics in called functions and automatically enable upper bound type inference for all such calls.

Eventually the language became less regular and more complex due to the addition of special rules, but it was more straightforward and simpler to use for the actual code written in it.

4 Fine-tuning and improvements

Most language design work is not about big features, but about fixing small problems and inconveniences everywhere. These little problems are often inconsistencies in language design. First let’s discuss how they might appear.

Once a new feature is added, it starts interacting with all other language features. These interactions tend to create a lot of corner cases. Designing for all of these corner cases is time-consuming and often becomes impossible in the absence of practical use cases for these corner cases. Kotlin’s approach to this is pragmatic. If we can’t find or imagine a specific corner case use case, then we disable it, giving a compile error when using the corresponding feature combination. Sometimes there are known use cases, but they do not exceed the design and implementation effort.

For example, when Kotlin coroutines became stable in Kotlin 1.3, they introduced a new function class – suspending functions and corresponding suspending function types. However, using a suspended function type as a supertype is not allowed. How to represent them at runtime while supporting runtime type checking using the is operator in Kotlin requires a very complex design. This was added later in Kotlin 1.6 as the use of coroutines became more and more and there was an increasing need to implement this feature interaction (see KT-18707 Support for suspending functions as supertypes) .

Sometimes the contradiction is historical, even predating the initial version of the language. Currently, the Kotlin team is working on a massive engineering project to rewrite the entire Kotlin compiler. The compiler’s architecture is being redesigned to improve performance and future extensibility. In this work, we encountered dozens of edge cases where a compiler written from scratch according to a consistent set of rules starts to behave differently in some real code. Some of these findings can be traced back to language design, rethinking whether the behavior of older compilers makes sense or needs to be replaced. From quirks in type inference to behavior that depends on the order in which supertypes appear in the source code, we’ve found a few cases.

5 Deprecated

When the language is stable and changes are needed, it’s often impossible or impractical to make changes in a fully backwards-compatible way, especially if you’re intentionally fixing some old design flaws. This is fortunate when the bug is severe enough to crash the previous version of the compiler or the generated code immediately crashes. But sometimes, it does work and might produce some code to do something sensible.

Much of the design work goes into assessing the impact of these changes and designing a migration plan to introduce these changes into the language. In some cases, migration plans may span multiple versions, and may span multiple years, when the potential impact of the change is not negligible. There are cases where warnings and automatic code fixes are implemented in older versions of the compiler and IDE, so that developers affected by the change will have enough time to replace code in advance of the new version of the compiler before it is released (the new version of the compiler This code is treated differently.)

This work is also about trade-offs. The simplest decision is often not to change anything and keep the old behavior forever, even if it is flawed. However, it built up design debt in the language and technical debt in the compiler. This is not a sustainable approach as it will make further development of the language increasingly difficult. Therefore, a balance must be found between maintaining backward compatibility and language evolution.

For example, historically, the way the original compiler handled safe calls and combinations of various Kotlin operator conventions (like a?.x += 1 ) was very inconsistent. Therefore, it had to be redesigned to minimize breaking of existing code that might depend on some of its behaviors. We have conducted several experiments on the existing Kotlin codebase and used various solution prototypes to select such a design. Details on the original issue and our final design can be found in KT-41034.

6 Conclusion

Language design in the real world is the maintenance of a complex system. We believe that with care we can keep Kotlin modern and relevant for decades to come. This is a very interesting design and engineering challenge.

Along the way, we also continue to encounter novel research questions about type systems, feature interactions, usability, real code patterns in large code, and more. Research collaboration in these areas is essential to put all improvements on a sound basis.

About the Author

Roman Elizarov is the project lead for Kotlin at JetBrains and currently focuses on language design for Kotlin as the lead language designer. He has been working on Kotlin at JetBrains since 2016 and has contributed to the design of Kotlin coroutines and the development of the Kotlin coroutine library.

Original link: language-design-in-the-real-world

The text and pictures in this article are from InfoQ Architecture Headlines

This article is reprinted from https://www.techug.com/post/kotlin-leader-how-do-we-design-kotlin-step-by-step/
This site is for inclusion only, and the copyright belongs to the original author.

Leave a Comment Cancel Reply