Given the last article discussing “Why Java?”, it is worth answering the other half of this question: “Why not WASM?”
This is a difficult question to answer without getting deeply technical since there aren’t many areas of comparing virtual machines which can be considered anything other than a very niche technical discussion. My hope is to at least describe how we evaluated this decision and then flesh out the technical details of each point, so that at least the high-level direction makes sense.
Meet the contenders
We are going to talk about how EVM, WASM, and JVM address different concerns within the blockchain space so let’s first explain what these are.
EVM is the “Ethereum Virtual Machine”. It is what currently runs the programs which live on the Ethereum blockchain. It was designed specifically for the original needs of Ethereum and it is what Solidity compiles into, when writing a smart contract.
WASM is “web assembly”. It was designed as a very minimalistic virtual CPU for running programs in a web browser. This is to replace the use of JavaScript as the “assembly language of the internet” (JavaScript can remain in its own domain). Programs written in most languages which the LLVM compiler supports (notably C, C++, and Rust) can be compiled into this portable WASM code. Given this multi-language support, and relatively simple virtual machine, it is a common choice for “portable executable” problems, most recently the blockchain. Even Ethereum is looking to replace their own EVM with eWASM, in the future.
JVM is the “Java Virtual Machine”. It was written in the mid-1990s to enable writing portable programs with high-level languages (originally Java, but now others such as Kotlin and Scala). In its long life, it has now grown to include a considerable class library programs can rely on, a vibrant tooling ecosystem, and a massive developer community. It now dominates areas where sustainable large-scale software design or high-performance portability are required.
Points of comparison
When conducting an analysis of various options, it is important to use a common framework. I’ve put together the important categories for comparison when looking at virtual machines in a blockchain environment.
Security: This specifically refers to the safety of running untrusted code within such a virtual machine. What possible opportunities exist for the program to attack the node where it is running?
Determinism: One of the key requirements of modern blockchain systems is that they can reliably reach consensus such that the same program, run with the same input, will produce the same output. This requirement of deterministic behavior is absolute, so we will touch on any special requirements or limitations which must be applied to make this possible on each virtual machine.
Trust: This refers to the understanding that the broader community has in the reliability of the virtual machine. This is based somewhat on age, but also the environments where it has been successfully deployed.
Language/Feature support: No matter how great the virtual machine is, it is only as good as the software written for it (otherwise, it doesn’t do anything). To this end, we will describe what kinds of programming languages are supported, or other machine features are exposed which could enable more software.
Tooling: Similar to the language support, software developers need tools to build/test/monitor their software so we will also talk about the maturity and some of the scope of each tooling ecosystem.
Compatibility/libraries/patterns: While the blockchain environment is new, familiarity with the virtual machine and tools, from existing environments, could be valuable.
Performance: In this section, we will talk about the implications of not just pure execution speed (considering both CPU but also memory throughput), but also opportunities to enhance the often-overlooked performance bottleneck that is disk or database access patterns.
A high-level analysis comparing the EVM, WASM and the JVM:
To quickly summarize, the trends are not surprising when considering the original domains of the respective technologies: EVM does well in security and determinism, WASM does well in portability and embedding concerns, and JVM excels in large-scale production concerns like performance/compatibility/tooling.
Technical discussion
While that is the summary of what we found and what kind of methodology we used, it is worth explaining the technical meanings of these concerns, how we think any weaknesses could be mitigated, and our ultimate justification for choosing the JVM as the basis for AVM.
Security
Security is one of the first factors to consider when looking to build a blockchain virtual machine since its task is to run untrusted code, all the time, and this code cannot be allowed to harm the node.
For the most part, this is where EVM and WASM both shine while the JVM’s traditional use-case is based on the assumption that the user is choosing to run trusted code.
The reason for this is that the EVM was built specifically for this environment, so it already assumes it is always under attack and WASM has the similar requirement of how to safely run code from a remote (typically untrusted) website on a local browser.
That said, we realized that we could use bytecode instrumentation to restrict access to the class library, which is where such access must be realized (that is the wall of the sandbox within Java). This allows us to make this untrusted code safe.
It is also worth discussing JIT, at this point, since it can be used as an attack vector. Many virtual machines contain a “just-in-time compiler” (JIT): instead of interpreting their input language, they compile it to the native language of their hardware. In many environments, this compiler is run when the smart contract is deployed or any time it is run. This front-loads the cost of compilation into execution which can be used as an attack by sending a cheap program which might be expensive to compile.
We have two ways of looking at this problem, when considering the JVM. First of all, the JVM will run the JIT on the same code, multiple times, if it keeps running it, increasing the effort it is willing to expend in optimization, since it wants to avoid this cost, where possible. This means that such an attack would incur a high cost to the attacker. Secondly, we have complete control over the incoming user code, based on our instrumenting transformers. This means that we could identify problematic idioms to change/reject them, in the future.
Our conclusion: While EVM and WASM have superior security, out of the box, we could enforce security requirements within the JVM by transforming incoming untrusted bytecode.
Determinism
Determinism is probably the second point to consider, next to security, when evaluating a virtual machine for the blockchain, as consensus breaks if nodes disagree on the result of a transaction. While sources of nondeterminism are typically externals, they can also be details of the hardware or aspects of the virtual machine’s implementation.
As this is a core blockchain concern, it is little surprise that EVM comes out on top. It was built for this, so of course it has its bases covered.
Even though WASM doesn’t, by default, expose any kind of externals, it does expose floating-point math support and doesn’t define that intermediate calculations are strict (although some implementations expose this option). This can allow details in the CPU implementation (architecture, vendor, or even generation) to be observed by the program, thus allowing different results.
The JVM scores poorly on this due to the same externals concern within its class library, as previously mentioned in the security section, but also within aspects of its implementation: identity hash code. Every object allocated within Java has an identity hash code whose value is not deterministic. This could allow a plethora of differences to be observed between nodes.
It turns out that these concerns can be mitigated. WASM floating-point support can be disabled, at the cost of losing the feature (note that EVM does not support floating-point, at all), or a strict mode can be enabled, if available on the WASM implementation. We are easily able to enable the strict floating-point mode in the user code, on the JVM.
The JVM work-around for class library security can also be applied here. Similarly, this can be extended to make the hash code deterministic since it gives us control over the base class library the user code sees.
Our conclusion: Much like the security section, EVM is the clear winner here but the short-comings of WASM and JVM can be mitigated to make them options, as well.
Trust
The trust that a community has in a virtual machine is based less on its pure technical merits and more on its maturity and how many successes have been observed, in deployment.
This is an area which penalizes WASM for being a very new idea with largely untested implementations.
EVM does reasonably well in this area as it has proven itself reliable over the past few years of Ethereum success stories under its belt.
The JVM wins this contest, hands down. It is over 20 years old and a few of its implementations (most obviously the standard Oracle implementation) have been the core technology across several domains, including high-scrutiny enterprise environments, for much of that time.
Our conclusion: The trust in the JVM is practically unprecedented within the virtual machine space so we felt that organizations building applications at scale would be more familiar and comfortable with this choice.
Language/Feature support
While this wasn’t a traditional concern of blockchains, it is growing in importance as we try to drive mainstream adoption and expand the possible scope of applications.
In this case, EVM scores poorly since languages targeting it are domain-specific and there is a history of bugs being hard to find in its type system.
Both WASM and JVM score highly in this area but for different reasons. WASM was designed to be a good fit as an LLVM back-end, meaning that most languages which LLVM can handle are able to generate code for WASM (notably C, C++, and Rust).
The JVM does well here due to the mainstream popularity of Java and the large number of JVM languages (notably Kotlin and Scala) with growing user bases.
Regarding other adoption-relevant features which matter here, we are again brought to floating-point. While financial applications have no interest in this feature, other application domains do. EVM doesn’t support this and WASM needs it disabled for determinism (or the implementation needs to offer a strict mode). The JVM, on the other hand, does have the ability to operate in a strict mode, which avoids intermediary error accumulation for a small cost.
Our conclusion: Both WASM or JVM would be good options here but the JVM does get a small advantage due to the popularity of Java within the enterprise space and the ability to support strict floating-point.
Tools
Beyond the basics of just writing a program is the requirement to perform more complex operations on it. This requires tooling. From editors, to compilers, to IDEs, to debuggers, to profilers, and beyond, this can go in many directions.
EVM has grown a set of useful tools over the past few years but they are still quite rudimentary and there has been little foray into more complex domains such as monitoring systems.
WASM is difficult to speak to since this depends on the tool, the implementation, and any additional restrictions imposed. On the level of the input languages, many existing tools can be used but tools for a running virtual machine instance vary wildly and don’t go too far into profiling or monitoring.
The JVM, on the other hand, has a very rich history of tooling for every step of the process: IDEs, debuggers, build automation, testing infrastructure, monitoring, logging, JVMTI agents, etc. There are many options in this space, and most are quite mature.
Our conclusion: The maturity of the ecosystem around the JVM puts this one handily in its court. While we may eventually see similar developments in the WASM space, it is difficult to see if its target environment (web browser) will see the need to build these tools or if there will ever be standards of interoperability between implementations.
Compatibility/libraries/patterns
The history of software development far predates the blockchain and there are many developers, libraries, and development patterns which came from these earlier, or different, domains. Ideally, we would be able to capitalize on this wealth of knowledge and expertise.
EVM, being very domain-specific, scores poorly in this area. There is essentially nothing to address these concerns within its design.
WASM approaches this from a few different angles. It brings developer familiarity from its original web domain and is able to capitalize on existing code and patterns due to its ability to compile from existing popular languages.
The JVM scores highly on this due to the long history of the JVM and its large developer base. This means a rich set of patterns and lots of preexisting code can be leveraged.
Our conclusion: Either WASM or JVM would be a good choice, from this perspective, since they both have effective ways of benefiting from preexisting developer communities and code.
Performance
Performance is a popular topic when comparing/discussing blockchains. In fact, this popularity may explain much of the interest in WASM, within the Ethereum community. The problem is that this discussion is usually simplified to just a discussion of CPU throughput, or broadly explained as transactions per second of a never shown benchmark application in a not described environment.
CPU performance of a virtual machine, when a JIT is involved, is not easy to compare. As mentioned in the security section, the JVM runs its JIT several times on frequently executed code, using application behavior it has observed to guide decisions at the higher opt levels. This, combined with the maturity of the JVM’s JIT, has resulted in this performance being beyond most other virtual machines.
Memory locality is also a factor in modern systems, as the cost of a cache miss becomes substantial, in terms of CPU cycles. The flat heap of EVM and WASM have little they can do to improve this situation while the type-safe heap of the JVM means that its GC (garbage collector) does have an opportunity. In a flat heap, related piece of data must stay where they were allocated. The JVM’s GC, however, can dynamically move related pieces of data closer together, increasing cache density and resulting in a substantial performance win.
The reality of performance is far more nuanced and CPU is, in many modern scenarios, not the bottleneck. It is for this reason that we wanted to capture other factors, specifically disk or database access. In modern blockchain applications, lots of time (often the majority) is spent waiting for data from the database so capturing that within performance is an absolute must.
This is where we find a specific piece of value within the JVM, specifically the aforementioned type-safe heap. We can build on top of the data relationship logic the GC uses for memory locality to determine long-lived data relationships, meaning we can bring this memory optimization to the disk!
Both EVM and WASM expose memory as a flat space, meaning that we can’t infer these relationships so such optimizations are not possible.
Our conclusion: The opportunity to optimize disk access on top of the well-known performance of the JVM’s JIT and GC was too compelling an argument to pass up.
Final thoughts
We suspect that all of EVM, WASM, and JVM are feasible approaches, each with their own trade-offs or (in the cases of WASM and JVM) feature restrictions. The ideal (at least for this generation of problems) is probably somewhere in between and maybe we will glimpse that as these various approaches progress.
WASM isn’t a bad choice for the blockchain, by any means, but it also isn’t an obviously good choice. The JVM gave us confidence and the ability to get this into the hands of a large developer base and tooling ecosystem quickly.