RION Performance Benchmarks

Jakob Jenkov
Last update: 2019-09-14

One of the RION Design Goals is that RION should be fast to read and write. To verify that we have indeed met that design goal we have benchmarked RION against other data formats and toolkits. This page contains the results of these benchmarks. Of course, no data format is best at everything, but as you can see, RION does pretty well across the many situations measured.

RION vs. JSON vs. Protobuf vs. MessagePack vs. CBOR

We have compared RION to JSON, Protobuf (Google Protocol Buffers), MessagePack and CBOR.

First of all we have compared RION to JSON because JSON is a commonly used format for exchanging data over a network. JSON is a natural choice if the client is a web browser because web browsers have built-in support for parsing JSON to JavaScript objects. But JSON is also often used as data format between backend services despite not being the fastest, most compact or most flexible and expressive data format you could use for that purpose. For JSON we have used Jackson's JSON APIs which are known to be among the fastest JSON APIs out there.

Second, we have benchmarked RION against ProtoBuf, MessagePack and CBOR which are all binary data formats. Since RION is a binary data format it is more fair to benchmark RION against these data formats than JSON. For Protobuf we have used Google's Protocol Buffers implementation. For MessagePack and CBOR we have used Jackson's implementations.

Toolkits and APIs

We have benchmarked IAP Tools, Jackson (2.5.3 + 2.6.3) and Google Protocol Buffers (3.0.0-alpha-2). Furthermore, both IAP Tools and Jackson has multiple APIs you can use, so we have (or will soon) benchmark those too.

Jackson is used for JSON, MessagePack and CBOR.

Both IAP Tools and Jackson have a Java Reflection based API which figures out what fields to serialize via reflection. Benchmarks of reflection based APIs are suffixed with a (R).

Both IAP Tools and Jackson also have an API where you need to "hand code" the reading and writing of objects. These APIs perform better than the reflection based APIs, but require more hand coding by developers using them. Benchmarks measuing hand coded APIs are suffixed with an (H)

Google Protocol Buffer reads and writes are always hand coded.

IAP Tools also has an "optimized" option where property names of objects are left out, so only property values are written. Benchmarks measuring this option has an extra O added to the suffix. For instance (HO) or (RO).

We have used a red color for JSON (textual format), yellow colors for other binary formats (MessagePack, CBOR, Protobuf) and a green color for RION formats.

Benchmark Information

The benchmarks are all implemented using the JMH - Java Microbenchmark Harness. We have attempted to make the benchmarks as fair as possible (to our knowledge). Of course we may have overlooked something. Therefore the benchmark code is publicly available on GitHub:

https://github.com/jjenkov/iap-tools-java-benchmarks

The benchmarks are executed on an Intel Core i7-4770 Quad-Core Haswell server which has no other work load than these benchmarks. The benchmarks are executed with Java JDK 1.8.0_u60, 64 bit edition with no special JVM flags enabled.

Length vs. Throughput

We have measured both the serialized length of various formats as well as the throughput of read and write operations. For serialized length, a lower number is generally better. For throughput a higher number is generally better.

The length of serialized data matters. More compact data transfers faster over networks - especially over encrypted connections where it is currently recommended to turn off compression because of the CRIME and BREACH attacks.

Benchmark Configurations

We have benchmarked the reading and writing of the supported data types individually, and we have a benchmarked the reading of writing of objects with mixed data types to get a picture of the average performance you can expect.

We have measured the individual data types in the following configurations:

1 object with 1 field - of the given data type
10 objects with 1 field - of the given data type
100 objects with 1 field - of the given data type
1000 objects with 1 field - of the given data type
1 object with 10 fields - of the given data type
10 objects with 10 fields - of the given data type
100 objects with 10 fields - of the given data type
1000 objects with 10 fields - of the given data type

A single object with a single field gives an impression of how RION performs with small objects containing a few fields of the given data type.

The single object with 10 fields gives an impression of how RION performs with objects with more fields of the given data type. 10 fields means that the overhead of writing the object (the object overhead) is spread out over more fields. The performance of reading and writing an object with 10 fields of a given data type thus gives you a more precise impression of the read / write performance of that data type.

The reading and writing arrays of objects with fields of the given data type is done to show the performance of reading and writing bigger data structures. We have measured these configurations with 10, 100 and 1000 objects in the array. 10 because 10 is a common number of objects to send back from e.g. a web service (e.g. 10 search results, or browsing through a larger number of results 10 at a time). 100 and 1000 to give a picture of the performance of reading and writing larger numbers of objects.

Read Throughput

This section contains read throughput benchmarks for a variety of objects with different numbers of properties and data types. The objects are the same as used during write throughput and serialized length benchmarks later.

By throughput is meant the number of times per second a given API can read an object from serialized form. The higher throughput the better.

Mixed Types

The mixed type throughput benchmark uses an object with a boolean, int, float, double and string field (5 fields).