I've now completed the first phase of testing Riak/leveled with 2i query load on top of standard GET/PUT Key/Value pressure.
In the previous tests without 2i Riak/leveldb was throttled by write pressure, which was exaggerated by enabling sync_on_write, using traditional HDDs or using larger objects. To provide an interesting comparison with secondary indexes I've tested this so far only with SDDs, with no sync_on_write and with midsize objects (8KB).
The test results follow previous trends with Riak + leveldb performing well until disk busyness becomes a commonly recurring factor - and at this point the throughput achievable becomes increasingly volatile. By comparison riak + leveled is constantly constrained by available CPU, and performs at a more consistent rate with reduced tail latency throughout the test. However riak + leveled cannot perform as fast riak + leveldb over short intervals when riak + leveldb is not subject to resource pressure.
A six hour test with 100 GETs to 20 updates to 2 secondary index range queries was used, with half the updates carrying four new index entries each. Over the course of the test throughput was roughly equivalent - with Riak/leveled achieving 4.48% more throughput (which is close to the margin of error in cloud-based testing). However in the last hour of the test the throughput advantage for Riak/leveled had increased to 22.36%, and this appears to be a consistent sustainable advantage.
As ever, multiple caveats apply. I've attempted a more complete write-up of the 2i test results here:
There were a number of optimisations made to leveled as part of this round of testing, mainly to try and improve the efficiency of the layout of the SST files within the Ledger (key and metadata store).
In the short term there is some additional refactoring I'm going to attempt to reduce the overhead of starting a 2i query within leveled. I will then look to demonstrate potential improvements in hashtree rebuild times and overheads in Riak/leveled when compared to Riak/leveldb. I also believe the improvements made as part of the 2i testing round should also reap benefits in the previous comparisons - so I will revisit those tests too and get up-to-date results.
In the medium term I intend to use leveled as a starting point for investigating new Riak features, and also new configurations (e.g. cloud-optimised, single backend for RiakCS) which may benefit from the flexibility gained from separation of Key and Value stores.
Many thanks to Russell Brown for his help and guidance over the past month. Also a big thanks to Mark Shaw and Angus McAllister from Amazon for their continued support running the tests.