why innodb?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

why innodb?

Richard Bucker
With the many fully open and free daatbases out there I was wondering why the team selected innodb? If memory serves there are some tools, like hotbackup/restore, that are not free or included. Also, while there is a innodb/erlang library it would seem that there is an impedance mismatch there.

/r

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: why innodb?

Dave Smith


On Fri, Feb 12, 2010 at 5:12 AM, Richard Bucker <[hidden email]> wrote:
With the many fully open and free daatbases out there I was wondering why the team selected innodb? If memory serves there are some tools, like hotbackup/restore, that are not free or included. Also, while there is a innodb/erlang library it would seem that there is an impedance mismatch there.

Hi Richard,

The major drivers for choosing Embedded Inno over other storage libraries available right now were 1. predictability and 2. stability. 

For Riak's purposes, we need something that is going to have predictable latency under significant loads. After evaluating TokyoCabinent (TC), BerkeleyDB-C (BDB) and Embedded Inno, it was quite clear that Inno won this aspect hands down. TC was quite fast until the dataset gets large and then write latency goes through the roof. BDB-C has an excellent average latency, but the 95th+ percentile latencies were highly variable (I saw 95th percentile times > 15 seconds); there were also some pretty icky threading bugs in the MVCC subsystem of BDB-C when using a lot of parallelism. Embedded Inno had none of these problems and performs better than the other two in all the test scenarios that I was able to come up with.

In terms of stability, Embedded Inno did require a few patches (which can be found on the Embedded Inno forums). However, with those patches in place, we've not had any problems with long running deployments and significant parallelism (i.e. 512+ threads). More importantly, when bugs have been found in Inno, the team on the support forums is quite responsive in answering questions. Contrast this with the BDB forums where when we found a bug with MVCC subsystem it took a few months for anyone to get back to us on what the problem _might_ be. 

To be honest, I originally thought that BDB-C would be the ticket. It does have a better set of support tools and a great reputation. But when matched up against the requirements for a distributed k/v store and when it was MEASURED, it fell short. Embedded Inno has won me over pretty completely with fast performance (even for an unusual use case of storing only BLOBs), solid behaviour and good support forums.

Hope that helps,

D.

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: why innodb?

DavidDabbs
In reply to this post by Richard Bucker


Thanks for the storage research details David. I'd be curious as to the
performance of BDB
now versus when you evaluated it. The recent versions seem to have exorcised
some of the  
MVCC demons you referenced:


dd
david dabbs



4.8.26 03 Feb 2010
------------------------
Changes between 4.8.24 and 4.8.26:

        truncate log record could be too large when freeing too many pages
during a compact. [#17313]

        deadlock detector might not run properly. [#17555]

        three bugs properly detecting thread local storage for DbStl.
[#17609] [#18001] [#18038]

        "unable to allocate space from buffer cache" error was improperly
generated. [#17630]

        DB->exists() did not accept the DB_AUTO_COMMIT flag. [#17687]

        DB_TXN_SNAPSHOT was not getting ignored when DB_MULTIVERSION not
set. [#17706]

        bug prevented callback based partitioning through the Java API.
[#17735]

        replication bug where log files were not automatically removed from
the client side. [#17899]

        bug that prevented a sequence from closing properly after the
EntityStore closed. [#17951]

        gets fail if the DB_GET_BOTH_FLAG is specified in a hash, sorted
duplicates database.[#17997]


Known bugs in 4.8

        Sharing logs across mixed-endian systems does not work.[#18032]



4.8.24 17 Aug 2009
------------------------
Changes between 4.8.21 and 4.8.24:

        bug in MVCC where an exclusive latch was not removed when we
couldn't obtain a buffer. [#17479]

        bug which could trigger an assertion when performing a B-tree page
split and running
        out of log space or with MVCC enabled. [#17531]

        bug where a lock wasn't removed on a non-transactional locker.
[#17509]

        incorrect representation of log system configuration info. [#17532]

        GCC 4.4 compiler bugs when building the examples and dbstl API.
[#17504] [#17476]






_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: why innodb?

Dave Smith
On Fri, Feb 12, 2010 at 11:21 AM, David Dabbs <[hidden email]> wrote:

Thanks for the storage research details David. I'd be curious as to the
performance of BDB
now versus when you evaluated it. The recent versions seem to have exorcised
some of the
MVCC demons you referenced:

Well, I'll see about it. Embedded Inno still wins on predictable latency and support turn around, so BDB isn't exactly something I'm motivated to spend time on. :)

D.

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com