Riak maximum throughput

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Riak maximum throughput

LKolin
Riak maximum throughput We’re investigating Riak as a means of persisting some data in a very read-heavy environment. Part of this process is determining the maximum number of requests a second that a Riak instance can handle, and I’ve created a simple Junit test to hammer the box with multiple worker threads that do reads, writes and deletes. I’ve found that around 8 concurrent reader threads, the instance tops out at 950-975 requests a second (these are very small JSON objects with a single value in them), on a 4-core 3Ghz Xeon running SuSE x64. 4GB of RAM.

I seem to have pegged the Riak box, since CPU utilization across all 4 cores is around 90% user, 5-7% kernel and about 3-5% idle. Nothing else on the box is doing anything, and there’s around 1+ GB of physical RAM available. No swap file usage. I tried different back-end storage types, but ets and dets seemed about the same (??) and fs was noticeably slower.  Does 1000 requests a second seem like a reasonable upper-end, or have I missed something obvious, tuning-wise? I’m somewhat new to this.

Cheers!

Luke

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Riak maximum throughput

bryan-basho
Administrator
On Thu, Feb 11, 2010 at 12:11 PM, LKolin <[hidden email]> wrote:

> We’re investigating Riak as a means of persisting some data in a very
> read-heavy environment. Part of this process is determining the maximum
> number of requests a second that a Riak instance can handle, and I’ve
> created a simple Junit test to hammer the box with multiple worker threads
> that do reads, writes and deletes. I’ve found that around 8 concurrent
> reader threads, the instance tops out at 950-975 requests a second (these
> are very small JSON objects with a single value in them), on a 4-core 3Ghz
> Xeon running SuSE x64. 4GB of RAM.
>
> I seem to have pegged the Riak box, since CPU utilization across all 4 cores
> is around 90% user, 5-7% kernel and about 3-5% idle. Nothing else on the box
> is doing anything, and there’s around 1+ GB of physical RAM available. No
> swap file usage. I tried different back-end storage types, but ets and dets
> seemed about the same (??) and fs was noticeably slower.  Does 1000 requests
> a second seem like a reasonable upper-end, or have I missed something
> obvious, tuning-wise? I’m somewhat new to this.
>
> Cheers!
>
> Luke


Hi, Luke.  Out of the box, 1000 requests/second on a single node
doesn't sound that out of line to me.  But, there are quite a few
knobs to play with.  I'll list a few here, as well as some questions.
Please feel free to respond here, or to our internal, basho-only list,
[hidden email].

- You said you're using a JUnit test suite.  Is it running on the same
machine as Riak?  On the same switch as the machine running Riak?
We've built our own benchmarking, if you're interested in having a
look at it.

- I assume you're using one of the HTTP interfaces: is it /jiak or
/raw?  We're planning to deprecate /jiak as it's not as efficient, and
doesn't provide much other benefit over /raw.

- Did you happen to try the innostore backend
(http://hg.basho.com/innostore)?  We've found that one to be the
fastest and most predictable in the large/heavy use case.

- How big was your total dataset (# of documents)?  How random were
your accesses?  What percentage of your requests were gets vs. puts
vs. deletes?

- Are you using the standard configuration, or have you tweaked
anything, like ring_creation_size?

- Standard N-value (3) for all data?

- Are you sure you obeyed all guidance on X-Riak-Vclock and
X-Riak-ClientId headers?  Omitting vector clock or client-id headers
can cause vector clocks to grow very large, inflating the size of the
data being pushed around.

Please get in touch any way you feel comfortable, and we'll be happy
to give you a hand figuring out if there's a way to improve your
performance.

-Bryan

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Riak maximum throughput

Vick Khera
On Thu, Feb 11, 2010 at 9:01 PM, Bryan Fink <[hidden email]> wrote:
> - I assume you're using one of the HTTP interfaces: is it /jiak or
> /raw?  We're planning to deprecate /jiak as it's not as efficient, and
> doesn't provide much other benefit over /raw.
>

I have a question on this: isn't the setup/teardown of the TCP
connection overwhelming for high rate of queries?  Is there a way to
keep a persistent connection going?  I'm most interested in a Perl
API.

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Riak maximum throughput

Jack Moffitt
> I have a question on this: isn't the setup/teardown of the TCP
> connection overwhelming for high rate of queries?  Is there a way to
> keep a persistent connection going?  I'm most interested in a Perl
> API.

I've no idea if Webmachine does this, but HTTP 1.1 solves this problem
for free with persistence and pipelining.

jack.

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Riak maximum throughput

Justin Sheehy
On Fri, Feb 12, 2010 at 12:04 PM, Jack Moffitt <[hidden email]> wrote:

>> I have a question on this: isn't the setup/teardown of the TCP
>> connection overwhelming for high rate of queries?  Is there a way to
>> keep a persistent connection going?  I'm most interested in a Perl
>> API.
>
> I've no idea if Webmachine does this, but HTTP 1.1 solves this problem
> for free with persistence and pipelining.

Correct, Jack.

The current tip of Riak, via Webmachine, handles persistent connections.

You do not incur a TCP connection per request.

-Justin

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com