Riak response time steadily increasing

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Riak response time steadily increasing

ricardo.ekm
Hi all,
I'm doing some performance tests with Riak, I'm facing an issue where Riak response time increases from 3 ms average to 500 ms or even higher after some load.

There's no disk contention, memory was fine. CPU was high even tough stable.

I'm using multi backend being Bitcask the default one and the one used in this test. I've a separate disk for bitcask.

SO was tuned (except by net.core.wmem_max, net.core.rmem_max, net.core.netdev_max_backlog which were not set at the time of the test)

Merge was off (window set to another time), expiry grace time is set to 1 hour.  

I'm using a five node cluster with m3.xlarge EC2 instance type.

You can find attached the tests results and some related data.

Any input, a way to increase the results or a way do drill down the cause is appreciated!

Thanks!

--
Ricardo Mayerhofer

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Riakloadtest-181115-1707-281.pdf (2M) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Riak response time steadily increasing

ricardo.ekm
It's seems the problem was the load itself. As any technology there's a throughput limit you can achieve with a given hardware. 

I was able to scale to m3.2xlarge and it handled a higher load.

Just wondered if there's a way to debug this kind of situation and if CPU can really be Riak bottleneck (as it seems in my case). There's a way to improve CPU usage?

Thanks.

On Wed, Nov 18, 2015 at 5:24 PM, Ricardo Mayerhofer <[hidden email]> wrote:
Hi all,
I'm doing some performance tests with Riak, I'm facing an issue where Riak response time increases from 3 ms average to 500 ms or even higher after some load.

There's no disk contention, memory was fine. CPU was high even tough stable.

I'm using multi backend being Bitcask the default one and the one used in this test. I've a separate disk for bitcask.

SO was tuned (except by net.core.wmem_max, net.core.rmem_max, net.core.netdev_max_backlog which were not set at the time of the test)

Merge was off (window set to another time), expiry grace time is set to 1 hour.  

I'm using a five node cluster with m3.xlarge EC2 instance type.

You can find attached the tests results and some related data.

Any input, a way to increase the results or a way do drill down the cause is appreciated!

Thanks!

--
Ricardo Mayerhofer



--
Ricardo Mayerhofer

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com