how does riak scale?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

how does riak scale?

Tux Racer
Hello Riak users!

Let me first present on a simple example what I can see as an
application that 'scales':
If a have a web server (e.g apache) serving 100Gb of files and I get
4000 GET per seconds, one way to scale the application is to copy
(rsync) the 100Gb to another web server and balance the load on the two
nodes: that way, globally my cluster of 2 nodes can serve 2 x 4000 GET
per seconds.
I can say that my architecture scales as if I multiply by 2 the number
of nodes, then I can multiply by 2 the number of requests that the
system can handle (per second).

With Riak, I am not sure to understand how the scaling works. Are we
speaking about a global 'key GET' rate (request per second) that scales
with the number of nodes added?
My web server example above also assumed that all the data (100Gb) could
fit into a single node. As I understand Riak could be used to serve data
too large to fit on one disk. So maybe the scaling is about the data
itself: a web client (browser) will not see a speed difference in the
response from a riak cluster serving K keys with N nodes and another
riak cluster serving 2*K keys on 2*N nodes.

Also the number of replicas role in scaling is not clear to me: it seems
to me that having a lot of replicas speeds up reads but slows down
writes. Is there a simple scaling law for this?

Thanks in advance,

TuX

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: how does riak scale?

Sean Cribbs-2
A number of points:

* The replication factor (N) * the total original data size = total data size stored in the cluster.  For example, if N=3, then you have 300GB of data stored across the cluster.  If you had 6 nodes, this would be about 50GB per node.
* Riak can be optimized in your setup for GET speed, but in general it is optimized for fault-tolerance and availability.  Even in the best of conditions, it will not perform as optimally as a web server reading static files off the local disk.
* The replication factor will affect both reads and writes.  If your cluster size is larger than your N, you should see increased throughput as you add nodes.

Sean Cribbs <[hidden email]>
Developer Advocate
Basho Technologies, Inc.
http://basho.com/

On Apr 12, 2010, at 1:19 PM, TuX RaceR wrote:

> Hello Riak users!
>
> Let me first present on a simple example what I can see as an application that 'scales':
> If a have a web server (e.g apache) serving 100Gb of files and I get 4000 GET per seconds, one way to scale the application is to copy (rsync) the 100Gb to another web server and balance the load on the two nodes: that way, globally my cluster of 2 nodes can serve 2 x 4000 GET per seconds.
> I can say that my architecture scales as if I multiply by 2 the number of nodes, then I can multiply by 2 the number of requests that the system can handle (per second).
>
> With Riak, I am not sure to understand how the scaling works. Are we speaking about a global 'key GET' rate (request per second) that scales with the number of nodes added?
> My web server example above also assumed that all the data (100Gb) could fit into a single node. As I understand Riak could be used to serve data too large to fit on one disk. So maybe the scaling is about the data itself: a web client (browser) will not see a speed difference in the response from a riak cluster serving K keys with N nodes and another riak cluster serving 2*K keys on 2*N nodes.
>
> Also the number of replicas role in scaling is not clear to me: it seems to me that having a lot of replicas speeds up reads but slows down writes. Is there a simple scaling law for this?
>
> Thanks in advance,
>
> TuX
>
> _______________________________________________
> riak-users mailing list
> [hidden email]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com