Worried about the backends to use

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Worried about the backends to use

Suman Kumar
Hi,

We have currently around 53 million key value pairs and each week we would add up with 1.4 million keys. can i use bitcask as my backend how many nodes and ring size should i go with and how should i scale this linearly once in 3months(will adding a node with same previous configuration scale the riak and increase the capacity of all the nodes?)

I also want to if there is any way so that i can change the data from bitcask to leveldb once in every three months as my old data is less frequent to access this can help me freeing up the ram in bitcask. Any help on this is appreciated

Thanks,
Suman Kumar Dey.

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Worried about the backends to use

Jeremiah Peschka
Responses inline.
---
Jeremiah Peschka - Founder, Brent Ozar Unlimited
MCITP: SQL Server 2008, MVP
Cloudera Certified Developer for Apache Hadoop


On Thu, Nov 28, 2013 at 9:57 PM, Suman Kumar <[hidden email]> wrote:
Hi,

We have currently around 53 million key value pairs and each week we would add up with 1.4 million keys. can i use bitcask as my backend how many nodes and ring size should i go with and how should i scale this linearly once in 3months(will adding a node with same previous configuration scale the riak and increase the capacity of all the nodes?)

There's a bitcask capacity planning tool available at http://docs.basho.com/riak/latest/ops/building/planning/bitcask/

This will be valid for 1.4.x and earlier. If I recall, some changes are being made to both LevelDB and bitcask that should reduce the per object overhead associated with each key. These changes only apply to Riak 2.0, though.
 

I also want to if there is any way so that i can change the data from bitcask to leveldb once in every three months as my old data is less frequent to access this can help me freeing up the ram in bitcask. Any help on this is appreciated

There are a few approaches to moving your data. Part of the design decision is going to be based on how you've structured your buckets.

1) You can add keys to a LevelDB bucket and then delete them from the bitcask bucket. This will carry some storage overhead and you may run into the problems outlined in this previous thread [1]. Namely that the data won't be removed until the next bitcask merge. If you set the bitcask merge thresholds too high, this could cause problems. 

2) You could set your default backend in the confgs to use multi-backend. When it comes time to age out your data, you can switch the backend on a bucket by bucket basis and hope for the best. This isn't supported or tested by Basho (as far as I know), so you'll be on your own on this one.

The smaller the time granularity for a bucket, the easier either option becomes. Moving one day is easier and faster than moving one week of data.




  

Thanks,
Suman Kumar Dey.

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Worried about the backends to use

John Daily
>
> 2) You could set your default backend in the confgs to use multi-backend. When it comes time to age out your data, you can switch the backend on a bucket by bucket basis and hope for the best. This isn't supported or tested by Basho (as far as I know), so you'll be on your own on this one.

Worse than unsupported, it simply won’t work. Once the backend is changed on a bucket, Riak will no longer know how to find the old data.

Rather than trying to migrate data, think about whether you can write the data twice on arrival: once to a bucket with bitcask with an appropriate expiry value, and once to a bucket with leveldb. This may seem wasteful, but will be far simpler, and the extra capacity can be handled by just adding more servers.

-John


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com