How much memory for 20GB of data?

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

How much memory for 20GB of data?

Maria Neise
Hey,
I would like to store 20GB of data with Riak. Does anyone know how
much memory Riak would need for that?

Cheers,
Maria

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: How much memory for 20GB of data?

Will Moss
If you're planning on using bitcask, this site should be helpful:

Innostore and LevelDB store the keys on disk, so the minimum overhead would be quite small.


On Thu, Jul 14, 2011 at 3:02 PM, Maria Neise <[hidden email]> wrote:
Hey,
I would like to store 20GB of data with Riak. Does anyone know how
much memory Riak would need for that?

Cheers,
Maria

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: How much memory for 20GB of data?

Mark Phillips-4
On Thu, Jul 14, 2011 at 3:09 PM, Will Moss <[hidden email]> wrote:
> If you're planning on using bitcask, this site should be helpful:
> http://wiki.basho.com/Bitcask-Capacity-Planning.html

Fair warning - this page is slated to be revised per Nico's Meyer's
awesome write up. (See this for more details -
https://github.com/basho/riak_wiki/issues/101 )

That said, it should still give you a good starting point for what a
Bitcask-backed cluster (say that five times fast) will need.

Mark

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: How much memory for 20GB of data?

Justin Sheehy
In reply to this post by Maria Neise
Hi, Maria.

In addition to what others have said, I would note that (at least) the
following issues matter quite a bit for such planning:

- how many items the data is broken up into
- how large the keys will be (especially if they are very large due to
embedded structure)
- what storage engine ("backend") is in use
- how many machines are in the cluster
- the N-val, or how many replicas are being stored (default is 3)

If you know those things, then you can make a more meaningful estimation.

I hope that this helps.

-Justin




On Thu, Jul 14, 2011 at 6:02 PM, Maria Neise
<[hidden email]> wrote:

> Hey,
> I would like to store 20GB of data with Riak. Does anyone know how
> much memory Riak would need for that?
>
> Cheers,
> Maria
>
> _______________________________________________
> riak-users mailing list
> [hidden email]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: How much memory for 20GB of data?

Maria Neise
Hey,
thank you a lot for your hints.
I have 20000000 records à 1KB. The key is a string like
"user123456789". I am using the default backend bitcask. There is just
one machine in the cluster and I didn't change the N-val. I already
tried to insert the 20GB of data, but 40GB of memory were obviously
not enough, because only 7000000 records were inserted. So I thought
mybe 150GB should be enough?

Cheers,
Maria

2011/7/15 Justin Sheehy <[hidden email]>:

> Hi, Maria.
>
> In addition to what others have said, I would note that (at least) the
> following issues matter quite a bit for such planning:
>
> - how many items the data is broken up into
> - how large the keys will be (especially if they are very large due to
> embedded structure)
> - what storage engine ("backend") is in use
> - how many machines are in the cluster
> - the N-val, or how many replicas are being stored (default is 3)
>
> If you know those things, then you can make a more meaningful estimation.
>
> I hope that this helps.
>
> -Justin
>
>
>
>
> On Thu, Jul 14, 2011 at 6:02 PM, Maria Neise
> <[hidden email]> wrote:
>> Hey,
>> I would like to store 20GB of data with Riak. Does anyone know how
>> much memory Riak would need for that?
>>
>> Cheers,
>> Maria
>>
>> _______________________________________________
>> riak-users mailing list
>> [hidden email]
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: How much memory for 20GB of data?

Justin Sheehy
Do you perhaps mean disk space instead of memory?

If so, and if you have left the N-val at the default of 3, then you
will need at least 60G of space before any other overhead is accounted
for.

-Justin



On Thu, Jul 14, 2011 at 7:04 PM, Maria Neise
<[hidden email]> wrote:

> Hey,
> thank you a lot for your hints.
> I have 20000000 records à 1KB. The key is a string like
> "user123456789". I am using the default backend bitcask. There is just
> one machine in the cluster and I didn't change the N-val. I already
> tried to insert the 20GB of data, but 40GB of memory were obviously
> not enough, because only 7000000 records were inserted. So I thought
> mybe 150GB should be enough?
>
> Cheers,
> Maria
>
> 2011/7/15 Justin Sheehy <[hidden email]>:
>> Hi, Maria.
>>
>> In addition to what others have said, I would note that (at least) the
>> following issues matter quite a bit for such planning:
>>
>> - how many items the data is broken up into
>> - how large the keys will be (especially if they are very large due to
>> embedded structure)
>> - what storage engine ("backend") is in use
>> - how many machines are in the cluster
>> - the N-val, or how many replicas are being stored (default is 3)
>>
>> If you know those things, then you can make a more meaningful estimation.
>>
>> I hope that this helps.
>>
>> -Justin
>>
>>
>>
>>
>> On Thu, Jul 14, 2011 at 6:02 PM, Maria Neise
>> <[hidden email]> wrote:
>>> Hey,
>>> I would like to store 20GB of data with Riak. Does anyone know how
>>> much memory Riak would need for that?
>>>
>>> Cheers,
>>> Maria
>>>
>>> _______________________________________________
>>> riak-users mailing list
>>> [hidden email]
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>
>>
>

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: How much memory for 20GB of data?

Maria Neise
Yes, I meant disk space. Sorry -.-

Cheers,
Maria

2011/7/15 Justin Sheehy <[hidden email]>:

> Do you perhaps mean disk space instead of memory?
>
> If so, and if you have left the N-val at the default of 3, then you
> will need at least 60G of space before any other overhead is accounted
> for.
>
> -Justin
>
>
>
> On Thu, Jul 14, 2011 at 7:04 PM, Maria Neise
> <[hidden email]> wrote:
>> Hey,
>> thank you a lot for your hints.
>> I have 20000000 records à 1KB. The key is a string like
>> "user123456789". I am using the default backend bitcask. There is just
>> one machine in the cluster and I didn't change the N-val. I already
>> tried to insert the 20GB of data, but 40GB of memory were obviously
>> not enough, because only 7000000 records were inserted. So I thought
>> mybe 150GB should be enough?
>>
>> Cheers,
>> Maria
>>
>> 2011/7/15 Justin Sheehy <[hidden email]>:
>>> Hi, Maria.
>>>
>>> In addition to what others have said, I would note that (at least) the
>>> following issues matter quite a bit for such planning:
>>>
>>> - how many items the data is broken up into
>>> - how large the keys will be (especially if they are very large due to
>>> embedded structure)
>>> - what storage engine ("backend") is in use
>>> - how many machines are in the cluster
>>> - the N-val, or how many replicas are being stored (default is 3)
>>>
>>> If you know those things, then you can make a more meaningful estimation.
>>>
>>> I hope that this helps.
>>>
>>> -Justin
>>>
>>>
>>>
>>>
>>> On Thu, Jul 14, 2011 at 6:02 PM, Maria Neise
>>> <[hidden email]> wrote:
>>>> Hey,
>>>> I would like to store 20GB of data with Riak. Does anyone know how
>>>> much memory Riak would need for that?
>>>>
>>>> Cheers,
>>>> Maria
>>>>
>>>> _______________________________________________
>>>> riak-users mailing list
>>>> [hidden email]
>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>>
>>>
>>
>

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: How much memory for 20GB of data?

Will Moss
In that case, Riak adds a minimum of around 500 bytes of overhead per record, but that can grow as the size of the vector clock grows. As Justin mentioned, if N = 3 you'll be storing three copies of the data, so you need at least 60G, but probably closer to 90G.


On Thu, Jul 14, 2011 at 4:13 PM, Maria Neise <[hidden email]> wrote:
Yes, I meant disk space. Sorry -.-

Cheers,
Maria

2011/7/15 Justin Sheehy <[hidden email]>:
> Do you perhaps mean disk space instead of memory?
>
> If so, and if you have left the N-val at the default of 3, then you
> will need at least 60G of space before any other overhead is accounted
> for.
>
> -Justin
>
>
>
> On Thu, Jul 14, 2011 at 7:04 PM, Maria Neise
> <[hidden email]> wrote:
>> Hey,
>> thank you a lot for your hints.
>> I have 20000000 records à 1KB. The key is a string like
>> "user123456789". I am using the default backend bitcask. There is just
>> one machine in the cluster and I didn't change the N-val. I already
>> tried to insert the 20GB of data, but 40GB of memory were obviously
>> not enough, because only 7000000 records were inserted. So I thought
>> mybe 150GB should be enough?
>>
>> Cheers,
>> Maria
>>
>> 2011/7/15 Justin Sheehy <[hidden email]>:
>>> Hi, Maria.
>>>
>>> In addition to what others have said, I would note that (at least) the
>>> following issues matter quite a bit for such planning:
>>>
>>> - how many items the data is broken up into
>>> - how large the keys will be (especially if they are very large due to
>>> embedded structure)
>>> - what storage engine ("backend") is in use
>>> - how many machines are in the cluster
>>> - the N-val, or how many replicas are being stored (default is 3)
>>>
>>> If you know those things, then you can make a more meaningful estimation.
>>>
>>> I hope that this helps.
>>>
>>> -Justin
>>>
>>>
>>>
>>>
>>> On Thu, Jul 14, 2011 at 6:02 PM, Maria Neise
>>> <[hidden email]> wrote:
>>>> Hey,
>>>> I would like to store 20GB of data with Riak. Does anyone know how
>>>> much memory Riak would need for that?
>>>>
>>>> Cheers,
>>>> Maria
>>>>
>>>> _______________________________________________
>>>> riak-users mailing list
>>>> [hidden email]
>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>>
>>>
>>
>

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com