Issues with capacity planning pages on wiki

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Issues with capacity planning pages on wiki

Anthony Molinaro

Hi,

  As I'm about to dramatically increase our riak investment by putting
lots more data into it.  I figured I might try to run through the
capacity planning on the wiki.

Since my current setup is fairly small and manageable I decided to try
to see how accurately the capacity planning matches what I see.

So first reality

  A: Number of Machine    : 8
  B: Memory per Machine   : 24 GB
  C: Length of Bucket Name: 10 bytes
  D: Length of Keys       : 36 bytes
  E: Length of Values     : 36 bytes
  F: Replication Factor   : 3
  G: Number of Keys       : 183915891
  H: Disk Space used      : 341898018816 bytes (341 GB)
  I: RAM                  : 70536691712 bytes (70 GB)

  G was calculated using riak_kv_bitcask_backend:key_counts/0 for
    each bitcask on a node, summing, then dividing by 3
  H was calculated with 'du -sk /var/lib/riak/bitcask/ | cut -f1', summing
    and multiplying by 1024
  I was caluclated with 'ps -U riak -o vsz h', summing and multiplying
    by 1024

Now from entering A-G on the Bitcask-Capacity-Planning page I get

  Total Key Space: 34.9 GB
  Node Count : 3 (7 GB Storage per Node)

in the first section and

  Key Overhead: 73 Bytes (22 Byte Overhead)
  Total Documents: 1,010,580,541
  Total Disk Used: 102 GB of Disk Space

Also when using the Cluster Capacity Planning page I get

  (static bitcask per key overhead
   + estimated average bucket+key length in bytes)
   * estimate total number of keys
   * n_val
   = Approximate RAM Needed for Bitcask

So plugging in values

  ( 22 + 10 + 36 ) * 183915891 * 3 = 37518841696 = 34.9 GB

and

  Disk = Estimated Total Objects * Average Object Size * n_val
  Disk = 183915891 * 36 * 3 = 19862916228 = 18.49 GB

So either the equations are drastically wrong or my calculations are.  I find
it very suspect that the equation for the amount of disk includes zero
overhead when reading the bitcask paper it seems like each entry consists
of

  CRC, timestamp, keysz, valsz, key, value

Well anyway, there's obviously something off, as I end up with the following

      Bitcask-Capacity-Planning     Cluster-Capacity-Planning   Reality
RAM            34.9 GB                        34.9 GB             70 GB
Disk          102 GB                          18.49 GB           341 GB

So it looks to me like the numbers for RAM are about 1/2 of actual and
the number for Disk are completely off, they are different depending on
which page you look at on the wiki and vastly underestimate reality.

I'm hoping someone from basho can clarify so I can really determine
capacity.

Thanks,

-Anthony

--
------------------------------------------------------------------------
Anthony Molinaro                           <[hidden email]>

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Issues with capacity planning pages on wiki

Dave Smith
On Mon, May 23, 2011 at 9:39 PM, Anthony Molinaro
<[hidden email]> wrote:
>
>      Bitcask-Capacity-Planning     Cluster-Capacity-Planning   Reality
> RAM            34.9 GB                        34.9 GB             70 GB
> Disk          102 GB                          18.49 GB           341 GB
>
> So it looks to me like the numbers for RAM are about 1/2 of actual and
> the number for Disk are completely off, they are different depending on
> which page you look at on the wiki and vastly underestimate reality.

So RAM would require a little digging to figure out; disk is easier to
explain. The disk calculations do not take into account (as best I can
tell) the fact that bitcask is an append-only store and requires
periodic merging/compaction of the on-disk files. Thus, depending on
your merge triggers, more space can be used than is strictly necessary
to store the data.

D.

--
Dave Smith
Director, Engineering
Basho Technologies, Inc.
[hidden email]

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Issues with capacity planning pages on wiki

Anthony Molinaro

On Mon, May 23, 2011 at 09:57:25PM -0600, David Smith wrote:

> On Mon, May 23, 2011 at 9:39 PM, Anthony Molinaro
> <[hidden email]> wrote:
> >
> >     Bitcask-Capacity-Planning  Cluster-Capacity-Planning  Reality
> > RAM   34.9 GB                     34.9 GB                   70 GB
> > Disk 102 GB                       18.49 GB                 341 GB
> >
> > So it looks to me like the numbers for RAM are about 1/2 of actual and
> > the number for Disk are completely off, they are different depending on
> > which page you look at on the wiki and vastly underestimate reality.
>
> So RAM would require a little digging to figure out;

Anything I can do there to help?  I'd really like to get to the bottom
of the discrepency with these numbers.  I assume everything is stored
as binaries, and I'm not seeing some sort of 64-bit doubling (I know
I convert my keys and values to binaries before sending them to riak).

Here's the output of memory/0 on an attached shell

(riak@10.1.1.31)1> memory().
[{total,7281790968},
 {processes,18543872},
 {processes_used,18132704},
 {system,7263247096},
 {atom,825105},
 {atom_used,815183},
 {binary,603512},
 {code,8306646},
 {ets,536440}]

Which seems like it's all used by system which is I assume the keydirs in
the driver.  Also does the number of partitions impact this value at all?
I have 1024 total on 8 nodes the ring currently looks like

ring_ownership : <<"[{'riak@10.1.8.10',128},\n {'riak@10.1.6.30',128},\n
{'riak@10.1.10.20',129},\n {'riak@10.1.1.31',129},\n {'riak@10.1.2.32',128},\n
{'riak@10.1.7.6',128},\n {'riak@10.1.11.18',126},\n {'riak@10.1.9.9',128}]">>

Which also seems a bit odd, I would expect them all to be 128, but anyway?

> disk is easier to
> explain. The disk calculations do not take into account (as best I can
> tell) the fact that bitcask is an append-only store and requires
> periodic merging/compaction of the on-disk files.

Is there anyway to force a merge/compaction so I can attempt to better
understand my usage.  I know with cassandra I had a way to run compactions
with their nodetool, but riak-admin doesn't seem to have any sort of
controls, unless a backup causes merging to occur.

> Thus, depending on
> your merge triggers, more space can be used than is strictly necessary
> to store the data.

So the lack of any overhead in the calculation is expected?  I mean
according to http://wiki.basho.com/Cluster-Capacity-Planning.html

Disk = Estimated Total Objects * Average Object Size * n_val

Which just seems wrong, doesn't it?  I don't quite understand the
bitcask code well enough yet to see what the actual data it stores is,
but the whitepaper suggested several things were involved in the on
disk representation.

-Anthony

--
------------------------------------------------------------------------
Anthony Molinaro                           <[hidden email]>

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Issues with capacity planning pages on wiki

Anthony Molinaro

On Mon, May 23, 2011 at 10:53:29PM -0700, Anthony Molinaro wrote:

>
> On Mon, May 23, 2011 at 09:57:25PM -0600, David Smith wrote:
> > On Mon, May 23, 2011 at 9:39 PM, Anthony Molinaro
> > Thus, depending on
> > your merge triggers, more space can be used than is strictly necessary
> > to store the data.
>
> So the lack of any overhead in the calculation is expected?  I mean
> according to http://wiki.basho.com/Cluster-Capacity-Planning.html
>
> Disk = Estimated Total Objects * Average Object Size * n_val
>
> Which just seems wrong, doesn't it?  I don't quite understand the
> bitcask code well enough yet to see what the actual data it stores is,
> but the whitepaper suggested several things were involved in the on
> disk representation.

Okay, finally found the code for this part, I kept looking in the nif
but that's only the keydir, not the data files.  It looks like

   %% Setup io_list for writing -- avoid merging binaries if we can help it
   Bytes0 = [<<Tstamp:?TSTAMPFIELD>>, <<KeySz:?KEYSIZEFIELD>>,
             <<ValueSz:?VALSIZEFIELD>>, Key, Value],
   Bytes  = [<<(erlang:crc32(Bytes0)):?CRCSIZEFIELD>> | Bytes0],

And looking at the header, it seems that there's 14 bytes of overhead
(4 for CRC, 4 for timestamp, 2 for keysize, 4 for valsize).

So disk calculation should be

( 14 + Key + Value ) * Num Entries * N_Val

So using my numbers from before that gives

( 14 + 36 + 36 ) * 183915891 * 3 = 47450299878 = 44.1 GB

which actually isn't much closer to 341 GB than the previous calculation :(

So all my questions from the previous email still apply.

-Anthony

--
------------------------------------------------------------------------
Anthony Molinaro                           <[hidden email]>

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Issues with capacity planning pages on wiki

Anthony Molinaro
Just curious if anyone has any ideas, for the moment, I'm just taking
the RAM calculation and multiplying by 2 and the Disk calculation and
multiplying by 8, based on my findings with my current cluster.  But
I would like to know why my values are so much higher than those I should
be getting.

Also, I'd still like to know how the forms calculate things as the disk
calculation there does not match reality or the formula.

Also, waiting to hear if there is any way to force merge to run so I can
more accurately gauge whether multiple copies are effecting disk usage.

Thanks,

-Anthony

On Mon, May 23, 2011 at 11:06:31PM -0700, Anthony Molinaro wrote:

>
> On Mon, May 23, 2011 at 10:53:29PM -0700, Anthony Molinaro wrote:
> >
> > On Mon, May 23, 2011 at 09:57:25PM -0600, David Smith wrote:
> > > On Mon, May 23, 2011 at 9:39 PM, Anthony Molinaro
> > > Thus, depending on
> > > your merge triggers, more space can be used than is strictly necessary
> > > to store the data.
> >
> > So the lack of any overhead in the calculation is expected?  I mean
> > according to http://wiki.basho.com/Cluster-Capacity-Planning.html
> >
> > Disk = Estimated Total Objects * Average Object Size * n_val
> >
> > Which just seems wrong, doesn't it?  I don't quite understand the
> > bitcask code well enough yet to see what the actual data it stores is,
> > but the whitepaper suggested several things were involved in the on
> > disk representation.
>
> Okay, finally found the code for this part, I kept looking in the nif
> but that's only the keydir, not the data files.  It looks like
>
>    %% Setup io_list for writing -- avoid merging binaries if we can help it
>    Bytes0 = [<<Tstamp:?TSTAMPFIELD>>, <<KeySz:?KEYSIZEFIELD>>,
>              <<ValueSz:?VALSIZEFIELD>>, Key, Value],
>    Bytes  = [<<(erlang:crc32(Bytes0)):?CRCSIZEFIELD>> | Bytes0],
>
> And looking at the header, it seems that there's 14 bytes of overhead
> (4 for CRC, 4 for timestamp, 2 for keysize, 4 for valsize).
>
> So disk calculation should be
>
> ( 14 + Key + Value ) * Num Entries * N_Val
>
> So using my numbers from before that gives
>
> ( 14 + 36 + 36 ) * 183915891 * 3 = 47450299878 = 44.1 GB
>
> which actually isn't much closer to 341 GB than the previous calculation :(
>
> So all my questions from the previous email still apply.
>
> -Anthony
>
> --
> ------------------------------------------------------------------------
> Anthony Molinaro                           <[hidden email]>

--
------------------------------------------------------------------------
Anthony Molinaro                           <[hidden email]>

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Issues with capacity planning pages on wiki

Nico Meyer
Hi Anthony,

I think, I can explain at least a big chunk of the difference in RAM and disk consumption you see.

Let start with RAM. I could of course be wrong here, but I believe the 'static bitcask per key overhead' is just plainly too small. Let me explain why.
The bitcask_keydir_entry struct for each entry looks like this:

typedef struct
{
    uint32_t file_id;
    uint32_t total_sz;
    uint64_t offset;
    uint32_t tstamp;
    uint16_t key_sz;
    char     key[0];
} bitcask_keydir_entry;

This has indeed a size of 22 bytes (The array 'key' has zero entries because the key is written to the memory address directly after the keydir entry).
As is done int the capacity planner, you need to add the size of the bucket and key to get the size of the keydir entry, but that is not the whole story.

The thing that is actually stored in key is the result of this Erlang expression:
   erlang:term_to_binary( {<<"bucket">>, <<"key">>} )
that is, a tuple of two binaries converted to the Erlang external term format.

So lets see:
1> term_to_binary({<<>>,<<>>}).
<<131,104,2,109,0,0,0,0,109,0,0,0,0>>
2> iolist_size(term_to_binary({<<>>,<<>>})).
13
3> iolist_size(term_to_binary({<<"a">>,<<"b">>})).
15
4> iolist_size(term_to_binary({<<"aa">>,<<"b">>})). 
16
5> iolist_size(term_to_binary({<<"aa">>,<<"bb">>})).
17
so even an empty bucket/key pair take 13 bytes  to store.

Also, since the hashtable storing the keydir entries is essentially an array of pointers to bitcask_keydir_entry objects, there is another 8 bytes of overhead per key, assuming you are running a 64bit system.

so the real static overhead per key is not 22 but 22+13+8 = 43 bytes.

Lets run the numbers for your predicted memory consumption again:
  ( 43 + 10 + 36 ) * 183915891 * 3 = 49105542897 = 45.7 GB

Your actual RAM consumption of 70 GB seems to be at odd with the output of erlang:memory/0 that you sent:

{total,7281790968} =>  RAM: 7281790968 * 8 = 54.3 GB

So that is much closer, within about 20 percent. Some additional overhead is to be expected, but it is hard to say how much of that is due to Erlangs internal usage and how much due to bitcask.

    So lets examine the disk consumption next.
As you rightly concluded the equation here http://wiki.basho.com/Cluster-Capacity-Planning.html is somewhat simplified, and your are also right, that the real equation would be
( 14 + Key + Value ) * Num Entries * N_Val
On the other hand 14 bytes + keysize might be quite irrelevant if your values have a size of at least 2KB (as in the example), which seems to be the general assumption in some aspects of the design of riak and bitcask.
As you also noticed, this additional small overhead brings you nowhere near the disk usage that you observe.

First, the key that is stored in the bitcask files is not the key part of the bucket/key pair that riak calls a key, but the serialized bucket/key pair described above, so the calculation becomes:
( 14 + ( 13 + Bucket + Key) + Value ) * Num Entries * N_Val

( 14 + ( 13 + 10 + 36) + 36 ) * 183915891 * 3 = 56 GB
Still not enough :-/.
So next lets examine what is actually stored as the value in bitcask. It is not simply the data you provide, but a riak object (r_object record) which is again serialized by the erlang:term_to_binary/1 function. So lets see. I create a new riak object with zero byte bucket, key and value:
3> Obj = riak_object:new(<<>>,<<>>,<<>>).      
{r_object,<<>>,<<>>,
          [{r_content,{dict,0,16,16,8,80,48,
                            {[],[],[],[],[],[],[],[],[],[],[],[],[],[],...},
                            {{[],[],[],[],[],[],[],[],[],[],[],[],...}}},
                      <<>>}],
          [],
          {dict,1,16,16,8,80,48,
                {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],...},
                {{[],[],[],[],[],[],[],[],[],[],[],[],[],...}}},
          undefined}
4> iolist_size(erlang:term_to_binary(Obj)).
205
Also, bucket and key are contained int  the riak object itself (and therefore in the bitcask notion of the value). So with this information the predicted disk usage becomes:
( 14 + ( 13 + Bucket + Key ) + ( 205 + Bucket + Key + Value ) ) * Num Entries * N_Val

( 14 + ( 13 + 10 + 36) + ( 205 + 10 + 36 ) ) * 183915891 * 3 = 166.5 GB
which is way closer to the 341 GB you observe.

But we can get even closer, although the detailes become somewhat more fuzzy. But bear with me.
I again create a riak object, but this time with a non empty bucket/key so I can store it in riak:
([hidden email])7> Obj = riak_object:new(<<"a">>,<<"a">>,<<>>). 
{r_object,<<"a">>,<<"a">>,
          [{r_content,{dict,0,16,16,8,80,48,
                            {[],[],[],[],[],[],[],[],[],[],[],[],[],[],...},
                            {{[],[],[],[],[],[],[],[],[],[],[],[],...}}},
                      <<>>}],
          [],
          {dict,1,16,16,8,80,48,
                {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],...},
                {{[],[],[],[],[],[],[],[],[],[],[],[],[],...}}},
          undefined}

([hidden email])8> iolist_size(erlang:term_to_binary(Obj)).
207
([hidden email])9> {ok,C}=riak:local_client().
{ok,{riak_client,'[hidden email]',<<2,123,179,255>>}}
([hidden email])10> C:put(Obj,1,1).            
ok
([hidden email])12> {ok,ObjStored} = C:get(<<"a">>,<<"a">>, 1).
{ok,{r_object,<<"a">>,<<"a">>,
         [{r_content,{dict,2,16,16,8,80,48,
			 {[],[],[],[],[],[],[],[],[],[],[],[],...},
                         {{[],[],[],[],[],[],[],[],[],[],...}}},
                     <<>>}],
              [{<<2,123,179,255>>,{1,63473554112}}],
              {dict,1,16,16,8,80,48,
                    {[],[],[],[],[],[],[],[],[],[],[],[],[],...},
                    {{[],[],[],[],[],[],[],[],[],[],[],...}}},
               undefined}}
([hidden email])13> iolist_size(erlang:term_to_binary(ObjStored)).
358
Ok? What happened? The object we retrieved is considerably larger than the one we stored. One culprit is the vector clock data, which was an empty list for Obj, and now has one entry:

([hidden email])14> riak_object:vclock(Obj).
[]
([hidden email])15> riak_object:vclock(ObjStored). 
[{<<2,123,179,255>>,{1,63473554112}}]
([hidden email])23> iolist_size(term_to_binary(riak_object:vclock(Obj))).      
2
([hidden email])24> iolist_size(term_to_binary(riak_object:vclock(ObjStored))).
30
So thats 28 bytes each time the object is updated with a new client ID (so alway use a meaningful client ID!!!!), until the vclock pruning sets in. The default bucket property is {big_vclock,50}, so in the worst case this could account for 28*50=1400 byte!
But each object that has been stored somehow has at least one entry in the vclock, so another 28 bytes of overhead

The other part of the growth stems from some standard entries, which are added to the object metadata during the put operation:

([hidden email])35> dict:to_list(riak_object:get_metadata(Obj)).      
[]
([hidden email])37> iolist_size(term_to_binary(riak_object:get_metadata(Obj))).
60

([hidden email])36> dict:to_list(riak_object:get_metadata(ObjStored)).
[{<<"X-Riak-VTag">>,"7PoD9FEMUBzNmQeMnjUbas"},
 {<<"X-Riak-Last-Modified">>,{1306,334912,424099}}]
([hidden email])38> iolist_size(term_to_binary(riak_object:get_metadata(ObjStored))).
183
So there are the other 123 bytes.

In total this 356 byte* overhead per object leads us to the following calculation:  (* 2 bytes from the above 358 came from the bucket and key which are already accounted for)
( 14 + ( 13 + Bucket + Key ) + ( 356 + Bucket + Key + Value ) ) * Num Entries * N_Val
( 14 + ( 13 + 10 + 36) + ( 356 + 10 + 36 ) ) * 183915891 * 3 = 244 GB

We are getting closer!
If you loaded the data via the REST API the overhead is somewhat larger still, since the object will also contain 'content-type', 'X-Riak-Meta' and 'Link' metadata entries:

xxxx@node2:~$ curl -v -d '' -H "Content-Type: text/plain" http://127.0.0.1:8098/riak/a/a

([hidden email])44> {ok,ObjStored} = C:get(<<"a">>,<<"a">>, 1).
{ok,{r_object,<<"a">>,<<"a">>,
              [{r_content,{dict,5,16,16,8,80,48,
                                {[],[],[],[],[],[],[],[],[],[],[],[],...},
                                {{[],[],[[<<"Links">>]],[],[],[],[],[],[],[],...}}},
                          <<>>}],
              [{<<5,134,53,93>>,{1,63473557230}}],
              {dict,1,16,16,8,80,48,
                    {[],[],[],[],[],[],[],[],[],[],[],[],[],...},
                    {{[],[],[],[],[],[],[],[],[],[],[],...}}},
              undefined}}
([hidden email])45> dict:to_list(riak_object:get_metadata(ObjStored)).               
[{<<"Links">>,[]},
 {<<"X-Riak-VTag">>,"3TQzJznzXXWtZefntWXPDR"},
 {<<"content-type">>,"text/plain"},
 {<<"X-Riak-Last-Modified">>,{1306,338030,682871}},
 {<<"X-Riak-Meta">>,[]}]

([hidden email])46> iolist_size(erlang:term_to_binary(ObjStored)).                   
449

Which leads to: (remember again to subtract 2 bytes)
( 14 + ( 13 + Bucket + Key ) + ( 447 + Bucket + Key + Value ) ) * Num Entries * N_Val
( 14 + ( 13 + 10 + 36) + ( 447 + 10 + 36 ) ) * 183915891 * 3 = 290.8 GB

Nearly there!

Now there are also the hintfiles, which are a kind of an index into the bitcask data files to speedup the start of a riak node. The hintfiles contain one entry per key and the code that creates one entry looks like this:
    [<<Tstamp:?TSTAMPFIELD>>, <<KeySz:?KEYSIZEFIELD>>,
     <<TotalSz:?TOTALSIZEFIELD>>, <<Offset:?OFFSETFIELD>>, Key].

So thats 4 + 2 + 4 + 8 + KeySize (= 18 + KeySize) additonal bytes per key.
So the final result if you inserted the key via the Rest API is:
( 14 + ( 13 + Bucket + Key ) + ( 447 + Bucket + Key + Value ) + (18 + ( 13 + Bucket + Key ) ) ) * Num Entries * N_Val = ( 505 + 3 * (Bucket + Key) + Value ) * Num Entries * N_Val

( 505 + 3 * (10 + 36) + 36 ) * 183915891 * 3 = 374636669967 = 348.9 GB

And if you used Erlang (or probably any ProtocolBuffers client):

( 14 + ( 13 + Bucket + Key ) + ( 356 + Bucket + Key + Value ) + (18 + ( 13 + Bucket + Key ) ) ) * Num Entries * N_Val = ( 414 + 3 * (Bucket + Key) + Value ) * Num Entries * N_Val

( 414 + 3 * (10 + 36) + 36 ) * 183915891 * 3 = 324427631724 = 302.1 GB

So the truth is somewhere in between. But as David wrote, there can be additional overhead due to the append only nature on bitcask.

Cheers,
Nico

Am 24.05.2011 23:48, schrieb Anthony Molinaro:
Just curious if anyone has any ideas, for the moment, I'm just taking
the RAM calculation and multiplying by 2 and the Disk calculation and
multiplying by 8, based on my findings with my current cluster.  But
I would like to know why my values are so much higher than those I should
be getting.

Also, I'd still like to know how the forms calculate things as the disk
calculation there does not match reality or the formula.

Also, waiting to hear if there is any way to force merge to run so I can
more accurately gauge whether multiple copies are effecting disk usage.

Thanks,

-Anthony

On Mon, May 23, 2011 at 11:06:31PM -0700, Anthony Molinaro wrote:
On Mon, May 23, 2011 at 10:53:29PM -0700, Anthony Molinaro wrote:
On Mon, May 23, 2011 at 09:57:25PM -0600, David Smith wrote:
On Mon, May 23, 2011 at 9:39 PM, Anthony Molinaro
Thus, depending on
your merge triggers, more space can be used than is strictly necessary
to store the data.
So the lack of any overhead in the calculation is expected?  I mean
according to http://wiki.basho.com/Cluster-Capacity-Planning.html

Disk = Estimated Total Objects * Average Object Size * n_val

Which just seems wrong, doesn't it?  I don't quite understand the
bitcask code well enough yet to see what the actual data it stores is,
but the whitepaper suggested several things were involved in the on
disk representation.
Okay, finally found the code for this part, I kept looking in the nif
but that's only the keydir, not the data files.  It looks like

   %% Setup io_list for writing -- avoid merging binaries if we can help it
   Bytes0 = [<<Tstamp:?TSTAMPFIELD>>, <<KeySz:?KEYSIZEFIELD>>,
             <<ValueSz:?VALSIZEFIELD>>, Key, Value],
   Bytes  = [<<(erlang:crc32(Bytes0)):?CRCSIZEFIELD>> | Bytes0],

And looking at the header, it seems that there's 14 bytes of overhead
(4 for CRC, 4 for timestamp, 2 for keysize, 4 for valsize).

So disk calculation should be

( 14 + Key + Value ) * Num Entries * N_Val

So using my numbers from before that gives

( 14 + 36 + 36 ) * 183915891 * 3 = 47450299878 = 44.1 GB

which actually isn't much closer to 341 GB than the previous calculation :(

So all my questions from the previous email still apply.

-Anthony

-- 
------------------------------------------------------------------------
Anthony Molinaro                           [hidden email]

    


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Issues with capacity planning pages on wiki

Jonathan Langevin
That was one hell of a response. You need to post that as a Wiki article or such, after all that work :-O


Jonathan Langevin
Systems Administrator

Loom Inc.
Wilmington, NC: (910) 241-0433 - [hidden email] - www.loomlearning.com - Skype: intel352



On Wed, May 25, 2011 at 12:22 PM, Nico Meyer <[hidden email]> wrote:
Hi Anthony,

I think, I can explain at least a big chunk of the difference in RAM and disk consumption you see.

Let start with RAM. I could of course be wrong here, but I believe the 'static bitcask per key overhead' is just plainly too small. Let me explain why.
The bitcask_keydir_entry struct for each entry looks like this:

typedef struct
{
    uint32_t file_id;
    uint32_t total_sz;
    uint64_t offset;
    uint32_t tstamp;
    uint16_t key_sz;
    char     key[0];
} bitcask_keydir_entry;

This has indeed a size of 22 bytes (The array 'key' has zero entries because the key is written to the memory address directly after the keydir entry).
As is done int the capacity planner, you need to add the size of the bucket and key to get the size of the keydir entry, but that is not the whole story.

The thing that is actually stored in key is the result of this Erlang expression:
   erlang:term_to_binary( {<<"bucket">>, <<"key">>} )
that is, a tuple of two binaries converted to the Erlang external term format.

So lets see:
1> term_to_binary({<<>>,<<>>}).
<<131,104,2,109,0,0,0,0,109,0,0,0,0>>
2> iolist_size(term_to_binary({<<>>,<<>>})).
13
3> iolist_size(term_to_binary({<<"a">>,<<"b">>})).
15
4> iolist_size(term_to_binary({<<"aa">>,<<"b">>})). 
16
5> iolist_size(term_to_binary({<<"aa">>,<<"bb">>})).
17
so even an empty bucket/key pair take 13 bytes  to store.

Also, since the hashtable storing the keydir entries is essentially an array of pointers to bitcask_keydir_entry objects, there is another 8 bytes of overhead per key, assuming you are running a 64bit system.

so the real static overhead per key is not 22 but 22+13+8 = 43 bytes.

Lets run the numbers for your predicted memory consumption again:
  ( 43 + 10 + 36 ) * 183915891 * 3 = 49105542897 = 45.7 GB

Your actual RAM consumption of 70 GB seems to be at odd with the output of erlang:memory/0 that you sent:

{total,7281790968} =>  RAM: 7281790968 * 8 = 54.3 GB

So that is much closer, within about 20 percent. Some additional overhead is to be expected, but it is hard to say how much of that is due to Erlangs internal usage and how much due to bitcask.

    So lets examine the disk consumption next.
As you rightly concluded the equation here http://wiki.basho.com/Cluster-Capacity-Planning.html is somewhat simplified, and your are also right, that the real equation would be

( 14 + Key + Value ) * Num Entries * N_Val
On the other hand 14 bytes + keysize might be quite irrelevant if your values have a size of at least 2KB (as in the example), which seems to be the general assumption in some aspects of the design of riak and bitcask.
As you also noticed, this additional small overhead brings you nowhere near the disk usage that you observe.

First, the key that is stored in the bitcask files is not the key part of the bucket/key pair that riak calls a key, but the serialized bucket/key pair described above, so the calculation becomes:
( 14 + ( 13 + Bucket + Key) + Value ) * Num Entries * N_Val

( 14 + ( 13 + 10 + 36) + 36 ) * 183915891 * 3 = 56 GB
Still not enough :-/.
So next lets examine what is actually stored as the value in bitcask. It is not simply the data you provide, but a riak object (r_object record) which is again serialized by the erlang:term_to_binary/1 function. So lets see. I create a new riak object with zero byte bucket, key and value:
3> Obj = riak_object:new(<<>>,<<>>,<<>>).      
{r_object,<<>>,<<>>,
          [{r_content,{dict,0,16,16,8,80,48,
                            {[],[],[],[],[],[],[],[],[],[],[],[],[],[],...},
                            {{[],[],[],[],[],[],[],[],[],[],[],[],...}}},
                      <<>>}],
          [],
          {dict,1,16,16,8,80,48,
                {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],...},
                {{[],[],[],[],[],[],[],[],[],[],[],[],[],...}}},
          undefined}
4> iolist_size(erlang:term_to_binary(Obj)).
205
Also, bucket and key are contained int  the riak object itself (and therefore in the bitcask notion of the value). So with this information the predicted disk usage becomes:
( 14 + ( 13 + Bucket + Key ) + ( 205 + Bucket + Key + Value ) ) * Num Entries * N_Val

( 14 + ( 13 + 10 + 36) + ( 205 + 10 + 36 ) ) * 183915891 * 3 = 166.5 GB
which is way closer to the 341 GB you observe.

But we can get even closer, although the detailes become somewhat more fuzzy. But bear with me.
I again create a riak object, but this time with a non empty bucket/key so I can store it in riak:
([hidden email])7> Obj = riak_object:new(<<"a">>,<<"a">>,<<>>). 
{r_object,<<"a">>,<<"a">>,
          [{r_content,{dict,0,16,16,8,80,48,
                            {[],[],[],[],[],[],[],[],[],[],[],[],[],[],...},
                            {{[],[],[],[],[],[],[],[],[],[],[],[],...}}},
                      <<>>}],
          [],
          {dict,1,16,16,8,80,48,
                {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],...},
                {{[],[],[],[],[],[],[],[],[],[],[],[],[],...}}},
          undefined}

([hidden email])8> iolist_size(erlang:term_to_binary(Obj)).
207
([hidden email])9> {ok,C}=riak:local_client().
{ok,{riak_client,'[hidden email]',<<2,123,179,255>>}}
([hidden email])10> C:put(Obj,1,1).            
ok
([hidden email])12> {ok,ObjStored} = C:get(<<"a">>,<<"a">>, 1).
{ok,{r_object,<<"a">>,<<"a">>,
         [{r_content,{dict,2,16,16,8,80,48,
			 {[],[],[],[],[],[],[],[],[],[],[],[],...},
                         {{[],[],[],[],[],[],[],[],[],[],...}}},
                     <<>>}],
              [{<<2,123,179,255>>,{1,63473554112}}],
              {dict,1,16,16,8,80,48,
                    {[],[],[],[],[],[],[],[],[],[],[],[],[],...},
                    {{[],[],[],[],[],[],[],[],[],[],[],...}}},
               undefined}}
([hidden email])13> iolist_size(erlang:term_to_binary(ObjStored)).
358
Ok? What happened? The object we retrieved is considerably larger than the one we stored. One culprit is the vector clock data, which was an empty list for Obj, and now has one entry:

([hidden email])14> riak_object:vclock(Obj).
[]
([hidden email])15> riak_object:vclock(ObjStored). 
[{<<2,123,179,255>>,{1,63473554112}}]
([hidden email])23> iolist_size(term_to_binary(riak_object:vclock(Obj))).      
2
([hidden email])24> iolist_size(term_to_binary(riak_object:vclock(ObjStored))).
30
So thats 28 bytes each time the object is updated with a new client ID (so alway use a meaningful client ID!!!!), until the vclock pruning sets in. The default bucket property is {big_vclock,50}, so in the worst case this could account for 28*50=1400 byte!
But each object that has been stored somehow has at least one entry in the vclock, so another 28 bytes of overhead

The other part of the growth stems from some standard entries, which are added to the object metadata during the put operation:

([hidden email])35> dict:to_list(riak_object:get_metadata(Obj)).      
[]
([hidden email])37> iolist_size(term_to_binary(riak_object:get_metadata(Obj))).
60

([hidden email])36> dict:to_list(riak_object:get_metadata(ObjStored)).
[{<<"X-Riak-VTag">>,"7PoD9FEMUBzNmQeMnjUbas"},
 {<<"X-Riak-Last-Modified">>,{1306,334912,424099}}]
([hidden email])38> iolist_size(term_to_binary(riak_object:get_metadata(ObjStored))).
183
So there are the other 123 bytes.

In total this 356 byte* overhead per object leads us to the following calculation:  (* 2 bytes from the above 358 came from the bucket and key which are already accounted for)
( 14 + ( 13 + Bucket + Key ) + ( 356 + Bucket + Key + Value ) ) * Num Entries * N_Val
( 14 + ( 13 + 10 + 36) + ( 356 + 10 + 36 ) ) * 183915891 * 3 = 244 GB

We are getting closer!
If you loaded the data via the REST API the overhead is somewhat larger still, since the object will also contain 'content-type', 'X-Riak-Meta' and 'Link' metadata entries:

xxxx@node2:~$ curl -v -d '' -H "Content-Type: text/plain" http://127.0.0.1:8098/riak/a/a

([hidden email])44> {ok,ObjStored} = C:get(<<"a">>,<<"a">>, 1).
{ok,{r_object,<<"a">>,<<"a">>,
              [{r_content,{dict,5,16,16,8,80,48,
                                {[],[],[],[],[],[],[],[],[],[],[],[],...},
                                {{[],[],[[<<"Links">>]],[],[],[],[],[],[],[],...}}},
                          <<>>}],
              [{<<5,134,53,93>>,{1,63473557230}}],
              {dict,1,16,16,8,80,48,
                    {[],[],[],[],[],[],[],[],[],[],[],[],[],...},
                    {{[],[],[],[],[],[],[],[],[],[],[],...}}},
              undefined}}
([hidden email])45> dict:to_list(riak_object:get_metadata(ObjStored)).               
[{<<"Links">>,[]},
 {<<"X-Riak-VTag">>,"3TQzJznzXXWtZefntWXPDR"},
 {<<"content-type">>,"text/plain"},
 {<<"X-Riak-Last-Modified">>,{1306,338030,682871}},
 {<<"X-Riak-Meta">>,[]}]

([hidden email])46> iolist_size(erlang:term_to_binary(ObjStored)).                   
449

Which leads to: (remember again to subtract 2 bytes)
( 14 + ( 13 + Bucket + Key ) + ( 447 + Bucket + Key + Value ) ) * Num Entries * N_Val
( 14 + ( 13 + 10 + 36) + ( 447 + 10 + 36 ) ) * 183915891 * 3 = 290.8 GB

Nearly there!

Now there are also the hintfiles, which are a kind of an index into the bitcask data files to speedup the start of a riak node. The hintfiles contain one entry per key and the code that creates one entry looks like this:
    [<<Tstamp:?TSTAMPFIELD>>, <<KeySz:?KEYSIZEFIELD>>,
     <<TotalSz:?TOTALSIZEFIELD>>, <<Offset:?OFFSETFIELD>>, Key].

So thats 4 + 2 + 4 + 8 + KeySize (= 18 + KeySize) additonal bytes per key.
So the final result if you inserted the key via the Rest API is:
( 14 + ( 13 + Bucket + Key ) + ( 447 + Bucket + Key + Value ) + (18 + ( 13 + Bucket + Key ) ) ) * Num Entries * N_Val = ( 505 + 3 * (Bucket + Key) + Value ) * Num Entries * N_Val

( 505 + 3 * (10 + 36) + 36 ) * 183915891 * 3 = 374636669967 = 348.9 GB

And if you used Erlang (or probably any ProtocolBuffers client):

( 14 + ( 13 + Bucket + Key ) + ( 356 + Bucket + Key + Value ) + (18 + ( 13 + Bucket + Key ) ) ) * Num Entries * N_Val = ( 414 + 3 * (Bucket + Key) + Value ) * Num Entries * N_Val

( 414 + 3 * (10 + 36) + 36 ) * 183915891 * 3 = 324427631724 = 302.1 GB

So the truth is somewhere in between. But as David wrote, there can be additional overhead due to the append only nature on bitcask.

Cheers,
Nico

Am <a href="tel:24.05.2011%2023" value="+12405201123" target="_blank">24.05.2011 23:48, schrieb Anthony Molinaro:
Just curious if anyone has any ideas, for the moment, I'm just taking
the RAM calculation and multiplying by 2 and the Disk calculation and
multiplying by 8, based on my findings with my current cluster.  But
I would like to know why my values are so much higher than those I should
be getting.

Also, I'd still like to know how the forms calculate things as the disk
calculation there does not match reality or the formula.

Also, waiting to hear if there is any way to force merge to run so I can
more accurately gauge whether multiple copies are effecting disk usage.

Thanks,

-Anthony

On Mon, May 23, 2011 at 11:06:31PM -0700, Anthony Molinaro wrote:
On Mon, May 23, 2011 at 10:53:29PM -0700, Anthony Molinaro wrote:
On Mon, May 23, 2011 at 09:57:25PM -0600, David Smith wrote:
On Mon, May 23, 2011 at 9:39 PM, Anthony Molinaro
Thus, depending on
your merge triggers, more space can be used than is strictly necessary
to store the data.
So the lack of any overhead in the calculation is expected?  I mean
according to http://wiki.basho.com/Cluster-Capacity-Planning.html

Disk = Estimated Total Objects * Average Object Size * n_val

Which just seems wrong, doesn't it?  I don't quite understand the
bitcask code well enough yet to see what the actual data it stores is,
but the whitepaper suggested several things were involved in the on
disk representation.
Okay, finally found the code for this part, I kept looking in the nif
but that's only the keydir, not the data files.  It looks like

   %% Setup io_list for writing -- avoid merging binaries if we can help it
   Bytes0 = [<<Tstamp:?TSTAMPFIELD>>, <<KeySz:?KEYSIZEFIELD>>,
             <<ValueSz:?VALSIZEFIELD>>, Key, Value],
   Bytes  = [<<(erlang:crc32(Bytes0)):?CRCSIZEFIELD>> | Bytes0],

And looking at the header, it seems that there's 14 bytes of overhead
(4 for CRC, 4 for timestamp, 2 for keysize, 4 for valsize).

So disk calculation should be

( 14 + Key + Value ) * Num Entries * N_Val

So using my numbers from before that gives

( 14 + 36 + 36 ) * 183915891 * 3 = 47450299878 = 44.1 GB

which actually isn't much closer to 341 GB than the previous calculation :(

So all my questions from the previous email still apply.

-Anthony

-- 
------------------------------------------------------------------------
Anthony Molinaro                           [hidden email]

    


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Issues with capacity planning pages on wiki

Nico Meyer
Everybody feel free to steal from my mail to his/her heart's content :-).
At the very least its now available in the mailing list archive for easy reference (http://lists.basho.com/pipermail/riak-users_lists.basho.com/2011-May/004292.html).
I just hope my use of HTML formatting made it through in a readable state.

Am 25.05.2011 18:55, schrieb Jonathan Langevin:
That was one hell of a response. You need to post that as a Wiki article or such, after all that work :-O


Jonathan Langevin
Systems Administrator

Loom Inc.
Wilmington, NC: (910) 241-0433 - [hidden email] - www.loomlearning.com - Skype: intel352



On Wed, May 25, 2011 at 12:22 PM, Nico Meyer <[hidden email]> wrote:
Hi Anthony,

I think, I can explain at least a big chunk of the difference in RAM and disk consumption you see.

Let start with RAM. I could of course be wrong here, but I believe the 'static bitcask per key overhead' is just plainly too small. Let me explain why.
The bitcask_keydir_entry struct for each entry looks like this:
[snip]



_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Issues with capacity planning pages on wiki

Mark Phillips-4
Nico - 

To echo Jonathan's sentiments from above, thanks for putting this together! This is an amazing write-up.

I went ahead and opened a new issue to make sure we get this information incorporated into the wiki in a timely manner. 


We'll take a stab at refining those pages with this and the other Capacity Planning/Calculator feedback when time permits. Pull requests are, of course, always encouraged. 

Mark 

On Wed, May 25, 2011 at 10:16 AM, Nico Meyer <[hidden email]> wrote:
Everybody feel free to steal from my mail to his/her heart's content :-).
At the very least its now available in the mailing list archive for easy reference (http://lists.basho.com/pipermail/riak-users_lists.basho.com/2011-May/004292.html).
I just hope my use of HTML formatting made it through in a readable state.

Am 25.05.2011 18:55, schrieb Jonathan Langevin:
That was one hell of a response. You need to post that as a Wiki article or such, after all that work :-O


Jonathan Langevin
Systems Administrator

Loom Inc.
Wilmington, NC: (910) 241-0433 - [hidden email] - www.loomlearning.com - Skype: intel352



On Wed, May 25, 2011 at 12:22 PM, Nico Meyer <[hidden email]> wrote:
Hi Anthony,

I think, I can explain at least a big chunk of the difference in RAM and disk consumption you see.

Let start with RAM. I could of course be wrong here, but I believe the 'static bitcask per key overhead' is just plainly too small. Let me explain why.
The bitcask_keydir_entry struct for each entry looks like this:
[snip]



_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Issues with capacity planning pages on wiki

Anthony Molinaro
In reply to this post by Nico Meyer
Hi Nico,

  Thanks for the awesome analysis, the one part I was at first confused
about was this part

On Wed, May 25, 2011 at 06:22:53PM +0200, Nico Meyer wrote:
> Your actual RAM consumption of 70 GB seems to be at odd with the
> output of erlang:memory/0 that you sent:
>
> {total,7281790968} =>   RAM: 7281790968 * 8 = 54.3 GB

Because at first I thought your were multiplying by wordsize which is 8 as
well, but then I realized you were multiplying by number of nodes (also 8).

Anyway, things make a lot more sense now, and I'm thinking I may need
to fork bitcask and get rid of some of that extra overhead.  For instance
13 bytes of overhead to store a tuple of binaries seems unnecessary, it's
probably better to just have a single binary with the bucket size as a
prefix, so something like

<<BucketSize:16,Bucket,Key>>

That way you turn 13 bytes of overhead to 2.

Of course I'd need some way to work with old data, but a one time migration
shouldn't be too bad.

It also seems like there should be some way to trim down some of that on
disk usage.  I mean 300+ bytes to store 36 bytes is a lot.

Again, many thanks this is an awesome analysis.

-Anthony

--
------------------------------------------------------------------------
Anthony Molinaro                           <[hidden email]>

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Issues with capacity planning pages on wiki

Justin Sheehy
Hi, Anthony.

There are really three different things below:

1- reducing the minimum overhead of the {Bucket, Key} encoding when
riak is storing into bitcask

2- reducing the size of the vector clock encoding

3- reducing the size of the overall riak_object structure and metadata

All three of these are worth doing.  The reason they are the way they
are now is that the initial assumptions for most Riak deployments was
of a high enough mean object size that these few bytes per object
would proportionally be small noise -- but that's just history and not
a reason to avoid improvements.

In fact, preliminary work has been done on all three of these.  It
just hasn't yet been such a high priority that it got pushed through
to the finish.  One tricky part with all three is backward
compatibility, as most production Riak clusters do not expect to need
a full stop every time we want to make an improvement like these.

Solving #1, by the way, isn't really in bitcask itself but rather in
riak_kv_bitcask_backend.  I can take a swing at that (with backward
compatibility) shortly.  I might also be able to help dig up some of
the old work on #2 that is nearly a year old, and I think Andy Gross
may have done some of what's needed for #3.

With less words: I agree, all this should be made smaller.

And don't let this stop you if you want to jump ahead and give some of it a try!

-Justin



On Wed, May 25, 2011 at 1:50 PM, Anthony Molinaro
<[hidden email]> wrote:

> Anyway, things make a lot more sense now, and I'm thinking I may need
> to fork bitcask and get rid of some of that extra overhead.  For instance
> 13 bytes of overhead to store a tuple of binaries seems unnecessary, it's
> probably better to just have a single binary with the bucket size as a
> prefix, so something like
>
> <<BucketSize:16,Bucket,Key>>
>
> That way you turn 13 bytes of overhead to 2.
>
> Of course I'd need some way to work with old data, but a one time migration
> shouldn't be too bad.
>
> It also seems like there should be some way to trim down some of that on
> disk usage.  I mean 300+ bytes to store 36 bytes is a lot.

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Issues with capacity planning pages on wiki

Anthony Molinaro

Hi Justin,

   Thanks for the reply.  Good to know you may have some partial
solutions for the sizings of items.  Out use case may long term
require us to write out own backend just for space efficiency, but
I'm hoping we can make it quite far with bitcask.  I've got enough on
my plate at the moment so am unlikely to get to this before you guys
do, but as I recently forked riak_kv for something else, you never
know.

Thanks

-Anthony

On Wed, May 25, 2011 at 08:10:29PM -0400, Justin Sheehy wrote:

> Hi, Anthony.
>
> There are really three different things below:
>
> 1- reducing the minimum overhead of the {Bucket, Key} encoding when
> riak is storing into bitcask
>
> 2- reducing the size of the vector clock encoding
>
> 3- reducing the size of the overall riak_object structure and metadata
>
> All three of these are worth doing.  The reason they are the way they
> are now is that the initial assumptions for most Riak deployments was
> of a high enough mean object size that these few bytes per object
> would proportionally be small noise -- but that's just history and not
> a reason to avoid improvements.
>
> In fact, preliminary work has been done on all three of these.  It
> just hasn't yet been such a high priority that it got pushed through
> to the finish.  One tricky part with all three is backward
> compatibility, as most production Riak clusters do not expect to need
> a full stop every time we want to make an improvement like these.
>
> Solving #1, by the way, isn't really in bitcask itself but rather in
> riak_kv_bitcask_backend.  I can take a swing at that (with backward
> compatibility) shortly.  I might also be able to help dig up some of
> the old work on #2 that is nearly a year old, and I think Andy Gross
> may have done some of what's needed for #3.
>
> With less words: I agree, all this should be made smaller.
>
> And don't let this stop you if you want to jump ahead and give some of it a try!
>
> -Justin
>
>
>
> On Wed, May 25, 2011 at 1:50 PM, Anthony Molinaro
> <[hidden email]> wrote:
>
> > Anyway, things make a lot more sense now, and I'm thinking I may need
> > to fork bitcask and get rid of some of that extra overhead.  For instance
> > 13 bytes of overhead to store a tuple of binaries seems unnecessary, it's
> > probably better to just have a single binary with the bucket size as a
> > prefix, so something like
> >
> > <<BucketSize:16,Bucket,Key>>
> >
> > That way you turn 13 bytes of overhead to 2.
> >
> > Of course I'd need some way to work with old data, but a one time migration
> > shouldn't be too bad.
> >
> > It also seems like there should be some way to trim down some of that on
> > disk usage.  I mean 300+ bytes to store 36 bytes is a lot.

--
------------------------------------------------------------------------
Anthony Molinaro                           <[hidden email]>

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com