Number of replica for Luwak

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Number of replica for Luwak

Aditya Patadia-2
How can I set the number of replica for the Luwak ?? I want to store
atleast 4 replica of my files uploaded to luwak. One more thing.. how to
set 'r' value and 'w' value for the luwak??

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Number of replica for Luwak

Ryan Zezeski
Hi Aditya,

There are no concepts of N, W, or R in Luwak.  This is because Luwak is built _on top_ of Riak.  To be more specific, Luwak takes a large file and chunks it into smaller pieces.  These smaller pieces are then hashed, and stored under a key equal to the hash value.  That is, Luwak uses something called a hash tree to store data.  This has the nice benefit that you don't take a storage hit for redundant data because Luwak files will share data they have in common.  However, it has the drawback that deleting something is non-trivial, and currently Luwak does not actually remove the data when you delete a file.

If you want to change the N/W/R props of Luwak (as a whole) then you can do so by setting the luwak_node and luwak_tld bucket props appropriately.

-Ryan

[Sent from my iPhone]

On Feb 18, 2011, at 4:31 PM, Aditya Patadia <[hidden email]> wrote:

> How can I set the number of replica for the Luwak ?? I want to store atleast 4 replica of my files uploaded to luwak. One more thing.. how to set 'r' value and 'w' value for the luwak??
>
> _______________________________________________
> riak-users mailing list
> [hidden email]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Number of replica for Luwak

Ryan Zezeski
I realized what I just said is confusing.  I start out saying there is no N/W/R in Luwak and end by saying essentially the opposite.  Maybe I shouldn't reply on my iPhone while drinking a beer on my front stoop?  Such a nice night in Baltimore.

Luwak has no _direct_ notion of N/W/R.  You can't call a Luwak specific URL or Luwak Erlang API and pass it those properties.  No.  However, underneath the surface, Luwak uses two Riak buckets to store all it's data.  They are 'luwak_tld' and 'luwak_node'.  Without getting into much detail, the first stores the file metadata object (Top Level Document), and the later stores the actual data chunks.  That said, if you want all objects stored in Luwak to have a N=4 then you should first set the bucket props accordingly on the two luwak buckets.

Maybe it would be nice to have these properties available via the app.config to make it more obvious?  Then again, I've been drinking.

-Ryan

On Fri, Feb 18, 2011 at 6:44 PM, Ryan Zezeski <[hidden email]> wrote:
Hi Aditya,

There are no concepts of N, W, or R in Luwak.  This is because Luwak is built _on top_ of Riak.  To be more specific, Luwak takes a large file and chunks it into smaller pieces.  These smaller pieces are then hashed, and stored under a key equal to the hash value.  That is, Luwak uses something called a hash tree to store data.  This has the nice benefit that you don't take a storage hit for redundant data because Luwak files will share data they have in common.  However, it has the drawback that deleting something is non-trivial, and currently Luwak does not actually remove the data when you delete a file.

If you want to change the N/W/R props of Luwak (as a whole) then you can do so by setting the luwak_node and luwak_tld bucket props appropriately.

-Ryan

[Sent from my iPhone]

On Feb 18, 2011, at 4:31 PM, Aditya Patadia <[hidden email]> wrote:

> How can I set the number of replica for the Luwak ?? I want to store atleast 4 replica of my files uploaded to luwak. One more thing.. how to set 'r' value and 'w' value for the luwak??
>
> _______________________________________________
> riak-users mailing list
> [hidden email]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Number of replica for Luwak

Les Mikesell
On 2/18/11 6:34 PM, Ryan Zezeski wrote:

> I realized what I just said is confusing.  I start out saying there is no N/W/R
> in Luwak and end by saying essentially the opposite.  Maybe I shouldn't reply on
> my iPhone while drinking a beer on my front stoop?  Such a nice night in Baltimore.
>
> Luwak has no _direct_ notion of N/W/R.  You can't call a Luwak specific URL or
> Luwak Erlang API and pass it those properties.  No.  However, underneath the
> surface, Luwak uses two Riak buckets to store all it's data.  They are
> 'luwak_tld' and 'luwak_node'.  Without getting into much detail, the first
> stores the file metadata object (Top Level Document), and the later stores the
> actual data chunks.  That said, if you want all objects stored in Luwak to have
> a N=4 then you should first set the bucket props accordingly on the two luwak
> buckets.

What happens if there is a read of the object while it is in the process of
being updated if the update is several different operations?

--
   Les Mikesell
    [hidden email]



_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Number of replica for Luwak

bryan-basho
Administrator
On Fri, Feb 18, 2011 at 8:54 PM, Les Mikesell <[hidden email]> wrote:
> What happens if there is a read of the object while it is in the process of
> being updated if the update is several different operations?

Luwak streams work in an "all or nothing" fashion.  That is, no read
will see the result of any stream until that stream is flushed.  Luwak
blocks are immutable, so old file trees will still reference
completely valid old blocks while new ones are being written.  The
last action of flushing a stream is to point the file-metdata object
(in the luwak_tld bucket) at the head of the new tree.

A flush will only occur when a stream closes, unless your program
explicitly calls luwak_put_stream:flush/1.

-Bryan

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Number of replica for Luwak

Les Mikesell
On 2/23/2011 9:11 AM, Bryan Fink wrote:

> On Fri, Feb 18, 2011 at 8:54 PM, Les Mikesell<[hidden email]>  wrote:
>> What happens if there is a read of the object while it is in the process of
>> being updated if the update is several different operations?
>
> Luwak streams work in an "all or nothing" fashion.  That is, no read
> will see the result of any stream until that stream is flushed.  Luwak
> blocks are immutable, so old file trees will still reference
> completely valid old blocks while new ones are being written.  The
> last action of flushing a stream is to point the file-metdata object
> (in the luwak_tld bucket) at the head of the new tree.
>
> A flush will only occur when a stream closes, unless your program
> explicitly calls luwak_put_stream:flush/1.


Thanks!  A couple more somewhat related questions: is that atomic update
nature hard to duplicate outside of luwak (say by a client that needs to
keep several items in sync), and if the luwak blocks are immutable, how
do you ever clean up the space used by data that has been deleted or
modified and no longer referenced?

--
   Les Mikesell
    [hidden email]

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Number of replica for Luwak

Ryan Zezeski


On Wed, Feb 23, 2011 at 10:40 AM, Les Mikesell <[hidden email]> wrote:


Thanks!  A couple more somewhat related questions: is that atomic update nature hard to duplicate outside of luwak (say by a client that needs to keep several items in sync), and if the luwak blocks are immutable, how do you ever clean up the space used by data that has been deleted or modified and no longer referenced?


For your first question, I believe the onus to make multiple object updates atomic is on you, the application developer.  One of the, perhaps easier, ways to achieve this would be to wrap all the data in one object?

Second, you don't; not at this time at least.  Luwak allows you to delete the file reference, but not the data itself.  It's the very nature of the fact that it's an immutable, persistent data structure that makes this so.  If two files share a block, then you can't simply delete the blocks under a file, but instead must perform something more like garbage collection.

If you're up for it, I have some proof of concept code on my fork of Luwak.  I got GC to work, to an extent.  IIRC, once I got past 10-15GB things started to degrade quickly.  When I get more time I plan to return to it. 


-Ryan

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Number of replica for Luwak

Les Mikesell
On 2/23/2011 10:39 AM, Ryan Zezeski wrote:

>
>     Thanks!  A couple more somewhat related questions: is that atomic
>     update nature hard to duplicate outside of luwak (say by a client
>     that needs to keep several items in sync), and if the luwak blocks
>     are immutable, how do you ever clean up the space used by data that
>     has been deleted or modified and no longer referenced?
>
>
> For your first question, I believe the onus to make multiple object
> updates atomic is on you, the application developer.  One of the,
> perhaps easier, ways to achieve this would be to wrap all the data in
> one object?

And luwak accomplishes this by putting the list of keys comprising the
whole stream in one object that is updated last, so a reader will get
one or the other?

> Second, you don't; not at this time at least.  Luwak allows you to
> delete the file reference, but not the data itself.  It's the very
> nature of the fact that it's an immutable, persistent data structure
> that makes this so.  If two files share a block, then you can't simply
> delete the blocks under a file, but instead must perform something more
> like garbage collection.

If I understand what is going on correctly, you'd have to maintain a
reference count atomically with the keys since files with duplicate
sections would reuse some data blocks.  Hmmm, that makes it sound like a
really good place to throw backups for a dedup effect...

> If you're up for it, I have some proof of concept code on my fork of
> Luwak.  I got GC to work, to an extent.  IIRC, once I got past 10-15GB
> things started to degrade quickly.  When I get more time I plan to
> return to it.

What is the trick to knowing if a block is currently referenced or not?
Would it be possible to have some sort of bucket versioning and
periodically copy currently-referenced blocks forward, flip the bucket
reference and drop everything in the old bucket?  I guess you'd still
have to deal with possible re-use during the copy.

--
   Les Mikesell
    [hidden email]


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Number of replica for Luwak

bryan-basho
Administrator
In reply to this post by Les Mikesell
On Wed, Feb 23, 2011 at 10:40 AM, Les Mikesell <[hidden email]> wrote:

> On 2/23/2011 9:11 AM, Bryan Fink wrote:
>>
>> On Fri, Feb 18, 2011 at 8:54 PM, Les Mikesell<[hidden email]>
>>  wrote:
>>>
>>> What happens if there is a read of the object while it is in the process
>>> of
>>> being updated if the update is several different operations?
>>
>> Luwak streams work in an "all or nothing" fashion.  That is, no read
>> will see the result of any stream until that stream is flushed.  Luwak
>> blocks are immutable, so old file trees will still reference
>> completely valid old blocks while new ones are being written.  The
>> last action of flushing a stream is to point the file-metdata object
>> (in the luwak_tld bucket) at the head of the new tree.
>>
>> A flush will only occur when a stream closes, unless your program
>> explicitly calls luwak_put_stream:flush/1.
>
>
> Thanks!  A couple more somewhat related questions: is that atomic update
> nature hard to duplicate outside of luwak (say by a client that needs to
> keep several items in sync), and if the luwak blocks are immutable, how do
> you ever clean up the space used by data that has been deleted or modified
> and no longer referenced?

(Ryan Zezeski sent correct answers before I could finish this, but I'm
sending anyway, with hopefully extra information.)

Well, these two behaviors are partially related.

It's easy to duplicate this behavior: write the new versions of your
items, without removing the old versions, then when you're finished,
replace the object that says which version of those items is the
latest.  It's akin to the old filesystem trick of writing out a new
file, then using 'rename' to move it in place of the old one.  (In
reference to your followup email, yes, Luwak accomplishes this by
effectively (tree-wise) putting all the keys in one object, which is
updated last.)

But, you've hit one one of Luwak's major specializations: it was
originally designed for immutable data, and so it does nothing about
cleaning up unreferenced blocks.  At this point, it's a distributed
online garbage collection problem that we haven't written a solution
for yet.  If you can pause all updates to Luwak, and be sure that the
data is stable (i.e. no conflicts hidden by unreachable nodes), it's
relatively simple to mark&sweep the luwak_node bucket, based on
pointers from the luwak_tld bucket.  There's even some history (look
at the "ancestors" property of the file object) that might help out.
But, (as both you and Ryan figured out) doing this live involves much
more bookkeeping to keep track of not only blocks shared between
files, but also blocks that are not linked solely because a stream
hasn't finished flushing yet.

/me takes down a note to review Ryan's GC experiments

-Bryan

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com