Question regarding backends

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Question regarding backends

Lev Walkin

I have a few questions regarding riak backends:

1. Besides fs, ets, dets and osmos backends, do you know of any other  
backends whether in private or public use?

2. The arch doc (http://riak.basho.com/arch.html) says that the  
backend needs to respond to "list keys" request, among others:
        Each node may be configured with a different module for managing  
local storage. This module only needs to define "get", "put",  
"delete", and "list keys" functions that operate on binary blobs. The  
backend can consider these binaries completely opaque data, or examine  
them to make decisions about how best to store them.

My question is whether it is efficient if a database has, say, several  
billion objects in it. It becomes unfeasible to allow the "list keys"  
operation to exectute. Under which circumstances this function is  
invoked?


--
vlm


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Question regarding backends

Eric Cestari
Hi,

> 1. Besides fs, ets, dets and osmos backends, do you know of any other backends whether in private or public use?


The Basho guys have written the Innostore :
http://hg.basho.com/innostore/
Using InnoDB as a riak backend.

And I wrote a Riak Redis backend :
http://github.com/cstar/riak_redis_backend


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Question regarding backends

Ben Browning
In reply to this post by Lev Walkin
Hi,

On Tue, Feb 16, 2010 at 5:56 AM, Lev Walkin <[hidden email]> wrote:
> My question is whether it is efficient if a database has, say, several
> billion objects in it. It becomes unfeasible to allow the "list keys"
> operation to exectute. Under which circumstances this function is invoked?


The list keys operation is invoked by the client and definitely won't
scale well to billions of keys. The stream_list_keys function will
scale better because it doesn't have to load the entire list of keys
into memory at once. However, looking through the code it looks like
it still loads all of a vnode's keys at once.

Documentation for both is here: http://riak.basho.com/edoc/riak_client.html

Storing lists of keys as new objects seems to work pretty well if you
need to maintain lists of keys.


Ben

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Question regarding backends

Lucas Di Pentima

El 16/02/2010, a las 12:43, Ben Browning escribió:

> The list keys operation is invoked by the client and definitely won't
> scale well to billions of keys. The stream_list_keys function will
> scale better because it doesn't have to load the entire list of keys
> into memory at once. However, looking through the code it looks like
> it still loads all of a vnode's keys at once.
>
> Documentation for both is here: http://riak.basho.com/edoc/riak_client.html
>
> Storing lists of keys as new objects seems to work pretty well if you
> need to maintain lists of keys.


Regarding this issue, if the "riak way" of saving the key list is having it on an object, my question is: Can I set up a replication level for a specific object? Because I surely would like to have a replication level higher than "normal" objects for this one.

Best regards
--
Lucas Di Pentima - Santa Fe, Argentina
Jabber: [hidden email]
MSN: [hidden email]





_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Question regarding backends

Ben Browning
On Tue, Feb 16, 2010 at 12:50 PM, Lucas Di Pentima
<[hidden email]> wrote:
>
> [...] Can I set up a replication level for a specific object? Because I surely would like to have a replication level higher than "normal" objects for this one.

I'm pretty sure you can only control the replication level per-bucket.
But, there's nothing preventing you from putting key lists in a
separate bucket than they keys they reference.

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Question regarding backends

bryan-basho
Administrator
In reply to this post by Lev Walkin
Hi, Lev.  Ben has already given lots of good answers on this topic,
but there was just a little bit of clarification I wanted to add.

On Tue, Feb 16, 2010 at 5:56 AM, Lev Walkin <[hidden email]> wrote:

> 2. The arch doc (http://riak.basho.com/arch.html) says that the backend
> needs to respond to "list keys" request, among others:
>        Each node may be configured with a different module for managing
> local storage. This module only needs to define "get", "put", "delete", and
> "list keys" functions that operate on binary blobs. The backend can consider
> these binaries completely opaque data, or examine them to make decisions
> about how best to store them.
>
> My question is whether it is efficient if a database has, say, several
> billion objects in it. It becomes unfeasible to allow the "list keys"
> operation to exectute. Under which circumstances this function is invoked?

The "list keys" operation mentioned in that arch doc is actually not
related to the "list keys" that the client can request.  The "list
keys" that is relevant here is the one used to build a merkle tree for
hinted handoff.

Indeed, you're right, for a parition with a very large number of keys
stored in it, such an operation could be extremely costly.  This is
why version 0.8 of Riak did away with building that merkle tree for
handoff.

That portion of that document is now incorrect, and instead of "list
keys", all backends are now required to implement "fold".  Switching
handoff to fold solves the problem you brought up by allowing an
incremental progression across the backend's data, instead of building
a giant structure all at once.

There is a separate "list keys" function required for backends to
implement (actually called "list bucket"), which enables the client
"list keys" request.  However, if an application never uses the client
"list keys" request, then the backend chosen for that Riak cluster is
not required to implement it (in contrast to "fold", which is required
for a Riak cluster to work, in order to do hinted handoff).

Hope that helps,
Bryan

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Question regarding backends

Ben Browning
Thanks for clearing up the multiple meanings of "list keys".

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com