Riak Duplicates, storage_backend

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Riak Duplicates, storage_backend

idmartin
Is there anyway to analyze Riak objectes to make sure there isnt duplicates or near duplicates?

also,

I want to switch to riak_kv_eleveldb_backend for secondary indexes, was wondering if there is any disadvantages to this backend?

Thx
Reply | Threaded
Open this post in threaded view
|

Re: Riak Duplicates, storage_backend

Samuel Elliott
On Fri, Mar 30, 2012 at 6:17 PM, idmartin <[hidden email]> wrote:
> Is there anyway to analyze Riak objectes to make sure there isnt duplicates
> or near duplicates?

A M/R Job? Identity map, followed by a reduce that does the
comparison? Yeah, it won't be super-efficient, but then a comparison
like that never is, and you probably want to define what "near
duplicates" means.

Sam

>
> also,
>
> I want to switch to riak_kv_eleveldb_backend for secondary indexes, was
> wondering if there is any disadvantages to this backend?
>
> Thx
>
>
> --
> View this message in context: http://riak-users.197444.n3.nabble.com/Riak-Duplicates-storage-backend-tp3871293p3871293.html
> Sent from the Riak Users mailing list archive at Nabble.com.
>
> _______________________________________________
> riak-users mailing list
> [hidden email]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



--
Samuel Elliott
[hidden email]
http://lenary.co.uk/
+44 (0)7891 993 664

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Riak Duplicates, storage_backend

Kresten Krab Thorup
In reply to this post by idmartin
The leveldb back end is not as fast as the default (bitcask), but leveldb does not keep all keys in memory. So for a very large or unbounded set of keys leveldb is superior.  Leveldb stores values sorted by key, which also lets riak speed up certain operations such as listing keys in a bucket.

Kresten


On 30/03/2012, at 18.17, "idmartin" <[hidden email]> wrote:

> Is there anyway to analyze Riak objectes to make sure there isnt duplicates
> or near duplicates?
>
> also,
>
> I want to switch to riak_kv_eleveldb_backend for secondary indexes, was
> wondering if there is any disadvantages to this backend?
>
> Thx
>
>
> --
> View this message in context: http://riak-users.197444.n3.nabble.com/Riak-Duplicates-storage-backend-tp3871293p3871293.html
> Sent from the Riak Users mailing list archive at Nabble.com.
>
> _______________________________________________
> riak-users mailing list
> [hidden email]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Riak Duplicates, storage_backend

idmartin
thanks guys much appreciated