Getting all the Keys

classic Classic list List threaded Threaded
36 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Getting all the Keys

Thomas Burdick
I've been playing around with riak lately as really my first usage of a distributed key/value store. I quite like many of the concepts and possibilities of Riak and what it may deliver, however I'm really stuck on an issue.

Doing the equivalent of a select * from sometable in riak is seemingly slow. As a quick test I tried...


Before even iterating over the keys this was unbearably slow already. This took almost half a second on my machine where mytable is completely empty! 

I'm a little baffled, I would assume that getting all the keys of a table is an incredibly common task?  How do I get all the keys of a table quickly? By quickly I mean a few milliseconds or less as I would expect of even a "slow" rdbms with an empty table, even some tables with 1000's of items can get all the primary keys of a sql table in a few milliseconds.

Tom Burdick


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Getting all the Keys

Alexander Sicular
Hi Thomas,

This is a topic that has come up many times. Lemme just hit a couple of high notes in no particular order:

- If you must do a list keys op on a bucket, you must must must use "?keys=stream". True will block on the coordinating node until all nodes return their keys. Stream will start sending keys as soon as the first node returns.

- "list keys" is one of the most expensive native operations you can perform in Riak. Not only does it do a full key scan of all the keys in your bucket, but all the keys in your cluster. It is obnoxiously expensive and only more so as the number of keys in your cluster grows. There has been discussions about changing this but everything comes with a cost (more open file descriptors) and I do not believe a decision has been made yet.

-Riak is in no way a relational system. It is, in fact, about as opposite as you can get. Incidentally, "select *" is generally not recommended in the Kingdom of Relations and regarded as wasteful. You need a bit of a mind shift from relational world to have success with nosql in general and Riak in particular.

-There are no native indices in Riak. By default Riak uses the bitcask backend. Bitcask has many advantages but one disadvantage is that all keys (key length + a bit of overhead) must fit in ram.

-Do not use "?keys=true". Your computer will melt. And then your face.

-As of Riak 0.14 your m/r can filter on key name. I would highly recommend that your data architecture take this into account by using keys that have meaningful names. This will allow you to not scan every key in your cluster.

-Buckets are analogous to relational tables but only just. In Riak, you can think of a bucket as a namespace holder (it is used as part of the default circular hash function) but primarily as a mechanism to differentiate system settings from one group of keys to the next.

-There is no penalty for unlimited buckets except for when their settings deviate from the system defaults. By settings I mean things like hooks, replication values and backends among others.

-One should list keys by truth if one enjoys sitting in parking lots on the freeway on a scorching summers day or perhaps waiting in a TSA line at your nearest international point of embarkation surrounded by octomom families all the while juggling between the grope or the pr0n slideshow. If that is for you, use "?keys=true".

-Virtually everything in Riak is transient. Meaning, for the most part (not including the 60 seconds or so of m/r cache), there is no caching going on in Riak outside of the operating system. Ie. your subsequent queries will do more or less the same work as their predecessors. You need to cache your own results if you want to reuse them... quickly.



Oh, there's more but I'm pretty jelloed from last night. Welcome to the fold, Thomas. Can I call you Tom?

Cheers,
-Alexander Sicular

@siculars

On Jan 22, 2011, at 10:19 AM, Thomas Burdick wrote:

> I've been playing around with riak lately as really my first usage of a distributed key/value store. I quite like many of the concepts and possibilities of Riak and what it may deliver, however I'm really stuck on an issue.
>
> Doing the equivalent of a select * from sometable in riak is seemingly slow. As a quick test I tried...
>
> http://localhost:8098/riak/mytable?keys=true
>
> Before even iterating over the keys this was unbearably slow already. This took almost half a second on my machine where mytable is completely empty!
>
> I'm a little baffled, I would assume that getting all the keys of a table is an incredibly common task?  How do I get all the keys of a table quickly? By quickly I mean a few milliseconds or less as I would expect of even a "slow" rdbms with an empty table, even some tables with 1000's of items can get all the primary keys of a sql table in a few milliseconds.
>
> Tom Burdick
>
> _______________________________________________
> riak-users mailing list
> [hidden email]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Getting all the Keys

Jeremiah Peschka
I was going to respond, but I think Alex answered it well with much more humor than I can muster on a good day.

All I can add is: 
  • Make sure you're on Riak 0.14.
  • Take a look at the filter documentation and see how you can clean up your queries
  • When you're structuring data think in terms of the queries you'll be running. Data duplication is fine.
Riak's design isn't geared towards any kind of complex relational algebra or range scans - we want to pull singular keys, or a few keys. MapReduce is more of a batch processing operation. 

Also, Alex you're very coherent for a man who was "jelloed" last night. Bravo.

Jeremiah Peschka
Microsoft SQL Server MVP
MCITP: Database Developer, DBA


On Sat, Jan 22, 2011 at 11:31 AM, Alexander Sicular <[hidden email]> wrote:
Hi Thomas,

This is a topic that has come up many times. Lemme just hit a couple of high notes in no particular order:

- If you must do a list keys op on a bucket, you must must must use "?keys=stream". True will block on the coordinating node until all nodes return their keys. Stream will start sending keys as soon as the first node returns.

- "list keys" is one of the most expensive native operations you can perform in Riak. Not only does it do a full key scan of all the keys in your bucket, but all the keys in your cluster. It is obnoxiously expensive and only more so as the number of keys in your cluster grows. There has been discussions about changing this but everything comes with a cost (more open file descriptors) and I do not believe a decision has been made yet.

-Riak is in no way a relational system. It is, in fact, about as opposite as you can get. Incidentally, "select *" is generally not recommended in the Kingdom of Relations and regarded as wasteful. You need a bit of a mind shift from relational world to have success with nosql in general and Riak in particular.

-There are no native indices in Riak. By default Riak uses the bitcask backend. Bitcask has many advantages but one disadvantage is that all keys (key length + a bit of overhead) must fit in ram.

-Do not use "?keys=true". Your computer will melt. And then your face.

-As of Riak 0.14 your m/r can filter on key name. I would highly recommend that your data architecture take this into account by using keys that have meaningful names. This will allow you to not scan every key in your cluster.

-Buckets are analogous to relational tables but only just. In Riak, you can think of a bucket as a namespace holder (it is used as part of the default circular hash function) but primarily as a mechanism to differentiate system settings from one group of keys to the next.

-There is no penalty for unlimited buckets except for when their settings deviate from the system defaults. By settings I mean things like hooks, replication values and backends among others.

-One should list keys by truth if one enjoys sitting in parking lots on the freeway on a scorching summers day or perhaps waiting in a TSA line at your nearest international point of embarkation surrounded by octomom families all the while juggling between the grope or the pr0n slideshow. If that is for you, use "?keys=true".

-Virtually everything in Riak is transient. Meaning, for the most part (not including the 60 seconds or so of m/r cache), there is no caching going on in Riak outside of the operating system. Ie. your subsequent queries will do more or less the same work as their predecessors. You need to cache your own results if you want to reuse them... quickly.



Oh, there's more but I'm pretty jelloed from last night. Welcome to the fold, Thomas. Can I call you Tom?

Cheers,
-Alexander Sicular

@siculars

On Jan 22, 2011, at 10:19 AM, Thomas Burdick wrote:

> I've been playing around with riak lately as really my first usage of a distributed key/value store. I quite like many of the concepts and possibilities of Riak and what it may deliver, however I'm really stuck on an issue.
>
> Doing the equivalent of a select * from sometable in riak is seemingly slow. As a quick test I tried...
>
> http://localhost:8098/riak/mytable?keys=true
>
> Before even iterating over the keys this was unbearably slow already. This took almost half a second on my machine where mytable is completely empty!
>
> I'm a little baffled, I would assume that getting all the keys of a table is an incredibly common task?  How do I get all the keys of a table quickly? By quickly I mean a few milliseconds or less as I would expect of even a "slow" rdbms with an empty table, even some tables with 1000's of items can get all the primary keys of a sql table in a few milliseconds.
>
> Tom Burdick
>
> _______________________________________________
> riak-users mailing list
> [hidden email]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Getting all the Keys

Eric Moritz
In reply to this post by Thomas Burdick

I always see problems with getting lists of data.  I hardly every see solutions. 

What are some of solutions people have  come up with to do lists of data?

One solution is Riak Search for sorted lists and filtering.

Another solution is a mega-doc of presorted keys that can be sliced in a map phase. 

Has anyone done something as crazy as a bucket for a b-tree nodes with links for doing inexact range searches?

Eric.

---------- Forwarded message ----------
From: "Alexander Sicular" <[hidden email]>
Date: Jan 22, 2011 11:31 AM
Subject: Re: Getting all the Keys
To: "Thomas Burdick" <[hidden email]>
Cc: <[hidden email]>

Hi Thomas,

This is a topic that has come up many times. Lemme just hit a couple of high notes in no particular order:

- If you must do a list keys op on a bucket, you must must must use "?keys=stream". True will block on the coordinating node until all nodes return their keys. Stream will start sending keys as soon as the first node returns.

- "list keys" is one of the most expensive native operations you can perform in Riak. Not only does it do a full key scan of all the keys in your bucket, but all the keys in your cluster. It is obnoxiously expensive and only more so as the number of keys in your cluster grows. There has been discussions about changing this but everything comes with a cost (more open file descriptors) and I do not believe a decision has been made yet.

-Riak is in no way a relational system. It is, in fact, about as opposite as you can get. Incidentally, "select *" is generally not recommended in the Kingdom of Relations and regarded as wasteful. You need a bit of a mind shift from relational world to have success with nosql in general and Riak in particular.

-There are no native indices in Riak. By default Riak uses the bitcask backend. Bitcask has many advantages but one disadvantage is that all keys (key length + a bit of overhead) must fit in ram.

-Do not use "?keys=true". Your computer will melt. And then your face.

-As of Riak 0.14 your m/r can filter on key name. I would highly recommend that your data architecture take this into account by using keys that have meaningful names. This will allow you to not scan every key in your cluster.

-Buckets are analogous to relational tables but only just. In Riak, you can think of a bucket as a namespace holder (it is used as part of the default circular hash function) but primarily as a mechanism to differentiate system settings from one group of keys to the next.

-There is no penalty for unlimited buckets except for when their settings deviate from the system defaults. By settings I mean things like hooks, replication values and backends among others.

-One should list keys by truth if one enjoys sitting in parking lots on the freeway on a scorching summers day or perhaps waiting in a TSA line at your nearest international point of embarkation surrounded by octomom families all the while juggling between the grope or the pr0n slideshow. If that is for you, use "?keys=true".

-Virtually everything in Riak is transient. Meaning, for the most part (not including the 60 seconds or so of m/r cache), there is no caching going on in Riak outside of the operating system. Ie. your subsequent queries will do more or less the same work as their predecessors. You need to cache your own results if you want to reuse them... quickly.



Oh, there's more but I'm pretty jelloed from last night. Welcome to the fold, Thomas. Can I call you Tom?

Cheers,
-Alexander Sicular

@siculars

On Jan 22, 2011, at 10:19 AM, Thomas Burdick wrote:

> I've been playing around with riak lately as really my first usage of a distributed key/value store. I quite like many of the concepts and possibilities of Riak and what it may deliver, however I'm really stuck on an issue.
>
> Doing the equivalent of a select * from sometable in riak is seemingly slow. As a quick test I tried...
>
> http://localhost:8098/riak/mytable?keys=true
>
> Before even iterating over the keys this was unbearably slow already. This took almost half a second on my machine where mytable is completely empty!
>
> I'm a little baffled, I would assume that getting all the keys of a table is an incredibly common task?  How do I get all the keys of a table quickly? By quickly I mean a few milliseconds or less as I would expect of even a "slow" rdbms with an empty table, even some tables with 1000's of items can get all the primary keys of a sql table in a few milliseconds.
>
> Tom Burdick
>
> _______________________________________________
> riak-users mailing list
> [hidden email]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Getting all the Keys

Thomas Burdick
In reply to this post by Alexander Sicular
I guess I'm left even more baffled now, if the keys are all in memory and I only have 1 real node in my cluster, why would it take half a second to obtain all the keys from a completely empty database? If it takes half a second to just list the keys out like that how could a map/reduce ever take less time? Doesn't map/reduce need to go through all the keys? Does streaming the keys really improve the ability to go through all of them or does it just let you incrementally work with them?

There's no real seemingly obvious way to map meaningful names in this case, the keys are just random unique identifiers, in postgresql I'd be using the serial type which clearly would never work in the case of riak.

So in case of riak I've been using uuid's thus far. So far in order to get any sort of meaningful speed I just serialize my own erlang list of binary uuid's to a table. That really isn't that fast either though, it just happens to be faster than list_keys at the moment.

So really whats the solution to just having a list of like 50k keys that can quickly be appended to without taking seconds to then retrieve later on. Or is this just not a valid use case for riak at all? That would suck cause again, I really like the notion of an AP oriented database!

Tom Burdick


On Sat, Jan 22, 2011 at 10:31 AM, Alexander Sicular <[hidden email]> wrote:
Hi Thomas,

This is a topic that has come up many times. Lemme just hit a couple of high notes in no particular order:

- If you must do a list keys op on a bucket, you must must must use "?keys=stream". True will block on the coordinating node until all nodes return their keys. Stream will start sending keys as soon as the first node returns.

- "list keys" is one of the most expensive native operations you can perform in Riak. Not only does it do a full key scan of all the keys in your bucket, but all the keys in your cluster. It is obnoxiously expensive and only more so as the number of keys in your cluster grows. There has been discussions about changing this but everything comes with a cost (more open file descriptors) and I do not believe a decision has been made yet.

-Riak is in no way a relational system. It is, in fact, about as opposite as you can get. Incidentally, "select *" is generally not recommended in the Kingdom of Relations and regarded as wasteful. You need a bit of a mind shift from relational world to have success with nosql in general and Riak in particular.

-There are no native indices in Riak. By default Riak uses the bitcask backend. Bitcask has many advantages but one disadvantage is that all keys (key length + a bit of overhead) must fit in ram.

-Do not use "?keys=true". Your computer will melt. And then your face.

-As of Riak 0.14 your m/r can filter on key name. I would highly recommend that your data architecture take this into account by using keys that have meaningful names. This will allow you to not scan every key in your cluster.

-Buckets are analogous to relational tables but only just. In Riak, you can think of a bucket as a namespace holder (it is used as part of the default circular hash function) but primarily as a mechanism to differentiate system settings from one group of keys to the next.

-There is no penalty for unlimited buckets except for when their settings deviate from the system defaults. By settings I mean things like hooks, replication values and backends among others.

-One should list keys by truth if one enjoys sitting in parking lots on the freeway on a scorching summers day or perhaps waiting in a TSA line at your nearest international point of embarkation surrounded by octomom families all the while juggling between the grope or the pr0n slideshow. If that is for you, use "?keys=true".

-Virtually everything in Riak is transient. Meaning, for the most part (not including the 60 seconds or so of m/r cache), there is no caching going on in Riak outside of the operating system. Ie. your subsequent queries will do more or less the same work as their predecessors. You need to cache your own results if you want to reuse them... quickly.



Oh, there's more but I'm pretty jelloed from last night. Welcome to the fold, Thomas. Can I call you Tom?

Cheers,
-Alexander Sicular

@siculars

On Jan 22, 2011, at 10:19 AM, Thomas Burdick wrote:

> I've been playing around with riak lately as really my first usage of a distributed key/value store. I quite like many of the concepts and possibilities of Riak and what it may deliver, however I'm really stuck on an issue.
>
> Doing the equivalent of a select * from sometable in riak is seemingly slow. As a quick test I tried...
>
> http://localhost:8098/riak/mytable?keys=true
>
> Before even iterating over the keys this was unbearably slow already. This took almost half a second on my machine where mytable is completely empty!
>
> I'm a little baffled, I would assume that getting all the keys of a table is an incredibly common task?  How do I get all the keys of a table quickly? By quickly I mean a few milliseconds or less as I would expect of even a "slow" rdbms with an empty table, even some tables with 1000's of items can get all the primary keys of a sql table in a few milliseconds.
>
> Tom Burdick
>
> _______________________________________________
> riak-users mailing list
> [hidden email]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Getting all the Keys

Jeremiah Peschka
If you're looking for a fast, in memory, store that has support for ordered lists you should probably give Redis a look-see. It's an in memory key-value store but it has support for lists as a native data type: http://redis.io/commands#list You could do the same thing in Riak, but you'd be storing your list as the value, retrieving it by key, serializing it, adding the new item to the list, and then persisting it back to the database. I say this assuming that you want a true list and not just some messy unordered list of values. Needless to say, that approach is not optimal.

The key filtering approach that Alex mentioned is an in memory filter. You don't necessarily have to provide a reduce phase. For example, if you have a bucket that contains stock information and the key is something like 'YYYY-MM-DD-ticker' you could use a key-filter to get all the keys for 2010 or combine multiple key filters and get all of the keys for 2010 and MSFT (there's no need, the stock price has been flat for 11 years).

I did some quick analysis, I'm not sure why it happens (apart from the time needed to fill a buffer), but here are the results I saw when using a list keys without streaming, with streaming, and listing keys on an empty bucket. 

                  user     system      total        real
list_keys     0.020000   0.000000   0.020000 (  3.254437)
stream_keys   0.030000   0.010000   0.040000 (  0.561119)
empty bucket  0.000000   0.000000   0.000000 (  0.664574)


My test may be completely flawed. Just so you can check it out, here's the Ruby code I used.

require 'benchmark'
require 'riak'

c = Riak::Client.new(:port => 8091, :http_backend => :Excon)
b = c.bucket('stocks')
fake_bucket = c.bucket('asdfasdfasdf')

Benchmark.bm(7) do |x|
  x.report('list_keys') {
    keys = b.keys 
  }
  
  x.report('stream_keys') {
    b.keys do |list|
      keys = list
    end
  }
  
  x.report('empty bucket') {
    keys = fake_bucket.keys
  }
end

Jeremiah Peschka
Microsoft SQL Server MVP
MCITP: Database Developer, DBA


On Sat, Jan 22, 2011 at 12:23 PM, Thomas Burdick <[hidden email]> wrote:
I guess I'm left even more baffled now, if the keys are all in memory and I only have 1 real node in my cluster, why would it take half a second to obtain all the keys from a completely empty database? If it takes half a second to just list the keys out like that how could a map/reduce ever take less time? Doesn't map/reduce need to go through all the keys? Does streaming the keys really improve the ability to go through all of them or does it just let you incrementally work with them?

There's no real seemingly obvious way to map meaningful names in this case, the keys are just random unique identifiers, in postgresql I'd be using the serial type which clearly would never work in the case of riak.

So in case of riak I've been using uuid's thus far. So far in order to get any sort of meaningful speed I just serialize my own erlang list of binary uuid's to a table. That really isn't that fast either though, it just happens to be faster than list_keys at the moment.

So really whats the solution to just having a list of like 50k keys that can quickly be appended to without taking seconds to then retrieve later on. Or is this just not a valid use case for riak at all? That would suck cause again, I really like the notion of an AP oriented database!

Tom Burdick



On Sat, Jan 22, 2011 at 10:31 AM, Alexander Sicular <[hidden email]> wrote:
Hi Thomas,

This is a topic that has come up many times. Lemme just hit a couple of high notes in no particular order:

- If you must do a list keys op on a bucket, you must must must use "?keys=stream". True will block on the coordinating node until all nodes return their keys. Stream will start sending keys as soon as the first node returns.

- "list keys" is one of the most expensive native operations you can perform in Riak. Not only does it do a full key scan of all the keys in your bucket, but all the keys in your cluster. It is obnoxiously expensive and only more so as the number of keys in your cluster grows. There has been discussions about changing this but everything comes with a cost (more open file descriptors) and I do not believe a decision has been made yet.

-Riak is in no way a relational system. It is, in fact, about as opposite as you can get. Incidentally, "select *" is generally not recommended in the Kingdom of Relations and regarded as wasteful. You need a bit of a mind shift from relational world to have success with nosql in general and Riak in particular.

-There are no native indices in Riak. By default Riak uses the bitcask backend. Bitcask has many advantages but one disadvantage is that all keys (key length + a bit of overhead) must fit in ram.

-Do not use "?keys=true". Your computer will melt. And then your face.

-As of Riak 0.14 your m/r can filter on key name. I would highly recommend that your data architecture take this into account by using keys that have meaningful names. This will allow you to not scan every key in your cluster.

-Buckets are analogous to relational tables but only just. In Riak, you can think of a bucket as a namespace holder (it is used as part of the default circular hash function) but primarily as a mechanism to differentiate system settings from one group of keys to the next.

-There is no penalty for unlimited buckets except for when their settings deviate from the system defaults. By settings I mean things like hooks, replication values and backends among others.

-One should list keys by truth if one enjoys sitting in parking lots on the freeway on a scorching summers day or perhaps waiting in a TSA line at your nearest international point of embarkation surrounded by octomom families all the while juggling between the grope or the pr0n slideshow. If that is for you, use "?keys=true".

-Virtually everything in Riak is transient. Meaning, for the most part (not including the 60 seconds or so of m/r cache), there is no caching going on in Riak outside of the operating system. Ie. your subsequent queries will do more or less the same work as their predecessors. You need to cache your own results if you want to reuse them... quickly.



Oh, there's more but I'm pretty jelloed from last night. Welcome to the fold, Thomas. Can I call you Tom?

Cheers,
-Alexander Sicular

@siculars

On Jan 22, 2011, at 10:19 AM, Thomas Burdick wrote:

> I've been playing around with riak lately as really my first usage of a distributed key/value store. I quite like many of the concepts and possibilities of Riak and what it may deliver, however I'm really stuck on an issue.
>
> Doing the equivalent of a select * from sometable in riak is seemingly slow. As a quick test I tried...
>
> http://localhost:8098/riak/mytable?keys=true
>
> Before even iterating over the keys this was unbearably slow already. This took almost half a second on my machine where mytable is completely empty!
>
> I'm a little baffled, I would assume that getting all the keys of a table is an incredibly common task?  How do I get all the keys of a table quickly? By quickly I mean a few milliseconds or less as I would expect of even a "slow" rdbms with an empty table, even some tables with 1000's of items can get all the primary keys of a sql table in a few milliseconds.
>
> Tom Burdick
>
> _______________________________________________
> riak-users mailing list
> [hidden email]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Getting all the Keys

Alexander Staubo
In reply to this post by Thomas Burdick
On Sat, Jan 22, 2011 at 18:23, Thomas Burdick
<[hidden email]> wrote:
> So really whats the solution to just having a list of like 50k keys that can
> quickly be appended to without taking seconds to then retrieve later on. Or
> is this just not a valid use case for riak at all? That would suck cause
> again, I really like the notion of an AP oriented database!

I have been struggling with the same issue. You may want to look at
Cassandra, which handles sequential key range traversal very well.
Riak also has a problem with buckets sharing the same data storage
(buckets are essentially just a way to namespace keys), so if you have
two buckets and fill up one of them, then enumerating the keys of the
empty bucket will take a long time even though it

> Tom Burdick
>
>
> On Sat, Jan 22, 2011 at 10:31 AM, Alexander Sicular <[hidden email]>
> wrote:
>>
>> Hi Thomas,
>>
>> This is a topic that has come up many times. Lemme just hit a couple of
>> high notes in no particular order:
>>
>> - If you must do a list keys op on a bucket, you must must must use
>> "?keys=stream". True will block on the coordinating node until all nodes
>> return their keys. Stream will start sending keys as soon as the first node
>> returns.
>>
>> - "list keys" is one of the most expensive native operations you can
>> perform in Riak. Not only does it do a full key scan of all the keys in your
>> bucket, but all the keys in your cluster. It is obnoxiously expensive and
>> only more so as the number of keys in your cluster grows. There has been
>> discussions about changing this but everything comes with a cost (more open
>> file descriptors) and I do not believe a decision has been made yet.
>>
>> -Riak is in no way a relational system. It is, in fact, about as opposite
>> as you can get. Incidentally, "select *" is generally not recommended in the
>> Kingdom of Relations and regarded as wasteful. You need a bit of a mind
>> shift from relational world to have success with nosql in general and Riak
>> in particular.
>>
>> -There are no native indices in Riak. By default Riak uses the bitcask
>> backend. Bitcask has many advantages but one disadvantage is that all keys
>> (key length + a bit of overhead) must fit in ram.
>>
>> -Do not use "?keys=true". Your computer will melt. And then your face.
>>
>> -As of Riak 0.14 your m/r can filter on key name. I would highly recommend
>> that your data architecture take this into account by using keys that have
>> meaningful names. This will allow you to not scan every key in your cluster.
>>
>> -Buckets are analogous to relational tables but only just. In Riak, you
>> can think of a bucket as a namespace holder (it is used as part of the
>> default circular hash function) but primarily as a mechanism to
>> differentiate system settings from one group of keys to the next.
>>
>> -There is no penalty for unlimited buckets except for when their settings
>> deviate from the system defaults. By settings I mean things like hooks,
>> replication values and backends among others.
>>
>> -One should list keys by truth if one enjoys sitting in parking lots on
>> the freeway on a scorching summers day or perhaps waiting in a TSA line at
>> your nearest international point of embarkation surrounded by octomom
>> families all the while juggling between the grope or the pr0n slideshow. If
>> that is for you, use "?keys=true".
>>
>> -Virtually everything in Riak is transient. Meaning, for the most part
>> (not including the 60 seconds or so of m/r cache), there is no caching going
>> on in Riak outside of the operating system. Ie. your subsequent queries will
>> do more or less the same work as their predecessors. You need to cache your
>> own results if you want to reuse them... quickly.
>>
>>
>>
>> Oh, there's more but I'm pretty jelloed from last night. Welcome to the
>> fold, Thomas. Can I call you Tom?
>>
>> Cheers,
>> -Alexander Sicular
>>
>> @siculars
>>
>> On Jan 22, 2011, at 10:19 AM, Thomas Burdick wrote:
>>
>> > I've been playing around with riak lately as really my first usage of a
>> > distributed key/value store. I quite like many of the concepts and
>> > possibilities of Riak and what it may deliver, however I'm really stuck on
>> > an issue.
>> >
>> > Doing the equivalent of a select * from sometable in riak is seemingly
>> > slow. As a quick test I tried...
>> >
>> > http://localhost:8098/riak/mytable?keys=true
>> >
>> > Before even iterating over the keys this was unbearably slow already.
>> > This took almost half a second on my machine where mytable is completely
>> > empty!
>> >
>> > I'm a little baffled, I would assume that getting all the keys of a
>> > table is an incredibly common task?  How do I get all the keys of a table
>> > quickly? By quickly I mean a few milliseconds or less as I would expect of
>> > even a "slow" rdbms with an empty table, even some tables with 1000's of
>> > items can get all the primary keys of a sql table in a few milliseconds.
>> >
>> > Tom Burdick
>> >
>> > _______________________________________________
>> > riak-users mailing list
>> > [hidden email]
>> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>
>
> _______________________________________________
> riak-users mailing list
> [hidden email]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Getting all the Keys

Alexander Staubo
On Sat, Jan 22, 2011 at 19:34, Alexander Staubo <[hidden email]> wrote:

> On Sat, Jan 22, 2011 at 18:23, Thomas Burdick
> <[hidden email]> wrote:
>> So really whats the solution to just having a list of like 50k keys that can
>> quickly be appended to without taking seconds to then retrieve later on. Or
>> is this just not a valid use case for riak at all? That would suck cause
>> again, I really like the notion of an AP oriented database!
>
> I have been struggling with the same issue. You may want to look at
> Cassandra, which handles sequential key range traversal very well.
> Riak also has a problem with buckets sharing the same data storage
> (buckets are essentially just a way to namespace keys), so if you have
> two buckets and fill up one of them, then enumerating the keys of the
> empty bucket will take a long time even though it

I accidentally "Send". Again: I have been struggling with the same
issue. You may want to look at Cassandra, which handles sequential key
range traversal very well. Riak also has a problem with buckets
sharing the same data storage (buckets are essentially just a way to
namespace keys), so if you have two buckets and fill up one of them,
then enumerating the keys of the empty bucket will take a long time
even though it's empty. Cassandra does not have a problem with this,
since Cassandra's keyspaces are separate data structures. I like Riak,
but it only works well with single-key/linked traversal, not this kind
of bucket-wide processing.

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Getting all the Keys

Gary William Flake
This is a really big pain point for me as well and -- at the risk of prematurely being overly critical of Riak's overall design -- I think it points to a major flaw of Riak in its current state.

Let me explain....

Riak is bad at enumerating keys.  We know that. I am happy to manage a list of keys myself.  Fine.  How do I do that in Riak?

Well, the obvious solution is to have a special object that you maintain that is a list of the keys that you need.  So, each time you insert a new object, you effectively append a new key to the end of a list, that is itself a value to a special index key.

But what is an append in Riak?  The only way to implement a list append is to:

1. read in the entire value of your list object.
2. append to this list at the application layer.
3. reinsert the new value back into the list.

This is a horrible solution for at least three reasons.  First, inserting N new keys and maintaining your own list is now O(N*N) runtime complexity because each append has to do I/O proportional to the size of the entire list for each append.  Second, this operation should be happening entirely at the data layer and not between the data and app layer.  Third, it introduces write contentions in that two clients may try to append at approximately the same time, giving you a list that is now inconsistent.

The conclusion for me is that you can't efficiently enumerate keys with Riak even if you roll your own key index with Riak (in anything close to an ideal way).

To overcome this problem, Riak desperately needs to either maintain its own key index efficiently, or it needs to support atomic mutations on values.

For an example of the latter approach, see Redis which I think handles this beautifully.

In the end, you may need to think about redesigning your data model so that there never is a need to enumerate keys.  I am trying this and I use a combination of:

1. Standard KV approaches,
2. Riak search for being able to enumerate some records in order,
3. Transactions logs stored in a special bucket,
4. Batched M/R phases on the Transaction logs to avoid write contention, and
5. Batched rebuilding of "views" in a view bucket.

Given that Riak search is loudly proclaimed as being beta, this makes me fairly anxious.

I am very close to not needing to enumerate keys the bad way now.  However, I would have killed for an atomic mutator like Redis.

BTW, I would love for someone from Basho to disabuse me of my conclusions in this note.

-- GWF







On Sat, Jan 22, 2011 at 10:40 AM, Alexander Staubo <[hidden email]> wrote:
On Sat, Jan 22, 2011 at 19:34, Alexander Staubo <[hidden email]> wrote:
> On Sat, Jan 22, 2011 at 18:23, Thomas Burdick
> <[hidden email]> wrote:
>> So really whats the solution to just having a list of like 50k keys that can
>> quickly be appended to without taking seconds to then retrieve later on. Or
>> is this just not a valid use case for riak at all? That would suck cause
>> again, I really like the notion of an AP oriented database!
>
> I have been struggling with the same issue. You may want to look at
> Cassandra, which handles sequential key range traversal very well.
> Riak also has a problem with buckets sharing the same data storage
> (buckets are essentially just a way to namespace keys), so if you have
> two buckets and fill up one of them, then enumerating the keys of the
> empty bucket will take a long time even though it

I accidentally "Send". Again: I have been struggling with the same
issue. You may want to look at Cassandra, which handles sequential key
range traversal very well. Riak also has a problem with buckets
sharing the same data storage (buckets are essentially just a way to
namespace keys), so if you have two buckets and fill up one of them,
then enumerating the keys of the empty bucket will take a long time
even though it's empty. Cassandra does not have a problem with this,
since Cassandra's keyspaces are separate data structures. I like Riak,
but it only works well with single-key/linked traversal, not this kind
of bucket-wide processing.

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Getting all the Keys

Alexander Sicular
I don't think it is a flaw at all. Rather I am of the opinion that riak was never meant to do the things we are all talking about in this thread. 

When I need to do these things I specifically use redis because, as noted, it has tremendous support for specific data structures. When I need to enumerate keys or mutate counters I use redis and periodically dump those values to riak. I'll write a post or somethin about it. Also, use redis if you wanna do these things.   

I'll drop a phat tangent and just mention that I watched @rk's talk at Qcon SF 2010 the other day and am kinda crushing on how they implemented distributed counters in cassandra (mainlined in 0.7.1 me thinks) which, imho, is so choice for a riak implementation it isn't even funny. It was like pow pow in da face and my face got melted.     

@siculars on twitter

Sent from my iPhone

On Jan 22, 2011, at 14:45, Gary William Flake <[hidden email]> wrote:

This is a really big pain point for me as well and -- at the risk of prematurely being overly critical of Riak's overall design -- I think it points to a major flaw of Riak in its current state.

Let me explain....

Riak is bad at enumerating keys.  We know that. I am happy to manage a list of keys myself.  Fine.  How do I do that in Riak?

Well, the obvious solution is to have a special object that you maintain that is a list of the keys that you need.  So, each time you insert a new object, you effectively append a new key to the end of a list, that is itself a value to a special index key.

But what is an append in Riak?  The only way to implement a list append is to:

1. read in the entire value of your list object.
2. append to this list at the application layer.
3. reinsert the new value back into the list.

This is a horrible solution for at least three reasons.  First, inserting N new keys and maintaining your own list is now O(N*N) runtime complexity because each append has to do I/O proportional to the size of the entire list for each append.  Second, this operation should be happening entirely at the data layer and not between the data and app layer.  Third, it introduces write contentions in that two clients may try to append at approximately the same time, giving you a list that is now inconsistent.

The conclusion for me is that you can't efficiently enumerate keys with Riak even if you roll your own key index with Riak (in anything close to an ideal way).

To overcome this problem, Riak desperately needs to either maintain its own key index efficiently, or it needs to support atomic mutations on values.

For an example of the latter approach, see Redis which I think handles this beautifully.

In the end, you may need to think about redesigning your data model so that there never is a need to enumerate keys.  I am trying this and I use a combination of:

1. Standard KV approaches,
2. Riak search for being able to enumerate some records in order,
3. Transactions logs stored in a special bucket,
4. Batched M/R phases on the Transaction logs to avoid write contention, and
5. Batched rebuilding of "views" in a view bucket.

Given that Riak search is loudly proclaimed as being beta, this makes me fairly anxious.

I am very close to not needing to enumerate keys the bad way now.  However, I would have killed for an atomic mutator like Redis.

BTW, I would love for someone from Basho to disabuse me of my conclusions in this note.

-- GWF







On Sat, Jan 22, 2011 at 10:40 AM, Alexander Staubo <[hidden email]> wrote:
On Sat, Jan 22, 2011 at 19:34, Alexander Staubo <[hidden email]> wrote:
> On Sat, Jan 22, 2011 at 18:23, Thomas Burdick
> <[hidden email]> wrote:
>> So really whats the solution to just having a list of like 50k keys that can
>> quickly be appended to without taking seconds to then retrieve later on. Or
>> is this just not a valid use case for riak at all? That would suck cause
>> again, I really like the notion of an AP oriented database!
>
> I have been struggling with the same issue. You may want to look at
> Cassandra, which handles sequential key range traversal very well.
> Riak also has a problem with buckets sharing the same data storage
> (buckets are essentially just a way to namespace keys), so if you have
> two buckets and fill up one of them, then enumerating the keys of the
> empty bucket will take a long time even though it

I accidentally "Send". Again: I have been struggling with the same
issue. You may want to look at Cassandra, which handles sequential key
range traversal very well. Riak also has a problem with buckets
sharing the same data storage (buckets are essentially just a way to
namespace keys), so if you have two buckets and fill up one of them,
then enumerating the keys of the empty bucket will take a long time
even though it's empty. Cassandra does not have a problem with this,
since Cassandra's keyspaces are separate data structures. I like Riak,
but it only works well with single-key/linked traversal, not this kind
of bucket-wide processing.

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Getting all the Keys

Eric Moritz
In reply to this post by Gary William Flake

I have a pipe dream of doing a distibuted b-tree with a bucket who's value is a node in the tree containing a list of keys to another bucket and left and right links  to children nodes.

It feels right in my head, though it is probably horribly flawed in some way. I'm certain more clever data nerds than me have thought of this and dismissed it for some obvious reason that I'm too green to see.

Eric.

On Jan 22, 2011 2:46 PM, "Gary William Flake" <[hidden email]> wrote:
> This is a really big pain point for me as well and -- at the risk of
> prematurely being overly critical of Riak's overall design -- I think it
> points to a major flaw of Riak in its current state.
>
> Let me explain....
>
> Riak is bad at enumerating keys. We know that. I am happy to manage a list
> of keys myself. Fine. How do I do that in Riak?
>
> Well, the obvious solution is to have a special object that you maintain
> that is a list of the keys that you need. So, each time you insert a new
> object, you effectively append a new key to the end of a list, that is
> itself a value to a special index key.
>
> But what is an append in Riak? The only way to implement a list append is
> to:
>
> 1. read in the entire value of your list object.
> 2. append to this list at the application layer.
> 3. reinsert the new value back into the list.
>
> This is a horrible solution for at least three reasons. First, inserting N
> new keys and maintaining your own list is now O(N*N) runtime complexity
> because each append has to do I/O proportional to the size of the entire
> list for each append. Second, this operation should be happening entirely
> at the data layer and not between the data and app layer. Third, it
> introduces write contentions in that two clients may try to append at
> approximately the same time, giving you a list that is now inconsistent.
>
> The conclusion for me is that you can't efficiently enumerate keys with Riak
> even if you roll your own key index with Riak (in anything close to an ideal
> way).
>
> To overcome this problem, Riak desperately needs to either maintain its own
> key index efficiently, or it needs to support atomic mutations on values.
>
> For an example of the latter approach, see Redis which I think handles this
> beautifully.
>
> In the end, you may need to think about redesigning your data model so that
> there never is a need to enumerate keys. I am trying this and I use a
> combination of:
>
> 1. Standard KV approaches,
> 2. Riak search for being able to enumerate some records in order,
> 3. Transactions logs stored in a special bucket,
> 4. Batched M/R phases on the Transaction logs to avoid write contention, and
> 5. Batched rebuilding of "views" in a view bucket.
>
> Given that Riak search is loudly proclaimed as being beta, this makes me
> fairly anxious.
>
> I am very close to not needing to enumerate keys the bad way now. However,
> I would have killed for an atomic mutator like Redis.
>
> BTW, I would love for someone from Basho to disabuse me of my conclusions in
> this note.

>
> -- GWF
>
>
>
>
>
>
>
> On Sat, Jan 22, 2011 at 10:40 AM, Alexander Staubo <[hidden email]>wrote:
>
>> On Sat, Jan 22, 2011 at 19:34, Alexander Staubo <[hidden email]>
>> wrote:
>> > On Sat, Jan 22, 2011 at 18:23, Thomas Burdick
>> > <[hidden email]> wrote:
>> >> So really whats the solution to just having a list of like 50k keys that
>> can
>> >> quickly be appended to without taking seconds to then retrieve later on.
>> Or
>> >> is this just not a valid use case for riak at all? That would suck cause
>> >> again, I really like the notion of an AP oriented database!
>> >
>> > I have been struggling with the same issue. You may want to look at
>> > Cassandra, which handles sequential key range traversal very well.
>> > Riak also has a problem with buckets sharing the same data storage
>> > (buckets are essentially just a way to namespace keys), so if you have
>> > two buckets and fill up one of them, then enumerating the keys of the
>> > empty bucket will take a long time even though it
>>
>> I accidentally "Send". Again: I have been struggling with the same
>> issue. You may want to look at Cassandra, which handles sequential key
>> range traversal very well. Riak also has a problem with buckets
>> sharing the same data storage (buckets are essentially just a way to
>> namespace keys), so if you have two buckets and fill up one of them,
>> then enumerating the keys of the empty bucket will take a long time
>> even though it's empty. Cassandra does not have a problem with this,
>> since Cassandra's keyspaces are separate data structures. I like Riak,
>> but it only works well with single-key/linked traversal, not this kind
>> of bucket-wide processing.
>>
>> _______________________________________________
>> riak-users mailing list
>> [hidden email]
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Getting all the Keys

Justin Sheehy
In reply to this post by Alexander Sicular
On Sat, Jan 22, 2011 at 3:18 PM, Alexander Sicular <[hidden email]> wrote:

> I'll drop a phat tangent and just mention that I watched @rk's talk at Qcon
> SF 2010 the other day and am kinda crushing on how they implemented
> distributed counters in cassandra (mainlined in 0.7.1 me thinks) which,
> imho, is so choice for a riak implementation it isn't even funny. It was
> like pow pow in da face and my face got melted.

I know that a couple of people have done their own spikes on
distributed counters for Riak and have demonstrated that it's
certainly doable.

The question isn't "can it be done" as we know it can.  The tricky
questions are about which tradeoffs to make: write-performance,
read-performance, and so on.

In other words, I am in support of this sort of feature.

-Justin

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Getting all the Keys

Les Mikesell
In reply to this post by Gary William Flake
On 1/22/11 1:45 PM, Gary William Flake wrote:
>
> Riak is bad at enumerating keys.

If the key isn't something that you can use to retrieve the items you want,
what's the point of having it?

--
   Les Mikesell
    [hidden email]

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Getting all the Keys

Eric Moritz
In reply to this post by Eric Moritz

After thinking about it. A b-tree in a bucket wouldn't provide any functionality that I couldn't get from riak search... so that's probably a bad idea.

On Jan 22, 2011 3:28 PM, "Eric Moritz" <[hidden email]> wrote:
> I have a pipe dream of doing a distibuted b-tree with a bucket who's value
> is a node in the tree containing a list of keys to another bucket and left
> and right links to children nodes.
>
> It feels right in my head, though it is probably horribly flawed in some
> way. I'm certain more clever data nerds than me have thought of this and
> dismissed it for some obvious reason that I'm too green to see.
>
> Eric.
> On Jan 22, 2011 2:46 PM, "Gary William Flake" <[hidden email]> wrote:
>> This is a really big pain point for me as well and -- at the risk of
>> prematurely being overly critical of Riak's overall design -- I think it
>> points to a major flaw of Riak in its current state.
>>
>> Let me explain....
>>
>> Riak is bad at enumerating keys. We know that. I am happy to manage a list
>> of keys myself. Fine. How do I do that in Riak?
>>
>> Well, the obvious solution is to have a special object that you maintain
>> that is a list of the keys that you need. So, each time you insert a new
>> object, you effectively append a new key to the end of a list, that is
>> itself a value to a special index key.
>>
>> But what is an append in Riak? The only way to implement a list append is
>> to:
>>
>> 1. read in the entire value of your list object.
>> 2. append to this list at the application layer.
>> 3. reinsert the new value back into the list.
>>
>> This is a horrible solution for at least three reasons. First, inserting N
>> new keys and maintaining your own list is now O(N*N) runtime complexity
>> because each append has to do I/O proportional to the size of the entire
>> list for each append. Second, this operation should be happening entirely
>> at the data layer and not between the data and app layer. Third, it
>> introduces write contentions in that two clients may try to append at
>> approximately the same time, giving you a list that is now inconsistent.
>>
>> The conclusion for me is that you can't efficiently enumerate keys with
> Riak
>> even if you roll your own key index with Riak (in anything close to an
> ideal
>> way).
>>
>> To overcome this problem, Riak desperately needs to either maintain its
> own
>> key index efficiently, or it needs to support atomic mutations on values.
>>
>> For an example of the latter approach, see Redis which I think handles
> this
>> beautifully.
>>
>> In the end, you may need to think about redesigning your data model so
> that
>> there never is a need to enumerate keys. I am trying this and I use a
>> combination of:
>>
>> 1. Standard KV approaches,
>> 2. Riak search for being able to enumerate some records in order,
>> 3. Transactions logs stored in a special bucket,
>> 4. Batched M/R phases on the Transaction logs to avoid write contention,
> and
>> 5. Batched rebuilding of "views" in a view bucket.
>>
>> Given that Riak search is loudly proclaimed as being beta, this makes me
>> fairly anxious.
>>
>> I am very close to not needing to enumerate keys the bad way now. However,
>> I would have killed for an atomic mutator like Redis.
>>
>> BTW, I would love for someone from Basho to disabuse me of my conclusions
> in

>> this note.
>>
>> -- GWF
>>
>>
>>
>>
>>
>>
>>
>> On Sat, Jan 22, 2011 at 10:40 AM, Alexander Staubo <[hidden email]
>>wrote:
>>
>>> On Sat, Jan 22, 2011 at 19:34, Alexander Staubo <[hidden email]>
>>> wrote:
>>> > On Sat, Jan 22, 2011 at 18:23, Thomas Burdick
>>> > <[hidden email]> wrote:
>>> >> So really whats the solution to just having a list of like 50k keys
> that
>>> can
>>> >> quickly be appended to without taking seconds to then retrieve later
> on.
>>> Or
>>> >> is this just not a valid use case for riak at all? That would suck
> cause
>>> >> again, I really like the notion of an AP oriented database!
>>> >
>>> > I have been struggling with the same issue. You may want to look at
>>> > Cassandra, which handles sequential key range traversal very well.
>>> > Riak also has a problem with buckets sharing the same data storage
>>> > (buckets are essentially just a way to namespace keys), so if you have
>>> > two buckets and fill up one of them, then enumerating the keys of the
>>> > empty bucket will take a long time even though it
>>>
>>> I accidentally "Send". Again: I have been struggling with the same
>>> issue. You may want to look at Cassandra, which handles sequential key
>>> range traversal very well. Riak also has a problem with buckets
>>> sharing the same data storage (buckets are essentially just a way to
>>> namespace keys), so if you have two buckets and fill up one of them,
>>> then enumerating the keys of the empty bucket will take a long time
>>> even though it's empty. Cassandra does not have a problem with this,
>>> since Cassandra's keyspaces are separate data structures. I like Riak,
>>> but it only works well with single-key/linked traversal, not this kind
>>> of bucket-wide processing.
>>>
>>> _______________________________________________
>>> riak-users mailing list
>>> [hidden email]
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Getting all the Keys

Thomas Burdick
In reply to this post by Les Mikesell
No one seems to have really answered either of my questions in any great detail other than "don't do that" or "use redis" which to me just adds another layer of complexity and potential bugginess to my end application or fails to really describe what the problem is.

So really my questions can be boiled down to..

* Why is key listing so slow?
* What do people do in the context of purely using riak to do what I want, have a big set of keys to iterate over?

Tom Burdick


On Sat, Jan 22, 2011 at 2:39 PM, Les Mikesell <[hidden email]> wrote:
On 1/22/11 1:45 PM, Gary William Flake wrote:

Riak is bad at enumerating keys.

If the key isn't something that you can use to retrieve the items you want, what's the point of having it?

--
 Les Mikesell
  [hidden email]


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Getting all the Keys

Neville Burnell
In reply to this post by Alexander Sicular
>As of Riak 0.14 your m/r can filter on key name. I would highly recommend that your data architecture take this into account by using keys that have meaningful names. 

>>>>>>This will allow you to not scan every key in your cluster.
Is this part true?

I understood that key filtering just means you dont have to fetch the 'value' from the backend (bitcask or innostore). How would it help wrt to scanning every key? Without a 'secondary index/set' somewhere, you would still need to scan every key in the cluster to find all the keys that match your filter.

Kind Regards

Nev

On 23 January 2011 03:31, Alexander Sicular <[hidden email]> wrote:
Hi Thomas,

This is a topic that has come up many times. Lemme just hit a couple of high notes in no particular order:

- If you must do a list keys op on a bucket, you must must must use "?keys=stream". True will block on the coordinating node until all nodes return their keys. Stream will start sending keys as soon as the first node returns.

- "list keys" is one of the most expensive native operations you can perform in Riak. Not only does it do a full key scan of all the keys in your bucket, but all the keys in your cluster. It is obnoxiously expensive and only more so as the number of keys in your cluster grows. There has been discussions about changing this but everything comes with a cost (more open file descriptors) and I do not believe a decision has been made yet.

-Riak is in no way a relational system. It is, in fact, about as opposite as you can get. Incidentally, "select *" is generally not recommended in the Kingdom of Relations and regarded as wasteful. You need a bit of a mind shift from relational world to have success with nosql in general and Riak in particular.

-There are no native indices in Riak. By default Riak uses the bitcask backend. Bitcask has many advantages but one disadvantage is that all keys (key length + a bit of overhead) must fit in ram.

-Do not use "?keys=true". Your computer will melt. And then your face.

-As of Riak 0.14 your m/r can filter on key name. I would highly recommend that your data architecture take this into account by using keys that have meaningful names. This will allow you to not scan every key in your cluster.

-Buckets are analogous to relational tables but only just. In Riak, you can think of a bucket as a namespace holder (it is used as part of the default circular hash function) but primarily as a mechanism to differentiate system settings from one group of keys to the next.

-There is no penalty for unlimited buckets except for when their settings deviate from the system defaults. By settings I mean things like hooks, replication values and backends among others.

-One should list keys by truth if one enjoys sitting in parking lots on the freeway on a scorching summers day or perhaps waiting in a TSA line at your nearest international point of embarkation surrounded by octomom families all the while juggling between the grope or the pr0n slideshow. If that is for you, use "?keys=true".

-Virtually everything in Riak is transient. Meaning, for the most part (not including the 60 seconds or so of m/r cache), there is no caching going on in Riak outside of the operating system. Ie. your subsequent queries will do more or less the same work as their predecessors. You need to cache your own results if you want to reuse them... quickly.



Oh, there's more but I'm pretty jelloed from last night. Welcome to the fold, Thomas. Can I call you Tom?

Cheers,
-Alexander Sicular

@siculars

On Jan 22, 2011, at 10:19 AM, Thomas Burdick wrote:

> I've been playing around with riak lately as really my first usage of a distributed key/value store. I quite like many of the concepts and possibilities of Riak and what it may deliver, however I'm really stuck on an issue.
>
> Doing the equivalent of a select * from sometable in riak is seemingly slow. As a quick test I tried...
>
> http://localhost:8098/riak/mytable?keys=true
>
> Before even iterating over the keys this was unbearably slow already. This took almost half a second on my machine where mytable is completely empty!
>
> I'm a little baffled, I would assume that getting all the keys of a table is an incredibly common task?  How do I get all the keys of a table quickly? By quickly I mean a few milliseconds or less as I would expect of even a "slow" rdbms with an empty table, even some tables with 1000's of items can get all the primary keys of a sql table in a few milliseconds.
>
> Tom Burdick
>
> _______________________________________________
> riak-users mailing list
> [hidden email]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Getting all the Keys

Ryan Zezeski
I think it's worth mentioning that riak is based on Amazon Dynamo and if you read the paper you'll see Dynamo's use case is lookup by primary key.

-Ryan

[Sent from my iPhone]

On Jan 22, 2011, at 5:43 PM, Neville Burnell <[hidden email]> wrote:

>As of Riak 0.14 your m/r can filter on key name. I would highly recommend that your data architecture take this into account by using keys that have meaningful names. 

>>>>>>This will allow you to not scan every key in your cluster.
Is this part true?

I understood that key filtering just means you dont have to fetch the 'value' from the backend (bitcask or innostore). How would it help wrt to scanning every key? Without a 'secondary index/set' somewhere, you would still need to scan every key in the cluster to find all the keys that match your filter.

Kind Regards

Nev

On 23 January 2011 03:31, Alexander Sicular <[hidden email]> wrote:
Hi Thomas,

This is a topic that has come up many times. Lemme just hit a couple of high notes in no particular order:

- If you must do a list keys op on a bucket, you must must must use "?keys=stream". True will block on the coordinating node until all nodes return their keys. Stream will start sending keys as soon as the first node returns.

- "list keys" is one of the most expensive native operations you can perform in Riak. Not only does it do a full key scan of all the keys in your bucket, but all the keys in your cluster. It is obnoxiously expensive and only more so as the number of keys in your cluster grows. There has been discussions about changing this but everything comes with a cost (more open file descriptors) and I do not believe a decision has been made yet.

-Riak is in no way a relational system. It is, in fact, about as opposite as you can get. Incidentally, "select *" is generally not recommended in the Kingdom of Relations and regarded as wasteful. You need a bit of a mind shift from relational world to have success with nosql in general and Riak in particular.

-There are no native indices in Riak. By default Riak uses the bitcask backend. Bitcask has many advantages but one disadvantage is that all keys (key length + a bit of overhead) must fit in ram.

-Do not use "?keys=true". Your computer will melt. And then your face.

-As of Riak 0.14 your m/r can filter on key name. I would highly recommend that your data architecture take this into account by using keys that have meaningful names. This will allow you to not scan every key in your cluster.

-Buckets are analogous to relational tables but only just. In Riak, you can think of a bucket as a namespace holder (it is used as part of the default circular hash function) but primarily as a mechanism to differentiate system settings from one group of keys to the next.

-There is no penalty for unlimited buckets except for when their settings deviate from the system defaults. By settings I mean things like hooks, replication values and backends among others.

-One should list keys by truth if one enjoys sitting in parking lots on the freeway on a scorching summers day or perhaps waiting in a TSA line at your nearest international point of embarkation surrounded by octomom families all the while juggling between the grope or the pr0n slideshow. If that is for you, use "?keys=true".

-Virtually everything in Riak is transient. Meaning, for the most part (not including the 60 seconds or so of m/r cache), there is no caching going on in Riak outside of the operating system. Ie. your subsequent queries will do more or less the same work as their predecessors. You need to cache your own results if you want to reuse them... quickly.



Oh, there's more but I'm pretty jelloed from last night. Welcome to the fold, Thomas. Can I call you Tom?

Cheers,
-Alexander Sicular

@siculars

On Jan 22, 2011, at 10:19 AM, Thomas Burdick wrote:

> I've been playing around with riak lately as really my first usage of a distributed key/value store. I quite like many of the concepts and possibilities of Riak and what it may deliver, however I'm really stuck on an issue.
>
> Doing the equivalent of a select * from sometable in riak is seemingly slow. As a quick test I tried...
>
> http://localhost:8098/riak/mytable?keys=true
>
> Before even iterating over the keys this was unbearably slow already. This took almost half a second on my machine where mytable is completely empty!
>
> I'm a little baffled, I would assume that getting all the keys of a table is an incredibly common task?  How do I get all the keys of a table quickly? By quickly I mean a few milliseconds or less as I would expect of even a "slow" rdbms with an empty table, even some tables with 1000's of items can get all the primary keys of a sql table in a few milliseconds.
>
> Tom Burdick
>
> _______________________________________________
> riak-users mailing list
> [hidden email]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Getting all the Keys

Sean Cribbs-2
In reply to this post by Thomas Burdick
On Jan 22, 2011, at 4:15 PM, Thomas Burdick wrote:

> * Why is key listing so slow?

It is slow because, even if the keys are in RAM, you have to scan roughly all of the keys in the cluster to get a listing for a single bucket.  As a certain person is fond of saying, "full table scan is full table scan".  There are ways to improve this, but without single-arbiters of state (and points of failure) it is very costly.

> * What do people do in the context of purely using riak to do what I want, have a big set of keys to iterate over?

As others have said so eloquently, they don't, they use something else. Or they try to minimize how frequently they do it.  Part of the current revolution in data storage is about realizing that no one tool is going to completely fit your needs, and that that's good and right.  Anyone who tells you otherwise is selling you a bill of goods.  

To understand why listing keys is difficult, you have to understand Riak's (and Dynamo's) original design motivations:

* To be basically available at all times for reads and writes, which in turn means to be tolerant of machine and network failures.
* To provide low-latency random access to large data sets. (Note I didn't say an entire data set.)
* To scale linearly with minimal operational complexity.

Everything has tradeoffs - these are the ones we chose with Riak. Now, we (Basho) are actively trying to create ways to make discovering your data easier (key-filters are one of them, as Justin mentioned we're discussing counters and indices), but the majority of people who use Riak have ways of discovering or knowing keys ahead of time.  If that's not your case, you should look into other solutions; some good ones have been mentioned in this thread.  That said, we hear your pain and are working hard to improve usability while maintaining the properties discussed above.

Cheers,

Sean Cribbs <[hidden email]>
Developer Advocate
Basho Technologies, Inc.
http://basho.com/


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Getting all the Keys

Eric Moritz

This is the best way for me to understand how to model data in Riak.  Think about the web. You always have a starting point. The starting point is an URL. An URL is analogous to a key in Riak. A URL gets you a document on the web, a key gets you a document in Riak.

Now on the web page addressed by your URL has other URLs that serve as pointers to other pages.

In Riak, your starting doc(s) need to have references to other documents to related them. 

In fact both systems, when defined this way are extremely relational. They diverge from relational databases in the fact that there's tons of redundant data and no built-in integrity checking.

On the web this results in 404s and inconsistent titles in a tags.  The same problems happen in Riak; deleted keys could still be referenced by other documents.  It's all reminiscent of dangling pointers in my CS classes.  Just in that old C code I had to write, there was a lot of house keeping to make sure integrity was preserved.

Unlike the web, with Riak, we have complete control over what links to what.  Unfortunately it adds complexity to applications that otherwise would be simple in an ACID DB. The benefit to this added complexity is AP.

The folks who wrote the Dynamo paper state that, at least for them, this added complexity when negligible because they were already designing their services to compensate for integrity issues. Unfortunately for most of us, our SQL databases let us ignore those issues.

tl;dr think of Riak like the web. The web interrelates pages using URLs, we have design our app's Riak docs similarly using key references.

On Jan 22, 2011 8:22 PM, "Sean Cribbs" <[hidden email]> wrote:
> On Jan 22, 2011, at 4:15 PM, Thomas Burdick wrote:
>
>> * Why is key listing so slow?
>
> It is slow because, even if the keys are in RAM, you have to scan roughly all of the keys in the cluster to get a listing for a single bucket. As a certain person is fond of saying, "full table scan is full table scan". There are ways to improve this, but without single-arbiters of state (and points of failure) it is very costly.
>
>> * What do people do in the context of purely using riak to do what I want, have a big set of keys to iterate over?
>
> As others have said so eloquently, they don't, they use something else. Or they try to minimize how frequently they do it. Part of the current revolution in data storage is about realizing that no one tool is going to completely fit your needs, and that that's good and right. Anyone who tells you otherwise is selling you a bill of goods.
>
> To understand why listing keys is difficult, you have to understand Riak's (and Dynamo's) original design motivations:
>
> * To be basically available at all times for reads and writes, which in turn means to be tolerant of machine and network failures.
> * To provide low-latency random access to large data sets. (Note I didn't say an entire data set.)
> * To scale linearly with minimal operational complexity.
>
> Everything has tradeoffs - these are the ones we chose with Riak. Now, we (Basho) are actively trying to create ways to make discovering your data easier (key-filters are one of them, as Justin mentioned we're discussing counters and indices), but the majority of people who use Riak have ways of discovering or knowing keys ahead of time. If that's not your case, you should look into other solutions; some good ones have been mentioned in this thread. That said, we hear your pain and are working hard to improve usability while maintaining the properties discussed above.
>
> Cheers,
>
> Sean Cribbs <[hidden email]>
> Developer Advocate
> Basho Technologies, Inc.
> http://basho.com/
>
>
> _______________________________________________
> riak-users mailing list
> [hidden email]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Getting all the Keys

Thomas Burdick
In reply to this post by Sean Cribbs-2
I mistakenly didn't send a reply to the whole list, but given what everyone is saying I think I "get it" now and the reasoning.

Given all of that it seems pretty clear that if I wanted to do what I'm talking about purely in the context of riak using links might work or a bucket containing keys and values that represent a data structure like a list or btree might work. But either way I guess its up to me if I want to make a index/faster method of traversal of keys. Thats fine, I accept thats the cost of using a dynamo database for now :-)

Thanks for all the insights and comments.

Cheers,
Tom Burdick

On Sat, Jan 22, 2011 at 7:22 PM, Sean Cribbs <[hidden email]> wrote:
On Jan 22, 2011, at 4:15 PM, Thomas Burdick wrote:

> * Why is key listing so slow?

It is slow because, even if the keys are in RAM, you have to scan roughly all of the keys in the cluster to get a listing for a single bucket.  As a certain person is fond of saying, "full table scan is full table scan".  There are ways to improve this, but without single-arbiters of state (and points of failure) it is very costly.

> * What do people do in the context of purely using riak to do what I want, have a big set of keys to iterate over?

As others have said so eloquently, they don't, they use something else. Or they try to minimize how frequently they do it.  Part of the current revolution in data storage is about realizing that no one tool is going to completely fit your needs, and that that's good and right.  Anyone who tells you otherwise is selling you a bill of goods.

To understand why listing keys is difficult, you have to understand Riak's (and Dynamo's) original design motivations:

* To be basically available at all times for reads and writes, which in turn means to be tolerant of machine and network failures.
* To provide low-latency random access to large data sets. (Note I didn't say an entire data set.)
* To scale linearly with minimal operational complexity.

Everything has tradeoffs - these are the ones we chose with Riak. Now, we (Basho) are actively trying to create ways to make discovering your data easier (key-filters are one of them, as Justin mentioned we're discussing counters and indices), but the majority of people who use Riak have ways of discovering or knowing keys ahead of time.  If that's not your case, you should look into other solutions; some good ones have been mentioned in this thread.  That said, we hear your pain and are working hard to improve usability while maintaining the properties discussed above.

Cheers,

Sean Cribbs <[hidden email]>
Developer Advocate
Basho Technologies, Inc.
http://basho.com/



_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
12