Really really basic questions

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Really really basic questions

David King
I have some *very* newbie questions, is there a more comprehensive FAQ  
somewhere than the one on the site?


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Really really basic questions

Preston Marshall
Thats what the list is for :)  Don't ask to ask, just ASK! Just like on IRC!  I've found that the two NoSQL conference videos are quite informative.
On Feb 28, 2010, at 11:37 PM, David King wrote:

> I have some *very* newbie questions, is there a more comprehensive FAQ somewhere than the one on the site?
>
>
> _______________________________________________
> riak-users mailing list
> [hidden email]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

smime.p7s (6K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Really really basic questions

David King
>> I have some *very* newbie questions, is there a more comprehensive FAQ somewhere than the one on the site?
> Thats what the list is for :)  Don't ask to ask, just ASK! Just like on IRC!  I've found that the two NoSQL conference videos are quite informative.

I just hate it when people ask questions that make it clear they haven't even tried the software or read the FAQ and didn't want to be one of those people :)

* dets is generally limited to a 4gb datastore. So would using this backend limit me to 4gb of data per node, or does riak work around this in some way? (The system we'd be migrating is at 6.4gb per node over five nodes atm)

* I don't see much information around about osmos. Is osmos production ready? How does it compare in performance to postgres, or BDB? Does it have size limitations too? What backend do most people use?

* I'm a little confused by the mapreduce query model. Do I specify the map/reduce pairs up-front before adding data (like an index), or every time I run a query (like a SQL SELECT)? Or in pairs (like doing a SELECT against columns that I know have an index)?

* I don't see that the Python library <http://hg.basho.com/riak/src/tip/client_lib/riak.py> has a get_multi-type command, where I fetch several resources in a single HTTP query. Is this missing in the Python library, or in the API?
_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Really really basic questions

John Lynch
David, I can tackle a few of these...

* I don't see much information around about osmos. Is osmos production ready? How does it compare in performance to postgres, or BDB? Does it have size limitations too? What backend do most people use?

Basho just released the ability to use InnoDB as a backend. It looks very promising.  
http://blog.basho.com/2010/01/26/innostore----connecting-erlang-to-innodb/
 
* I'm a little confused by the mapreduce query model. Do I specify the map/reduce pairs up-front before adding data (like an index), or every time I run a query (like a SQL SELECT)? Or in pairs (like doing a SELECT against columns that I know have an index)?

Map/Reduce runs at the time of a query, the map function runs on each node, and the final reduce phase runs on a single node. You dont have to query a whole bucket, you can specify a list of keys to start the search from.
 

Regards,

John Lynch, CTO
Rigel Group, LLC
[hidden email]
Mobile: 760-515-2653

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Really really basic questions

David King
>> * I'm a little confused by the mapreduce query model. Do I specify the map/reduce pairs up-front before adding data (like an index), or every time I run a query (like a SQL SELECT)? Or in pairs (like doing a SELECT against columns that I know have an index)?
> Map/Reduce runs at the time of a query, the map function runs on each node, and the final reduce phase runs on a single node. You dont have to query a whole bucket, you can specify a list of keys to start the search from.

So for instance, I work for reddit. To ask for the top-N scoring Links right now, we'd have to map against every Link object in the store, every time?
_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Really really basic questions

John Lynch
I just asked that question recently, and got the below answer from Basho:

--------------------------
Hi John,

In the near future, we are planning to add a pre-commit "hook", specified at a bucket level. This would provide the building blocks necessary to keep an index up to date when an object is stored in Riak. Eventually, I expect to see frameworks and other tools use the hook to allow easy indexing. 

Until that is in place, the best approach for querying depends on the shape of your data:

If you are searching for data through relations in a hierarchy, and you know the starting point of that hierarchy, then you should add tagged links to your objects and use linkwalking. If you need to search in a hierarchy, but need more flexibility than links and tags can provide, then you can use map/reduce functionality. By "starting point", I mean that you know the exact object or objects under which you would like to query.

If your queries are not relational/hierarchical and you don't know the starting point in advance, then your best approach would be to mimic the hook feature described above, and build up your index by hand in a separate Riak object. You could do this in your application when an object is stored (which requires extra hops to Riak), or you can use a background process to do this using list-keys, which means there will be some lag between when your data is stored and when the index is updated. (Keep in mind that list-keys can be an expensive operation, which is why it should be a background process.) 

Best,
Rusty



- Show quoted text -
On Fri, Feb 26, 2010 at 4:53 PM, John Lynch <[hidden email]> wrote:
- Hide quoted text -
I am preparing to give a talk on Riak next week to a local Ruby user group here in San Diego, and wanted to get your thoughts on the future of Riak.  While in its current form it is awesome for loads of use cases, it falls short in the whole querying department, at least as it relates to building typical web applications. I get the map/reduce features, and the link features, but are there any plans to build in indexed query capabilities, a la MongoDB?  Mongo obviously has an easier time of this since it knows exactly what format your data is in (BSON). Or, is this something you would leave to third-parties to build as part of an ORM framework like Ripple, for example, which would have a better idea of the shape of the data and could create/maintain indexes accordingly...


Regards,

John Lynch, CTO
Rigel Group, LLC
[hidden email]



On Mon, Mar 1, 2010 at 5:22 PM, David King <[hidden email]> wrote:
>> * I'm a little confused by the mapreduce query model. Do I specify the map/reduce pairs up-front before adding data (like an index), or every time I run a query (like a SQL SELECT)? Or in pairs (like doing a SELECT against columns that I know have an index)?
> Map/Reduce runs at the time of a query, the map function runs on each node, and the final reduce phase runs on a single node. You dont have to query a whole bucket, you can specify a list of keys to start the search from.

So for instance, I work for reddit. To ask for the top-N scoring Links right now, we'd have to map against every Link object in the store, every time?


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Really really basic questions

Preston Marshall
In reply to this post by David King
Awesome, I love reddit!  Good job at getting the site back and going after the caches failed.  Essentially, a Riak runs each item through your map function, and your function chooses to omit it or not.  You're probably more worried about reduces though, unless you plan on manually incrementing the score on each item.  This is the fastest way, if you can work with it.  The maps are run concurrently on each node you have.  This would be particularly useful for you guys, because you could just add more nodes (since you use EC2) automatically when your response times go above a certain level, and then REMOVE (scale DOWN) just as easily.  This is something unique about Riak that I find quite interesting.  Riak caches the results of maps internally for a limited amount of time.  You would probably have a function to run each type of list for each subreddit (Top, Hot, New, etc inside of Pics, WTF, etc).  In addition to the internal caching, since it's HTTP, you could and should easily throw a big HTTP cache in front of it.  Please feel free to ask me anything else, I think it would be awesome for Riak to have reddit as one of it's big users, they might even be able to improve your search!

Thanks,
Preston Marshall (bbhoss on reddit!)
On Mar 1, 2010, at 7:22 PM, David King wrote:

>>> * I'm a little confused by the mapreduce query model. Do I specify the map/reduce pairs up-front before adding data (like an index), or every time I run a query (like a SQL SELECT)? Or in pairs (like doing a SELECT against columns that I know have an index)?
>> Map/Reduce runs at the time of a query, the map function runs on each node, and the final reduce phase runs on a single node. You dont have to query a whole bucket, you can specify a list of keys to start the search from.
>
> So for instance, I work for reddit. To ask for the top-N scoring Links right now, we'd have to map against every Link object in the store, every time?
> _______________________________________________
> riak-users mailing list
> [hidden email]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

smime.p7s (6K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Really really basic questions

David King
> Awesome, I love reddit!

Hey good, that will make explaining it easier :)

> Good job at getting the site back and going after the caches failed

Heh, thanks :) That was a hell of a Sunday night. That's what has me here, in fact. We've long been looking at alternative key/value stores, but outgrowing those caches is what has accelerated the search

> Essentially, a Riak runs each item through your map function, and your function chooses to omit it or not

But that means that to determine, say, whether an item is in the top N, it has to run that mapfun against every item in the store, emitting each items hotness along with the object proper, coupled with a reducefun that keeps the top N in a running max list? We have 10 million links, so I'd have to run that map function 10 million times every time someone goes to the front page? Even if we had a "starting point" like "links from programming", that really only drops the number to . On our current model, we only look at a thousand links maximum (on average, 37).

> You're probably more worried about reduces though, unless you plan on manually incrementing the score on each item.

Actually, this is what we do. Every Link has 'ups' and 'downs' properties that we incr/decr (along with storing the vote itself so that we can look up up later) every time you vote or change a vote.

> You would probably have a function to run each type of list for each subreddit (Top, Hot, New, etc inside of Pics, WTF, etc).

Can you give me an example of what such a mapreduce pair would look like? For reference, our data right now looks like (a smaller set of the properties for simplification):

    Subreddit:
      int id

    Link:
      int id
      int sr_id
      int ups
      int downs
      int date

To render the front page, we sort by hot, a function that looks like

    def hot(ups, downs, date):
        """The hot formula. Should match the equivalent function in postgres."""
        s = score(ups, downs)
        order = log(max(abs(s), 1), 10)
        sign = 1 if s > 0 else -1 if s < 0 else 0
        seconds = epoch_seconds(date) - 1134028003
        return round(order + sign * seconds / 45000, 7)

But you can imagine that it's just (ups-downs)/epoch_seconds(date) if that helps. The way we used to do this is we had an index on that function in postgres, so a query like (also greatly simplified):

    SELECT   *
    FROM     links
    WHERE    sr_id == <programming's ID>
    ORDER BY hot(ups, downs, date)

That didn't scale terribly well past about a million links, so the way we do it now is we have this pickled list stored in memcachedb that every time someone votes we look up the listing (in a memcachedb key like 'hotness_list_programming'), insert into it the new hotness of this item along with its ID, sort it, and cut off the top thousand. So we're keeping the thousand hottest items pre-ordered in this list, and only recalculating it when someone actually changes the hotness of a link (the hotness is immutable with respect to (ups,downs)). So the list looks like

    [(100, (your link's iD)),
     (99,  (my link's ID))]

And someone submits something with no votes yet, now it looks like:

    [(100, (your link's ID)),
     (99,  (my link's ID)),
     (0,   (their link's ID))]

So we're managing our own indices, basically. This isn't totally optimal, but at least it's fast, since we really aren't ever running SQL against Postgres, we're using it as a key-value store, which it does *very* well.

So if we weren't dealing with a lot of scale (which is what necessitates our managing our own indices like that), what would be the canonical way to make a mapreduce pair to pick up the N hottest links from the programming Subreddit? And would that canonical way require us doing crazy things on every request like running 10 million copies of the mapfuns, that *wouldn't* deal with a lot of scale? I'm fine with making compromises like denormalising the hotness in a property of the Link or whatever, I just want to know the canonical way it would be done.

> In addition to the internal caching, since it's HTTP, you could and should easily throw a big HTTP cache in front of it.

How would this couple with invalidation of cached queries? (That is, you vote on something, invalidating the queries for the subreddit where it lives, your Liked page, etc.) Would we have to invalidate those ourself, or is there some GET If-Modified-Since support or something?

To be clear, what we're looking at in the short term is something to replace memcachedb, not postgres. It looks like riak has the simple key/value stuff down (except for the limitation on disk space per node, which will be a real sticking point since we're already larger than that limit; a half-done but "promising" innodb implementation isn't going to cut it), and key/value is all memcachedb does for us, so that's fine. But I want to understand the query model so that I can investigate moving some more of our stuff into it too if it performs well for those tasks too.


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Really really basic questions

Kevin Smith-5
David --

Here's a few suggestions on how you might be able to map your link hotness process and data model onto Riak:

1) Partition your data into a bucket per subreddit. This will cut down on the scope of the data processed by the M/R jobs.

2) Perform your hotness calculation inside of a map phase. Map phases are highly parallel and run on the node where the data lives. This should give you the best performance.

3) Sorting the links according to their hotness score can done in a subsequent reduce phase.

4) A final reduce phase, written in Erlang, could store the the results of steps 2 & 3 back in Riak.  If you can tolerate some latency in the liveness of rendered data, you could render the page off this data and reduce the load on the Riak cluster. This step has to be written in Erlang since only Erlang map/reduce functions have access to Riak's client API.

4a) Steps 2-4 could run as a scheduled map/reduce job via cron or similar to keep it fresh and updated.

5) Rendering the main home page could be done in a similar fashion using the cached lists created in #4.

Assuming you store your data as JSON -- although that's only one possibility -- the actual code for the m/r could look like this using a combination of Javascript and Erlang:

Map function
------------------
function(value, keyData, arg) {
  /* Have to parse since data comes in as strings */
  var data = JSON.parse(value);

  /* Assumption: hotness defined somewhere else */
  var hotness = hotness(data.ups, data.downs, data.date);

  /* Build a new JSON hash containing just the link id and hotness rating */
  return [{id: data.id, hotness: hotness}];
}

Sorting reduce function
--------------------------------
function(values, arg) {
  var sorter = function(f, s) {
    if (f.hotness == s. hotness) {
      return 0;
    }
    else if (f.hotness > s.hotness) {
      return 1;
    }
    else {
      return -1;
    }
  }
  return values.sort(sorter);
}

Storing reduce function
-------------------------------
%% Where Arg is the subreddit name
%% Also perform "top N" truncation here
fun(Values, Arg) ->
  {ok, Client} = riak:local_client(),
  Obj = riak_object:new(<<"hot_lists">>, Arg, Values),
  %% Store the list w/2 replicas and call it a day
  Client:put(Obj, 2),
  [].

There are several performance optimizations to make this M/R job run faster than what I'v specced here. I've left them out to illustrate the shape of M/R processing. Feel free to ping me with any questions.

--Kevin

On Mar 2, 2010, at 12:48 AM, David King wrote:

>> Awesome, I love reddit!
>
> Hey good, that will make explaining it easier :)
>
>> Good job at getting the site back and going after the caches failed
>
> Heh, thanks :) That was a hell of a Sunday night. That's what has me here, in fact. We've long been looking at alternative key/value stores, but outgrowing those caches is what has accelerated the search
>
>> Essentially, a Riak runs each item through your map function, and your function chooses to omit it or not
>
> But that means that to determine, say, whether an item is in the top N, it has to run that mapfun against every item in the store, emitting each items hotness along with the object proper, coupled with a reducefun that keeps the top N in a running max list? We have 10 million links, so I'd have to run that map function 10 million times every time someone goes to the front page? Even if we had a "starting point" like "links from programming", that really only drops the number to . On our current model, we only look at a thousand links maximum (on average, 37).
>
>> You're probably more worried about reduces though, unless you plan on manually incrementing the score on each item.
>
> Actually, this is what we do. Every Link has 'ups' and 'downs' properties that we incr/decr (along with storing the vote itself so that we can look up up later) every time you vote or change a vote.
>
>> You would probably have a function to run each type of list for each subreddit (Top, Hot, New, etc inside of Pics, WTF, etc).
>
> Can you give me an example of what such a mapreduce pair would look like? For reference, our data right now looks like (a smaller set of the properties for simplification):
>
>    Subreddit:
>      int id
>
>    Link:
>      int id
>      int sr_id
>      int ups
>      int downs
>      int date
>
> To render the front page, we sort by hot, a function that looks like
>
>    def hot(ups, downs, date):
>        """The hot formula. Should match the equivalent function in postgres."""
>        s = score(ups, downs)
>        order = log(max(abs(s), 1), 10)
>        sign = 1 if s > 0 else -1 if s < 0 else 0
>        seconds = epoch_seconds(date) - 1134028003
>        return round(order + sign * seconds / 45000, 7)
>
> But you can imagine that it's just (ups-downs)/epoch_seconds(date) if that helps. The way we used to do this is we had an index on that function in postgres, so a query like (also greatly simplified):
>
>    SELECT   *
>    FROM     links
>    WHERE    sr_id == <programming's ID>
>    ORDER BY hot(ups, downs, date)
>
> That didn't scale terribly well past about a million links, so the way we do it now is we have this pickled list stored in memcachedb that every time someone votes we look up the listing (in a memcachedb key like 'hotness_list_programming'), insert into it the new hotness of this item along with its ID, sort it, and cut off the top thousand. So we're keeping the thousand hottest items pre-ordered in this list, and only recalculating it when someone actually changes the hotness of a link (the hotness is immutable with respect to (ups,downs)). So the list looks like
>
>    [(100, (your link's iD)),
>     (99,  (my link's ID))]
>
> And someone submits something with no votes yet, now it looks like:
>
>    [(100, (your link's ID)),
>     (99,  (my link's ID)),
>     (0,   (their link's ID))]
>
> So we're managing our own indices, basically. This isn't totally optimal, but at least it's fast, since we really aren't ever running SQL against Postgres, we're using it as a key-value store, which it does *very* well.
>
> So if we weren't dealing with a lot of scale (which is what necessitates our managing our own indices like that), what would be the canonical way to make a mapreduce pair to pick up the N hottest links from the programming Subreddit? And would that canonical way require us doing crazy things on every request like running 10 million copies of the mapfuns, that *wouldn't* deal with a lot of scale? I'm fine with making compromises like denormalising the hotness in a property of the Link or whatever, I just want to know the canonical way it would be done.
>
>> In addition to the internal caching, since it's HTTP, you could and should easily throw a big HTTP cache in front of it.
>
> How would this couple with invalidation of cached queries? (That is, you vote on something, invalidating the queries for the subreddit where it lives, your Liked page, etc.) Would we have to invalidate those ourself, or is there some GET If-Modified-Since support or something?
>
> To be clear, what we're looking at in the short term is something to replace memcachedb, not postgres. It looks like riak has the simple key/value stuff down (except for the limitation on disk space per node, which will be a real sticking point since we're already larger than that limit; a half-done but "promising" innodb implementation isn't going to cut it), and key/value is all memcachedb does for us, so that's fine. But I want to understand the query model so that I can investigate moving some more of our stuff into it too if it performs well for those tasks too.
>
>
> _______________________________________________
> riak-users mailing list
> [hidden email]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com