Schema Architecture, Map Reduce & Key Lists

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Schema Architecture, Map Reduce & Key Lists

Mat Ellis
We are converting a mysql based schema to Riak using Ripple. We're tracking a lot of clicks, and each click belongs to a cascade of other objects:

click -> placement -> campaign -> customer

i.e. we do a lot of operations on these clicks grouped by placement or sets of placements.

Reading this http://lists.basho.com/pipermail/riak-users_lists.basho.com/2010-July/001591.html gave me pause for thought. I was hoping the time needed to crunch each day's data would be proportional to the volume of clicks on that day but it seems that it would be proportional to the total number of clicks ever.

What's the best approach here? I can see a number of 'solutions' each of them complicated:

(1) Maintain an index of clicks by day so that we can focus our operations on a time bound set of clicks

(2) Delete or archive clicks once they have been processed or after a certain number of days

(3) Add many links to each placement, one per click (millions potentially)

On a related noob-note, what would be the best way of creating a set of the clicks for a given placement? Map Reduce or Riak Search or some other method?

Thanks in advance.

M.

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Schema Architecture, Map Reduce & Key Lists

Jeremiah Peschka
Riak 0.14 brings key filters - it's still going to take time to filter the keys in memory, but it's an in memory operation. Using 'smart keys' along the lines of UNIXTIMESTAMP:placement:campaign:customer you can rapidly filter your keys using meaningful criteria and perform MapReduce jobs on the results.

Nothing says you can't also store the same data in multiple buckets in multiple formats to make querying easier.

In response to number 2 - there's a way to set Riak to auto expire data from a bucket. It'll only be removed when compactions occur, but if you're storing clickstream data that should be happen often enough.

-- 
Jeremiah Peschka
Microsoft SQL Server MVP
MCITP: Database Developer, DBA

On Thursday, February 10, 2011 at 9:35 AM, Mat Ellis wrote:

We are converting a mysql based schema to Riak using Ripple. We're tracking a lot of clicks, and each click belongs to a cascade of other objects:

click -> placement -> campaign -> customer

i.e. we do a lot of operations on these clicks grouped by placement or sets of placements.

Reading this http://lists.basho.com/pipermail/riak-users_lists.basho.com/2010-July/001591.html gave me pause for thought. I was hoping the time needed to crunch each day's data would be proportional to the volume of clicks on that day but it seems that it would be proportional to the total number of clicks ever.

What's the best approach here? I can see a number of 'solutions' each of them complicated:

(1) Maintain an index of clicks by day so that we can focus our operations on a time bound set of clicks

(2) Delete or archive clicks once they have been processed or after a certain number of days

(3) Add many links to each placement, one per click (millions potentially)

On a related noob-note, what would be the best way of creating a set of the clicks for a given placement? Map Reduce or Riak Search or some other method?

Thanks in advance.

M.
_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Schema Architecture, Map Reduce & Key Lists

Mat Ellis
That's interesting. Unfortunately I think a smart key would simply shift the problem elsewhere as we also generate a UUID so clicks can be converted when a conversion event occurs. So we either have a UUID as the key (and can quickly look up a click to see if it's valid and then the associated data if it is) or we can have a Smart Key and can easily lookup the associated data. Either way the posting I linked to suggests that unless I supply long lists of keys I have to scan *all* keys in *all* buckets, no matter how much I segment my data into separate (i.e. smaller) "sub-" buckets. 

The auto-expire feature will definitely be handy. We'll look at that.

Does anyone have:

(a) opinions on addressing this issue using Ripple's "many" and "one" associations? It seems I could end up with a placement record with millions of clicks attached

(b) any good examples of Ripple's associations being used in real code (as opposed to examples here http://seancribbs.github.com/ripple/Ripple/Associations.html) and/or a discussion about Ripple::Document vs Ripple::EmbeddedDocument?

M.



On Feb 10, 2011, at 9:52 AM, Jeremiah Peschka wrote:

Riak 0.14 brings key filters - it's still going to take time to filter the keys in memory, but it's an in memory operation. Using 'smart keys' along the lines of UNIXTIMESTAMP:placement:campaign:customer you can rapidly filter your keys using meaningful criteria and perform MapReduce jobs on the results.

Nothing says you can't also store the same data in multiple buckets in multiple formats to make querying easier.

In response to number 2 - there's a way to set Riak to auto expire data from a bucket. It'll only be removed when compactions occur, but if you're storing clickstream data that should be happen often enough.

-- 
Jeremiah Peschka
Microsoft SQL Server MVP
MCITP: Database Developer, DBA

On Thursday, February 10, 2011 at 9:35 AM, Mat Ellis wrote:

We are converting a mysql based schema to Riak using Ripple. We're tracking a lot of clicks, and each click belongs to a cascade of other objects:

click -> placement -> campaign -> customer

i.e. we do a lot of operations on these clicks grouped by placement or sets of placements.

Reading this http://lists.basho.com/pipermail/riak-users_lists.basho.com/2010-July/001591.html gave me pause for thought. I was hoping the time needed to crunch each day's data would be proportional to the volume of clicks on that day but it seems that it would be proportional to the total number of clicks ever.

What's the best approach here? I can see a number of 'solutions' each of them complicated:

(1) Maintain an index of clicks by day so that we can focus our operations on a time bound set of clicks

(2) Delete or archive clicks once they have been processed or after a certain number of days

(3) Add many links to each placement, one per click (millions potentially)

On a related noob-note, what would be the best way of creating a set of the clicks for a given placement? Map Reduce or Riak Search or some other method?

Thanks in advance.

M.
_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Schema Architecture, Map Reduce & Key Lists

bryan-basho
Administrator
In reply to this post by Mat Ellis
On Thu, Feb 10, 2011 at 12:35 PM, Mat Ellis <[hidden email]> wrote:
> We are converting a mysql based schema to Riak using Ripple. We're tracking
> a lot of clicks, and each click belongs to a cascade of other objects:
> click -> placement -> campaign -> customer
> i.e. we do a lot of operations on these clicks grouped by placement or sets
> of placements.
… snip …
> On a related noob-note, what would be the best way of creating a set of the
> clicks for a given placement? Map Reduce or Riak Search or some other
> method?

Hi, Mat.  I have an alternative strategy I think you could try if
you're up for stepping outside of the Ripple interface.  Your incoming
clicks reminded me of other stream data I've processed before, so the
basic idea is to store clicks as a stream, and then process that
stream later.  The tools I'd use to do this are Luwak[1] and
luwak_mr[2].

First, store all clicks, as they arrive, in one Luwak file (or maybe
one Luwak file per host accepting clicks, depending on your service's
arrangement).  Luwak has a streaming interface that's available
natively in distributed Erlang, or over HTTP by exploiting the
"chunked" encoding type.  Roll over to a new file on whatever
convenient trigger you like (time period, timeout, manual
intervention, etc.).

Next, use map/reduce to process the stream.  The luwak_mr utility will
allow you to specify a Luwak file by name, and it will handle toss
each of the chunks of that file to various cluster nodes for
processing.  The first stage of your map/reduce query just needs to be
able to handle any single chunk of the file.

I've posted a few examples about how to use the luwak_mr
utility.[3][4][5]  They deal with analyzing events in baseball games
(another sort of stream of events).

Pros:
 - No need to list keys.
 - The time to process a day's data should be proportional to the
number of clicks on that day (i.e. proportional to the size of the
file).

Caveats:
 - Luwak works best with write-once data.  Modifying a block of a
Luwak file after it has been written causes the block to be copied,
and the old version of the block is not deleted.  (Even if some of
your data is modification-heavy, this might work for the non-modified
parts … like the key list for a day's clicks?)
 - I don't have good numbers for Luwak's speed/efficiency.
 - I've only recently started experimenting with Luwak in this
map/reducing manner, so I'm not sure if there are other pitfalls.

[1] http://wiki.basho.com/Luwak.html
[2] http://contrib.basho.com/luwak_mr.html
[3] http://blog.beerriot.com/2011/01/16/mapreducing-luwak/
[4] http://blog.basho.com/2011/01/20/baseball-batting-average%2c-using-riak-map/reduce/
[5] http://blog.basho.com/2011/01/26/fixing-the-count/

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Schema Architecture, Map Reduce & Key Lists

Mat Ellis
Thanks Bryan, that certainly looks interesting. The clicks are amended but just once and only a tiny percentage (when they convert). We're basically doing what you describe: taking a click stream and processing it once into a set of summary tables for reporting & decision making. We'll take a look at it as soon as we've finished getting our head around the Ripple goodness.

Cheers

M.

On Feb 10, 2011, at 11:54 AM, Bryan Fink wrote:

> On Thu, Feb 10, 2011 at 12:35 PM, Mat Ellis <[hidden email]> wrote:
>> We are converting a mysql based schema to Riak using Ripple. We're tracking
>> a lot of clicks, and each click belongs to a cascade of other objects:
>> click -> placement -> campaign -> customer
>> i.e. we do a lot of operations on these clicks grouped by placement or sets
>> of placements.
> … snip …
>> On a related noob-note, what would be the best way of creating a set of the
>> clicks for a given placement? Map Reduce or Riak Search or some other
>> method?
>
> Hi, Mat.  I have an alternative strategy I think you could try if
> you're up for stepping outside of the Ripple interface.  Your incoming
> clicks reminded me of other stream data I've processed before, so the
> basic idea is to store clicks as a stream, and then process that
> stream later.  The tools I'd use to do this are Luwak[1] and
> luwak_mr[2].
>
> First, store all clicks, as they arrive, in one Luwak file (or maybe
> one Luwak file per host accepting clicks, depending on your service's
> arrangement).  Luwak has a streaming interface that's available
> natively in distributed Erlang, or over HTTP by exploiting the
> "chunked" encoding type.  Roll over to a new file on whatever
> convenient trigger you like (time period, timeout, manual
> intervention, etc.).
>
> Next, use map/reduce to process the stream.  The luwak_mr utility will
> allow you to specify a Luwak file by name, and it will handle toss
> each of the chunks of that file to various cluster nodes for
> processing.  The first stage of your map/reduce query just needs to be
> able to handle any single chunk of the file.
>
> I've posted a few examples about how to use the luwak_mr
> utility.[3][4][5]  They deal with analyzing events in baseball games
> (another sort of stream of events).
>
> Pros:
> - No need to list keys.
> - The time to process a day's data should be proportional to the
> number of clicks on that day (i.e. proportional to the size of the
> file).
>
> Caveats:
> - Luwak works best with write-once data.  Modifying a block of a
> Luwak file after it has been written causes the block to be copied,
> and the old version of the block is not deleted.  (Even if some of
> your data is modification-heavy, this might work for the non-modified
> parts … like the key list for a day's clicks?)
> - I don't have good numbers for Luwak's speed/efficiency.
> - I've only recently started experimenting with Luwak in this
> map/reducing manner, so I'm not sure if there are other pitfalls.
>
> [1] http://wiki.basho.com/Luwak.html
> [2] http://contrib.basho.com/luwak_mr.html
> [3] http://blog.beerriot.com/2011/01/16/mapreducing-luwak/
> [4] http://blog.basho.com/2011/01/20/baseball-batting-average%2c-using-riak-map/reduce/
> [5] http://blog.basho.com/2011/01/26/fixing-the-count/


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Schema Architecture, Map Reduce & Key Lists

Alexander Sicular
i would change the model and have another stream for "converted" clicks.

-Alexander Sicular

@siculars

On Feb 10, 2011, at 5:58 PM, Mat Ellis wrote:

> Thanks Bryan, that certainly looks interesting. The clicks are amended but just once and only a tiny percentage (when they convert). We're basically doing what you describe: taking a click stream and processing it once into a set of summary tables for reporting & decision making. We'll take a look at it as soon as we've finished getting our head around the Ripple goodness.
>
> Cheers
>
> M.
>
> On Feb 10, 2011, at 11:54 AM, Bryan Fink wrote:
>
>> On Thu, Feb 10, 2011 at 12:35 PM, Mat Ellis <[hidden email]> wrote:
>>> We are converting a mysql based schema to Riak using Ripple. We're tracking
>>> a lot of clicks, and each click belongs to a cascade of other objects:
>>> click -> placement -> campaign -> customer
>>> i.e. we do a lot of operations on these clicks grouped by placement or sets
>>> of placements.
>> … snip …
>>> On a related noob-note, what would be the best way of creating a set of the
>>> clicks for a given placement? Map Reduce or Riak Search or some other
>>> method?
>>
>> Hi, Mat.  I have an alternative strategy I think you could try if
>> you're up for stepping outside of the Ripple interface.  Your incoming
>> clicks reminded me of other stream data I've processed before, so the
>> basic idea is to store clicks as a stream, and then process that
>> stream later.  The tools I'd use to do this are Luwak[1] and
>> luwak_mr[2].
>>
>> First, store all clicks, as they arrive, in one Luwak file (or maybe
>> one Luwak file per host accepting clicks, depending on your service's
>> arrangement).  Luwak has a streaming interface that's available
>> natively in distributed Erlang, or over HTTP by exploiting the
>> "chunked" encoding type.  Roll over to a new file on whatever
>> convenient trigger you like (time period, timeout, manual
>> intervention, etc.).
>>
>> Next, use map/reduce to process the stream.  The luwak_mr utility will
>> allow you to specify a Luwak file by name, and it will handle toss
>> each of the chunks of that file to various cluster nodes for
>> processing.  The first stage of your map/reduce query just needs to be
>> able to handle any single chunk of the file.
>>
>> I've posted a few examples about how to use the luwak_mr
>> utility.[3][4][5]  They deal with analyzing events in baseball games
>> (another sort of stream of events).
>>
>> Pros:
>> - No need to list keys.
>> - The time to process a day's data should be proportional to the
>> number of clicks on that day (i.e. proportional to the size of the
>> file).
>>
>> Caveats:
>> - Luwak works best with write-once data.  Modifying a block of a
>> Luwak file after it has been written causes the block to be copied,
>> and the old version of the block is not deleted.  (Even if some of
>> your data is modification-heavy, this might work for the non-modified
>> parts … like the key list for a day's clicks?)
>> - I don't have good numbers for Luwak's speed/efficiency.
>> - I've only recently started experimenting with Luwak in this
>> map/reducing manner, so I'm not sure if there are other pitfalls.
>>
>> [1] http://wiki.basho.com/Luwak.html
>> [2] http://contrib.basho.com/luwak_mr.html
>> [3] http://blog.beerriot.com/2011/01/16/mapreducing-luwak/
>> [4] http://blog.basho.com/2011/01/20/baseball-batting-average%2c-using-riak-map/reduce/
>> [5] http://blog.basho.com/2011/01/26/fixing-the-count/
>
>
> _______________________________________________
> riak-users mailing list
> [hidden email]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Schema Architecture, Map Reduce & Key Lists

Mat Ellis
Good idea, thanks.

M.

On Feb 10, 2011, at 4:10 PM, Alexander Sicular wrote:

> i would change the model and have another stream for "converted" clicks.
>
> -Alexander Sicular
>
> @siculars
>
> On Feb 10, 2011, at 5:58 PM, Mat Ellis wrote:
>
>> Thanks Bryan, that certainly looks interesting. The clicks are amended but just once and only a tiny percentage (when they convert). We're basically doing what you describe: taking a click stream and processing it once into a set of summary tables for reporting & decision making. We'll take a look at it as soon as we've finished getting our head around the Ripple goodness.
>>
>> Cheers
>>
>> M.
>>
>> On Feb 10, 2011, at 11:54 AM, Bryan Fink wrote:
>>
>>> On Thu, Feb 10, 2011 at 12:35 PM, Mat Ellis <[hidden email]> wrote:
>>>> We are converting a mysql based schema to Riak using Ripple. We're tracking
>>>> a lot of clicks, and each click belongs to a cascade of other objects:
>>>> click -> placement -> campaign -> customer
>>>> i.e. we do a lot of operations on these clicks grouped by placement or sets
>>>> of placements.
>>> … snip …
>>>> On a related noob-note, what would be the best way of creating a set of the
>>>> clicks for a given placement? Map Reduce or Riak Search or some other
>>>> method?
>>>
>>> Hi, Mat.  I have an alternative strategy I think you could try if
>>> you're up for stepping outside of the Ripple interface.  Your incoming
>>> clicks reminded me of other stream data I've processed before, so the
>>> basic idea is to store clicks as a stream, and then process that
>>> stream later.  The tools I'd use to do this are Luwak[1] and
>>> luwak_mr[2].
>>>
>>> First, store all clicks, as they arrive, in one Luwak file (or maybe
>>> one Luwak file per host accepting clicks, depending on your service's
>>> arrangement).  Luwak has a streaming interface that's available
>>> natively in distributed Erlang, or over HTTP by exploiting the
>>> "chunked" encoding type.  Roll over to a new file on whatever
>>> convenient trigger you like (time period, timeout, manual
>>> intervention, etc.).
>>>
>>> Next, use map/reduce to process the stream.  The luwak_mr utility will
>>> allow you to specify a Luwak file by name, and it will handle toss
>>> each of the chunks of that file to various cluster nodes for
>>> processing.  The first stage of your map/reduce query just needs to be
>>> able to handle any single chunk of the file.
>>>
>>> I've posted a few examples about how to use the luwak_mr
>>> utility.[3][4][5]  They deal with analyzing events in baseball games
>>> (another sort of stream of events).
>>>
>>> Pros:
>>> - No need to list keys.
>>> - The time to process a day's data should be proportional to the
>>> number of clicks on that day (i.e. proportional to the size of the
>>> file).
>>>
>>> Caveats:
>>> - Luwak works best with write-once data.  Modifying a block of a
>>> Luwak file after it has been written causes the block to be copied,
>>> and the old version of the block is not deleted.  (Even if some of
>>> your data is modification-heavy, this might work for the non-modified
>>> parts … like the key list for a day's clicks?)
>>> - I don't have good numbers for Luwak's speed/efficiency.
>>> - I've only recently started experimenting with Luwak in this
>>> map/reducing manner, so I'm not sure if there are other pitfalls.
>>>
>>> [1] http://wiki.basho.com/Luwak.html
>>> [2] http://contrib.basho.com/luwak_mr.html
>>> [3] http://blog.beerriot.com/2011/01/16/mapreducing-luwak/
>>> [4] http://blog.basho.com/2011/01/20/baseball-batting-average%2c-using-riak-map/reduce/
>>> [5] http://blog.basho.com/2011/01/26/fixing-the-count/
>>
>>
>> _______________________________________________
>> riak-users mailing list
>> [hidden email]
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Schema Architecture, Map Reduce & Key Lists

Nico Meyer
In reply to this post by Jeremiah Peschka
Hi Jeremiah!

Actually there should be no compaction at all if he only ever inserts
new keys, so the expire feature of bitcask won't help in this case.
Compactions/Merges only happen if keys have been updated or deleted.

Cheers,
Nico

Am Donnerstag, den 10.02.2011, 09:52 -0800 schrieb Jeremiah Peschka:

> Riak 0.14 brings key filters - it's still going to take time to filter
> the keys in memory, but it's an in memory operation. Using 'smart
> keys' along the lines of UNIXTIMESTAMP:placement:campaign:customer you
> can rapidly filter your keys using meaningful criteria and perform
> MapReduce jobs on the results.
>
>
> Nothing says you can't also store the same data in multiple buckets in
> multiple formats to make querying easier.
>
>
> In response to number 2 - there's a way to set Riak to auto expire
> data from a bucket. It'll only be removed when compactions occur, but
> if you're storing clickstream data that should be happen often enough.
>
> --
> Jeremiah Peschka
> Microsoft SQL Server MVP
> MCITP: Database Developer, DBA
>
>
> On Thursday, February 10, 2011 at 9:35 AM, Mat Ellis wrote:
>
> > We are converting a mysql based schema to Riak using Ripple. We're
> > tracking a lot of clicks, and each click belongs to a cascade of
> > other objects:
> >
> >
> > click -> placement -> campaign -> customer
> >
> >
> > i.e. we do a lot of operations on these clicks grouped by placement
> > or sets of placements.
> >
> >
> > Reading
> > this http://lists.basho.com/pipermail/riak-users_lists.basho.com/2010-July/001591.html gave me pause for thought. I was hoping the time needed to crunch each day's data would be proportional to the volume of clicks on that day but it seems that it would be proportional to the total number of clicks ever.
> >
> >
> > What's the best approach here? I can see a number of 'solutions'
> > each of them complicated:
> >
> >
> > (1) Maintain an index of clicks by day so that we can focus our
> > operations on a time bound set of clicks
> >
> >
> > (2) Delete or archive clicks once they have been processed or after
> > a certain number of days
> >
> >
> > (3) Add many links to each placement, one per click (millions
> > potentially)
> >
> >
> > On a related noob-note, what would be the best way of creating a set
> > of the clicks for a given placement? Map Reduce or Riak Search or
> > some other method?
> >
> >
> > Thanks in advance.
> >
> >
> > M.
> > _______________________________________________
> > riak-users mailing list
> > [hidden email]
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >
>
>
> _______________________________________________
> riak-users mailing list
> [hidden email]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

has_many :through

Mat Ellis
In reply to this post by Mat Ellis
We're still thoroughly in an ActiveRecord mindset over here. Before we go and build some simple collect statements to replace a few has_many :through statements, is there some Ripple or Riak goodness we should be using instead?

Example:

* A Publisher has many Authors who have many Books
* We want to get a set of books for a given publisher
* In ActiveRecord we would say:

# publisher.rb
class Publisher < ActiveRecord::Base
  has_many :authors
  has_many :books, :through => authors
end

# author.rb
class Author < ActiveRecord::Base
  belongs_to :publisher
  has_many :books
end

# book.rb
class Book < ActiveRecord::Base
  belongs_to :author
end

Thanks

M.


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: has_many :through

Sean Cribbs-2
Mat,

In Riak you would implement this pattern with links.  Ripple helps you do similar queries to has_many :through, but there is no automatic thing for it yet (it needs some more thought).  In the meantime, do this:

class Publisher
  include Ripple::Document
  many :authors

  def books
    authors.map {|a| a.books }.flatten
  end
end

Sean Cribbs <[hidden email]>
Developer Advocate
Basho Technologies, Inc.
http://basho.com/

On Feb 11, 2011, at 3:39 PM, Mat Ellis wrote:

> We're still thoroughly in an ActiveRecord mindset over here. Before we go and build some simple collect statements to replace a few has_many :through statements, is there some Ripple or Riak goodness we should be using instead?
>
> Example:
>
> * A Publisher has many Authors who have many Books
> * We want to get a set of books for a given publisher
> * In ActiveRecord we would say:
>
> # publisher.rb
> class Publisher < ActiveRecord::Base
>  has_many :authors
>  has_many :books, :through => authors
> end
>
> # author.rb
> class Author < ActiveRecord::Base
>  belongs_to :publisher
>  has_many :books
> end
>
> # book.rb
> class Book < ActiveRecord::Base
>  belongs_to :author
> end
>
> Thanks
>
> M.
>
>
> _______________________________________________
> riak-users mailing list
> [hidden email]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: has_many :through

Mat Ellis
We're using the links and was going to implement pretty much what you recommend. It wouldn't seem to hard to implement this kind feature via some meta-programming in Ripple, another one for the feature backlog maybe.

Thanks

M.

On Feb 11, 2011, at 12:56 PM, Sean Cribbs wrote:

> Mat,
>
> In Riak you would implement this pattern with links.  Ripple helps you do similar queries to has_many :through, but there is no automatic thing for it yet (it needs some more thought).  In the meantime, do this:
>
> class Publisher
>  include Ripple::Document
>  many :authors
>
>  def books
>    authors.map {|a| a.books }.flatten
>  end
> end
>
> Sean Cribbs <[hidden email]>
> Developer Advocate
> Basho Technologies, Inc.
> http://basho.com/
>
> On Feb 11, 2011, at 3:39 PM, Mat Ellis wrote:
>
>> We're still thoroughly in an ActiveRecord mindset over here. Before we go and build some simple collect statements to replace a few has_many :through statements, is there some Ripple or Riak goodness we should be using instead?
>>
>> Example:
>>
>> * A Publisher has many Authors who have many Books
>> * We want to get a set of books for a given publisher
>> * In ActiveRecord we would say:
>>
>> # publisher.rb
>> class Publisher < ActiveRecord::Base
>> has_many :authors
>> has_many :books, :through => authors
>> end
>>
>> # author.rb
>> class Author < ActiveRecord::Base
>> belongs_to :publisher
>> has_many :books
>> end
>>
>> # book.rb
>> class Book < ActiveRecord::Base
>> belongs_to :author
>> end
>>
>> Thanks
>>
>> M.
>>
>>
>> _______________________________________________
>> riak-users mailing list
>> [hidden email]
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com