Understanding MapReduce and chunked/streaming results

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Understanding MapReduce and chunked/streaming results

Blake Schwendiman
Assuming I have keys in one bucket, if I have a reduce function that calculates an average (of some attribute of each key), and I want to use chunking, does each node return the average of all values reduced on that node independently? If so, I really need the node to emit sum and count so I can do the final average calculation in PHP (or whichever language) once I have the total sum and count of items.

Or, since I'm calculating a value from a single bucket of keys, will each node that contains the bucket return the same value of the reduce function because each node has the same (complete set) of data?

Thanks again for your patience with a complete noob.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Blake Schwendiman

Blog: http://www.thewhyandthehow.com/
Facebook: http://www.facebook.com/blake.schwendiman

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Understanding MapReduce and chunked/streaming results

Sean Cribbs-2
Because reduce functions can be run multiple times, yes, your best bet is to keep track of the total and the count and then do the final average on the client side.

Sean Cribbs <[hidden email]>
Developer Advocate
Basho Technologies, Inc.

On Apr 5, 2010, at 4:39 PM, Blake Schwendiman wrote:

Assuming I have keys in one bucket, if I have a reduce function that calculates an average (of some attribute of each key), and I want to use chunking, does each node return the average of all values reduced on that node independently? If so, I really need the node to emit sum and count so I can do the final average calculation in PHP (or whichever language) once I have the total sum and count of items.

Or, since I'm calculating a value from a single bucket of keys, will each node that contains the bucket return the same value of the reduce function because each node has the same (complete set) of data?

Thanks again for your patience with a complete noob.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Blake Schwendiman

Blog: http://www.thewhyandthehow.com/
Facebook: http://www.facebook.com/blake.schwendiman
_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com