Structure of Erlang MR Arg

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Structure of Erlang MR Arg

Jeremiah Peschka
Christian brought up something in the "ListKeys or MapReduce" thread that deserves a discrete thread.

In the MapReduce Implementation docs [1], it looks like the Arg parameter of an Erlang MR phase is not a single argument but is, in fact, a proplist of Erlang terms, am I correct in assuming that this is the case?

And, if this is the case, would the JSON submitted to Riak look something like this:

{"reduce":{"language":"erlang",
           "module":"riak_kv_mapreduce",
           "function":"reduce_count_inputs", 
           "arg":{"do_prereduce":true,
                  "reduce_phase_batch_size": 250
                 }
           }
}

As we're mucking around in the lower levels of Corrugated Iron, I want to make sure we're able to send the appropriate jibber jabber back to Erlang phases.


---
Jeremiah Peschka - Founder, Brent Ozar Unlimited
MCITP: SQL Server 2008, MVP
Cloudera Certified Developer for Apache Hadoop

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Structure of Erlang MR Arg

bryan-basho
Administrator
Doh … sorry this thread ended up lost in the stack. Response below…

On Feb 14, 2013, at 10:58 AM, Jeremiah Peschka <[hidden email]> wrote:

> In the MapReduce Implementation docs [1], it looks like the Arg parameter of an Erlang MR phase is not a single argument but is, in fact, a proplist of Erlang terms, am I correct in assuming that this is the case?
>
> And, if this is the case, would the JSON submitted to Riak look something like this:
>
> {"reduce":{"language":"erlang",
>            "module":"riak_kv_mapreduce",
>            "function":"reduce_count_inputs",
>            "arg":{"do_prereduce":true,
>                   "reduce_phase_batch_size": 250
>                  }
>            }
> }

Your read of the doc is mostly correct. The arg is whatever you want it to be. However, the MR interface has recently begun to inspect the arg itself to find tuning parameters. What it's looking for is exactly what you suggested: a JSON object with those fields in it. The idea is that the object could also contain other fields, allowing other data to be passed to the function, as usual. Some of the built-in functions in riak_kv_mapreduce.erl need to be updated to support an arg in this format; sorry for the mixed message.

HTH,
Bryan


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Structure of Erlang MR Arg

bryan-basho
Administrator
On Feb 26, 2013, at 9:16 AM, Bryan Fink <[hidden email]> wrote:
>> {"reduce":{"language":"erlang",
...
>>           "arg":{"do_prereduce":true,
>>                  "reduce_phase_batch_size": 250
...

I should have also mentioned that `do_prereduce` is actually a *map* phase argument. It is only applicable when a map phase is followed by a reduce phase.

-Bryan


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Structure of Erlang MR Arg

Jeremiah Peschka
Excellent, thanks much for the clarification.

---
Jeremiah Peschka - Founder, Brent Ozar Unlimited
MCITP: SQL Server 2008, MVP
Cloudera Certified Developer for Apache Hadoop


On Tue, Feb 26, 2013 at 6:18 AM, Bryan Fink <[hidden email]> wrote:
On Feb 26, 2013, at 9:16 AM, Bryan Fink <[hidden email]> wrote:
>> {"reduce":{"language":"erlang",
...
>>           "arg":{"do_prereduce":true,
>>                  "reduce_phase_batch_size": 250
...

I should have also mentioned that `do_prereduce` is actually a *map* phase argument. It is only applicable when a map phase is followed by a reduce phase.

-Bryan



_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Loading...