mochijson2 error from curl mapreduce

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

mochijson2 error from curl mapreduce

John Roy
I saw something similar from an individual who was using javascript (http://lists.basho.com/pipermail/riak-users_lists.basho.com/2011-July/004843.html), but I don't think I have the same problem in erlang.  Any help is greatly appreciated.

I'm consistently getting the following error from my curl map reduce calls:

{error,{exit,{json_encode,{bad_term,…..

output dictionary here, then:

        [{mochijson2,json_encode,2},
         {mochijson2,'-json_encode_array/2-fun-0-',3},
         {lists,foldl,3},
         {mochijson2,json_encode_array,2},
         {riak_kv_wm_mapred,pipe_mapred_nonchunked,5},
         {webmachine_resource,resource_call,3},
         {webmachine_resource,do,3},

The term that follows the error appears to the correct result of the map reduce.

I'm just trying the mr locally for testing purposes.  I'm using eleveldb backend and riak (1.1.2 2012-04-17) OSX x86_64.

Here's the curl command:

curl -v -X POST http://127.0.0.1:8098/mapred -H "Content_Type: application/json" -d @query.js

and the query.js:

{
    "inputs" : {
        "bucket" : "test7",
        "index" : "field1_int",
        "start" : 0,
        "end" : 23
    },
    "query" : [
        { "map" : {"language" : "erlang", "module" : "impressions", "function" : "emitStats", "keep" : false} },
        { "reduce" : {"language" : "erlang", "module" : "impressions", "function" : "reduceStats", "keep" : true} }
    ],
    "timeout" : 600000
}

and finally the associated erlang functions:

emitStats(G, undefined, none)->
    ObjectJson = riak_object:get_value(G),
    {struct, DocPropList} = mochijson:decode(ObjectJson),
    I = {proplists:get_value("obj_name",DocPropList,"None"),
         proplists:get_value("type",DocPropList,"None"),
         proplists:get_value("obj_url",DocPropList,"None"),
         "hourly"},
    [dict:from_list([{I,1}])].

reduceStats(Gcounts, none)->
    [lists:foldr(fun(G, Acc)->dict:merge(fun(_, X, Y) -> X+Y end,G, Acc) end,
                 dict:new(),
                 Gcounts)].



_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: mochijson2 error from curl mapreduce

Bob Ippolito
On Tue, Jun 5, 2012 at 1:22 PM, John Roy <[hidden email]> wrote:
I saw something similar from an individual who was using javascript (http://lists.basho.com/pipermail/riak-users_lists.basho.com/2011-July/004843.html), but I don't think I have the same problem in erlang.  Any help is greatly appreciated.

I'm consistently getting the following error from my curl map reduce calls:

{error,{exit,{json_encode,{bad_term,…..

output dictionary here, then:

        [{mochijson2,json_encode,2},
         {mochijson2,'-json_encode_array/2-fun-0-',3},
         {lists,foldl,3},
         {mochijson2,json_encode_array,2},
         {riak_kv_wm_mapred,pipe_mapred_nonchunked,5},
         {webmachine_resource,resource_call,3},
         {webmachine_resource,do,3},

The term that follows the error appears to the correct result of the map reduce.

I'm just trying the mr locally for testing purposes.  I'm using eleveldb backend and riak (1.1.2 2012-04-17) OSX x86_64.

Here's the curl command:

curl -v -X POST http://127.0.0.1:8098/mapred -H "Content_Type: application/json" -d @query.js

and the query.js:

{
    "inputs" : {
        "bucket" : "test7",
        "index" : "field1_int",
        "start" : 0,
        "end" : 23
    },
    "query" : [
        { "map" : {"language" : "erlang", "module" : "impressions", "function" : "emitStats", "keep" : false} },
        { "reduce" : {"language" : "erlang", "module" : "impressions", "function" : "reduceStats", "keep" : true} }
    ],
    "timeout" : 600000
}

and finally the associated erlang functions:

emitStats(G, undefined, none)->
    ObjectJson = riak_object:get_value(G),
    {struct, DocPropList} = mochijson:decode(ObjectJson),
    I = {proplists:get_value("obj_name",DocPropList,"None"),
         proplists:get_value("type",DocPropList,"None"),
         proplists:get_value("obj_url",DocPropList,"None"),
         "hourly"},
    [dict:from_list([{I,1}])].

reduceStats(Gcounts, none)->
    [lists:foldr(fun(G, Acc)->dict:merge(fun(_, X, Y) -> X+Y end,G, Acc) end,
                 dict:new(),
                 Gcounts)].

I don't have any experience with Riak's MR implementation, but mochijson2 does not have any native support for serializing dict, and JSON itself doesn't support keys that are not strings, and I suspect that one or both of those may be your problem. mochijson2 expects to see {struct, [{Key :: binary(), Value :: json_term()}]} for JSON objects.

-bob
 

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: mochijson2 error from curl mapreduce

John Roy
Excellent -- that helped a great deal.

For those who may search in the future the most important thing is to have the output of your final step (a reduce in my case) be a list of {Key :: Binary, Value :: json_term()} as Bob identified below.  

In a standard map reduce that means the input to the reduce step and the output of the map step must also be that same data structure.  So by converting the dictionary in my map step to a string, and doing some from_list, to_list magic in the reduce I was able to achieve my goal.

Here's my newbie looking (but working!) code for those interested in the details:

emitStatsFromList(G, undefined, none)->
    ObjectJson = riak_object:get_value(G),
    {array, ListOfStructs} = mochijson:decode(ObjectJson),
    [{getMrKey(I), 1} || I <- ListOfStructs].

getMrKey(StructItem)->
    {struct, DocPropList} = StructItem,
    SList = [proplists:get_value("obj_name",DocPropList,"None"),
             proplists:get_value("type",DocPropList,"None"),
             proplists:get_value("obj_url",DocPropList,"None"),
             "hourly"],
    string:join(SList, ",").

reduceStatsList(Gcounts, none)->
    GC = [dict:from_list(Gcounts)],
    Update = [lists:foldr(fun(G, Acc)->dict:merge(fun(_, X, Y) -> X+Y end,G, Acc) end,
                 dict:new(),
                 GC)],
    [OutputList] = [dict:to_list(I) || I <- Update],
    OutputList.

Thanks again,

John


On Jun 5, 2012, at 2:27 PM, Bob Ippolito wrote:

On Tue, Jun 5, 2012 at 1:22 PM, John Roy <[hidden email]> wrote:
I saw something similar from an individual who was using javascript (http://lists.basho.com/pipermail/riak-users_lists.basho.com/2011-July/004843.html), but I don't think I have the same problem in erlang.  Any help is greatly appreciated.

I'm consistently getting the following error from my curl map reduce calls:

{error,{exit,{json_encode,{bad_term,…..

output dictionary here, then:

        [{mochijson2,json_encode,2},
         {mochijson2,'-json_encode_array/2-fun-0-',3},
         {lists,foldl,3},
         {mochijson2,json_encode_array,2},
         {riak_kv_wm_mapred,pipe_mapred_nonchunked,5},
         {webmachine_resource,resource_call,3},
         {webmachine_resource,do,3},

The term that follows the error appears to the correct result of the map reduce.

I'm just trying the mr locally for testing purposes.  I'm using eleveldb backend and riak (1.1.2 2012-04-17) OSX x86_64.

Here's the curl command:

curl -v -X POST http://127.0.0.1:8098/mapred -H "Content_Type: application/json" -d @query.js

and the query.js:

{
    "inputs" : {
        "bucket" : "test7",
        "index" : "field1_int",
        "start" : 0,
        "end" : 23
    },
    "query" : [
        { "map" : {"language" : "erlang", "module" : "impressions", "function" : "emitStats", "keep" : false} },
        { "reduce" : {"language" : "erlang", "module" : "impressions", "function" : "reduceStats", "keep" : true} }
    ],
    "timeout" : 600000
}

and finally the associated erlang functions:

emitStats(G, undefined, none)->
    ObjectJson = riak_object:get_value(G),
    {struct, DocPropList} = mochijson:decode(ObjectJson),
    I = {proplists:get_value("obj_name",DocPropList,"None"),
         proplists:get_value("type",DocPropList,"None"),
         proplists:get_value("obj_url",DocPropList,"None"),
         "hourly"},
    [dict:from_list([{I,1}])].

reduceStats(Gcounts, none)->
    [lists:foldr(fun(G, Acc)->dict:merge(fun(_, X, Y) -> X+Y end,G, Acc) end,
                 dict:new(),
                 Gcounts)].

I don't have any experience with Riak's MR implementation, but mochijson2 does not have any native support for serializing dict, and JSON itself doesn't support keys that are not strings, and I suspect that one or both of those may be your problem. mochijson2 expects to see {struct, [{Key :: binary(), Value :: json_term()}]} for JSON objects.

-bob
 


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com