MapReduce Weird Issue

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

MapReduce Weird Issue

Yehuda Zargrov

Hi All,

 

We are facing a very strange map-reduce behavior.

We use ripple in ruby, this is the call:

 

  def yehuda

    query_result = Riak::MapReduce.new(Ripple.client).add('usage-test1')

     .map("function(v) { data = JSON.parse(v.values[0].data).loc; for (var a in data) { for (var b in data[a]) { for (var c in data[a][b]) { return [{date:v.key, sum:data[a][b][c]}]; } } } }")

     .reduce("function(v) {var s={}; for(var i in v)  { var date=''; var sum=0; for(var n in v[i]) { if (n === \"date\") date=v[i][n]; if (n === \"sum\") sum=v[i][n]; } if (date in s) s[date] += sum; else s[date] = sum; } return[v]; }", :keep => true).run

    puts query_result

  end

 

The map function, though looks complicated, works fine. It gives the reduce an array of hashes like this:

{"date"=>"CT16-20110114", "sum"=>1}

{"date"=>"CT18-20101204", "sum"=>1}

{"date"=>"CT19-20110314", "sum"=>1}

{"date"=>"CT116-20100516", "sum"=>1}

{"date"=>"CT17-20110214", "sum"=>1}

{"date"=>"CT19-20100511", "sum"=>1}

{"date"=>"CT18-20100710", "sum"=>1}

{"date"=>"CT19-20110301", "sum"=>1}

{"date"=>"CT17-20110213", "sum"=>1}

 

There are a lot of items in this array (hundreds).

The reduce function should sum up the total sum of every date.

From some reason it returns only 13 results:

{""=>0, "CT18-20101222"=>1, "CT116-20101123"=>1, "CT18-20101208"=>1, "CT18-20101110"=>1, "CT116-20101028"=>1, "CT18-20100904"=>1, "CT18-20100820"=>1, "CT18-20100618"=>1, "CT17-20110420"=>1, "CT17-20110412"=>1, "CT17-20110407"=>1, "CT17-20110401"=>1}.

 

Does someone have a clue why?

 

Thanks,

 

Yehuda Zargarov


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: MapReduce Weird Issue

bryan-basho
Administrator
On Tue, Nov 29, 2011 at 3:56 AM, Yehuda Zargrov <[hidden email]>
wrote:>      .reduce("function(v) {var s={}; for(var i in v)  { var
date=''; var> sum=0; for(var n in v[i]) { if (n === \"date\")
date=v[i][n]; if (n ===> \"sum\") sum=v[i][n]; } if (date in s)
s[date] += sum; else s[date] = sum; }> return[v]; }", :keep =>
true).run
Hi, Yehuda.  I think one of the troubles you're seeing may be
becausethe last statement in your reduce function is "return[v]".  It
lookslike you actually wanted "return[s]".  The function is returning
itsinputs instead of its computation result.
As for why you only see 13 entries instead of hundreds, I'd bet
thateither an error or timeout occurred, and caused a known bug
(whereinputs are returned instead of outputs) to raise its
head:https://issues.basho.com/show_bug.cgi?id=1185
There may be more information in Riak's logs.  Please write back
withwhat you find, and include the version numbers you're using for
Riakand Ripple.

-Bryan

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com