Riak performance on GET operations (was "LevelDB read performance)

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Riak performance on GET operations (was "LevelDB read performance)

Parnell Springmeyer
Hi everyone!

I figured out what my bottleneck was - HTTP API + sequential (as opposed to concurrent) GET requests.

I wrote a simple Erlange Cowboy handler that uses a worker pool OTP application I built to make concurrent GETs using the PBC api. My Python web app makes a call to the handler and it simulates "batched" requests.

It would be a great feature in Riak to have batched GETs built in…it shouldn't be too difficult to add to the PBC or HTTP API's and it would be enormously useful, particularly in the event where the client KNOWS the key space and wants a batch of results.

I might make an experimental fork of Riak to add that feature if Basho isn't working on something similar?


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

signature.asc (858 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Riak performance on GET operations (was "LevelDB read performance)

Rapsey
A map reduce job is a batch request. It takes in a list of {Bucket,Key} pairs and returns the result. Though writing map reduce in the erlang PB client is not exactly as nice as one would think. This is the function I use:

% P = riak connection
% B = bucket
% LI = list of keys
mget_bin(P,B,LI) ->
       case riakc_pb_socket:mapred(P,[{B,key(Key)} || Key <- LI],
  [{reduce, {modfun, riak_kv_mapreduce, reduce_set_union}, none, false},
  {map, {modfun, riak_kv_mapreduce, map_identity}, none, true}]) of
{ok,[{_,L}|_]} ->
{ok,[Val || {r_object,_Bucket,_Key,[{r_content,_Meta,Val}],_VClock,_UpdateMeta,_UpdateVal} <- L]};
{error,notfound} ->
undefined;
X ->
X
end.



On Sat, Jul 28, 2012 at 6:26 PM, Parnell Springmeyer <[hidden email]> wrote:
Hi everyone!

I figured out what my bottleneck was - HTTP API + sequential (as opposed to concurrent) GET requests.

I wrote a simple Erlange Cowboy handler that uses a worker pool OTP application I built to make concurrent GETs using the PBC api. My Python web app makes a call to the handler and it simulates "batched" requests.

It would be a great feature in Riak to have batched GETs built in…it shouldn't be too difficult to add to the PBC or HTTP API's and it would be enormously useful, particularly in the event where the client KNOWS the key space and wants a batch of results.

I might make an experimental fork of Riak to add that feature if Basho isn't working on something similar?


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Riak performance on GET operations (was "LevelDB read performance)

Parnell Springmeyer
I (for some reason) didn't think about this… Thanks for saying so; I'll try implementing that instead.

On Jul 29, 2012, at 1:46 PM, Rapsey wrote:

A map reduce job is a batch request. It takes in a list of {Bucket,Key} pairs and returns the result. Though writing map reduce in the erlang PB client is not exactly as nice as one would think. This is the function I use:

% P = riak connection
% B = bucket
% LI = list of keys
mget_bin(P,B,LI) ->
       case riakc_pb_socket:mapred(P,[{B,key(Key)} || Key <- LI],
  [{reduce, {modfun, riak_kv_mapreduce, reduce_set_union}, none, false},
  {map, {modfun, riak_kv_mapreduce, map_identity}, none, true}]) of
{ok,[{_,L}|_]} ->
{ok,[Val || {r_object,_Bucket,_Key,[{r_content,_Meta,Val}],_VClock,_UpdateMeta,_UpdateVal} <- L]};
{error,notfound} ->
undefined;
X ->
X
end.



On Sat, Jul 28, 2012 at 6:26 PM, Parnell Springmeyer <[hidden email]> wrote:
Hi everyone!

I figured out what my bottleneck was - HTTP API + sequential (as opposed to concurrent) GET requests.

I wrote a simple Erlange Cowboy handler that uses a worker pool OTP application I built to make concurrent GETs using the PBC api. My Python web app makes a call to the handler and it simulates "batched" requests.

It would be a great feature in Riak to have batched GETs built in…it shouldn't be too difficult to add to the PBC or HTTP API's and it would be enormously useful, particularly in the event where the client KNOWS the key space and wants a batch of results.

I might make an experimental fork of Riak to add that feature if Basho isn't working on something similar?


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com




_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

signature.asc (858 bytes) Download Attachment