Bulk loading data and "Could not contact Riak Server" error

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Bulk loading data and "Could not contact Riak Server" error

gtuhl
I am currently load testing Riak using riak_0.14.2-1_amd64.deb with fs.file-max set to 503840 for all users.

I have a reasonably large set of data (hundreds of millions of documents, many terabytes in size) that is currently stored in a combination of PostgreSQL+Redis and Disco/DDFS.  The first for key/value and the second for map/reduce to satisfy the full set of user requirements.

I am trying to consolidate these data sources so trying out a variety of different data stores with the potential of satisfying both usage types.

With Riak, my main challenge is getting this data loaded.  Using the PHP library I am able to push 100-200 documents/sec.  Is there a recommended approach to bulk loading data?  At that pace it would take a couple months to load everything.  That is not necessarily a deal breaker, but wanted to sniff around for better options.

Related to this, I did attempt to break up my records and load them with a bunch of concurrently running loaders.  This actually seems to work fairly well with not much of a penalty in terms of documents/sec on any single loader process.  But, once I reach 4-5 loaders running concurrently I consistently get the "Could not contact Riak Server" error and all of my loader processes die simultaneously.  If I wait a few seconds the Riak server does begin to respond again.

Any idea for approaching this differently?  Is attempting to run many loaders concurrently a bad idea with Riak?

I am running a single server right now while I test with bucket nval set to 1.
Reply | Threaded
Open this post in threaded view
|

Re: Bulk loading data and "Could not contact Riak Server" error

Jeremiah Peschka
There are a few things you can do to load data into Riak faster. I blogged about it a while back: http://www.brentozar.com/go/riak-writes/

Basically, set w = 0, dw = 0, and return-body = false. This will effectively throw data into your Riak cluster as fast as possible and not care if the write succeeds. If you actually care about the data, you could set w to 1.
---
Jeremiah Peschka
Founder, Brent Ozar PLF, LLC

On Aug 1, 2011, at 2:13 PM, gtuhl wrote:

> I am currently load testing Riak using riak_0.14.2-1_amd64.deb with
> fs.file-max set to 503840 for all users.
>
> I have a reasonably large set of data (hundreds of millions of documents,
> many terabytes in size) that is currently stored in a combination of
> PostgreSQL+Redis and Disco/DDFS.  The first for key/value and the second for
> map/reduce to satisfy the full set of user requirements.
>
> I am trying to consolidate these data sources so trying out a variety of
> different data stores with the potential of satisfying both usage types.
>
> With Riak, my main challenge is getting this data loaded.  Using the PHP
> library I am able to push 100-200 documents/sec.  Is there a recommended
> approach to bulk loading data?  At that pace it would take a couple months
> to load everything.  That is not necessarily a deal breaker, but wanted to
> sniff around for better options.
>
> Related to this, I did attempt to break up my records and load them with a
> bunch of concurrently running loaders.  This actually seems to work fairly
> well with not much of a penalty in terms of documents/sec on any single
> loader process.  But, once I reach 4-5 loaders running concurrently I
> consistently get the "Could not contact Riak Server" error and all of my
> loader processes die simultaneously.  If I wait a few seconds the Riak
> server does begin to respond again.
>
> Any idea for approaching this differently?  Is attempting to run many
> loaders concurrently a bad idea with Riak?
>
> I am running a single server right now while I test with bucket nval set to
> 1.
>
> --
> View this message in context: http://riak-users.197444.n3.nabble.com/Bulk-loading-data-and-Could-not-contact-Riak-Server-error-tp3217091p3217091.html
> Sent from the Riak Users mailing list archive at Nabble.com.
>
> _______________________________________________
> riak-users mailing list
> [hidden email]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Bulk loading data and "Could not contact Riak Server" error

Scott Lystig Fritchie-2
In reply to this post by gtuhl
gtuhl <[hidden email]> wrote:

 > With Riak, my main challenge is getting this data loaded.  Using the
 > PHP library I am able to push 100-200 documents/sec.

A quick grep through the PHP client source suggests that that client
doesn't support the Protocol Buffers interface to Riak.  Depending on
the workload, a PB-based client is anywhere from 20% to several hundred
percent faster than an HTTP-based client.

You'll definitely want to run multiple clients in parallel, especially
if/when your cluster is larger than a single box: pointing those clients
at different cluster members will get you different throughput than
pointing all clients at a single cluster member.

Your message hadn't included any messages from the Riak server logs that
might give hints to why the HTTP service becomes unavailable ... but
it's likely that there's useful info there.

-Scott

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Bulk loading data and "Could not contact Riak Server" error

Kev Burns
I've begun adding Protobuf support to the PHP Client on a branch of my fork but it's a ways off
https://github.com/KevBurnsJr/riak-php-client/tree/protobuf

- Kev
c: <a href="tel:%2B001%20%28650%29%20521-7791" value="+16505217791" target="_blank">+001 (650) 521-7791


On Wed, Aug 3, 2011 at 12:17 PM, Scott Lystig Fritchie <[hidden email]> wrote:
gtuhl <[hidden email]> wrote:

 > With Riak, my main challenge is getting this data loaded.  Using the
 > PHP library I am able to push 100-200 documents/sec.

A quick grep through the PHP client source suggests that that client
doesn't support the Protocol Buffers interface to Riak.  Depending on
the workload, a PB-based client is anywhere from 20% to several hundred
percent faster than an HTTP-based client.

You'll definitely want to run multiple clients in parallel, especially
if/when your cluster is larger than a single box: pointing those clients
at different cluster members will get you different throughput than
pointing all clients at a single cluster member.

Your message hadn't included any messages from the Riak server logs that
might give hints to why the HTTP service becomes unavailable ... but
it's likely that there's useful info there.

-Scott

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Bulk loading data and "Could not contact Riak Server" error

Kresten Krab Thorup
In reply to this post by gtuhl
In my experience, there is little point in testing with less than N physical machines (when using replication factor N) + a load balancer.  Riak is designed to run on this, and performance will be miserable if you try to run on a single machine.   At first we tried running a number of virtual machines, but since disk i/o is usually the limiting factor, and Riak is fairly memory hungry in the default setup (and virtual machines are generally bad with memory hungry apps) that turned out as a terrible test setup.  Now we have a stack of mac minis in the dev team that we can use for running performance tests.  While they're not nearly as fast as the real servers, they are a much better predictor for performance characteristics.

To get loads fast we run many threads [~10 per target machine in our case] in the loader app, and make sure to either use a load balancer or do the load balancing in the client app.

Kresten

On Aug 1, 2011, at 11:13 PM, gtuhl wrote:

> I am currently load testing Riak using riak_0.14.2-1_amd64.deb with
> fs.file-max set to 503840 for all users.
>
> I have a reasonably large set of data (hundreds of millions of documents,
> many terabytes in size) that is currently stored in a combination of
> PostgreSQL+Redis and Disco/DDFS.  The first for key/value and the second for
> map/reduce to satisfy the full set of user requirements.
>
> I am trying to consolidate these data sources so trying out a variety of
> different data stores with the potential of satisfying both usage types.
>
> With Riak, my main challenge is getting this data loaded.  Using the PHP
> library I am able to push 100-200 documents/sec.  Is there a recommended
> approach to bulk loading data?  At that pace it would take a couple months
> to load everything.  That is not necessarily a deal breaker, but wanted to
> sniff around for better options.
>
> Related to this, I did attempt to break up my records and load them with a
> bunch of concurrently running loaders.  This actually seems to work fairly
> well with not much of a penalty in terms of documents/sec on any single
> loader process.  But, once I reach 4-5 loaders running concurrently I
> consistently get the "Could not contact Riak Server" error and all of my
> loader processes die simultaneously.  If I wait a few seconds the Riak
> server does begin to respond again.
>
> Any idea for approaching this differently?  Is attempting to run many
> loaders concurrently a bad idea with Riak?
>
> I am running a single server right now while I test with bucket nval set to
> 1.
>
> --
> View this message in context: http://riak-users.197444.n3.nabble.com/Bulk-loading-data-and-Could-not-contact-Riak-Server-error-tp3217091p3217091.html
> Sent from the Riak Users mailing list archive at Nabble.com.
>
> _______________________________________________
> riak-users mailing list
> [hidden email]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Bulk loading data and "Could not contact Riak Server" error

gtuhl
Appreciate the responses.

Since posting I've tried running up to 8 vnodes on this same physical machine.  Even with 8 the machine is under virtually no load when I have as many loaders as possible running so it could be this sort of hardware isn't a good candidate for a riak node.

Adding more vnodes does allow me to fire up more loaders, but as soon as I cross the 4-5 loader mark on any specific node it stops responding for a few seconds.  I think I got it up to about 1600 inserts/sec at peak with 8 nodes and a couple dozen loaders running.

It could also be that I am just trying to fit a workload on Riak that isn't suitable.  On this same machine with a single untuned Cassandra node I can get about 15,000 inserts/sec using the exact same input data and when that thing runs I can see the disks getting pounded and the CPU spiking.  Cassandra lacks a map reduce interface though so it isn't as appealing.  I may stick with a multiple datastore approach on this particular project.