Connection Pool with Erlang PB Client Necessary?

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Connection Pool with Erlang PB Client Necessary?

Andrew Berman
I know that this subject has been brought up before, but I'm still
wondering what the value of a connection pool is with Riak.  In my
app, I'm using Webmachine resources to talk to a gen_server which in
turn talks to Riak.  So, in other words, the Webmachine resources
never talk to Riak directly, they must always talk to the gen_server
to deal with Riak.  Since Erlang processes are so small and fast to
create, is there really any overhead in having the gen_server create a
new connection (with the same client id) each time it needs to access
Riak?

So the pseudo-code would look like this:

my_webmachine_resource.erl
========================

some_service:persist(MyRecord).

some_service.erl
==============

persist(MyRecord) ->
    riak_repository:load(LoadSomething),
    riak_repository:persist(MyRecord),
    riak_repository:persist(SomethingElse).

riak_repository.erl (this is the gen_server)
================================

persist(...) -> call (...)
load(...) -> call(...)

call(....) ->
      Pid = get_connection(ClientId),
      DoAction(Pid, ....),
      close_connection(Pid) %% Is this even necessary?

Thoughts?

Another approach I thought of was:

some_service.erl
==============

persist(SomeRecord) ->
   riak_repository:execute(fun(Pid) ->
          riak_repository:persist(..., Pid),
          riak_repository:load(...., Pid).
      end).

riak_repository.erl
==============

execute(Fun) ->
     try
           Pid = get_connection(),
           Fun(Pid)
      after
           close_connection(Pid)
      end

Is one of these approaches better than the other in dealing with Riak
and vclocks?

Thanks,

Andrew

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Connection Pool with Erlang PB Client Necessary?

Bryan O'Sullivan
On Mon, Jul 25, 2011 at 4:03 PM, Andrew Berman <[hidden email]> wrote:
I know that this subject has been brought up before, but I'm still
wondering what the value of a connection pool is with Riak.

It's a big deal:
  • It amortises TCP and PBC connection setup overhead over a number of requests, thereby reducing average query latency.
  • It greatly reduces the likelihood that very busy clients and servers will run out of limited resources that are effectively invisible, e.g. closed TCP connections stuck in TIME_WAIT.
Each of the above is a pretty big deal. Of course, connection pooling isn't free.
  • If you have many clients talking to a server sporadically, you may end up with large numbers of open-and-idle connections on a server, which will both consume resources and increase latency for all other clients. This is usually only a problem with a very large number (many thousands) of clients per server, and it usually only arises with poorly written and tuned connection pooling libraries. But ...
  • ... Most connection pooling libraries are poorly written and tuned, so they'll behave pathologically just when you need them not to.
  • Since you don't set up a connection per request, the requests where you *do* need to set up a connection are going to be more expensive than those where you don't, so you'll see jitter in your latency profile. About 99.9% of users will never, ever care about this. 
Since Erlang processes are so small and fast to
create, is there really any overhead in having the gen_server create a
new connection (with the same client id) each time it needs to access
Riak?

Of course. The overhead of Erlang processes has nothing to do with the cost of setting up a connection.

Also, you really don't want to be using the same client ID repeatedly across different connections. That's an awesome way to cause bugs with vclock resolution that end up being very very hard to diagnose.

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Connection Pool with Erlang PB Client Necessary?

Andrew Berman
Thanks for the reply Bryan.  This all makes sense.  I am fairly new to
Erlang and wasn't sure if using a gen_server solved some of the issues
with connections.  From what I've seen a lot of people simply make
calls to Riak directly from a resource and so I thought having a
gen_server in front of Riak would help to manage things better.
Apparently it doesn't.

So, then, two more questions.  I have used connection pools in Java
like C3P0 and they can ramp up connections and then cull connections
when there is a period of inactivity.  The only pooler I've found that
does this is: https://github.com/seth/pooler .  Do you have any other
recommendations on connection poolers?

Second, I'm still a little confused on client ID.  I thought client Id
represented an actual client, not a connection.  So, in my case, the
gen_server is one client which makes multiple connections.  After
seeing what you wrote and reading a bit more on it, it seems like
client Id should just be some random string (base64 encoded) that
should be generated on creating a connection.  Is that right?

Thanks for your help!

Andrew

On Tue, Jul 26, 2011 at 9:39 AM, Bryan O'Sullivan <[hidden email]> wrote:

> On Mon, Jul 25, 2011 at 4:03 PM, Andrew Berman <[hidden email]> wrote:
>>
>> I know that this subject has been brought up before, but I'm still
>> wondering what the value of a connection pool is with Riak.
>
> It's a big deal:
>
> It amortises TCP and PBC connection setup overhead over a number of
> requests, thereby reducing average query latency.
> It greatly reduces the likelihood that very busy clients and servers will
> run out of limited resources that are effectively invisible, e.g. closed TCP
> connections stuck in TIME_WAIT.
>
> Each of the above is a pretty big deal. Of course, connection pooling isn't
> free.
>
> If you have many clients talking to a server sporadically, you may end up
> with large numbers of open-and-idle connections on a server, which will both
> consume resources and increase latency for all other clients. This is
> usually only a problem with a very large number (many thousands) of clients
> per server, and it usually only arises with poorly written and tuned
> connection pooling libraries. But ...
> ... Most connection pooling libraries are poorly written and tuned, so
> they'll behave pathologically just when you need them not to.
> Since you don't set up a connection per request, the requests where you *do*
> need to set up a connection are going to be more expensive than those where
> you don't, so you'll see jitter in your latency profile. About 99.9% of
> users will never, ever care about this.
>>
>> Since Erlang processes are so small and fast to
>> create, is there really any overhead in having the gen_server create a
>> new connection (with the same client id) each time it needs to access
>> Riak?
>
> Of course. The overhead of Erlang processes has nothing to do with the cost
> of setting up a connection.
> Also, you really don't want to be using the same client ID repeatedly across
> different connections. That's an awesome way to cause bugs with vclock
> resolution that end up being very very hard to diagnose.

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Connection Pool with Erlang PB Client Necessary?

Bryan O'Sullivan
On Tue, Jul 26, 2011 at 11:35 AM, Andrew Berman <[hidden email]> wrote:
So, then, two more questions.  I have used connection pools in Java
like C3P0 and they can ramp up connections and then cull connections
when there is a period of inactivity.  The only pooler I've found that
does this is: https://github.com/seth/pooler .  Do you have any other
recommendations on connection poolers?

Sorry, I'm not an Erlang user, so can't help with that.
 
Second, I'm still a little confused on client ID.  I thought client Id
represented an actual client, not a connection.

It's very hard to find documentation on what a client ID is for and why it matters, so don't blame yourself :-)
 
After
seeing what you wrote and reading a bit more on it, it seems like
client Id should just be some random string (base64 encoded) that
should be generated on creating a connection.

That's not a bad plan. It can be helpful to prefix it with a human-readable component so you have some idea which components in your system are participating when something goes wrong. So part fixed, part random.

Also, if you're using a connection pool, it's not a bad idea to change the client ID each time you take a connection from the pool. The downside to this is that it adds a TCP roundtrip, and will thus increase latency. (As a matter of safe interface design, client IDs arguably shouldn't be per-connection entities; it might be better if they were per-request.)

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Connection Pool with Erlang PB Client Necessary?

Bob Ippolito
In reply to this post by Andrew Berman
In our case, we have a client id per connection but the connections
are re-used. For a given application request, a connection is checked
out, some work is done, and then it is checked back in to the pool at
the end of the request. Choosing a random client id for every request
would make bigger vector clocks, but it's a reasonable design.

Interleaving operations with the same client id is going to cause
integrity problems (e.g. the single gen_server + connection approach),
you really want to treat a client id like a transaction id.

On Tue, Jul 26, 2011 at 11:35 AM, Andrew Berman <[hidden email]> wrote:

> Thanks for the reply Bryan.  This all makes sense.  I am fairly new to
> Erlang and wasn't sure if using a gen_server solved some of the issues
> with connections.  From what I've seen a lot of people simply make
> calls to Riak directly from a resource and so I thought having a
> gen_server in front of Riak would help to manage things better.
> Apparently it doesn't.
>
> So, then, two more questions.  I have used connection pools in Java
> like C3P0 and they can ramp up connections and then cull connections
> when there is a period of inactivity.  The only pooler I've found that
> does this is: https://github.com/seth/pooler .  Do you have any other
> recommendations on connection poolers?
>
> Second, I'm still a little confused on client ID.  I thought client Id
> represented an actual client, not a connection.  So, in my case, the
> gen_server is one client which makes multiple connections.  After
> seeing what you wrote and reading a bit more on it, it seems like
> client Id should just be some random string (base64 encoded) that
> should be generated on creating a connection.  Is that right?
>
> Thanks for your help!
>
> Andrew
>
> On Tue, Jul 26, 2011 at 9:39 AM, Bryan O'Sullivan <[hidden email]> wrote:
>> On Mon, Jul 25, 2011 at 4:03 PM, Andrew Berman <[hidden email]> wrote:
>>>
>>> I know that this subject has been brought up before, but I'm still
>>> wondering what the value of a connection pool is with Riak.
>>
>> It's a big deal:
>>
>> It amortises TCP and PBC connection setup overhead over a number of
>> requests, thereby reducing average query latency.
>> It greatly reduces the likelihood that very busy clients and servers will
>> run out of limited resources that are effectively invisible, e.g. closed TCP
>> connections stuck in TIME_WAIT.
>>
>> Each of the above is a pretty big deal. Of course, connection pooling isn't
>> free.
>>
>> If you have many clients talking to a server sporadically, you may end up
>> with large numbers of open-and-idle connections on a server, which will both
>> consume resources and increase latency for all other clients. This is
>> usually only a problem with a very large number (many thousands) of clients
>> per server, and it usually only arises with poorly written and tuned
>> connection pooling libraries. But ...
>> ... Most connection pooling libraries are poorly written and tuned, so
>> they'll behave pathologically just when you need them not to.
>> Since you don't set up a connection per request, the requests where you *do*
>> need to set up a connection are going to be more expensive than those where
>> you don't, so you'll see jitter in your latency profile. About 99.9% of
>> users will never, ever care about this.
>>>
>>> Since Erlang processes are so small and fast to
>>> create, is there really any overhead in having the gen_server create a
>>> new connection (with the same client id) each time it needs to access
>>> Riak?
>>
>> Of course. The overhead of Erlang processes has nothing to do with the cost
>> of setting up a connection.
>> Also, you really don't want to be using the same client ID repeatedly across
>> different connections. That's an awesome way to cause bugs with vclock
>> resolution that end up being very very hard to diagnose.
>
> _______________________________________________
> riak-users mailing list
> [hidden email]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Connection Pool with Erlang PB Client Necessary?

Justin Sheehy
In reply to this post by Bryan O'Sullivan
The simplest guidance on client IDs that I can give:

If two mutation (PUT) operations could occur concurrently or without
awareness of each other, then they should have different client IDs.

As a result of the above: if you are sharing a connection, then you
should use a different client ID for each separate user of that
connection.

-Justin

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Connection Pool with Erlang PB Client Necessary?

Andrew Berman
Thanks for all the replies guys!

I just want to make sure I'm totally clear on this.  Bob's solution
would work well with my design.  So basically, this would be the
workflow?

1.  check out connection from the pool
2.  set client id on connection (which would have some static and some
random component)
3.  perform multiple operations (gets, puts, etc.) which would be seen
as a single "transaction"
4.  check in the connection to the pool

This way once the connection is checked out from the pool, if another
user comes along he cannot get that same connection until it has been
checked back in, which would meet Justin's requirements.  However,
each time it's checked out, a new client id is created.

Does this sound reasonable and in line with proper client id usage?

Thanks again!

Andrew


On Tue, Jul 26, 2011 at 11:55 AM, Justin Sheehy <[hidden email]> wrote:

> The simplest guidance on client IDs that I can give:
>
> If two mutation (PUT) operations could occur concurrently or without
> awareness of each other, then they should have different client IDs.
>
> As a result of the above: if you are sharing a connection, then you
> should use a different client ID for each separate user of that
> connection.
>
> -Justin
>

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Connection Pool with Erlang PB Client Necessary?

Justin Sheehy
Yes, Andrew -- that is a fine approach to using a connection pool.

Go for it.

-Justin



On Tue, Jul 26, 2011 at 3:18 PM, Andrew Berman <[hidden email]> wrote:

> Thanks for all the replies guys!
>
> I just want to make sure I'm totally clear on this.  Bob's solution
> would work well with my design.  So basically, this would be the
> workflow?
>
> 1.  check out connection from the pool
> 2.  set client id on connection (which would have some static and some
> random component)
> 3.  perform multiple operations (gets, puts, etc.) which would be seen
> as a single "transaction"
> 4.  check in the connection to the pool
>
> This way once the connection is checked out from the pool, if another
> user comes along he cannot get that same connection until it has been
> checked back in, which would meet Justin's requirements.  However,
> each time it's checked out, a new client id is created.
>
> Does this sound reasonable and in line with proper client id usage?
>
> Thanks again!
>
> Andrew
>
>
> On Tue, Jul 26, 2011 at 11:55 AM, Justin Sheehy <[hidden email]> wrote:
>> The simplest guidance on client IDs that I can give:
>>
>> If two mutation (PUT) operations could occur concurrently or without
>> awareness of each other, then they should have different client IDs.
>>
>> As a result of the above: if you are sharing a connection, then you
>> should use a different client ID for each separate user of that
>> connection.
>>
>> -Justin
>>
>

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Connection Pool with Erlang PB Client Necessary?

Andrew Berman
Awesome!  Thanks for all your help guys.

On Tue, Jul 26, 2011 at 12:20 PM, Justin Sheehy <[hidden email]> wrote:

> Yes, Andrew -- that is a fine approach to using a connection pool.
>
> Go for it.
>
> -Justin
>
>
>
> On Tue, Jul 26, 2011 at 3:18 PM, Andrew Berman <[hidden email]> wrote:
>> Thanks for all the replies guys!
>>
>> I just want to make sure I'm totally clear on this.  Bob's solution
>> would work well with my design.  So basically, this would be the
>> workflow?
>>
>> 1.  check out connection from the pool
>> 2.  set client id on connection (which would have some static and some
>> random component)
>> 3.  perform multiple operations (gets, puts, etc.) which would be seen
>> as a single "transaction"
>> 4.  check in the connection to the pool
>>
>> This way once the connection is checked out from the pool, if another
>> user comes along he cannot get that same connection until it has been
>> checked back in, which would meet Justin's requirements.  However,
>> each time it's checked out, a new client id is created.
>>
>> Does this sound reasonable and in line with proper client id usage?
>>
>> Thanks again!
>>
>> Andrew
>>
>>
>> On Tue, Jul 26, 2011 at 11:55 AM, Justin Sheehy <[hidden email]> wrote:
>>> The simplest guidance on client IDs that I can give:
>>>
>>> If two mutation (PUT) operations could occur concurrently or without
>>> awareness of each other, then they should have different client IDs.
>>>
>>> As a result of the above: if you are sharing a connection, then you
>>> should use a different client ID for each separate user of that
>>> connection.
>>>
>>> -Justin
>>>
>>
>

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Connection Pool with Erlang PB Client Necessary?

Joel Meyer
In reply to this post by Andrew Berman


On Tue, Jul 26, 2011 at 11:35 AM, Andrew Berman <[hidden email]> wrote:
Thanks for the reply Bryan.  This all makes sense.  I am fairly new to
Erlang and wasn't sure if using a gen_server solved some of the issues
with connections.  From what I've seen a lot of people simply make
calls to Riak directly from a resource and so I thought having a
gen_server in front of Riak would help to manage things better.
Apparently it doesn't.

So, then, two more questions.  I have used connection pools in Java
like C3P0 and they can ramp up connections and then cull connections
when there is a period of inactivity.  The only pooler I've found that
does this is: https://github.com/seth/pooler .  Do you have any other
recommendations on connection poolers?

I'm late to the party, but you could take a look at gen_server_pool (https://github.com/openx/gen_server_pool). It's a pooling library I wrote to provide pooling of gen_servers. I've used it mostly for Thrift clients, but Anthony (also on the list) uses it to pool riak_pb clients in webmachine. The basic idea is that you'd call gen_server_pool:start_link(...) wherever you'd normally call gen_server:start_link(...) and pass in a few extra args that control min and max pool size, as well as idle timeout. You can use the Pid you get back from that the same way you'd use the pid of your gen_server, except that all work gets dispatched to a member of a pool instead of a single gen_server. To be honest, I haven't tested out the open-source version I posted on GitHub (sorry, I've been busy), but it's just a slightly modified version of the internal library that's been used in production for several months with good results.

Cheers,
Joel
 

Second, I'm still a little confused on client ID.  I thought client Id
represented an actual client, not a connection.  So, in my case, the
gen_server is one client which makes multiple connections.  After
seeing what you wrote and reading a bit more on it, it seems like
client Id should just be some random string (base64 encoded) that
should be generated on creating a connection.  Is that right?

Thanks for your help!

Andrew

On Tue, Jul 26, 2011 at 9:39 AM, Bryan O'Sullivan <[hidden email]> wrote:
> On Mon, Jul 25, 2011 at 4:03 PM, Andrew Berman <[hidden email]> wrote:
>>
>> I know that this subject has been brought up before, but I'm still
>> wondering what the value of a connection pool is with Riak.
>
> It's a big deal:
>
> It amortises TCP and PBC connection setup overhead over a number of
> requests, thereby reducing average query latency.
> It greatly reduces the likelihood that very busy clients and servers will
> run out of limited resources that are effectively invisible, e.g. closed TCP
> connections stuck in TIME_WAIT.
>
> Each of the above is a pretty big deal. Of course, connection pooling isn't
> free.
>
> If you have many clients talking to a server sporadically, you may end up
> with large numbers of open-and-idle connections on a server, which will both
> consume resources and increase latency for all other clients. This is
> usually only a problem with a very large number (many thousands) of clients
> per server, and it usually only arises with poorly written and tuned
> connection pooling libraries. But ...
> ... Most connection pooling libraries are poorly written and tuned, so
> they'll behave pathologically just when you need them not to.
> Since you don't set up a connection per request, the requests where you *do*
> need to set up a connection are going to be more expensive than those where
> you don't, so you'll see jitter in your latency profile. About 99.9% of
> users will never, ever care about this.
>>
>> Since Erlang processes are so small and fast to
>> create, is there really any overhead in having the gen_server create a
>> new connection (with the same client id) each time it needs to access
>> Riak?
>
> Of course. The overhead of Erlang processes has nothing to do with the cost
> of setting up a connection.
> Also, you really don't want to be using the same client ID repeatedly across
> different connections. That's an awesome way to cause bugs with vclock
> resolution that end up being very very hard to diagnose.

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Connection Pool with Erlang PB Client Necessary?

Andrew Berman
Cool, I'll check it out, though there appears to be something wrong
with your account as when I try to view the source, I get an error
back from GitHub.

On Thu, Jul 28, 2011 at 1:55 PM, Joel Meyer <[hidden email]> wrote:

>
>
> On Tue, Jul 26, 2011 at 11:35 AM, Andrew Berman <[hidden email]> wrote:
>>
>> Thanks for the reply Bryan.  This all makes sense.  I am fairly new to
>> Erlang and wasn't sure if using a gen_server solved some of the issues
>> with connections.  From what I've seen a lot of people simply make
>> calls to Riak directly from a resource and so I thought having a
>> gen_server in front of Riak would help to manage things better.
>> Apparently it doesn't.
>>
>> So, then, two more questions.  I have used connection pools in Java
>> like C3P0 and they can ramp up connections and then cull connections
>> when there is a period of inactivity.  The only pooler I've found that
>> does this is: https://github.com/seth/pooler .  Do you have any other
>> recommendations on connection poolers?
>
> I'm late to the party, but you could take a look at gen_server_pool
> (https://github.com/openx/gen_server_pool). It's a pooling library I wrote
> to provide pooling of gen_servers. I've used it mostly for Thrift clients,
> but Anthony (also on the list) uses it to pool riak_pb clients in
> webmachine. The basic idea is that you'd call
> gen_server_pool:start_link(...) wherever you'd normally call
> gen_server:start_link(...) and pass in a few extra args that control min and
> max pool size, as well as idle timeout. You can use the Pid you get back
> from that the same way you'd use the pid of your gen_server, except that all
> work gets dispatched to a member of a pool instead of a single gen_server.
> To be honest, I haven't tested out the open-source version I posted on
> GitHub (sorry, I've been busy), but it's just a slightly modified version of
> the internal library that's been used in production for several months with
> good results.
> Cheers,
> Joel
>
>>
>> Second, I'm still a little confused on client ID.  I thought client Id
>> represented an actual client, not a connection.  So, in my case, the
>> gen_server is one client which makes multiple connections.  After
>> seeing what you wrote and reading a bit more on it, it seems like
>> client Id should just be some random string (base64 encoded) that
>> should be generated on creating a connection.  Is that right?
>>
>> Thanks for your help!
>>
>> Andrew
>>
>> On Tue, Jul 26, 2011 at 9:39 AM, Bryan O'Sullivan <[hidden email]>
>> wrote:
>> > On Mon, Jul 25, 2011 at 4:03 PM, Andrew Berman <[hidden email]>
>> > wrote:
>> >>
>> >> I know that this subject has been brought up before, but I'm still
>> >> wondering what the value of a connection pool is with Riak.
>> >
>> > It's a big deal:
>> >
>> > It amortises TCP and PBC connection setup overhead over a number of
>> > requests, thereby reducing average query latency.
>> > It greatly reduces the likelihood that very busy clients and servers
>> > will
>> > run out of limited resources that are effectively invisible, e.g. closed
>> > TCP
>> > connections stuck in TIME_WAIT.
>> >
>> > Each of the above is a pretty big deal. Of course, connection pooling
>> > isn't
>> > free.
>> >
>> > If you have many clients talking to a server sporadically, you may end
>> > up
>> > with large numbers of open-and-idle connections on a server, which will
>> > both
>> > consume resources and increase latency for all other clients. This is
>> > usually only a problem with a very large number (many thousands) of
>> > clients
>> > per server, and it usually only arises with poorly written and tuned
>> > connection pooling libraries. But ...
>> > ... Most connection pooling libraries are poorly written and tuned, so
>> > they'll behave pathologically just when you need them not to.
>> > Since you don't set up a connection per request, the requests where you
>> > *do*
>> > need to set up a connection are going to be more expensive than those
>> > where
>> > you don't, so you'll see jitter in your latency profile. About 99.9% of
>> > users will never, ever care about this.
>> >>
>> >> Since Erlang processes are so small and fast to
>> >> create, is there really any overhead in having the gen_server create a
>> >> new connection (with the same client id) each time it needs to access
>> >> Riak?
>> >
>> > Of course. The overhead of Erlang processes has nothing to do with the
>> > cost
>> > of setting up a connection.
>> > Also, you really don't want to be using the same client ID repeatedly
>> > across
>> > different connections. That's an awesome way to cause bugs with vclock
>> > resolution that end up being very very hard to diagnose.
>>
>> _______________________________________________
>> riak-users mailing list
>> [hidden email]
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
> _______________________________________________
> riak-users mailing list
> [hidden email]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Connection Pool with Erlang PB Client Necessary?

Andrew Berman
In reply to this post by Justin Sheehy
So I looked at a bunch of pooling applications and none of them really
have the functionality and flexibility I'm used to with Java
connection pools.  So, I created my own OTP pooling application,
Pooly.  It allows multiple pools to be configured, has flexibility on
configuring the pool (idle timeout, max age of processes, initial
count, acquire increment, max pool size, and min pool size) and
reduces the size of the pool based on the configuration parameters.

Feel free to check it out: https://github.com/aberman/pooly

--Andrew

On Tue, Jul 26, 2011 at 12:20 PM, Justin Sheehy <[hidden email]> wrote:

> Yes, Andrew -- that is a fine approach to using a connection pool.
>
> Go for it.
>
> -Justin
>
>
>
> On Tue, Jul 26, 2011 at 3:18 PM, Andrew Berman <[hidden email]> wrote:
>> Thanks for all the replies guys!
>>
>> I just want to make sure I'm totally clear on this.  Bob's solution
>> would work well with my design.  So basically, this would be the
>> workflow?
>>
>> 1.  check out connection from the pool
>> 2.  set client id on connection (which would have some static and some
>> random component)
>> 3.  perform multiple operations (gets, puts, etc.) which would be seen
>> as a single "transaction"
>> 4.  check in the connection to the pool
>>
>> This way once the connection is checked out from the pool, if another
>> user comes along he cannot get that same connection until it has been
>> checked back in, which would meet Justin's requirements.  However,
>> each time it's checked out, a new client id is created.
>>
>> Does this sound reasonable and in line with proper client id usage?
>>
>> Thanks again!
>>
>> Andrew
>>
>>
>> On Tue, Jul 26, 2011 at 11:55 AM, Justin Sheehy <[hidden email]> wrote:
>>> The simplest guidance on client IDs that I can give:
>>>
>>> If two mutation (PUT) operations could occur concurrently or without
>>> awareness of each other, then they should have different client IDs.
>>>
>>> As a result of the above: if you are sharing a connection, then you
>>> should use a different client ID for each separate user of that
>>> connection.
>>>
>>> -Justin
>>>
>>
>

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com