ClusterOffline Unable to access functioning Riak node

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

ClusterOffline Unable to access functioning Riak node

Charles Solar
Hi list - I'm currently running both Riak and RiakTS in a lab environment for testing and my clients writing data get 

"ClusterOffline Unable to access functioning Riak node"

errors fairly often.  I am wondering if this is an indication that I need to add more nodes to increase capacity? Or tune some other settings?

I've looked through Riak logs and there is no indication of a problem, are there other diagnostics I can do?


Im finding RiakTS commits fail with this commit far more often.

I'm using the C# client, nodePollTime 5000, retryWaitTime 100, retryCount 3

with 7 riak nodes and 3 riakts nodes.

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: ClusterOffline Unable to access functioning Riak node

Luke Bakken
Hi Charles -

Extend the read and write timeouts using this setting:

https://github.com/basho/riak-dotnet-client/blob/develop/src/RiakClientTests.Live/App.config#L24

The above example extends it to 60 seconds.

The default is 4 seconds which may be too short if you are running
long queries. If 4 seconds is exceeded, the socket read times out and
the client assumes that there is an issue with the node, marking it
down. Eventually, all nodes can be marked down.
--
Luke Bakken
Engineer
[hidden email]


On Wed, Apr 12, 2017 at 11:03 AM, Charles Solar <[hidden email]> wrote:

> Hi list - I'm currently running both Riak and RiakTS in a lab environment
> for testing and my clients writing data get
>
> "ClusterOffline Unable to access functioning Riak node"
>
> errors fairly often.  I am wondering if this is an indication that I need to
> add more nodes to increase capacity? Or tune some other settings?
>
> I've looked through Riak logs and there is no indication of a problem, are
> there other diagnostics I can do?
>
>
> Im finding RiakTS commits fail with this commit far more often.
>
> I'm using the C# client, nodePollTime 5000, retryWaitTime 100, retryCount 3
>
> with 7 riak nodes and 3 riakts nodes.
>
> _______________________________________________
> riak-users mailing list
> [hidden email]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: ClusterOffline Unable to access functioning Riak node

Charles Solar
Thanks for the tip Luke - I updated those timeouts to 30s each and not seeing anymore failures.  I guess ideally updates should happen in under 4 seconds though so I'll have to find out why certain saves are taking so long!

Charles

On Fri, Apr 14, 2017 at 12:33 PM, Luke Bakken <[hidden email]> wrote:
Hi Charles -

Extend the read and write timeouts using this setting:

https://github.com/basho/riak-dotnet-client/blob/develop/src/RiakClientTests.Live/App.config#L24

The above example extends it to 60 seconds.

The default is 4 seconds which may be too short if you are running
long queries. If 4 seconds is exceeded, the socket read times out and
the client assumes that there is an issue with the node, marking it
down. Eventually, all nodes can be marked down.
--
Luke Bakken
Engineer
[hidden email]


On Wed, Apr 12, 2017 at 11:03 AM, Charles Solar <[hidden email]> wrote:
> Hi list - I'm currently running both Riak and RiakTS in a lab environment
> for testing and my clients writing data get
>
> "ClusterOffline Unable to access functioning Riak node"
>
> errors fairly often.  I am wondering if this is an indication that I need to
> add more nodes to increase capacity? Or tune some other settings?
>
> I've looked through Riak logs and there is no indication of a problem, are
> there other diagnostics I can do?
>
>
> Im finding RiakTS commits fail with this commit far more often.
>
> I'm using the C# client, nodePollTime 5000, retryWaitTime 100, retryCount 3
>
> with 7 riak nodes and 3 riakts nodes.
>
> _______________________________________________
> riak-users mailing list
> [hidden email]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: ClusterOffline Unable to access functioning Riak node

Luke Bakken
Thanks for letting us know the outcome.
--
Luke Bakken
Engineer
[hidden email]


On Fri, Apr 14, 2017 at 12:49 PM, Charles Solar <[hidden email]> wrote:

> Thanks for the tip Luke - I updated those timeouts to 30s each and not
> seeing anymore failures.  I guess ideally updates should happen in under 4
> seconds though so I'll have to find out why certain saves are taking so
> long!
>
> Charles
>
> On Fri, Apr 14, 2017 at 12:33 PM, Luke Bakken <[hidden email]> wrote:
>>
>> Hi Charles -
>>
>> Extend the read and write timeouts using this setting:
>>
>>
>> https://github.com/basho/riak-dotnet-client/blob/develop/src/RiakClientTests.Live/App.config#L24
>>
>> The above example extends it to 60 seconds.
>>
>> The default is 4 seconds which may be too short if you are running
>> long queries. If 4 seconds is exceeded, the socket read times out and
>> the client assumes that there is an issue with the node, marking it
>> down. Eventually, all nodes can be marked down.
>> --
>> Luke Bakken
>> Engineer
>> [hidden email]
>>
>>
>> On Wed, Apr 12, 2017 at 11:03 AM, Charles Solar <[hidden email]>
>> wrote:
>> > Hi list - I'm currently running both Riak and RiakTS in a lab
>> > environment
>> > for testing and my clients writing data get
>> >
>> > "ClusterOffline Unable to access functioning Riak node"
>> >
>> > errors fairly often.  I am wondering if this is an indication that I
>> > need to
>> > add more nodes to increase capacity? Or tune some other settings?
>> >
>> > I've looked through Riak logs and there is no indication of a problem,
>> > are
>> > there other diagnostics I can do?
>> >
>> >
>> > Im finding RiakTS commits fail with this commit far more often.
>> >
>> > I'm using the C# client, nodePollTime 5000, retryWaitTime 100,
>> > retryCount 3
>> >
>> > with 7 riak nodes and 3 riakts nodes.
>> >
>> > _______________________________________________
>> > riak-users mailing list
>> > [hidden email]
>> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>> >
>
>

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Loading...