Pending handoff when node offline

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Pending handoff when node offline

Daniel Iwan
Hi all

Am I right thinking that when node goes offline riak-admin transfers will always show transfers to be done? E.g.

riak-admin transfers
Attempting to restart script through sudo -H -u riak
[sudo] password for myuser:
Nodes ['riak@10.173.240.12'] are currently down.
'riak@10.173.240.9' waiting to handoff 18 partitions
'riak@10.173.240.11' waiting to handoff 13 partitions
'riak@10.173.240.10' waiting to handoff 13 partitions

Active Transfers:


Node 'riak@10.173.240.12' could not be contacted


Even when I mark node as down with riak-admin down transfers are still waiting.
Is that normal?

The reason I ask is because our services before they start are checking if all transfers are complete (normal process during riak startup). This is because in the past we've had issues with using 2i queries when Riak.

Unfortunately this means that after e.g. reboot our services won't start until timeout expires or missing node comes back and handoff finishes.

Maybe there is a better way to check if Riak cluster is ready for 2i queries?
We are still on Riak 1.3.1

Regards
Daniel
Reply | Threaded
Open this post in threaded view
|

Re: Pending handoff when node offline

Magnus Kessler
On 5 January 2016 at 10:07, Daniel Iwan <[hidden email]> wrote:
Hi all

Am I right thinking that when node goes offline *riak-admin transfers* will
always show transfers to be done? E.g.

riak-admin transfers
Attempting to restart script through sudo -H -u riak
[sudo] password for myuser:
Nodes ['[hidden email]'] are currently down.
'[hidden email]' waiting to handoff 18 partitions
'[hidden email]' waiting to handoff 13 partitions
'[hidden email]' waiting to handoff 13 partitions

Active Transfers:


Node '[hidden email]' could not be contacted


Even when I mark node as down with *riak-admin down* transfers are still
waiting.
Is that normal?

The reason I ask is because our services before they start are checking if
all transfers are complete (normal process during riak startup). This is
because in the past we've had issues with using 2i queries when Riak.

Unfortunately this means that after e.g. reboot our services won't start
until timeout expires or missing node comes back and handoff finishes.

Maybe there is a better way to check if Riak cluster is ready for 2i
queries?
We are still on Riak 1.3.1

Regards
Daniel


Hi Daniel,

this behaviour is completely normal and expected. As part of the high availability capabilities of Riak, when a target VNode is not available to write data to other Nodes will spin up fallback VNodes that temporarily store incoming data. These will show up in the "riak-admin transfers" output as partitions waiting to be handed off. The presence of partition handoffs does not automatically mean that a Node is not capable of handling queries and is therefore not a good indicator for your use case.

You may want to use "riak-admin wait-for-service riak_kv <node-name>" to detect that a restarted node is capable of handling requests again. See http://docs.basho.com/riak/latest/ops/running/tools/riak-admin/#wait-for-service for more details.

Regards,

Magnus
 
--
Magnus Kessler
Client Services Engineer
Basho Technologies Limited

Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Pending handoff when node offline

Daniel Iwan
Magnus

Thanks for confirming.
We've had issues with 2i (coverage queries) during node startup where some keys potentially might not appear in results.
More details on the here:

http://riak-users.197444.n3.nabble.com/Keys-not-listed-during-vnode-transfers-td4027133.html#a4027139

We've been using wait-for-service in our script but that was not sufficient to workaround issue with 2i.
I will try to workaround it or check if newer version of Riak solved to 2i problem.

Thanks