Riak nodes constantly crashing

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Riak nodes constantly crashing

ricardo.ekm
Hi all,
I have a Riak 1.4 where the nodes seems to be constantly crashing. All 5 nodes are affected. 

However it seems Riak manage to get them up again.

Any idea on whats going on? Erros logs below.

Thanks.

error.log
...
2016-10-24 21:57:29.185 [error] <0.24570.1174> CRASH REPORT Process <0.24570.1174> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 21:58:29.187 [error] <0.7109.1175> CRASH REPORT Process <0.7109.1175> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 21:59:29.228 [error] <0.19612.1175> CRASH REPORT Process <0.19612.1175> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 22:00:29.218 [error] <0.1356.1176> CRASH REPORT Process <0.1356.1176> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 22:01:29.197 [error] <0.11380.1176> CRASH REPORT Process <0.11380.1176> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 22:02:29.231 [error] <0.24279.1176> CRASH REPORT Process <0.24279.1176> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107

crash.log
2016-10-24 21:51:56 =CRASH REPORT====
  crasher:
    initial call: mochiweb_acceptor:init/3
    pid: <0.28136.1621>
    registered_name: []
    exception error: {{case_clause,{ok,{http_error,"exit\r\n"},<<>>}},[{mochiweb_http,request,3,[{file,"src/mochiweb_http.erl"},{line,107}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]}
    ancestors: ['http_0.0.0.0:8098_mochiweb',riak_core_sup,<0.148.0>]
    messages: []
    links: [<0.201.0>,#Port<0.235869290>]
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 377
    stack_size: 24
    reductions: 423
  neighbours:
2016-10-24 21:52:56 =CRASH REPORT====
  crasher:
    initial call: mochiweb_acceptor:init/3
    pid: <0.7845.1622>
    registered_name: []
    exception error: {{case_clause,{ok,{http_error,"exit\r\n"},<<>>}},[{mochiweb_http,request,3,[{file,"src/mochiweb_http.erl"},{line,107}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]}
    ancestors: ['http_0.0.0.0:8098_mochiweb',riak_core_sup,<0.148.0>]
    messages: []
    links: [<0.201.0>,#Port<0.235879110>]
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 377
    stack_size: 24
    reductions: 406
--
Ricardo Mayerhofer

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Riak nodes constantly crashing

Alexander Sicular-2
Disk, memory or file descriptors would be my guess. Bitcask?

On Monday, October 24, 2016, Ricardo Mayerhofer <[hidden email]> wrote:
Hi all,
I have a Riak 1.4 where the nodes seems to be constantly crashing. All 5 nodes are affected. 

However it seems Riak manage to get them up again.

Any idea on whats going on? Erros logs below.

Thanks.

error.log
...
2016-10-24 21:57:29.185 [error] <0.24570.1174> CRASH REPORT Process <0.24570.1174> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 21:58:29.187 [error] <0.7109.1175> CRASH REPORT Process <0.7109.1175> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 21:59:29.228 [error] <0.19612.1175> CRASH REPORT Process <0.19612.1175> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 22:00:29.218 [error] <0.1356.1176> CRASH REPORT Process <0.1356.1176> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 22:01:29.197 [error] <0.11380.1176> CRASH REPORT Process <0.11380.1176> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 22:02:29.231 [error] <0.24279.1176> CRASH REPORT Process <0.24279.1176> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107

crash.log
2016-10-24 21:51:56 =CRASH REPORT====
  crasher:
    initial call: mochiweb_acceptor:init/3
    pid: <0.28136.1621>
    registered_name: []
    exception error: {{case_clause,{ok,{http_error,"exit\r\n"},<<>>}},[{mochiweb_http,request,3,[{file,"src/mochiweb_http.erl"},{line,107}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]}
    ancestors: ['http_0.0.0.0:8098_mochiweb',riak_core_sup,<0.148.0>]
    messages: []
    links: [<0.201.0>,#Port<0.235869290>]
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 377
    stack_size: 24
    reductions: 423
  neighbours:
2016-10-24 21:52:56 =CRASH REPORT====
  crasher:
    initial call: mochiweb_acceptor:init/3
    pid: <0.7845.1622>
    registered_name: []
    exception error: {{case_clause,{ok,{http_error,"exit\r\n"},<<>>}},[{mochiweb_http,request,3,[{file,"src/mochiweb_http.erl"},{line,107}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]}
    ancestors: ['http_0.0.0.0:8098_mochiweb',riak_core_sup,<0.148.0>]
    messages: []
    links: [<0.201.0>,#Port<0.235879110>]
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 377
    stack_size: 24
    reductions: 406
--
Ricardo Mayerhofer


--


Alexander Sicular
Solutions Architect
Basho Technologies
9175130679
@siculars


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Riak nodes constantly crashing

ricardo.ekm
Hi Alexander,
Thanks for your response. We use multi-backend with bitcask and leveldb.

- File descriptors seems to be ok, at least the config.

ubuntu@ip-10-2-58-5:/var/log/riak$ sudo su riak 
sudo: unable to resolve host ip-10-2-58-5
riak@ip-10-2-58-5:/var/log/riak$ ulimit -n
65535

- Memory seems to be ok as well:

KiB Mem:  15400916 total, 14493744 used,   907172 free,    36244 buffers

- Disk is ok

/dev/xvda1       20G  4.1G   15G  22% / # root device

/dev/xvdb       148G   69G   72G  49% /mnt/riak-data  # bitcask and riak data disk
/dev/xvdc       296G   23G  258G   8% /mnt/riak-data/leveldb #leveldb disk

Any other idea? Thanks.

On Mon, Oct 24, 2016 at 8:06 PM, Alexander Sicular <[hidden email]> wrote:
Disk, memory or file descriptors would be my guess. Bitcask?


On Monday, October 24, 2016, Ricardo Mayerhofer <[hidden email]> wrote:
Hi all,
I have a Riak 1.4 where the nodes seems to be constantly crashing. All 5 nodes are affected. 

However it seems Riak manage to get them up again.

Any idea on whats going on? Erros logs below.

Thanks.

error.log
...
2016-10-24 21:57:29.185 [error] <0.24570.1174> CRASH REPORT Process <0.24570.1174> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 21:58:29.187 [error] <0.7109.1175> CRASH REPORT Process <0.7109.1175> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 21:59:29.228 [error] <0.19612.1175> CRASH REPORT Process <0.19612.1175> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 22:00:29.218 [error] <0.1356.1176> CRASH REPORT Process <0.1356.1176> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 22:01:29.197 [error] <0.11380.1176> CRASH REPORT Process <0.11380.1176> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 22:02:29.231 [error] <0.24279.1176> CRASH REPORT Process <0.24279.1176> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107

crash.log
2016-10-24 21:51:56 =CRASH REPORT====
  crasher:
    initial call: mochiweb_acceptor:init/3
    pid: <0.28136.1621>
    registered_name: []
    exception error: {{case_clause,{ok,{http_error,"exit\r\n"},<<>>}},[{mochiweb_http,request,3,[{file,"src/mochiweb_http.erl"},{line,107}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]}
    ancestors: ['http_0.0.0.0:8098_mochiweb',riak_core_sup,<0.148.0>]
    messages: []
    links: [<0.201.0>,#Port<0.235869290>]
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 377
    stack_size: 24
    reductions: 423
  neighbours:
2016-10-24 21:52:56 =CRASH REPORT====
  crasher:
    initial call: mochiweb_acceptor:init/3
    pid: <0.7845.1622>
    registered_name: []
    exception error: {{case_clause,{ok,{http_error,"exit\r\n"},<<>>}},[{mochiweb_http,request,3,[{file,"src/mochiweb_http.erl"},{line,107}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]}
    ancestors: ['http_0.0.0.0:8098_mochiweb',riak_core_sup,<0.148.0>]
    messages: []
    links: [<0.201.0>,#Port<0.235879110>]
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 377
    stack_size: 24
    reductions: 406
--
Ricardo Mayerhofer


--


Alexander Sicular
Solutions Architect
Basho Technologies
<a href="tel:9175130679" value="+19175130679" target="_blank">9175130679
@siculars




--
Ricardo Mayerhofer

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Riak nodes constantly crashing

ricardo.ekm
I'm also pasting the free -m:

             total       used       free     shared    buffers     cached
Mem:         15039      14557        482          0         37       4594
-/+ buffers/cache:       9925       5114
Swap:            0          0          0

On Mon, Oct 24, 2016 at 8:24 PM, Ricardo Mayerhofer <[hidden email]> wrote:
Hi Alexander,
Thanks for your response. We use multi-backend with bitcask and leveldb.

- File descriptors seems to be ok, at least the config.

ubuntu@ip-10-2-58-5:/var/log/riak$ sudo su riak 
sudo: unable to resolve host ip-10-2-58-5
riak@ip-10-2-58-5:/var/log/riak$ ulimit -n
65535

- Memory seems to be ok as well:

KiB Mem:  15400916 total, 14493744 used,   907172 free,    36244 buffers

- Disk is ok

/dev/xvda1       20G  4.1G   15G  22% / # root device

/dev/xvdb       148G   69G   72G  49% /mnt/riak-data  # bitcask and riak data disk
/dev/xvdc       296G   23G  258G   8% /mnt/riak-data/leveldb #leveldb disk

Any other idea? Thanks.

On Mon, Oct 24, 2016 at 8:06 PM, Alexander Sicular <[hidden email]> wrote:
Disk, memory or file descriptors would be my guess. Bitcask?


On Monday, October 24, 2016, Ricardo Mayerhofer <[hidden email]> wrote:
Hi all,
I have a Riak 1.4 where the nodes seems to be constantly crashing. All 5 nodes are affected. 

However it seems Riak manage to get them up again.

Any idea on whats going on? Erros logs below.

Thanks.

error.log
...
2016-10-24 21:57:29.185 [error] <0.24570.1174> CRASH REPORT Process <0.24570.1174> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 21:58:29.187 [error] <0.7109.1175> CRASH REPORT Process <0.7109.1175> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 21:59:29.228 [error] <0.19612.1175> CRASH REPORT Process <0.19612.1175> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 22:00:29.218 [error] <0.1356.1176> CRASH REPORT Process <0.1356.1176> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 22:01:29.197 [error] <0.11380.1176> CRASH REPORT Process <0.11380.1176> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 22:02:29.231 [error] <0.24279.1176> CRASH REPORT Process <0.24279.1176> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107

crash.log
2016-10-24 21:51:56 =CRASH REPORT====
  crasher:
    initial call: mochiweb_acceptor:init/3
    pid: <0.28136.1621>
    registered_name: []
    exception error: {{case_clause,{ok,{http_error,"exit\r\n"},<<>>}},[{mochiweb_http,request,3,[{file,"src/mochiweb_http.erl"},{line,107}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]}
    ancestors: ['http_0.0.0.0:8098_mochiweb',riak_core_sup,<0.148.0>]
    messages: []
    links: [<0.201.0>,#Port<0.235869290>]
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 377
    stack_size: 24
    reductions: 423
  neighbours:
2016-10-24 21:52:56 =CRASH REPORT====
  crasher:
    initial call: mochiweb_acceptor:init/3
    pid: <0.7845.1622>
    registered_name: []
    exception error: {{case_clause,{ok,{http_error,"exit\r\n"},<<>>}},[{mochiweb_http,request,3,[{file,"src/mochiweb_http.erl"},{line,107}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]}
    ancestors: ['http_0.0.0.0:8098_mochiweb',riak_core_sup,<0.148.0>]
    messages: []
    links: [<0.201.0>,#Port<0.235879110>]
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 377
    stack_size: 24
    reductions: 406
--
Ricardo Mayerhofer


--


Alexander Sicular
Solutions Architect
Basho Technologies
<a href="tel:9175130679" value="+19175130679" target="_blank">9175130679
@siculars




--
Ricardo Mayerhofer



--
Ricardo Mayerhofer

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Riak nodes constantly crashing

ricardo.ekm
What's weird is that the node crashes every minute at the same second. Is there anything Riak may be running every minute? 

On Mon, Oct 24, 2016 at 8:28 PM, Ricardo Mayerhofer <[hidden email]> wrote:
I'm also pasting the free -m:

             total       used       free     shared    buffers     cached
Mem:         15039      14557        482          0         37       4594
-/+ buffers/cache:       9925       5114
Swap:            0          0          0

On Mon, Oct 24, 2016 at 8:24 PM, Ricardo Mayerhofer <[hidden email]> wrote:
Hi Alexander,
Thanks for your response. We use multi-backend with bitcask and leveldb.

- File descriptors seems to be ok, at least the config.

ubuntu@ip-10-2-58-5:/var/log/riak$ sudo su riak 
sudo: unable to resolve host ip-10-2-58-5
riak@ip-10-2-58-5:/var/log/riak$ ulimit -n
65535

- Memory seems to be ok as well:

KiB Mem:  15400916 total, 14493744 used,   907172 free,    36244 buffers

- Disk is ok

/dev/xvda1       20G  4.1G   15G  22% / # root device

/dev/xvdb       148G   69G   72G  49% /mnt/riak-data  # bitcask and riak data disk
/dev/xvdc       296G   23G  258G   8% /mnt/riak-data/leveldb #leveldb disk

Any other idea? Thanks.

On Mon, Oct 24, 2016 at 8:06 PM, Alexander Sicular <[hidden email]> wrote:
Disk, memory or file descriptors would be my guess. Bitcask?


On Monday, October 24, 2016, Ricardo Mayerhofer <[hidden email]> wrote:
Hi all,
I have a Riak 1.4 where the nodes seems to be constantly crashing. All 5 nodes are affected. 

However it seems Riak manage to get them up again.

Any idea on whats going on? Erros logs below.

Thanks.

error.log
...
2016-10-24 21:57:29.185 [error] <0.24570.1174> CRASH REPORT Process <0.24570.1174> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 21:58:29.187 [error] <0.7109.1175> CRASH REPORT Process <0.7109.1175> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 21:59:29.228 [error] <0.19612.1175> CRASH REPORT Process <0.19612.1175> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 22:00:29.218 [error] <0.1356.1176> CRASH REPORT Process <0.1356.1176> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 22:01:29.197 [error] <0.11380.1176> CRASH REPORT Process <0.11380.1176> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 22:02:29.231 [error] <0.24279.1176> CRASH REPORT Process <0.24279.1176> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107

crash.log
2016-10-24 21:51:56 =CRASH REPORT====
  crasher:
    initial call: mochiweb_acceptor:init/3
    pid: <0.28136.1621>
    registered_name: []
    exception error: {{case_clause,{ok,{http_error,"exit\r\n"},<<>>}},[{mochiweb_http,request,3,[{file,"src/mochiweb_http.erl"},{line,107}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]}
    ancestors: ['http_0.0.0.0:8098_mochiweb',riak_core_sup,<0.148.0>]
    messages: []
    links: [<0.201.0>,#Port<0.235869290>]
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 377
    stack_size: 24
    reductions: 423
  neighbours:
2016-10-24 21:52:56 =CRASH REPORT====
  crasher:
    initial call: mochiweb_acceptor:init/3
    pid: <0.7845.1622>
    registered_name: []
    exception error: {{case_clause,{ok,{http_error,"exit\r\n"},<<>>}},[{mochiweb_http,request,3,[{file,"src/mochiweb_http.erl"},{line,107}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]}
    ancestors: ['http_0.0.0.0:8098_mochiweb',riak_core_sup,<0.148.0>]
    messages: []
    links: [<0.201.0>,#Port<0.235879110>]
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 377
    stack_size: 24
    reductions: 406
--
Ricardo Mayerhofer


--


Alexander Sicular
Solutions Architect
Basho Technologies
<a href="tel:9175130679" value="+19175130679" target="_blank">9175130679
@siculars




--
Ricardo Mayerhofer



--
Ricardo Mayerhofer



--
Ricardo Mayerhofer

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Riak nodes constantly crashing

Steven Joseph

Hi Ricardo,

If you are using systemd might have to check LimitNOFILE for your units. Active anti entropy runs periodically.

Steven


On Wed, 26 Oct 2016 04:36 Ricardo Mayerhofer <[hidden email]> wrote:
What's weird is that the node crashes every minute at the same second. Is there anything Riak may be running every minute? 

On Mon, Oct 24, 2016 at 8:28 PM, Ricardo Mayerhofer <[hidden email]> wrote:
I'm also pasting the free -m:

             total       used       free     shared    buffers     cached
Mem:         15039      14557        482          0         37       4594
-/+ buffers/cache:       9925       5114
Swap:            0          0          0

On Mon, Oct 24, 2016 at 8:24 PM, Ricardo Mayerhofer <[hidden email]> wrote:
Hi Alexander,
Thanks for your response. We use multi-backend with bitcask and leveldb.

- File descriptors seems to be ok, at least the config.

ubuntu@ip-10-2-58-5:/var/log/riak$ sudo su riak 
sudo: unable to resolve host ip-10-2-58-5
riak@ip-10-2-58-5:/var/log/riak$ ulimit -n
65535

- Memory seems to be ok as well:

KiB Mem:  15400916 total, 14493744 used,   907172 free,    36244 buffers

- Disk is ok

/dev/xvda1       20G  4.1G   15G  22% / # root device

/dev/xvdb       148G   69G   72G  49% /mnt/riak-data  # bitcask and riak data disk
/dev/xvdc       296G   23G  258G   8% /mnt/riak-data/leveldb #leveldb disk

Any other idea? Thanks.

On Mon, Oct 24, 2016 at 8:06 PM, Alexander Sicular <[hidden email]> wrote:
Disk, memory or file descriptors would be my guess. Bitcask?


On Monday, October 24, 2016, Ricardo Mayerhofer <[hidden email]> wrote:
Hi all,
I have a Riak 1.4 where the nodes seems to be constantly crashing. All 5 nodes are affected. 

However it seems Riak manage to get them up again.

Any idea on whats going on? Erros logs below.

Thanks.

error.log
...
2016-10-24 21:57:29.185 [error] <0.24570.1174> CRASH REPORT Process <0.24570.1174> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 21:58:29.187 [error] <0.7109.1175> CRASH REPORT Process <0.7109.1175> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 21:59:29.228 [error] <0.19612.1175> CRASH REPORT Process <0.19612.1175> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 22:00:29.218 [error] <0.1356.1176> CRASH REPORT Process <0.1356.1176> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 22:01:29.197 [error] <0.11380.1176> CRASH REPORT Process <0.11380.1176> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 22:02:29.231 [error] <0.24279.1176> CRASH REPORT Process <0.24279.1176> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107

crash.log
2016-10-24 21:51:56 =CRASH REPORT====
  crasher:
    initial call: mochiweb_acceptor:init/3
    pid: <0.28136.1621>
    registered_name: []
    exception error: {{case_clause,{ok,{http_error,"exit\r\n"},<<>>}},[{mochiweb_http,request,3,[{file,"src/mochiweb_http.erl"},{line,107}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]}
    ancestors: ['http_0.0.0.0:8098_mochiweb',riak_core_sup,<0.148.0>]
    messages: []
    links: [<0.201.0>,#Port<0.235869290>]
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 377
    stack_size: 24
    reductions: 423
  neighbours:
2016-10-24 21:52:56 =CRASH REPORT====
  crasher:
    initial call: mochiweb_acceptor:init/3
    pid: <0.7845.1622>
    registered_name: []
    exception error: {{case_clause,{ok,{http_error,"exit\r\n"},<<>>}},[{mochiweb_http,request,3,[{file,"src/mochiweb_http.erl"},{line,107}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]}
    ancestors: ['http_0.0.0.0:8098_mochiweb',riak_core_sup,<0.148.0>]
    messages: []
    links: [<0.201.0>,#Port<0.235879110>]
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 377
    stack_size: 24
    reductions: 406
--
Ricardo Mayerhofer


--


Alexander Sicular
Solutions Architect
Basho Technologies
<a href="tel:9175130679" value="+19175130679" class="gmail_msg" target="_blank">9175130679
@siculars




--
Ricardo Mayerhofer



--
Ricardo Mayerhofer



--
Ricardo Mayerhofer
_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Riak nodes constantly crashing

ricardo.ekm
Yes, I'll check if the problem is the AAE! I will disable it and see the results.

Thanks Steven!

On Tue, Oct 25, 2016 at 6:54 PM, Steven Joseph <[hidden email]> wrote:

Hi Ricardo,

If you are using systemd might have to check LimitNOFILE for your units. Active anti entropy runs periodically.

Steven


On Wed, 26 Oct 2016 04:36 Ricardo Mayerhofer <[hidden email]> wrote:
What's weird is that the node crashes every minute at the same second. Is there anything Riak may be running every minute? 

On Mon, Oct 24, 2016 at 8:28 PM, Ricardo Mayerhofer <[hidden email]> wrote:
I'm also pasting the free -m:

             total       used       free     shared    buffers     cached
Mem:         15039      14557        482          0         37       4594
-/+ buffers/cache:       9925       5114
Swap:            0          0          0

On Mon, Oct 24, 2016 at 8:24 PM, Ricardo Mayerhofer <[hidden email]> wrote:
Hi Alexander,
Thanks for your response. We use multi-backend with bitcask and leveldb.

- File descriptors seems to be ok, at least the config.

ubuntu@ip-10-2-58-5:/var/log/riak$ sudo su riak 
sudo: unable to resolve host ip-10-2-58-5
riak@ip-10-2-58-5:/var/log/riak$ ulimit -n
65535

- Memory seems to be ok as well:

KiB Mem:  15400916 total, 14493744 used,   907172 free,    36244 buffers

- Disk is ok

/dev/xvda1       20G  4.1G   15G  22% / # root device

/dev/xvdb       148G   69G   72G  49% /mnt/riak-data  # bitcask and riak data disk
/dev/xvdc       296G   23G  258G   8% /mnt/riak-data/leveldb #leveldb disk

Any other idea? Thanks.

On Mon, Oct 24, 2016 at 8:06 PM, Alexander Sicular <[hidden email]> wrote:
Disk, memory or file descriptors would be my guess. Bitcask?


On Monday, October 24, 2016, Ricardo Mayerhofer <[hidden email]> wrote:
Hi all,
I have a Riak 1.4 where the nodes seems to be constantly crashing. All 5 nodes are affected. 

However it seems Riak manage to get them up again.

Any idea on whats going on? Erros logs below.

Thanks.

error.log
...
2016-10-24 21:57:29.185 [error] <0.24570.1174> CRASH REPORT Process <0.24570.1174> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 21:58:29.187 [error] <0.7109.1175> CRASH REPORT Process <0.7109.1175> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 21:59:29.228 [error] <0.19612.1175> CRASH REPORT Process <0.19612.1175> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 22:00:29.218 [error] <0.1356.1176> CRASH REPORT Process <0.1356.1176> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 22:01:29.197 [error] <0.11380.1176> CRASH REPORT Process <0.11380.1176> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 22:02:29.231 [error] <0.24279.1176> CRASH REPORT Process <0.24279.1176> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107

crash.log
2016-10-24 21:51:56 =CRASH REPORT====
  crasher:
    initial call: mochiweb_acceptor:init/3
    pid: <0.28136.1621>
    registered_name: []
    exception error: {{case_clause,{ok,{http_error,"exit\r\n"},<<>>}},[{mochiweb_http,request,3,[{file,"src/mochiweb_http.erl"},{line,107}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]}
    ancestors: ['http_0.0.0.0:8098_mochiweb',riak_core_sup,<0.148.0>]
    messages: []
    links: [<0.201.0>,#Port<0.235869290>]
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 377
    stack_size: 24
    reductions: 423
  neighbours:
2016-10-24 21:52:56 =CRASH REPORT====
  crasher:
    initial call: mochiweb_acceptor:init/3
    pid: <0.7845.1622>
    registered_name: []
    exception error: {{case_clause,{ok,{http_error,"exit\r\n"},<<>>}},[{mochiweb_http,request,3,[{file,"src/mochiweb_http.erl"},{line,107}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]}
    ancestors: ['http_0.0.0.0:8098_mochiweb',riak_core_sup,<0.148.0>]
    messages: []
    links: [<0.201.0>,#Port<0.235879110>]
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 377
    stack_size: 24
    reductions: 406
--
Ricardo Mayerhofer


--


Alexander Sicular
Solutions Architect
Basho Technologies
<a href="tel:9175130679" value="+19175130679" class="m_-7077806071243209795gmail_msg" target="_blank">9175130679
@siculars




--
Ricardo Mayerhofer



--
Ricardo Mayerhofer



--
Ricardo Mayerhofer
_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



--
Ricardo Mayerhofer

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Riak nodes constantly crashing

Steven Joseph

I don't think you should disable AAE, you can tune its frequency.

Steven


On Thu, 27 Oct 2016 03:50 Ricardo Mayerhofer <[hidden email]> wrote:
Yes, I'll check if the problem is the AAE! I will disable it and see the results.

Thanks Steven!

On Tue, Oct 25, 2016 at 6:54 PM, Steven Joseph <[hidden email]> wrote:

Hi Ricardo,

If you are using systemd might have to check LimitNOFILE for your units. Active anti entropy runs periodically.

Steven


On Wed, 26 Oct 2016 04:36 Ricardo Mayerhofer <[hidden email]> wrote:
What's weird is that the node crashes every minute at the same second. Is there anything Riak may be running every minute? 

On Mon, Oct 24, 2016 at 8:28 PM, Ricardo Mayerhofer <[hidden email]> wrote:
I'm also pasting the free -m:

             total       used       free     shared    buffers     cached
Mem:         15039      14557        482          0         37       4594
-/+ buffers/cache:       9925       5114
Swap:            0          0          0

On Mon, Oct 24, 2016 at 8:24 PM, Ricardo Mayerhofer <[hidden email]> wrote:
Hi Alexander,
Thanks for your response. We use multi-backend with bitcask and leveldb.

- File descriptors seems to be ok, at least the config.

ubuntu@ip-10-2-58-5:/var/log/riak$ sudo su riak 
sudo: unable to resolve host ip-10-2-58-5
riak@ip-10-2-58-5:/var/log/riak$ ulimit -n
65535

- Memory seems to be ok as well:

KiB Mem:  15400916 total, 14493744 used,   907172 free,    36244 buffers

- Disk is ok

/dev/xvda1       20G  4.1G   15G  22% / # root device

/dev/xvdb       148G   69G   72G  49% /mnt/riak-data  # bitcask and riak data disk
/dev/xvdc       296G   23G  258G   8% /mnt/riak-data/leveldb #leveldb disk

Any other idea? Thanks.

On Mon, Oct 24, 2016 at 8:06 PM, Alexander Sicular <[hidden email]> wrote:
Disk, memory or file descriptors would be my guess. Bitcask?


On Monday, October 24, 2016, Ricardo Mayerhofer <[hidden email]> wrote:
Hi all,
I have a Riak 1.4 where the nodes seems to be constantly crashing. All 5 nodes are affected. 

However it seems Riak manage to get them up again.

Any idea on whats going on? Erros logs below.

Thanks.

error.log
...
2016-10-24 21:57:29.185 [error] <0.24570.1174> CRASH REPORT Process <0.24570.1174> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 21:58:29.187 [error] <0.7109.1175> CRASH REPORT Process <0.7109.1175> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 21:59:29.228 [error] <0.19612.1175> CRASH REPORT Process <0.19612.1175> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 22:00:29.218 [error] <0.1356.1176> CRASH REPORT Process <0.1356.1176> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 22:01:29.197 [error] <0.11380.1176> CRASH REPORT Process <0.11380.1176> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 22:02:29.231 [error] <0.24279.1176> CRASH REPORT Process <0.24279.1176> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107

crash.log
2016-10-24 21:51:56 =CRASH REPORT====
  crasher:
    initial call: mochiweb_acceptor:init/3
    pid: <0.28136.1621>
    registered_name: []
    exception error: {{case_clause,{ok,{http_error,"exit\r\n"},<<>>}},[{mochiweb_http,request,3,[{file,"src/mochiweb_http.erl"},{line,107}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]}
    ancestors: ['http_0.0.0.0:8098_mochiweb',riak_core_sup,<0.148.0>]
    messages: []
    links: [<0.201.0>,#Port<0.235869290>]
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 377
    stack_size: 24
    reductions: 423
  neighbours:
2016-10-24 21:52:56 =CRASH REPORT====
  crasher:
    initial call: mochiweb_acceptor:init/3
    pid: <0.7845.1622>
    registered_name: []
    exception error: {{case_clause,{ok,{http_error,"exit\r\n"},<<>>}},[{mochiweb_http,request,3,[{file,"src/mochiweb_http.erl"},{line,107}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]}
    ancestors: ['http_0.0.0.0:8098_mochiweb',riak_core_sup,<0.148.0>]
    messages: []
    links: [<0.201.0>,#Port<0.235879110>]
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 377
    stack_size: 24
    reductions: 406
--
Ricardo Mayerhofer


--


Alexander Sicular
Solutions Architect
Basho Technologies
<a href="tel:9175130679" value="+19175130679" class="m_4712377848424424345m_-7077806071243209795gmail_msg gmail_msg" target="_blank">9175130679
@siculars




--
Ricardo Mayerhofer



--
Ricardo Mayerhofer



--
Ricardo Mayerhofer
_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



--
Ricardo Mayerhofer

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Riak nodes constantly crashing

Alexander Sicular-2
Take a look at the AAE settings here:


On Oct 26, 2016, at 16:17, Steven Joseph <[hidden email]> wrote:

I don't think you should disable AAE, you can tune its frequency.

Steven


On Thu, 27 Oct 2016 03:50 Ricardo Mayerhofer <[hidden email]> wrote:
Yes, I'll check if the problem is the AAE! I will disable it and see the results.

Thanks Steven!

On Tue, Oct 25, 2016 at 6:54 PM, Steven Joseph <[hidden email]> wrote:

Hi Ricardo,

If you are using systemd might have to check LimitNOFILE for your units. Active anti entropy runs periodically.

Steven


On Wed, 26 Oct 2016 04:36 Ricardo Mayerhofer <[hidden email]> wrote:
What's weird is that the node crashes every minute at the same second. Is there anything Riak may be running every minute? 

On Mon, Oct 24, 2016 at 8:28 PM, Ricardo Mayerhofer <[hidden email]> wrote:
I'm also pasting the free -m:

             total       used       free     shared    buffers     cached
Mem:         15039      14557        482          0         37       4594
-/+ buffers/cache:       9925       5114
Swap:            0          0          0

On Mon, Oct 24, 2016 at 8:24 PM, Ricardo Mayerhofer <[hidden email]> wrote:
Hi Alexander,
Thanks for your response. We use multi-backend with bitcask and leveldb.

- File descriptors seems to be ok, at least the config.

ubuntu@ip-10-2-58-5:/var/log/riak$ sudo su riak 
sudo: unable to resolve host ip-10-2-58-5
riak@ip-10-2-58-5:/var/log/riak$ ulimit -n
65535

- Memory seems to be ok as well:

KiB Mem:  15400916 total, 14493744 used,   907172 free,    36244 buffers

- Disk is ok

/dev/xvda1       20G  4.1G   15G  22% / # root device

/dev/xvdb       148G   69G   72G  49% /mnt/riak-data  # bitcask and riak data disk
/dev/xvdc       296G   23G  258G   8% /mnt/riak-data/leveldb #leveldb disk

Any other idea? Thanks.

On Mon, Oct 24, 2016 at 8:06 PM, Alexander Sicular <[hidden email]> wrote:
Disk, memory or file descriptors would be my guess. Bitcask?


On Monday, October 24, 2016, Ricardo Mayerhofer <[hidden email]> wrote:
Hi all,
I have a Riak 1.4 where the nodes seems to be constantly crashing. All 5 nodes are affected. 

However it seems Riak manage to get them up again.

Any idea on whats going on? Erros logs below.

Thanks.

error.log
...
2016-10-24 21:57:29.185 [error] <0.24570.1174> CRASH REPORT Process <0.24570.1174> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 21:58:29.187 [error] <0.7109.1175> CRASH REPORT Process <0.7109.1175> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 21:59:29.228 [error] <0.19612.1175> CRASH REPORT Process <0.19612.1175> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 22:00:29.218 [error] <0.1356.1176> CRASH REPORT Process <0.1356.1176> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 22:01:29.197 [error] <0.11380.1176> CRASH REPORT Process <0.11380.1176> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107
2016-10-24 22:02:29.231 [error] <0.24279.1176> CRASH REPORT Process <0.24279.1176> with 0 neighbours crashed with reason: no case clause matching {ok,{http_error,"exit\r\n"},<<>>} in mochiweb_http:request/3 line 107

crash.log
2016-10-24 21:51:56 =CRASH REPORT====
  crasher:
    initial call: mochiweb_acceptor:init/3
    pid: <0.28136.1621>
    registered_name: []
    exception error: {{case_clause,{ok,{http_error,"exit\r\n"},<<>>}},[{mochiweb_http,request,3,[{file,"src/mochiweb_http.erl"},{line,107}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]}
    ancestors: ['http_0.0.0.0:8098_mochiweb',riak_core_sup,<0.148.0>]
    messages: []
    links: [<0.201.0>,#Port<0.235869290>]
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 377
    stack_size: 24
    reductions: 423
  neighbours:
2016-10-24 21:52:56 =CRASH REPORT====
  crasher:
    initial call: mochiweb_acceptor:init/3
    pid: <0.7845.1622>
    registered_name: []
    exception error: {{case_clause,{ok,{http_error,"exit\r\n"},<<>>}},[{mochiweb_http,request,3,[{file,"src/mochiweb_http.erl"},{line,107}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]}
    ancestors: ['http_0.0.0.0:8098_mochiweb',riak_core_sup,<0.148.0>]
    messages: []
    links: [<0.201.0>,#Port<0.235879110>]
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 377
    stack_size: 24
    reductions: 406
--
Ricardo Mayerhofer


--


Alexander Sicular
Solutions Architect
Basho Technologies
<a href="tel:9175130679" value="+19175130679" class="m_4712377848424424345m_-7077806071243209795gmail_msg gmail_msg" target="_blank">9175130679
@siculars




--
Ricardo Mayerhofer



--
Ricardo Mayerhofer



--
Ricardo Mayerhofer
_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



--
Ricardo Mayerhofer
_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com