Truncated bit-cask files

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Truncated bit-cask files

Arun Rajagopalan
Hello Riak Users

We have situations where we dont or cant gracefully stop riak. When that happens we occasionally get a truncated last-record in bitcask files

If I delete those bitcask dir and the anti_entropy directory, Riak rebuilds those bitcask files correctly

Is there a way to rectify those borken bitcask files ?

Thanks
Arun



_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Truncated bit-cask files

Magnus Kessler
On 13 February 2017 at 19:56, Arun Rajagopalan <[hidden email]> wrote:
Hello Riak Users

We have situations where we dont or cant gracefully stop riak. When that happens we occasionally get a truncated last-record in bitcask files

If I delete those bitcask dir and the anti_entropy directory, Riak rebuilds those bitcask files correctly

Is there a way to rectify those borken bitcask files ?

Thanks
Arun



Hi Arun,

There should be no need to remove truncated bitcask files. Any objects up to the point of truncation should still be available to Riak. However, it may take longer for the affected partition to start up, as the corresponding hint file will not match the data file, and Riak will scan the latter to populate the key set it keeps in memory.

By removing anti_entropy files you are forcing the AAE trees to be rebuilt. Once this has completed, AAE will fill back any objects missing due to the truncated bitcask files. You can also force a full partition repair by following the instructions from the documentation [0].

Can you let me know why you cannot shut down the node gracefully? Unclean shutdowns should be a last resort and not part of normal operating procedures.

Kind Regards,

Magnus


[0]: http://docs.basho.com/riak/kv/2.2.0/using/repair-recovery/repairs/#repairing-partitions.

--
Magnus Kessler
Client Services Engineer
Basho Technologies Limited

Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Truncated bit-cask files

Arun Rajagopalan
Hi Magnus

RIAK crashes on startup when I have trucated bitcask file

It also crashes when the AAE files are bad too I think. Example below

2017-02-13 21:18:30 =CRASH REPORT====

  crasher:

    initial call: riak_kv_index_hashtree:init/1

    pid: <0.6037.0>

    registered_name: []

    exception exit: {{{badmatch,{error,{db_open,"Corruption: truncated record at end of file"}}},[{hashtree,new_segment_

store,2,[{file,"src/hashtree.erl"},{line,675}]},{hashtree,new,2,[{file,"src/hashtree.erl"},{line,246}]},{riak_kv_index_h

ashtree,do_new_tree,3,[{file,"src/riak_kv_index_hashtree.erl"},{line,610}]},{lists,foldl,3,[{file,"lists.erl"},{line,124

8}]},{riak_kv_index_hashtree,init_trees,3,[{file,"src/riak_kv_index_hashtree.erl"},{line,474}]},{riak_kv_index_hashtree,

init,1,[{file,"src/riak_kv_index_hashtree.erl"},{line,268}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,304}]}

,{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]},[{gen_server,init_it,6,[{file,"gen_server.erl"},{line

,328}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]}

    ancestors: [<0.715.0>,riak_core_vnode_sup,riak_core_sup,<0.160.0>]

    messages: []

    links: []

    dictionary: []

    trap_exit: false

    status: running

    heap_size: 1598

    stack_size: 27

    reductions: 889

  neighbours:



Regards
Arun


On Tue, Feb 14, 2017 at 4:48 AM, Magnus Kessler <[hidden email]> wrote:
On 13 February 2017 at 19:56, Arun Rajagopalan <[hidden email]> wrote:
Hello Riak Users

We have situations where we dont or cant gracefully stop riak. When that happens we occasionally get a truncated last-record in bitcask files

If I delete those bitcask dir and the anti_entropy directory, Riak rebuilds those bitcask files correctly

Is there a way to rectify those borken bitcask files ?

Thanks
Arun



Hi Arun,

There should be no need to remove truncated bitcask files. Any objects up to the point of truncation should still be available to Riak. However, it may take longer for the affected partition to start up, as the corresponding hint file will not match the data file, and Riak will scan the latter to populate the key set it keeps in memory.

By removing anti_entropy files you are forcing the AAE trees to be rebuilt. Once this has completed, AAE will fill back any objects missing due to the truncated bitcask files. You can also force a full partition repair by following the instructions from the documentation [0].

Can you let me know why you cannot shut down the node gracefully? Unclean shutdowns should be a last resort and not part of normal operating procedures.

Kind Regards,

Magnus


[0]: http://docs.basho.com/riak/kv/2.2.0/using/repair-recovery/repairs/#repairing-partitions.

--
Magnus Kessler
Client Services Engineer
Basho Technologies Limited

Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Truncated bit-cask files

Magnus Kessler
On 14 February 2017 at 14:46, Arun Rajagopalan <[hidden email]> wrote:
Hi Magnus

RIAK crashes on startup when I have trucated bitcask file

It also crashes when the AAE files are bad too I think. Example below

2017-02-13 21:18:30 =CRASH REPORT====

  crasher:

    initial call: riak_kv_index_hashtree:init/1

    pid: <0.6037.0>

    registered_name: []

    exception exit: {{{badmatch,{error,{db_open,"Corruption: truncated record at end of file"}}},[{hashtree,new_segment_

store,2,[{file,"src/hashtree.erl"},{line,675}]},{hashtree,new,2,[{file,"src/hashtree.erl"},{line,246}]},{riak_kv_index_h

ashtree,do_new_tree,3,[{file,"src/riak_kv_index_hashtree.erl"},{line,610}]},{lists,foldl,3,[{file,"lists.erl"},{line,124

8}]},{riak_kv_index_hashtree,init_trees,3,[{file,"src/riak_kv_index_hashtree.erl"},{line,474}]},{riak_kv_index_hashtree,

init,1,[{file,"src/riak_kv_index_hashtree.erl"},{line,268}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,304}]}

,{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]},[{gen_server,init_it,6,[{file,"gen_server.erl"},{line

,328}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]}

    ancestors: [<0.715.0>,riak_core_vnode_sup,riak_core_sup,<0.160.0>]

    messages: []

    links: []

    dictionary: []

    trap_exit: false

    status: running

    heap_size: 1598

    stack_size: 27

    reductions: 889

  neighbours:



Regards
Arun


Hi Arun,

The crash log you provided shows that there is a corrupted file in the AAE (anti_entropy) backend. Entries in console.log should have more information about which partition is affected. Please post output from the affected node at around 2017-02-13T21:18:30. As this is AAE data, it is safe to remove the directory named after the affected partition from the active_entropy directory before restarting the node. You may find that there is more than one affected partition, the next of which will be encountered after the attempted restart only. If this is the case, simply identify the next partition in the same way and remove it, too, until the node starts up successfully again.

Is there a reason why the nodes aren't shut down in the regular way?

Kind Regards,

Magnus



--
Magnus Kessler
Client Services Engineer
Basho Technologies Limited

Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431

_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Truncated bit-cask files

Matthew Von-Maszewski
Arun,

The AAE code uses leveldb for its storage of anti-entropy data, no matter which backend holds the user data.  Therefore the error below suggests corruption within leveldb files (which is not impossible, but becoming really rare except with bad hardware or full disks).

Before wiping out the AAE directory, you should copy the LOG file within it.  There are likely more useful error messages within that file ... maybe put the file in drop box or zip attach to a reply for us to review.

Matthew

On Feb 14, 2017, at 10:42 AM, Magnus Kessler <[hidden email]> wrote:

On 14 February 2017 at 14:46, Arun Rajagopalan <[hidden email]> wrote:
Hi Magnus

RIAK crashes on startup when I have trucated bitcask file

It also crashes when the AAE files are bad too I think. Example below

2017-02-13 21:18:30 =CRASH REPORT====
  crasher:
    initial call: riak_kv_index_hashtree:init/1
    pid: <0.6037.0>
    registered_name: []
    exception exit: {{{badmatch,{error,{db_open,"Corruption: truncated record at end of file"}}},[{hashtree,new_segment_
store,2,[{file,"src/hashtree.erl"},{line,675}]},{hashtree,new,2,[{file,"src/hashtree.erl"},{line,246}]},{riak_kv_index_h
ashtree,do_new_tree,3,[{file,"src/riak_kv_index_hashtree.erl"},{line,610}]},{lists,foldl,3,[{file,"lists.erl"},{line,124
8}]},{riak_kv_index_hashtree,init_trees,3,[{file,"src/riak_kv_index_hashtree.erl"},{line,474}]},{riak_kv_index_hashtree,
init,1,[{file,"src/riak_kv_index_hashtree.erl"},{line,268}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,304}]}
,{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]},[{gen_server,init_it,6,[{file,"gen_server.erl"},{line
,328}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]}
    ancestors: [<0.715.0>,riak_core_vnode_sup,riak_core_sup,<0.160.0>]
    messages: []
    links: []
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 1598
    stack_size: 27
    reductions: 889
  neighbours:


Regards
Arun


Hi Arun,

The crash log you provided shows that there is a corrupted file in the AAE (anti_entropy) backend. Entries in console.log should have more information about which partition is affected. Please post output from the affected node at around 2017-02-13T21:18:30. As this is AAE data, it is safe to remove the directory named after the affected partition from the active_entropy directory before restarting the node. You may find that there is more than one affected partition, the next of which will be encountered after the attempted restart only. If this is the case, simply identify the next partition in the same way and remove it, too, until the node starts up successfully again.

Is there a reason why the nodes aren't shut down in the regular way?

Kind Regards,

Magnus



--
Magnus Kessler
Client Services Engineer
Basho Technologies Limited

Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431
_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Truncated bit-cask files

Matthew Von-Maszewski
Arun,

You are running out of RAM for the leveldb AAE.  There are several ways to fix that:

- reduce memory allocated to bitcask
- more memory per server
- more servers of same memory
- reduce the ring size from 64 to 8, and rebuild data within the cluster from scratch
- lie to leveldb and give it a big than real memory setting in riak.conf:
        leveldb.maximum_memory=8G


The key LOG lines are:

Options.total_leveldb_mem: 2,901,766,963    <-- this is the total memory assigned to ALL of leveldb, but
    only 20% of it goes to AAE vnodes

File cache size: 5833527     <-- the first vnode says, cool enough memory for me
Block cache size: 7930679  <-- ditto

  ... but as more vnodes start:

 File cache size: 0                <-- things are just not going to work well
Block cache size: 0

There are no actual file system error messages in your LOG files.  That supports that the real problem is memory unhappiness.

Matthew


On Feb 14, 2017, at 3:34 PM, Arun Rajagopalan <[hidden email]> wrote:

Hi Matthew, Magnus

I have attached the log files for your review

Thanks
Arun


On Tue, Feb 14, 2017 at 11:55 AM, Matthew Von-Maszewski <[hidden email]> wrote:
Arun,

The AAE code uses leveldb for its storage of anti-entropy data, no matter which backend holds the user data.  Therefore the error below suggests corruption within leveldb files (which is not impossible, but becoming really rare except with bad hardware or full disks).

Before wiping out the AAE directory, you should copy the LOG file within it.  There are likely more useful error messages within that file ... maybe put the file in drop box or zip attach to a reply for us to review.

Matthew

On Feb 14, 2017, at 10:42 AM, Magnus Kessler <[hidden email]> wrote:

On 14 February 2017 at 14:46, Arun Rajagopalan <[hidden email]> wrote:
Hi Magnus

RIAK crashes on startup when I have trucated bitcask file

It also crashes when the AAE files are bad too I think. Example below

2017-02-13 21:18:30 =CRASH REPORT====
  crasher:
    initial call: riak_kv_index_hashtree:init/1
    pid: <0.6037.0>
    registered_name: []
    exception exit: {{{badmatch,{error,{db_open,"Corruption: truncated record at end of file"}}},[{hashtree,new_segment_
store,2,[{file,"src/hashtree.erl"},{line,675}]},{hashtree,new,2,[{file,"src/hashtree.erl"},{line,246}]},{riak_kv_index_h
ashtree,do_new_tree,3,[{file,"src/riak_kv_index_hashtree.erl"},{line,610}]},{lists,foldl,3,[{file,"lists.erl"},{line,124
8}]},{riak_kv_index_hashtree,init_trees,3,[{file,"src/riak_kv_index_hashtree.erl"},{line,474}]},{riak_kv_index_hashtree,
init,1,[{file,"src/riak_kv_index_hashtree.erl"},{line,268}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,304}]}
,{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]},[{gen_server,init_it,6,[{file,"gen_server.erl"},{line
,328}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]}
    ancestors: [<0.715.0>,riak_core_vnode_sup,riak_core_sup,<0.160.0>]
    messages: []
    links: []
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 1598
    stack_size: 27
    reductions: 889
  neighbours:


Regards
Arun


Hi Arun,

The crash log you provided shows that there is a corrupted file in the AAE (anti_entropy) backend. Entries in console.log should have more information about which partition is affected. Please post output from the affected node at around 2017-02-13T21:18:30. As this is AAE data, it is safe to remove the directory named after the affected partition from the active_entropy directory before restarting the node. You may find that there is more than one affected partition, the next of which will be encountered after the attempted restart only. If this is the case, simply identify the next partition in the same way and remove it, too, until the node starts up successfully again.

Is there a reason why the nodes aren't shut down in the regular way?

Kind Regards,

Magnus



--
Magnus Kessler
Client Services Engineer
Basho Technologies Limited

Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431
_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


<aaeLOG.tar>


_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Reply | Threaded
Open this post in threaded view
|

Re: Truncated bit-cask files

Arun Rajagopalan
Thanks Matthew. I will try one of those solutions


On Tue, Feb 14, 2017 at 3:51 PM, Matthew Von-Maszewski <[hidden email]> wrote:
Arun,

You are running out of RAM for the leveldb AAE.  There are several ways to fix that:

- reduce memory allocated to bitcask
- more memory per server
- more servers of same memory
- reduce the ring size from 64 to 8, and rebuild data within the cluster from scratch
- lie to leveldb and give it a big than real memory setting in riak.conf:
        leveldb.maximum_memory=8G


The key LOG lines are:

Options.total_leveldb_mem: 2,901,766,963    <-- this is the total memory assigned to ALL of leveldb, but
    only 20% of it goes to AAE vnodes

File cache size: 5833527     <-- the first vnode says, cool enough memory for me
Block cache size: 7930679  <-- ditto

  ... but as more vnodes start:

 File cache size: 0                <-- things are just not going to work well
Block cache size: 0

There are no actual file system error messages in your LOG files.  That supports that the real problem is memory unhappiness.

Matthew


On Feb 14, 2017, at 3:34 PM, Arun Rajagopalan <[hidden email]> wrote:

Hi Matthew, Magnus

I have attached the log files for your review

Thanks
Arun


On Tue, Feb 14, 2017 at 11:55 AM, Matthew Von-Maszewski <[hidden email]> wrote:
Arun,

The AAE code uses leveldb for its storage of anti-entropy data, no matter which backend holds the user data.  Therefore the error below suggests corruption within leveldb files (which is not impossible, but becoming really rare except with bad hardware or full disks).

Before wiping out the AAE directory, you should copy the LOG file within it.  There are likely more useful error messages within that file ... maybe put the file in drop box or zip attach to a reply for us to review.

Matthew

On Feb 14, 2017, at 10:42 AM, Magnus Kessler <[hidden email]> wrote:

On 14 February 2017 at 14:46, Arun Rajagopalan <[hidden email]> wrote:
Hi Magnus

RIAK crashes on startup when I have trucated bitcask file

It also crashes when the AAE files are bad too I think. Example below

2017-02-13 21:18:30 =CRASH REPORT====
  crasher:
    initial call: riak_kv_index_hashtree:init/1
    pid: <0.6037.0>
    registered_name: []
    exception exit: {{{badmatch,{error,{db_open,"Corruption: truncated record at end of file"}}},[{hashtree,new_segment_
store,2,[{file,"src/hashtree.erl"},{line,675}]},{hashtree,new,2,[{file,"src/hashtree.erl"},{line,246}]},{riak_kv_index_h
ashtree,do_new_tree,3,[{file,"src/riak_kv_index_hashtree.erl"},{line,610}]},{lists,foldl,3,[{file,"lists.erl"},{line,124
8}]},{riak_kv_index_hashtree,init_trees,3,[{file,"src/riak_kv_index_hashtree.erl"},{line,474}]},{riak_kv_index_hashtree,
init,1,[{file,"src/riak_kv_index_hashtree.erl"},{line,268}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,304}]}
,{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]},[{gen_server,init_it,6,[{file,"gen_server.erl"},{line
,328}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]}
    ancestors: [<0.715.0>,riak_core_vnode_sup,riak_core_sup,<0.160.0>]
    messages: []
    links: []
    dictionary: []
    trap_exit: false
    status: running
    heap_size: 1598
    stack_size: 27
    reductions: 889
  neighbours:


Regards
Arun


Hi Arun,

The crash log you provided shows that there is a corrupted file in the AAE (anti_entropy) backend. Entries in console.log should have more information about which partition is affected. Please post output from the affected node at around 2017-02-13T21:18:30. As this is AAE data, it is safe to remove the directory named after the affected partition from the active_entropy directory before restarting the node. You may find that there is more than one affected partition, the next of which will be encountered after the attempted restart only. If this is the case, simply identify the next partition in the same way and remove it, too, until the node starts up successfully again.

Is there a reason why the nodes aren't shut down in the regular way?

Kind Regards,

Magnus



--
Magnus Kessler
Client Services Engineer
Basho Technologies Limited

Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431
_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


<aaeLOG.tar>



_______________________________________________
riak-users mailing list
[hidden email]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com