• Arun Nukula

!!! Rabbitmq reported unrecoverable state , recovery.dets corrupted !!!


Unable to start rabbitmq after an outage?

Are you seeing a similar exception as below

2018-07-26T09:39:17.273888+00:00 <<fqdnofvraappliance>> [cluster-rabbitmq-monitor] - ERROR - Rabbitmq reported unrecoverable state: [Error]: {could_not_start,rabbit, {{badmatch, {error, {{{badmatch, {error, {not_a_dets_file, "/var/lib/rabbitmq/mnesia/rabbit@<<fqdnofvraappliance>>/recovery.dets"}}}, [{rabbit_recovery_terms,open_table,0, [{file,"src/rabbit_recovery_terms.erl"},{line,126}]}, {rabbit_recovery_terms,init,1, [{file,"src/rabbit_recovery_terms.erl"},{line,107}]}, {gen_server,init_it,6,[{file,"gen_server.erl"},{line,328}]}, {proc_lib,init_p_do_apply,3, [{file,"proc_lib.erl"},{line,247}]}]}, {child,undefined,rabbit_recovery_terms, {rabbit_recovery_terms,start_link,[]}, transient,30000,worker, [rabbit_recovery_terms]}}}}, [{rabbit_queue_index,start,1, [{file,"src/rabbit_queue_index.erl"},{line,464}]}, {rabbit_variable_queue,start,1, [{file,"src/rabbit_variable_queue.erl"},{line,455}]}, {rabbit_priority_queue,start,1, [{file,"src/rabbit_priority_queue.erl"},{line,92}]}, {rabbit_amqqueue,recover,0, [{file,"src/rabbit_amqqueue.erl"},{line,239}]}, {rabbit,recover,0,[{file,"src/rabbit.erl"},{line,756}]}, {rabbit_boot_steps,'-run_step/2-lc$^1/1-1-',1, [{file,"src/rabbit_boot_steps.erl"},{line,49}]}, {rabbit_boot_steps,run_step,2, [{file,"src/rabbit_boot_steps.erl"},{line,49}]}, {rabbit_boot_steps,'-run_boot_steps/1-lc$^0/1-0-',1, [{file,"src/rabbit_boot_steps.erl"},{line,26}]}]}} 2018-07-26T09:39:17.898241+00:00 vasydp161 su: (to rabbitmq) root on /dev/pts/4

Above exception states that rabbitmq could not start as there was an exception reading recovery.dets file

If you browse to /var/lib/rabbitmq/mnesia and perform ls -ltrh

You would see that this file recovery.dets is corrupt or 0 bytes

recovery.dets file contains recovery metadata if the node was stopped gracefully. There exists a high change of it's corruption if the node rabbitmq is stopped abruptly

To remediate , delete or move this 0 byte file to another location ( eg. /tmp/ ) and then reboot the node , in this case vRealize Automation appliance

Once done , during boot process we did see all services including rabbitmq started successfully.

#vRealizeAutomation

0 views

Subscribe Now

  • Twitter
  • Facebook Social Icon

Copyright © 2019 nukescloud