SAP HANA on RHEL 8.6 Pacemaker Split Brain Troubleshooting

SAP

Due to Azure internal issue, SAP HANA DB Primary VM shut down abnormally and HANA DB takeover the Secondary VM via RHEL Pacemaker as Primary DB and SAP Applications are available, no outage occurs.

Post restart of Primary VM from Azure Portal, we have identified that Primary VM pacemaker is still understand themselves DB Primary Server as it’s not letting HANA DB started as standby. HANA DB resource in pacemaker cluster is getting failed continuously.

Trigger the below command to identify the status –

Login to SiteA (earlier primary site)

hdbnsutil -sr_state –> showing Site A as Primary

Login to SiteB (present primary site)

hdbnsutil -sr_state –> showing Site B as Primary as well

Steps to Resolve this issue –

1. Set the RHEL pacemaker cluster maintenance mode – TRUE

2. Forcefully Disable Replication on Site A (Site B is already a primary site)

hdbnsutil -sr_disable –force

3. Once successfully executed then stop the HANA database at Site A

HDB stop

4. Reregister Site A as secondary with below command –

hdbnsutil -sr_register –name=SiteA –remoteHost= –remoteInstance= –replicationMode=sync –operationMode=logreplay

5. Start the HANA DB as SiteA as secondary –

HDB start

6. Validate the Replication Status in HANA studio or OS level(Python Script)

7. Remove the maintenance mode in RHEL pacemaker cluster

8. Validate the pacemaker cluster status (Site B should be Primary and Site A should be Secondary)

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload CAPTCHA.