SAP HANA HSR & Pacemaker — Complete HA/DR Cluster Setup Guide

If you’ve ever been woken up at 3 AM because your SAP HANA production node went down and the “high availability” setup didn’t quite live up to its name, you know why this post exists. Setting up real high availability for SAP HANA isn’t just about flipping a switch — it’s about understanding how System Replication and Pacemaker actually talk to each other, what happens when a network link flaps silently, and why your failover test in the DR lab might not survive contact with production reality.

In my experience helping Basis teams deploy HANA HA/DR clusters, the biggest pain point isn’t the initial setup — it’s making sure failover actually works six months later when you haven’t touched the cluster since go-live. This guide walks you through the entire stack: HSR modes, Pacemaker cluster configuration, fencing, resource agents, automatic failover, and the DR drill procedures your auditors want to see.

Table of Contents

Architecture Overview: Primary, Secondary, and Quorum

Before touching a single config file, let’s get the architecture straight. A proper SAP HANA HSR + Pacemaker setup involves at minimum three nodes:

Primary node — handles all read/write HANA workloads and replicates data to the secondary.
Secondary node — receives replication data and stands ready to take over if the primary fails.
Quorum/Witness node — breaks ties in split-brain scenarios. In a two-node setup without a witness, you risk both nodes thinking they’re primary simultaneously — and that’s when your data diverges silently.

The quorum node can be a lightweight VM — it doesn’t run HANA itself. It participates only in the Corosync voting ring. If you’re running on Azure or AWS, place the witness in a different availability zone from your HANA nodes. Think of it like a referee on the sidelines who can see both players — if the two players start arguing about who’s in charge, the referee makes the call.

Pacemaker itself has two major components you need to understand: Corosync handles cluster communication and membership (who’s alive), while Pacemaker manages resources (what runs where and in what order). When someone says “Pacemaker failed over,” what actually happened is Corosync detected the node was gone, and Pacemaker’s policy engine decided to relocate the HANA resource agent to the secondary.

HSR Replication Modes: Choosing the Right One

SAP HANA supports three HSR replication modes, and choosing correctly is the single most important design decision you’ll make. Each mode represents a different tradeoff between zero data loss and performance impact — there’s no universal “best” option.

Mode 1: Synchronous (sync)

Synchronous replication means every transaction on the primary waits for confirmation that it was written to the secondary’s log buffer before committing. This guarantees zero data loss (RPO = 0) but adds latency to every write operation — typically 1-5 milliseconds over a direct network link.

Use synchronous when: Your RPO is zero (no data loss acceptable), your network latency between nodes is under 2ms, and your workload can tolerate the additional commit delay. This is the default recommendation for most production SAP landscapes.

Mode 2: Synchronous in Memory (syncmem)

Syncmem is a middle ground. The secondary confirms receipt of log pages to the primary before they’re persisted to disk on the secondary. Transactions commit faster than pure sync because the secondary doesn’t wait for disk I/O — it only waited for the log to land in memory.

The catch? If the secondary crashes before flushing those log pages to disk, you lose whatever was still in memory. In practice, the window is tiny (milliseconds), and SAP rates syncmem as near-zero data loss. I recommend this mode when synchronous adds more than 5ms of latency to your workload but you still need near-zero RPO.

Mode 3: Asynchronous (async)

Asynchronous replication fires and forgets. The primary doesn’t wait for any acknowledgment from the secondary. This gives you the lowest latency on the primary side, but during a failover, you lose whatever transactions were in transit — potentially minutes of data.

Use async when: The replication link spans long distances (inter-region DR), network latency exceeds 10ms, or you have a tolerance for some data loss in exchange for performance. It’s common to see sync for intra-data-center HA paired with async for cross-region DR — a tiered approach.

Pacemaker Cluster Components

Now let’s talk about what makes the cluster actually work. Understanding these components will save you hours of “why didn’t it failover?” debugging later.

Corosync — The Membership Layer

Corosync is the cluster communication layer. It manages node membership, message passing, and quorum decisions. In an HANA HA setup, Corosync runs on all nodes (including the witness) over a dedicated network interface — and yes, you should use a separate NIC for Corosync traffic.

Key corosync.conf settings you’ll configure:

totem.version: 2 — Ring protocol version
secauth: on — Encrypt cluster communication (enable this in production)
interface ring0_addr — Bind to your dedicated cluster network
two_node: 1 — Required when using a two-node setup
quorum_votes — Set the witness to 0 votes (non-voting) or adjust for expected failures

Fencing — STONITH and Why It Matters

Fencing is the mechanism that guarantees only one node can access shared resources at any time. Without proper fencing, you get a split-brain scenario where both nodes think they’re the primary — and each starts accepting writes independently. This corrupts your data silently and is significantly worse than downtime.

Pacemaker uses STONITH (Shoot The Other Node In The Head) fencing agents. Common agents include:

fence_aws / fence_azure — Cloud API-based fencing for virtualized environments
fence_ipmilan — IPMI/BMC-based physical server fencing
fence_sbd — SBD (STOTH Block Device) for shared-disk fencing

Critical rule: Never disable STONITH in production because “it’s causing problems.” STONITH triggering is a symptom, not the cause. If STONITH fires unexpectedly, fix the root cause (network flapping, resource starvation) rather than disabling the guard rail.

Resource Agents and Constraints

Resource agents are scripts that know how to start, stop, and monitor a specific service — in our case, the SAP HANA instance. The sfence_saphana and sfence_sapnode resource agents handle HANA-specific operations including takeover, replication status checks, and version validation.

The key constraint properties you’ll configure:

colocation — HANA primary IP must run on the same node as the HANA primary role
ordering — Bring up the IP address before starting HANA, and reverse on shutdown
promotable resources — Define the master/slave relationship managed by HSR

Step-by-Step Cluster Configuration

Here’s the practical sequence I follow when standing up a new HANA HSR + Pacemaker cluster from scratch. Adjust for your specific OS (SLES or RHEL) and HANA version.

Step 1: Prerequisites Checklist

Both HANA nodes installed with identical version, patch level, and parameter files
Network: at least two network paths between nodes (production + replication)
Time synchronized — NTP drift breaks Corosync. I’ve seen 60-second drift cause false fencing events.
SSH key-based authentication between nodes (for HSR replication)
SAP Host Agent installed and running on all nodes
Log mounts accessible from both nodes (for shared log access during failover)

Step 2: Enable HANA System Replication

On the primary node, enable HSR:

hdbsql -u SYSTEM -p <password> -i 90 "ALTER SYSTEM ALTER CONFIGURATION ('global.ini', 'system') SET ('replication', 'mode') = 'logreplay' WITH RECONFIGURE"

For synchronous mode:

hdbsql -u SYSTEM -p <password> -i 90 "ALTER SYSTEM ALTER CONFIGURATION ('global.ini', 'system') SET ('replication', 'mode') = 'logreplay_sync' WITH RECONFIGURE"

hdbnsutil -sr_enable --name=NODE01

On the secondary node, register for replication from the primary:

hdbnsutil -sr_register --remoteHost=hananode01 --remoteInstance=00 --mode=sync --operationMode=logreplay

Step 3: Configure the Cluster Infrastructure

Install cluster packages (SLES example):

zypper install -y pacemaker pcs corosync fence-agents-saphana saphana-ha

Initialize the cluster on the primary:

pcs cluster auth hananode01 hananode02 -u hacluster -p <password>
pcs cluster setup --name hana-ha --start hananode01 hananode02 --enable

Configure Corosync for two-node operation:

pcs property set no-quorum-policy=freeze
pcs property set stonith-enabled=true

The freeze policy for two-node clusters prevents a single remaining node from continuing without quorum (which would violate cluster consistency unless the witness confirms).

Step 4: Configure Fencing

For cloud deployments (AWS example with IPMI fallback):

pcs stonith create fence-node1 fence_aws instance_id=i-0abc123 region=us-east-1 op monitor interval=30s
pcs stonith create fence-node2 fence_aws instance_id=i-0def456 region=us-east-1 op monitor interval=30s

Configure fencing levels — try graceful shutdown before hard power-off:

pcs stonith level add 1 node1 fence-node1
pcs stonith level add 1 node2 fence-node2
pcs stonith level add 2 node1 fence-ipmilan-node1
pcs stonith level add 2 node2 fence-ipmilan-node2

Step 5: Define Cluster Resources

Create the HANA resource agent with promotable master/slave semantics:

pcs resource create hana_SAPHana_PRD --class SAPHanaSR --provider=SAPHanaSR --type SAPHanaSR operations --import SAPHanaSR operations --p START_TIMEOUT=3600 --p STOP_TIMEOUT=3600 --p PROMOTE_TIMEOUT=3600 --p DEMOTE_TIMEOUT=3600

Create the virtual IP that follows the primary:

pcs resource create vip_SAPHana_PRD IPaddr2 ip=10.0.1.100 op monitor interval=10s

Set colocation and ordering constraints:

pcs constraint colocation add vip_SAPHana_PRD master hana_SAPHana_PRD-master INFINITY
pcs constraint order stop hana_SAPHana_PRD then stop vip_SAPHana_PRD symmetrical=false
pcs constraint order start vip_SAPHana_PRD then start hana_SAPHana_PRD

Automatic Failover Testing & DR Drills

This is where I see teams cut corners — and it’s exactly where things go wrong in production. Running a proper DR drill means simulating real failure conditions, not just clicking “stop HANA” in the H studio.

Test Scenario 1: Graceful Primary Failure

# On primary node
crm_mon -r  # Note current layout
pcs cluster stop hananode01  # Simulates clean shutdown

Expected: Secondary promotes to primary within 30-60 seconds. Verify with:

hdbview -d SYSTEMDB --saphost=hananode02

Test Scenario 2: Hard Crash (Kernel Panic)

This is the real test. On the primary node:

echo c > /proc/sysrq-trigger

This triggers a kernel panic immediately — no graceful shutdown. The cluster should detect the node loss within the configured token timeout (default 10 seconds) plus fencing time. For a clean test, expected total failover time: 15-30 seconds with sync replication.

Check the cluster log for the fencing event:

journalctl -u pacemaker --since "5 minutes ago" | grep stonith

Test Scenario 3: Network Partition

Simulate network failure by blocking Corosync traffic on the primary:

iptables -A INPUT -s hananode02 -j DROP
iptables -A OUTPUT -d hananode02 -j DROP

The quorum mechanism should handle this correctly. If you have a witness node, the primary with witness connection continues operating. Remove the iptables rule after testing — leaving it in will cause permanent split-brain.

Test Scenario 4: DR Failover

For cross-site DR replication, test the full takeover procedure to your DR site. Document the time it takes — your RTO (Recovery Time Objective) number needs to be measured, not guessed. As I covered in my earlier post about SAP HANA Backup & Recovery, recovery time validation is something most teams skip until an auditor asks for evidence.

Monitoring the Cluster

A cluster you can’t monitor is a cluster that fails silently. Set up monitoring from day one — don’t wait until after go-live.

Essential Cluster Health Checks

crm_mon -rf — Real-time resource status. Run this as a cron job every minute to catch stop-start loops.
SAPHanaSR-showAttr — Check HSR replication status, sync state, and site-specific attributes.
systemReplicationStatus.py — Built-in SAP script that checks replication health. Returns exit code 0 (OK), 1 (WARNING), or 2 (CRITICAL).

Replication Lag Monitoring

Watch the logGap metric between primary and secondary. If it grows beyond 10MB consistently, your network throughput or secondary write performance is the bottleneck. Check:

SELECT * FROM M_VOLUME_HOST_FILE_LEFT JOIN M_SERVICE_REPLICATION ON M_VOLUME_HOST_FILE_LEFT.HOST = M_SERVICE_REPLICATION.HOST WHERE M_SERVICE_REPLICATION.SERVICE_NAME = 'indexserver' AND M_SERVICE_REPLICATION.REPLICATION_STATUS = 'ACTIVE'

Integration with SAP Solution Manager

Register your HANA cluster in Solution Manager’s Technical Monitoring for central alerting. The HDB_ALERT_MONITOR configuration handles critical cluster events including failed failovers, fencing actions, and replication lag exceeding thresholds.

Troubleshooting Common Issues

After building dozens of these clusters, here are the problems I see most often — and the fixes that aren’t in the SAP documentation:

Split-Brain: Both Nodes Primary

Symptoms: Both nodes show PRIMARY role in M_SERVICE_REPLICATION, and applications report inconsistent data reads.

Root cause: Usually network corruption between Corosync ring members, or a STONITH agent that failed to fire. Check for asymmetric routing — packets from node1 to node2 taking a different path than node2 to node1.

Fix sequence:

Identify which node has the most recent data (check M_BACKUP_CATALOG for latest timestamps)
Manually demote the stale node: hdbnsutil -sr_takeover on the winning node first, then re-register the losing node
Investigate why STONITH didn’t fire — check agent logs, API credentials expiry (common in cloud environments)

Failed Failover: Secondary Won’t Promote

Most common cause: The secondary’s HANA log replay is behind and Pacemaker won’t promote because the resource agent’s start operation times out. Check:

SAPHanaSR-showAttr --sid=PRD

Look for sro-Status showing anything other than SOK. If the secondary is in a corrupt or incomplete state, you may need to re-sync from the current primary before attempting promotion.

Inactive Replication After Maintenance

After a planned maintenance cycle where both nodes were briefly stopped, replication often shows “inactive.” This usually means the secondaries don’t have the correct log position. Resolution:

On the secondary:
hdbnsutil -sr_register --remoteHost=hananode01 --remoteInstance=00 --mode=sync --operationMode=logreplay

Verify sync state returns to ACTIVE before considering the cluster “healthy” again.

Best Practices for Production

After everything above, here are the non-obvious things that separate a production-grade installation from a lab setup — and linking to some of my SAP on Azure Deployment Guide for cloud-specific considerations:

Use log replay over log shipping. Log replay is transaction-consistent; log shipping has edge cases during concurrent transactions that can silently corrupt data.
Dedicate a network interface for HSR replication. Sharing the HSR NIC with production traffic creates latency spikes during peak workloads.
Set the correct max_concurrency for your replication. Default values assume uniform workload — if you have large batch jobs, increase the parallel log shipping threads.
Test your fencing agents monthly. Cloud API credentials rotate, IPMI firmware gets updated, and shared disk SBD devices get re-partitioned. Test that your fence agent actually works.
Document the runbook and store it offline. When your cluster is down and your wiki is on that same cluster, you can’t access the fix procedure. Keep a printed copy or separate documentation system.
Use the HANA Lifecycle Manager for updates. Stop a CRS to patch one node without complex manual steps. The cluster won’t programmatically place nodes in maintenance mode for HSR, so script it.
Monitor disk latency on the secondary. If the secondary’s storage is slower than the primary’s, replication lag will grow during heavy write workloads even if HSR shows “active.”

For the sizing and performance math behind HANA memory and storage planning, I covered the detailed calculations in my SAP HANA DBA Calculations Guide — worth a read if you’re planning capacity.

Conclusion

Setting up SAP HANA HSR with Pacemaker gives you genuine automatic failover capability — but only if you design, configure, and test it properly. The cluster doesn’t maintain itself. Split-brain scenarios, silent replication failures, and expired fencing credentials are all things that look fine in monitoring until the moment you actually need the cluster to save you.

Start with a clear architecture decision (sync for zero data loss, async for DR, syncmem for the middle ground), configure fencing before you configure resources, and run actual failure tests — not just “stop the service” tests. Your future self at 3 AM will thank you.

Have you set up HANA HSR clusters in production? What was your biggest surprise during the first failover test? Drop a comment below or connect with me on LinkedIn — I’d love to hear what worked (or didn’t) in your environment.