Distributed Locking in Horizontally Scaled Systems: Deep Dive into Apache ZooKeeper

Distributed Locking in Horizontally Scaled Systems: Deep Dive into Apache ZooKeeper

Anand Singh·
Distributed SystemsLockingZookeeper

In the world of large scale distributed systems, one of the most challenging issues we, as developers, face is managing concurrent access to shared resources. I encountered this first hand while building a horizontally scaled service to process card transactions. Despite the system being stateless and distributed across multiple nodes, the underlying data like user account balance was centralised, for obvious reasons. Transactions processed in parallel on different nodes could target the same account, leading to below impacts - 

  • Race Conditions - Concurrent updates to an account can result in unpredictable final states.
  • Double Spending - Funds may be spent more than once before the system detects and resolves conflicts.
  • Data Inconsistency - Distributed nodes may end up with conflicting views of the same account data.

To tackle these issues, I dove deep into distributed locking, a critical technique for coordinating access to shared resources in distributed environments. This article explores the problem space, highlights the different ways of locking, their pros and cons. The central part of discussion being Apache Zookeeper. So, let’s dive straight into it.

Distributed Locking Mechanisms/Tools

Before zooming in on any specific approach, it’s important to understand the broader landscape of distributed locking options. Each approach has trade offs in terms of complexity, availability, consistency, and failure handling.

1. Optimistic Locking

  • How it works - We do not take any lock upfront. Instead, we read version numbers or timestamps of the data. When updating that record, we check if the version is unchanged. If it is, the update goes through otherwise, it fails and retries.
  • Pros - No external locking infrastructure. It’s simple.
  • Cons - Conflict resolution logic needed. Many retries might be needed under contention and can lead to performance degradation.

2. Pessimistic Locking

  • How it works - We lock the resource before any updates happen, for exclusive access, preventing others from modifying it. For example - taking database (MySQL/Postgres etc.) row level locks (e.g., SELECT ... FOR UPDATE)
  • Pros - Strong consistency. Safer for high-conflict operations
  • Cons - Risk of deadlocks. Not ideal for high-concurrency environments.

3. Redis-based Locking (e.g., Redlock)

  • How it works - We use Redis with key expirations and replication to enforce distributed locks. Redlock improves safety by requiring locks on a majority (n+1/2) of Redis instances.
  • Pros - Fast and simple. No major infra changes if Redis is already in use.
  • Cons - Must handle edge cases like clock drift and partial failures. Split-brain risks if not carefully implemented.

NOTE - More detailed resource on Redlock

4. ZooKeeper-based Locking

  • How it works - Clients create ephemeral, sequential znodes. The client with the lowest sequence number gets the lock. Others wait for their turn.
  • Pros - Strong consistency. Fault tolerant. Auto cleans locks if a client crashes
  • Cons - Operational overhead as it needs a running ZooKeeper ensemble. Higher latency than Redis

ZooKeeper Deep Dive 

What is ZooKeeper?

Zookeeper is a distributed coordination service designed to manage and coordinate distributed systems. Its internal data structure is namespace, similar to a filesystem, and looks similar to a tree. Each node is called zNode(zookeeper node). Each zNode has a path (denoted by /) to connect with each other. Znodes can be of three types - 

  • Persistent zNodes - Remain until explicitly deleted.
  • Ephemeral zNodes - Exist as long as the session that created them is active.
  • Sequential zNodes - Assigned a unique, incrementing number when created.

These zNodes are used to store data and maintain state information in a distributed system.

Imp - We cannot re-name a zNode.

Sessions - When a client connects to a ZooKeeper server, a session is established. If the client fails to send heartbeats(sort of reply) within a specified timeout, the session expires, and any ephemeral zNodes created by the client are deleted.

Watches - Clients can set watches on zNodes to get notified of changes. Watches are one-time triggers. After being triggered, they are removed.

Why ZooKeeper?

In our use case, Redis lacked the safety guarantees we needed. We needed a solution that guaranteed mutual exclusion across multiple nodes, survived node crashes, and gave us a consistent state. While ZooKeeper introduces additional complexity, it met these requirements effectively.

How ZooKeeper Locking Works

ZooKeeper clients create an ephemeral sequential znode under a known path (e.g., /locks/lock-). The client that creates the znode with the lowest sequence number owns the lock. Others watch the znode just before theirs, and once that node is deleted (lock released), they try to acquire the lock.

Step-by-Step Implementation

  1. Create Lock Path - On service startup, ensure /locks exists.
  2. Acquire Lock - 
  • First we create an ephemeral sequential znode under /locks, e.g., /locks/lock-0000000012. This sequence is auto assigned by zookeeper to us.
  • We should then list znodes under /locks and sort by sequence number.
  • If our znode is the lowest, we own the lock.
  • Otherwise, watch the znode that comes before ours.

3. Release Lock - 

  • Just delete the znode that we have taken.
  • ZooKeeper will notify the next client in line.

Integration in our use case

In our card transaction processing system, we implement the zookeeper as given in below steps.

Step 1: Lock Acquisition

Each service instance performs the following.

  • Connect to ZooKeeper
conn, _, err := zk.Connect([]string{"localhost:2181"}, time.Second*5)
if err != nil {
    panic(err)
}
  • Ensure the Lock Path Exists
lockBasePath := "/locks/account-123"
exists, _, err := conn.Exists(lockBasePath)
if err != nil {
    panic(err)
}
if !exists {
    _, err := conn.Create(lockBasePath, []byte{}, 0, zk.WorldACL(zk.PermAll))
    if err != nil && err != zk.ErrNodeExists {
        panic(err)
    }
}
  • Create an Ephemeral Sequential Znode - ZooKeeper automatically appends a monotonically increasing sequence number.
lockPathPrefix := lockBasePath + "/lock-"
lockPath, err := conn.Create(lockPathPrefix, []byte{}, zk.FlagEphemeral|zk.FlagSequence, zk.WorldACL(zk.PermAll))
if err != nil {
    panic(err)
}
fmt.Println("Created znode:", lockPath)

Step 2: Determining Lock Ownership

  • List all znodes under the lock path and sort them -
children, _, err := conn.Children(lockBasePath)
if err != nil {
    panic(err)
}
sort.Strings(children)

myNode := strings.TrimPrefix(lockPath, lockBasePath+"/")
index := sort.SearchStrings(children, myNode)
  • If the current node is the smallest, it holds the lock. Else, it sets a watch on its immediate predecessor.
if index == 0 {
    fmt.Println("Acquired lock!")
} else {
    previousNode := children[index-1]
    watchPath := lockBasePath + "/" + previousNode

    _, _, ch, err := conn.ExistsW(watchPath)
    if err != nil {
        panic(err)
    }

    fmt.Println("Waiting for", watchPath, "to be deleted...")
    <-ch
    fmt.Println("Acquired lock after previous node deleted!")
}

Step 3: Perform Critical Operations like balance check and update etc.

fmt.Println("Processing transaction: read/write balance, audit logs...")

Step 4: Release the Lock

err = conn.Delete(lockPath, -1)
if err != nil {
    panic(err)
}
fmt.Println("Released lock.")

Crash Safety with Ephemeral Znodes

If a service or node crashes or disconnects, ZooKeeper automatically deletes the ephemeral znode upon session expiry. This ensures that locks held by failed services are released, allowing the next in line to proceed. It prevents deadlock.

ZooKeeper Considerations and Challenges

  • Ephemeral Nodes - All our clients must handle session expiration and reconnection logic to ensure locks are released appropriately.
  • Ensemble Health - We should keep an odd number of ZooKeeper nodes for quorum (typically 3 or 5).
  • Latency - While reads are fast, writes (creating znodes) require consensus, which can introduce latency.
  • Deadlocks - We should avoid long-held locks and make operations short and idempotent.
  • Failover: We should implement graceful handling of ZooKeeper outages such as retries with exponential backoff.
  • Monitoring: Also, track lock acquisition times and queue lengths to detect contention.

Conclusion

In the chaos of distributed systems, coordination is everything. We have seen all the common practices above which can help us in making our system more reliable. Apache ZooKeeper may not be the lightest or easiest, but it's graceful where it counts i.e. ensuring order, integrity, and resilience. For systems which are distributed and where precision matters, like payment platforms, it’s not just a tool, it’s insurance !