Building distributed systems is a zoo...

ZooKeeper is a service for coordinating processes of distributed applications. Historically distributed processes are coordinated using group messaging, shared registers, or distributed lock services. ZooKeeper incorporates elements from all these servers, but incorporates them into a replicated centralized service. The interface exposed by ZooKeeper incorporates the wait-free aspects of group messaging and shared registers with an eventing mechanism similar to those of locking services to provide a simple, yet powerful coordination service.

ZooKeeper is a replicated service and tolerates faults of the servers that run it. Clients can connect to any of the servers that make up the service. Network and machine faults may require a client to change the server it connects to, but it will always see the same view of the service. To guarantee that applications perform correctly despite concurrent access, ZooKeeper implements an efficient replicated state machine. ZooKeeper guarantees that updates to nodes are totally ordered. Operations that read the state of nodes simply reads the state of one server replica, making read operations significantly more scalable. (More on guarantees: wiki guarantees).

ZooKeeper exposes to applications a simple interface to a hierarchical namespace of simple data nodes. Through ZooKeeper, applications coordinate the actions of its several processes by storing configuration data or by implementing more sophisticated primitives such as locks and barriers. For the target workloads, 1:10 to 1:100 update to read ratio, ZooKeeper can handle hundreds of thousands of transactions per second. This performance allows ZooKeeper to be used pervasively by client applications.

With ZooKeeper, applications manipulate nodes to coordinate the actions of their processes in a similar way that files and directories are manipulated in a file system . In such nodes, user applications store data directly to nodes, or simply use node names to indicate some event of the application. Take the following simple example:

In this example, we have two applications "app1" and "app2". The former uses the leaf nodes for failure detection: if a node representing a process exists, then the node is up and running. One important feature of ZooKeeper for this case is the possibility of making nodes ephemeral. Ephemeral nodes are automatically deleted once the user that created them disconnects. (More on recipes: wiki recipes)

ZooKeeper is a replicated service and tolerates faults of the servers that run it. To guarantee that applications perform correctly despite concurrent access, ZooKeeper implements an efficient replicated state machine. ZooKeeper guarantees that updates to nodes are totally ordered. Operations that read the state of nodes simply reads the state of one server replica, making read operations significantly more scalable. (More on guarantees: wiki guarantees).


Recent Talks about ZooKeeper


Current developers

Have an idea to contribute to ZooKeeper? Drop us a message (zookeeper-user@lists.sourceforge.net)


Download

    Download page


last modified: Tuesday, 28-Oct-2008 21:35:25 UTC