Wednesday 13 May 2009 - Self-assembling high availability grids
Currently under testing and development: the amazing "infinite rabbits from a single venti-hat" High Availability Grid. The past few weeks we've been exploring ideas triggered by the possibility of storing entire sets of machines in both virtual machine and fossil vacroot form in a set of arenas and isect data for a venti server. Before your eyes glaze over from the seeming over-complexity, lets cut to the chase, the goal for the end user.
The end product is a relatively small file distributed as a .tgz that contains a complete set of files for a venti server and a large collection of .vac files. The venti data includes the rootscores of many different machine fossils, and also entire preinstalled qemu virtual machine images that correspond to those rootscores. Using either a native or plan9port venti, the user can choose from a large number of differentlty configured machines to instantiate on either native hardware or as a virtual machine. The venti has data corresponding to machines ranging from an untouched default install to a large grid of dozens of machines. Due to the nature of venti, storing an arbitrarily large number of differently configured virtual machines and fossils has almost no overhead.
The system images included are configured to act as an integrated plan 9 network with a few tricks designed to provide a high availability 'cloud' environment out of the resources available. This is the system employed on the 9gridchan version of the grid which will be opened for public use soon. Our current setup uses one master and one backup venti, three fossil file servers, three cpu servers, and an arbitrary number of terminals. The creation of a single unified environment for the user along with failsafe reliability is accomplished via simple scripted filesystem imports and binds.
Any of the 3 fossils can boot backed by either of the two ventis, which use venti/rdarena and venti/wrarena to make scores available on both. Each CPU server boots with a TCP root from a different fossil. However, all the fossils are basically identical, and when the user logs into any cpu server, 9fs imports of the other fossils are imported, and the primary fossil's $home is bound over the boot $home by the cpu. A simple backgrounded script then mirrors the data from the active $user to the backup fossils.
The user interaction model is to dial each of the three cpu servers from a different window on their terminal. Each window will behave identically and binds $home from the same primary fossil, but is using a different CPU with a different tcp root and venti backup. Once the user has dialed in their three windows, they can work freely in any of them and see the same file data. If that fossil fails, the backup fossils, no more than 5 minutes out of date, can be bound to $home. A standard usage model is to run rio on one cpu, acme on another, and rc shell only on the third. The user experience should be identical to working with a single all-in-one machine with the difference of enhanced reliability and the increased performance of truly independent multiple CPUs.
A public release of a first version of these tools along with some updates to the g/scripts and preconfigured qemu images should be arriving soon. Feel free to stop in #plan9chan on irc.freenode.net if you want to explore this system in its development state. Standard public resources such as omni are unaffected by the current development environment but will become integrated at the time of an initial public distribution.