Saturday, 3 January 2010 - coming soon, updated Qemu image with many new tools (2010/01/03)
Now that recent projects like writable /proc/pid/ns and rootless post kernel load startup have been announced on 9fans, its time to make a new version of the Qemu preinstalled image with all the new stuff. It will include the latest updates of the previous grid software as well as hubfs, grio, and the new startup process - all available optionally, along with the traditional default kernels and configs. However, I am going to try to make the customized environment as compelling as possible, because I'd like to make a good case for the idea that using multiple system nodes, virtualized or not, has practical benefits. Some goals:
multi-machine aware startup to bind in fileserver and cpu resources early if available
easier to make use of 9gridchan.org hosted resources - and 9gridchan registry needs more real services
easy to set up highly unconventional namespace structures coexisting on a single machine (rootless boot helps)
simple, scripted data replication and machine backup operations
easy bootstrapping from grid of VMs to physical machines or mixed grid
Release goal is by the end of January, with a rough spin available for interested testers a week before or so.
Wednesday 21 October 2009 - Overdue Hubfs announcement - improved "screen" utility (2009/10/21)
Since this has been up on sources for several months now, I really should have announced this here already. Hubfs is an improved screen-like tool for Plan 9, implemented as a 9p fs with client. It is much more resource efficient than iosrv and, as a 9p fs, much better integrated into the overall system. The user interface is also much more sane. It can be installed from the source tarball contrib/mycroftiv/hubfs.tgz, or with contrib/install mycroftiv/hubfs. Basic usage is just 'hub NAME' to either start a new shell/hub session or connect to an existing. The manpage explains most of the details. Following completion of a solid version of hubfs I hit the burnout threshold as a result of several months of 16 hours a day of Plan 9 and retreated to a remote corner of namespace for a couple months. With the arrival of the winter months, however, I have a good excuse to turn on all my computers in the service of home heating, and undoubtedly this will result in a renewed burst of development energy.
Wednesday 8 July 2009 - Announcing Iosrv - "screen without a screen" (2009/07/08)
After several weeks of isolation in their underground lab, the /9/grid's craziest researchers are now unleashing upon the world our latest weapon in the struggle for an open collaborative decentralized worldwide grid:
Iosrv.tgz
Available on Bell Labs sources - contrib/install mycroftiv/iosrv or as a tarball contrib/mycroftiv/iosrv.tgz to mk install at your pleasure, or here - this is what happened when we decided to get serious about screen-like functionality for Plan 9 while also trying to pursue some tantalizing hints about some of Doug McIlroy's original ideas that led to the creation of UNIX pipes. Iosrv (which is controlled via a wrapper scrip called io for both the clients and server) does not emulate a terminal in the manner GNU screen does - instead, it follows the Plan 9 principle of staying close to file i/o fundamentals and provides multiplexed buffered pipes that can be attached to the three standard file descriptors used by textual environment programs such as the rc shell.
As a consequence of this, we have some new tricks, and here's the best one:
The clients attached to an iosrv can all share locally executing shells with each other in addition to using the shells on the iosrv host
The standard model of usage that corresponds to how GNU screen works is for clients to attach to an existing iosrv in an imported /srv (started with a command like io rcjul15 on the iosrv remote host ) using the io wrapper script:
io SRVNAME
This command connects to an existing /srv/srvname. Once connected, additional commands will be trapped by the client. To create and attach to new shells hosted by the iosrv remote machine with the command remote # where '#' is a multiple of 3. The first new shell might be created with:
remote 3
followed by subsequent shells created by replacing '3' with successively larger multiples of 3 as the number of backed shells increases. (Each shell uses 3 file descriptors) To move between shells after they have been created, attach # will connect to previously created rc shell Hubs.
attach 0
will move back to the set of file descriptors beginning at Hub 0. (More on Hubs in a moment). The new trick is to start a new shell like this:
local 6
which will start a new rc shell on the CLIENT machine, but connect its i/o file descriptors to the remote iosrv HOST. The consequence is that the locally executing rc becomes shared back to the iosrv, and other clients of the remote host can attach to it using
attach 6
and the shell running on your machine is available to them exactly as a shell running on the iosrv host. In other words, iosrv really does 'serve i/o' and doesn't care what's considered the client or server from the viewpoint of the user, it just pipes data back and forth to whatever its pipe Hubs are connected to.
Ok, there's that word again - Hub - the Hub is the low level data structure iosrv uses to organize connections. Following Pike's classic dictum that "data dominates", we are offering a new variant of an old abstraction, the pipe. A Hub is a pipe, but it has multiple nozzles on each end. It accepts multiple simultaneous readers and writers and allows them to connect and disconnect independently to and from the flow of data in real time. The core iosrv knows nothing of rc, or shells, it simply manages a set of file descriptors connected to Hubs. A standard shell has 3 open file descriptors, so 3 Hubs allow an arbitrary number of clients to all make use of the same shell. Adding a new shell simply means creating 3 new Hubs and then starting an rc using file descriptors managed by that iosrv. This why standard attach, remote, and local commands always take a multiple of 3 as their numeric parameter.
On the client side, a smaller program called ioshell navigates the forest of piped file descriptors for the user, acting to connect the file descriptors of the initiating client to the Hubs provided by iosrv. It also monitors user input for command strings and then takes actions or passes them to iosrvs ctl file as appropriate.
The iosrv is completely parallel, forking off independent processes for reading and writing each connected file descriptor. It is not uncommon for a busy iosrv to be coordinating 50-100 running processes and holding double that many file descriptors. This will not however clog the host system, because all processes use QLocks and/or blocking reads to control execution, yield execution frequently, and each Hub has individually tunable process cycle sleeptime. Testing shows that even a low resource Qemu based virtual machine can easily handle multiple simultaneous clients and servers to multiple local and remote rc shells.
The iosrv also provides fine-grained control over resources for the user, although the interface to this is a work in progress. Using iosrv for screen-like functionality is its main intended purpose, but the design is deliberately open-ended. It is already possible to use iosrv as a kind of greatly expanded 'tee' for data streams and we will be trying to prototype some multinode applications that pipe data around in constant loops between them.
We have been using iosrv heavily and testing it extensively. It is not perfectly polished but we have found it to be stable, reliable, and functional in its current state. Suggestions for features and code improvements always welcome.
Look for shared public shell exports as services in the 9gridchan.org 9p service registry soon, also.
Thursday 18 June 2009 - screen functionality in a few lines of rc (2009/06/18)
This is an extension of the earlier post - this is a pair of scripts intended to be used on nodes setup similarly to those described in 'reversed remote execution' - it allows shared terminals to have persistent state in the background with the ability to be pushed onto the remote clients (which are providing rio as a service).
#! /bin/rc
# start up an rc with its i/o connected to pipes set
# this monitors and its window should be set to scroll if not backgrounded
mkdir /tmp/wpin
mkdir /tmp/wpinclone
mkdir /tmp/wpout1
mkdir /tmp/wpout2
mkdir /tmp/wpout3
bind '#|' /tmp/wpin
bind '#|' /tmp/wpinclone
bind '#|' /tmp/wpout1
bind '#|' /tmp/wpout2
bind '#|' /tmp/wpout3
window -m
rc <{tee /tmp/wpinclone/data < /tmp/wpin/data1} | tee /tmp/wpout1/data /tmp/wpout2/data /tmp/wpout3/data
#! /bin/rc
# attach to a copy of grc using wpout $1 and push to a second wsys also
mount $wsys /mnt/wsys new
bind -b /mnt/wsys /dev
read -m /tmp/wpout$1/data1 | tee /dev/cons &
read -m /dev/cons | tee /tmp/wpin/data &
cat >> /tmp/wpin/data
The usage model is probably not exactly transparent from these scripts - the first script is named grc and it creates a set of pipes and starts an rc with its input and output redirected to them. This is what creates the 'screen' like functionality - any number of processes can control the shell by writing to /tmp/wpin/data, and as many clients can read the output as there are teed output pipes. Window -m is started after binds to provide a seed window - you will probably want to start more than one additional window -m to preserve access to the pipes in the namespace.
The second script I name getgrc and it assumes that you have set $wsys to a traget window system, then starts a new window on it and connects the i/o of both the remote and the local window to the specified output pipe. It should be noted that the quasi-screen functionality is independent of the idea of mounting a remote window system and using i/o from both/either location. If you want a simple screen equivalent, simply making the pipes and starting the rc with i/o redirections allows multiple clients to use it by connecting to the piped files.
Tuesday 16 June 2009 - Reversed remote execution connections and shared sub rios (2009/06/17)
An experience I continually have with Plan 9 is discovering that everything I have ever wanted to do with computers but have never been able to effectively before can be done with about 3 lines of rc and the standard Plan 9 toolkit. Right now I am gleefully exploring what I find to be an incredibly rich and useful set of possibilities created by rio-as-fileserver. First, let me describe the setup I'm using so you can follow along - you want two plan 9 systems for this, with at least one setup as a CPU server. A pair of Qemu VMs will work fine. Let's just assume you have a pair of CPU servers running, machines A and B, each with a graphical display, rio running, and with a standard exportfs listener on port 17007.
You are working using cpu server A, and your friend is using cpu server B. For convenience, we will describe this as if you have full privileges on each other's machines, but these tricks can also work fine between parties with more limited trust with appropriate modifications to how the filesystem exports are set up. Your goal is to use Plan 9 for some lightweight interactive resource sharing for real time communication and collaboration. Each of you opens a subrio within your main rio session. To start off, you need access to some of machine B's resources. You are working using cpu server A, and you start with:
import -c cpuB /srv /n/cpuBsrv
Now you want to write a message to your friend that will appear on his screen in a new window briefly.
wsys = /n/cpuBsrv/rio.friend.68962
window -m rc -c 'echo hi fred && sleep 5'
And up on Fred's screen on machine B pops a window saying 'hi fred' that sticks around for 5 seconds, then vanishes. How and why? Because rio is a fileserver, and the window command mounts whatever rio is specified in the $wsys. The window command is actually an rc script that can show us the principles involved for a lot of nice tricks. How about just popping up a window to talk back and forth:
mount /n/cpuBsrv/rio.friend.68962 /mnt/wsys new
bind -b /mnt/wsys /dev
cat /dev/cons &
cat >/dev/cons
And you have created a window on your friends screen, with each of you able to see whatever the other types. This is a perfectly functional and usable way to communicate, and the ability to initiate the connection by making the window appear on demand in the 'shared' rio is an improvement over other simple equivalents like telnet or netcat based direct chat.
Another very simple and useful application is to allow your friend to run an interactive rc on your machine, with the display on his system, while you watch. In other words, he is using your machine in a manner similar to a telnet/ssh login, even though you 'pushed' the window onto his machine. The first two setup lines are the same as before:
mount /n/cpuBsrv/rio.friend.68962 /mnt/wsys new
bind -b /mnt/wsys /dev
exec rc -v < /dev/cons |tee /dev/cons
Now you can watch in real time as your friend uses your system. You can also easily invert this - and let your friend watch what you are doing. Just change the final line to:
exec rc <{tee /dev/cons} |tee /dev/cons
This changes the source of the controlling input to the shell from the remote /dev/cons to the local standard input. I believe both of these simple interaction models provide useful functionality in very direct ways. The real-time mirroring of the textual input and output in a shell window is provided without any of the usual overhead of a tool like VNC viewing and the use of a sub rio session as a container for the shared windows makes sandboxing via namespace manipulation very easy - and it should be noted that sharing a rio service is very different than providing full cpu server functionality, as only the functionality of rio is being exposed to the remote client. These simple filesystem and i/o based operations provided fine grained real time connectivity options with lower overhead and potentially greater security than the predominating methods.
The above commands are very simple and can certainly be improved upon - in particular the standard error is being abused and the prompt and errors will be suppressed for the remote user. I'm certain a lot of Plan 9 users out there have nifty improved versions of these tricks or similar. Using the fileserver properties of rio to create a sandboxed dynamic shared environment is a good fit for the goals of the /9/grid so we'll be posting an example of a scripted 'shared workspace' setup script and commands soon - its a pretty clear upgrade from the current 'grid graffiti wall' to a real-time 'grid graffiti rio!'
Sunday 7 June 2009 - Planning for demo grid setup (2009/06/07)
We are trying to determine the best way to have a larger set of public services - more nodes, and a demonstration of ideas like the High Availability Grid, an attempt to make a redundant and failproof environment that still maximizes the utility of the attached resources and can be flexibly reconfigured on the fly. Most of the following are things that we have already protyped and tested, but haven't been available through our public portals. Things we'd like to implement (perhaps not all at once):
A high availability configuration of resources using about 6-8 functional nodes divided between 2-3 physical machines.
On-demand temporary single VMs
"The proving grounds" - a combination of both of the above, but with the intention of stability and stress testing for various multimachine configurations.The only way to test 'failproofing' is to force failures and test the ability of the systems to recover.
We've been doing some testing of VMware server in addition to our normal Qemu/9vx/p9p based setups. We prefer free open source personally, but some people prefer to VMware to qemu so we may begin providing vmware .vmdk images as well as Qemu qcow2s. For anyone who wants to try using vmware now, it is possible to use qemu-img convert to make a vmdk from a .qcow2, but there were a few additional adjustments we needed to make for best results, notably recompiling the included pccpuf kernel.
Wednesday 3 June 2009 - Documentation added for new tools (2009/06/03)
Well, some documentation for some of the new tools at least, but its a start. Writing documentation is hard. The organization of knowledge is the tricky thing. The balance of conceptual explanation and task-oriented how-to is hard to get right. The docs in the docs directory still need to be supplemented with some specific task-oriented recipes for doing things like making a fully customized local grid, which requires a bit of coordination of action at the host os level and the VM interiors. As a warmup, I'll try to summarize the process here:
The base system image is designed to serve as the seed for a multisystem grid. It includes a variety of /cfg directories to use as base roles for new systems, and the confighelper script is designed to help do customization quickly. If you want to create a personalized local grid of machines from the base image, here is how to do it:
Test the base image in its default configuration to verify functionality before customizing.
login as bootes in cpu server mode and start up the confighelper
choose to (r)eset machine key and passwords, and answer the prompts. Your chosen authdom is particularly essential - it can be your domain if you have one, or just an arbitrary name for 'your grid turf'.
reboot the vm to cpu server mode and answer the prompts to create the new machine key. You only get once chance to set bootes' password here, be careful. Then run auth/changeuser bootes and enter the same password you just did, and then run auth/changeuser for your other users and set their new passwords. Run auth/changeuser -n for each user also to update the netkeys.
at this point your new auth setup should be complete and you should be able to drawterm/cpu in, 9fs if the port 564 listener is running, use netkey auth, etc. Please test whatever functions are important to you.
this is your new seed image. If you use it as a base for cloning your additional grid systems, they will all include copies of the auth information and keys and users.
now you can manufacture as many customized systems as you want using the confighelper tools. to make a new customized system, first create a new blank qemu qcow2 for it. qemu-img create -f qcow2 newsysname.qcow2.img 1G creates a qcow2 blank disc that can expand up to 1gb of storage. A venti backed fossil is very small so you probably wont use much of that space.
now boot qemu with the blank disk attached as -hdb. in windows for instance, you might do qemu -L . -hda g9.qcow2.img -hdb newcpu.qcow2.img -redir tcp:567::567 -redir tcp:17010::17010 -m 256, then drawterm in as bootes. Run the confighelper script and choose to (a)dd new system to grid. It will prompt you for a system name and other info and a config from /cfg to clone - it will then transform the disk on drive 2 into a customized clone of your seed system on drive 1.
continue this process to make as many customized VMs as you want. A very common customization to apply would be changing the venti server IP so you could boot additional nodes on other machines using your existing venti server.
these cloned VMs are bootable in exactly the same way as the parent images, and can make 'children' of their own.
Making a new blank qcow and booting and running the configscript takes about 1 minute once you are familiar with the process, so you can make two VMs for every computer on your LAN more or less instantaneously. Note that you might want to 'preload' your /lib/ndb/local with the new machines using confighelper's nbd updater prior to starting the cloning process, so all children have all their peers in their ndb.
Monday 1 June 2009 - New Release Cycle begins! (2009/06/01)
Well, we've got a new version of the qemu-based grid tools up - it provides a lot of new functionality, flexibility, and hopefully ease-of-use. We haven't really started documenting everything that's going on with it, and we are going to be rolling out new services related to its design during the upcoming weeks. The image and tools itself should start getting more frequent updates and we are hoping to leverage Venti and other Plan 9 based tools to do so.
If you already have access to a linux machine, the new base version of the image provides what we hope is pretty neat out-of-the box functionality - you extract the .tgz, cd into your new gridtools directory, and run (as root) a simple iptables script to redirect some low ports to high ports for the qemu VM to access without root privileges. The gridlord script (run as normal user inside the extraced directory, no install required) then provides a simple menu-based interface to control a full distributed plan9 system all on your local machine. Fire up a VM and let it boot in cpu server mode, then drawterm in as bootes, glenda, or gridna with password: gridpass. The VM image itself is meta-pre-configured - check out the /cfg directory, which holds relatively self-descriptive configurations for multiple machine roles. /usr/bootes/bin/rc contains confighelper -- a wrapper for an set of configuration helper scripts designed to let you quickly and easily control some administrative variables. Please note that right now this is an ad-hoc set of recipes based on the base configuration we provide. We hope to develop it into a more general purpose Plan 9 configuration utility, but it is currently not recommended for use with anything other than the 9gridchan.org preconfigured qemu image.
There are several other customizations and extras included in the image - and source code to everything, of course. /usr/grid is a 'non-login' user whose directories contain some of the grid tools. /usr/grid/src and /usr/bootes/lib have most of the additonal materials, along with /usr/grid/bin/rc and /usr/grid/bin/386 - these directories are bound in during most user's profiles, except glenda. We left Glenda basically untouched from the Bell Labs setup in this image. Thanks to everyone who helped test and offered suggestions for this version of the tools, and we look forward to improving it further according to user suggestions. As of the time of this posting, the surrounding documentation and web resources are still mostly un-updated so things like the walkthrough screenshots gallery still apply to the old version of the image.
Wednesday 13 May 2009 - Self-assembling high availability grids (2009/05/13)
Currently under testing and development: the amazing "infinite rabbits from a single venti-hat" High Availability Grid. The past few weeks we've been exploring ideas triggered by the possibility of storing entire sets of machines in both virtual machine and fossil vacroot form in a set of arenas and isect data for a venti server. Before your eyes glaze over from the seeming over-complexity, lets cut to the chase, the goal for the end user.
The end product is a relatively small file distributed as a .tgz that contains a complete set of files for a venti server and a large collection of .vac files. The venti data includes the rootscores of many different machine fossils, and also entire preinstalled qemu virtual machine images that correspond to those rootscores. Using either a native or plan9port venti, the user can choose from a large number of differentlty configured machines to instantiate on either native hardware or as a virtual machine. The venti has data corresponding to machines ranging from an untouched default install to a large grid of dozens of machines. Due to the nature of venti, storing an arbitrarily large number of differently configured virtual machines and fossils has almost no overhead.
The system images included are configured to act as an integrated plan 9 network with a few tricks designed to provide a high availability 'cloud' environment out of the resources available. This is the system employed on the 9gridchan version of the grid which will be opened for public use soon. Our current setup uses one master and one backup venti, three fossil file servers, three cpu servers, and an arbitrary number of terminals. The creation of a single unified environment for the user along with failsafe reliability is accomplished via simple scripted filesystem imports and binds.
Any of the 3 fossils can boot backed by either of the two ventis, which use venti/rdarena and venti/wrarena to make scores available on both. Each CPU server boots with a TCP root from a different fossil. However, all the fossils are basically identical, and when the user logs into any cpu server, 9fs imports of the other fossils are imported, and the primary fossil's $home is bound over the boot $home by the cpu. A simple backgrounded script then mirrors the data from the active $user to the backup fossils.
The user interaction model is to dial each of the three cpu servers from a different window on their terminal. Each window will behave identically and binds $home from the same primary fossil, but is using a different CPU with a different tcp root and venti backup. Once the user has dialed in their three windows, they can work freely in any of them and see the same file data. If that fossil fails, the backup fossils, no more than 5 minutes out of date, can be bound to $home. A standard usage model is to run rio on one cpu, acme on another, and rc shell only on the third. The user experience should be identical to working with a single all-in-one machine with the difference of enhanced reliability and the increased performance of truly independent multiple CPUs.
A public release of a first version of these tools along with some updates to the g/scripts and preconfigured qemu images should be arriving soon. Feel free to stop in #plan9chan on irc.freenode.net if you want to explore this system in its development state. Standard public resources such as omni are unaffected by the current development environment but will become integrated at the time of an initial public distribution.
Apr 12 2009 - Just checking in (2009/04/12)
No big news on the grid past few weeks - had a service interruption for a few days, but it was resolved. Shockingly, the brief absence of 9gridchan services did not cause any significant rioting or result in the overthrow of any governments. We are greatly relieved.
Feb 19 2009 - Random update (2009/02/19)
Random blog update to make sure people know the project is healthy and ongoing. An actual public announcement of some kind on a known Plan 9 forum such as 9fans has been something we've been waiting a long time to do.
Feb 02 2009 - Explanatory captions for the walkthroughs. (2009/02/02)
Thanks to the ceaseless flow of innovation and progress in free and open source software, the walkthrough screenshots tutorial galleries now feature captions and better navigation! Additionally, we are working to provide new services and content on the dynamic 9p registry system. Nothing too thrilling, but as an example, the omniguest user is now exporting /bin. After connecting to that service, binding the imported directory over the local /bin in an rc window allows testing of any of the installed software on the public cpu server without actually logging in.
Jan 31 2009 - Setup screenshots gallery (2009/01/31)
Created a couple galleries with screenshots of starting up the image, connecting with drawterm, and running configscript. Hopefully having a visual reference will make it easier for people trying plan 9 for the first time to see if they are seeing what they are supposed to see. More screenshot galleries will be added to demonstrate making use of the g/script toolkit and 9gridchan.org resources.
Jan 30 2009 - post #2: more on 1.1 (2009/01/30)
The preinstalled qemu node image has several purposes. Most importantly is to provide an easy-to-use form of Plan 9 for every platform. Qemu runs on everything and a preinstalled image means that once the download is complete, the user can be inside a fully functional Plan 9 enivornment instantly. Another goal is to provide easy access to the key component of Plan 9: a CPU server. Another is is to provide a user-driven, decentralized platform for grid computing research and development. The latest version of the image makes bootup easier with a single menu for boot choices. There are more variant versions of plan 9 software and some customizations like the default font, taken from quanstro's subpixel fonts. A small sampler collection of .vac scores for downloading additional docs and collections of Plan 9 pictures from venti.9gridchan.org is also included. Thanks as always to the Plan 9 contributors whose work is included such as fgb (contrib and abaco), andrey (irc7 and links port), and others.
Jan 30 2009 - Development blog post #1 (2009/01/30)
A big day of upgrades - along with release 1.1 of the /9/grid qemu image and toolkit, we've upgraded www.9gridchan.org with Uriel's latest Werc + Soul9's image gallery application plugin. Thanks to them for their development efforts. The new 1.1 image features more stuff and a smaller download and installed size, along with default user accounts for instant drawterm/cpu access for testing. (CHANGE THOSE PASSWORDS before allowing access from external networks, everyone!)
Ongoing projects include adding scripts for managing venti servers and databases of .vac files to the g/toolkit and a grid game based on namespace manipulation. Future blog posts will also provide some tips and tricks for using grid resources and tools.