OS Holes

OS Hacking and more.

5.04.2011

Revival: Starting from Embedded Systems

I haven't posted anything in quite a while, but as I wait for stuff to compile here, I think sharing a few observations might not be a bad idea.

I'm currently going through the process of compiling TinyOS and the appropriate ARM toolchain for a Cortex-M0 board I got my hands on. Specifically, I've got an NXP board with an LPC1114, and if you were wondering, it was free. To give a little perspective here, the LPC1114 is an MCU with 32K of flash and 8K of ram...essentially the same resource class as an MSP430 or Atmega328. The difference here is that this is ARM, so we get 60Mhz and 32bit instruction set. This doesn't help with resource constraints, but it should allow for some performance niceties.

Out of the box, the NXP kit has some basic libraries and an IDE integrated with GCC+proprietary flashing tools. The example code provides some single-task I/O examples, written in C. Pretty basic, but really these are things I'd be far happier to do on an Atmega328 with a lazy-man Arduino boot loader. I guess this is what they expect applications to look like; nice prototyping or something.

Thing is, this hardware is suited to much more sophisticated applications: it supports I/O at faster rates than the Atmega could dream of, while providing enough clock cycles and operation efficiency to do more than one task at once. The thing screams "multi-tasked OS" at me. The step up here is of the kind needed to support efficient, concurrent handling of multiple sensors and coordination of outputs (ala mechanical devices--if you don't already see the robotic tendencies here...)

The other neat thing we should be able to do on Cortex devices of this class is support some greater levels of abstraction. If we're going to be building multicomponent systems, the benefits of thinking at a high-level about those components are pretty clear. This should be a language/framework feature, one we have no excuse to avoid.

Ok, so I'm looking to build a parallelized, modular system on this cortex. I can think of a hundred ways to do this--throw linux at it and run, slap a JVM on the metal, zoom around in a comfy RTOS. Only one problem...Memory footprint. 32K won't fit any sort of JVM, definitely not linux, and even most respectable RTOSes need more space. Not a happy situation.

Well, out of the box, the folks NXP have given me an example of a FreeRTOS implementation on the lpc1114, so thats something. Pretty straightforward to build and deploy in their tools (which run on linux and windows, but not on my mac--nbd) and the debugger works as it ought to. All well and good here, I can spawn tasks from C function pointers, and they get scheduled. I/O runs. Unfortunately this isn't moving me toward a parallelized system on two fronts. First, this mode of operation is conducive to spawning a bunch of tasks related to the same component of a project, but not separate systems. This seems like a trivial distinction from a code perspective, but its actually a pretty serious abstraction barrier. If I have to mix subsystem code, abstraction suffers, period. Speaking of abstractions, this code is no more high-level than the C that drives the Blinky demo. I'm still calling out to and polling GPIO pins for control of a device.

Ugh.

A step in a different direction, then, is to play with the model used in wireless sensor networks--the TinyOS platform and its associated nesC language. Ironically, this was designed to run on the Atmegas and MSPs that the LPC is trying to displace, but porting TinyOS to the Cortex should give some neat advantages.
The biggest thing is that on the slower 8 and 16bit MCUs, even with TinyOS, Motes would be serving a single purpose or monitoring a single system in the physical world. There wasn't really room to be switching around--the most switching you'd do would be between radio operations and sensing. In the Cortex world, we should be able to manage a handful of different real world systems without introducing serious I/O or processing latency.

So why try TinyOS?

TinyOS's model for application design is pretty neat: it revolves around locally-scoped modules, and wires that link interfaces between them together. With these wires, modules send each other events, and a great amount of logic is executed asynchronously/on demand. The system is conducive to reactivity, which is essentially what all robots and sensors do in the physical world. By using modules that operate on their own, TinyOS establishes a neat sort of *implicit* parallelism--whereby its explicitly declared what operations are done in series and parallel, but the development paradigm is toward modules acting in parallel streams. To be clear: you can have modules for different sensors, for example, and they'll notify controller logic independently of each other--in parallel. This also parallelizes the handling logic for real-world events, accommodating them better. It's neat. Or a starting point for something better at least ;)

So I'm trying to setup everything to make TinyOS work on this little LPC. I'm starting with the work from this project which should setup the toolchain nicely. I'll build using the tinyOS system and then use the NXP flasher to move binaries around. For reference I'm working out of an Ubuntu 11.04 VM, 64bit, fresh from today.

HowTo Notes/Progress:

Sorry to say it, but TinyOS-Cortex is crappily documented. Cortex-M3 ports of tinyOS have been made and rolled into the main tinyos 2.x stream, leaving m0 a bastard little brother. Alas.

From my fresh vm, I installed the flasher for the NXP (part of their lpcxpresso kit, whatever, irrelevant). Then I pull the scripts from google code to setup the tinyos-cortex stuff. There are a few of dependencies that aren't stated as part of this-- I needed to add the 32bit support libraries (ia32-libs), java (sun-java6-jdk), autoconf/automake, a symlink for the 32 bit c++ stdlib (“ln -s /usr/lib32/libstdc++.so.6 /usr/lib32/libstdc++.so”)
^^ simple things you may already have. Just make sure they're there.
Given these things, I pulled the scripts into my /opt directory as specified here
I made a build directory within the tinyos-cortex directory as directed. As fix before running the script though, there is a change to be made. The fetch() function is slightly broken: in main.subr, wget should have a [capital] -O flag rather than a lower case -o. Subtle.
Then I run the tinyos toolchain script, followed by the cortex toolchain script. You have to watch really really carefully to see if stuff breaks, because its non obvious/doesnt fail the exit() call. It takes a while too.
After that, you have to add the binaries to your path:

export PATH=/opt/tinyos/bin/:/opt/cortex/bin/:$PATH

and get some symlinks in place

ln -s /opt/tinyos/build/tinyos-2.x /opt/tinyos/tinyos-2.x

Once that's in place we should be able to start building stuff. So here's the first problem: we don't actually have a cortex M0 target. I'm hacking around and using the m3 target, with the knowledge that the m0 compiler is getting invoked (i hope?)...otherwise i might need to define a new set of rules here...

11.15.2008

Bundle: OpenSolaris Live USB

Up until now I've posted all the instructions and ideas behind my OpenSolaris Live USB system, the sum of which serves as mostly informative for all but the most intrepid...
Attached here is the kit of all the tools you need to make a persistent OpenSolaris Live USB drive, from start to finish.
In the Zip file is the main conversion script, which will take a standard OpenSolaris Install and make it live, and an additional tool for use after you've converted your system.
This tool works with both 2008.05 and 2008.11, and should work with Nevada builds as well.

Now, the easiest howto:

You'll need:
1x OpenSolaris Live CD (Indiana Preferred)
1x USB stick, at least 4gb suggested.
1x Conversion kit convert.zip

1. Boot from the live CD, with the USB drive plugged in. Install OpenSolaris to the USB drive.
2. Reboot and boot from the USB drive for the first time. After logging in, shutdown.
3. Boot from the live CD again.
4. Unzip the conversion kit.
5. Open zfsconverter with a texteditor, and change the value of "protodir" to the directory where you extracted the zip, + /convert/proto. Everything else should be ok for a normal setup.
6. In a superuser shell run zfsconverter


bash> zpool import -f rpool
bash> ./zfsconverter
bash> zpool export rpool


7. Reboot, and boot from the usb drive.
That should be all you need. After converting the usb drive to a live usb drive, you can install any packages you like. If a package you install adds kernel modules or boot files, you'll need to run the updatemicroroot script after installation.

OpenSolaris now can be taken in your pocket and used wherever you roam...

10.27.2008

Making a dd'able OpenSolaris USB image

Recently I was sitting and thinking [deeply..zzz] about how it was that people who didn't already have OpenSolaris would be able to try out the OpenSolaris USB based installer distribution (the one made by distroconstructor & co.). You need to run the usbcopy script to even get the USB image copied to your usb stick, and that script uses all kinds of crazy Solaris-specific features...
Man, I really wanna be able to create this USB distro from linux...
So, instead of trying to rewrite the script and its oodles of sophisticated functions, why not just run it once, and create a disk image of its product--a dd'able OpenSolaris USB image!
That's exactly what I did. Basically, if you want to be able to create OpenSolaris USB distros from linux, all you need to do is get your hands on an image created in this method:
In Solaris, run usbcopy on [name of osol distro].usb, with a USB stick target that is ~1Gb in size.
Then once you've create that stick, dd if=[device path for stick] of=image.img bs=4096 , and you've created the new disk image, image.img. image.img has Grub and the MBR bits necessary to boot(which the .usb lacks), and can then be dd'ed to any usb stick from any operating system. So only one person needs to build one dd'able OpenSolaris image, and share it.
And the world can mass produce usb install kits for OpenSolaris regardless of a user's current operating system.

Note, if your looking to carry around OpenSolaris in your pocket, I strongly suggest that you use my persistent USB live boot instead of the above usb distro. See the concept for it here, and then pick up my automated installer. The difference is simple, my version is behaves like the version you install on a hard disk, which means you can install applications and keep all your data(oh, and you get ZFS too), where the other version is an installer tool.

I'm working on a less cludgy way to do this, involving modifying usbcopy to operate on disk images as opposed to usb sticks...that would make life nicer...we'll see...

10.15.2008

Workaround for the 1TB disk in OpenSolaris

Recently it came to my attention that OpenSolaris has a bug by which it does not support disks greater than or equal to 1TB in size, a result of a label reading problem that causes Solaris to read the disk geometry wrong. This occurs for disks with the SMI labeling scheme, and therefore renders those disks incompatible with OpenSolaris.
I've found a workaround for this problem:
It turns out that EFI labeling doesn't have the same issue in OpenSolaris, ie the disks work fine under EFI labeling. So if we use EFI labeling instead of SMI, we can get Solaris to recognize the disk correctly. Only problem with this is that you won't be able to boot off that disk with EFI labeling and a standard OpenSolaris boot(ZFS or UFS).
Got a fix for that too!
Following the steps for my OpenSolaris live boot, you can create a small usb drive/flash memory device with just kernel, microroot image, and GRUB, and boot from that drive. That boot will recognize the huge disk and mount it as the root filesystem, allowing you to continue the boot normally from there. So there's a bit of hardware involved this way, but you really only need a 128mb flash drive, and you can get one of those for less than a dollar(US).
So, howto?
Follow my procedure for the ZFS live boot, but really focus on the creation/alteration of the microroot. Copy the microroot and kernel to the usb drive(UFS or even FAT) and install GRUB. Make sure to have a working menu.lst(it just needs kernel & module & title -- no ZFS here).
Then using a Mac or FreeBSD(or other ZFS-capable OS), format the big disk with an EFI label and ZFS. You can then follow my procedure for the rapid upgrade of an OpenSolaris install to get OpenSolaris on the big disk. I know this is a bit roundabout, but you wanted that 1-2TB disk right?
Now, you should be able to boot from the USB drive, and then it should then boot and chroot to your super drive...DONE~!

10.05.2008

Hypervisor To Go

Over the last few months I've been able to assemble a Hypervisor “To Go" based on the OpenSolaris with either Xen (Solaris xVM) or VirtualBox. There are still few outstanding issues but I completed a fully functional prototype (say, proof-of-concept) for this project. This presentation, which I gave at Sun Labs in August 2008, discusses the purpose, design, and implementation of Hypervisor “To Go”.




Hypervisor “To Go” was an idea that occurred to me first when I visited the Sun Labs for their Open House event last year. It was really a simple concept: What if you could have a hypervisor installation on a USB stick that was incredibly portable and independent from the host hardware, from which you could boot your favorite operating system, with any/all of your applications and customizations.




This concept evolved to something a bit more concrete, with a couple of key components. The "system" could be based on what is arguably the most portable device around – a USB flash drive. A bootable USB drive can be taken anywhere and it can be used in pretty much any system, so we'd be able set up the user's world there in a portable fashion. But now, the fun part: there are a limited number of OS distributions that actually support Live USB boot with data persistence (i.e. preserve the the modifications and new files across reboots). So in order to allow the user to run his or her favorite OS in combination with his or her choice of additional software on an arbitrary hardware, virtualization (a hypervisor) would be the answer. We'd package a hypervisor with the stripped down OS (base) as a bootable image for a USB flash drive. The user can boot such an image anywhere, and then choose his or her guest OS and additional applications.



What we ultimately want is a situation where the user can plug a USB drive into any machine boot from his fully configured USB drive, and have a working environment out-of-the-box. The user should be able to copy the preconfigured setup to this machine with the click of a button, deploy his "appliance" instantly, and make use of local machine hardware resources, such as disks and networking. Since the “appliance” is a virtual machine, it can be run with the hypervisor regardless of the host. Such a setup has real-world utility in a data center environment.



Perhaps the greatest hurdle we encountered was creating a USB bootable distribution of OpenSolaris, which I have detailed in earlier posts. In short, the OpenSolaris boot process is currently tied to the identity of the boot media, which doesn't change on a single system. However, across many systems, BIOS or a disk controller may identify a single disk in a multitude of ways, which constrains interoperability. To get a USB live boot, we needed to change to boot process itself.


Our changes to the boot process allow us to boot from a USB flash (or a similar device) plugged into a random slot (i.e. without a preconceived knowledge of the bootpath), discover storage devices, then let ZFS to discover and mount the root pools dynamically, and finally chroot to normally configured root filesystem and run real /sbin/init. The ramdisk is used as trampoline only, but having this trampoline allows us to discover the root pool dynamically, without having the Solaris device name hardcoded in menu.lst or in a disk label.






Here we see the code from our boot discovery process. All essential changes are confined to the /sbin/init script on ramdisk (miniroot). The disk file system contains a result of a normal installation from LiveCD without any additional tricks.



Here we see the OpenSolaris kernel code that normally mounts the rootfs and the associated virtual filesystems.



This is our script which mimics the original root mount process for a USB drive.


Finally, the real root device is located and init is started.



The whole concept of booting to ramdisk first, and then chrooting to a larger file system backed by stable storage resembles Linux boot. There are few differences in the Solaris case though. We had to work around a few, we tried to exploit the other ones to our benefit.

1. The key remaining problem is the size of the ramdisk image. Grub uses BIOS calls to load it. Many BIOSes are inefficient. It is essential to detect the optimal block size (typically 8KB, sometimes 4KB or 16KB) and use that size. 512B reads are very slow. This is a major issue when ramdisk is dozens or hundreds of MB.

2. With a linux ramdisk the kernel unpacks a cpio archive directly to the ramdisk. These files are deleted before the real root is chrooted. There is no such option in Solaris, so the all the memory used for ramdisk is gone forever.

3. Most of the ramdisk space is taken by kernel modules. David Minor suggested splitting the ramdisk image into separate pieces. We also looked into detecting necessary modules and stripping the rest (sound, networking, etc). We couldn't find anybody who knows/remembers the complete philosophy of how Solaris determines what to load. Without deeper understanding it is difficult to estimate how much space this could save.

4. There are issues running Solaris in a chrooted environment.
- The default module search path is /kernel, /usr/kernel, /platform, /usr/platform. If you have more modules in the real root file system than on ramdisk you have to either loopback mount /newroot/[dir] on /[dir] before chrooting, or to alter the kernel search path.
- Even though the content of /etc/mnttab is generated by kernel, it does not reflect the fact that in a chrooted environment /newroot/proc is /proc. (workaround: run "chroot /newroot mount" instead of "mount" when mimicing vfs_mountroot in the /sbin/init script.) This may also be the case with the share command.
- devicefs used for /devices cannot be remounted second time. (workaround: loopback mount as /newroot/devices.)

5. There are advantages using a ramdisk with Solaris:
- ZFS does not require additional tools to re-construct its view of storage devices, and the ZFS-enabled grub does not depend on BIOS view of which disk drive is bootable.
- Solaris can use "reconfigure" during boot rather than information injected by an installer or Live Boot scripts to discover all the devices.

6. Other thoughts:
- Debugging stuff in the /sbin/init script is pretty nasty because the console is normally configured by the real init program.
- If you miss something needed by the paravirtualized kernel then the xpv module does not load but the output goes by default to the paravirtualized console i.e. you have a completely black screen.


9.12.2008

OpenSolaris Kernel Modules: Load'em later

One of the biggest problems with the current OpenSolaris ZFS live boot is the size of the "minimal" microroot image used to boot the system to a useable state(to mount the root filesystem). It can be brought down to 170MBs decompressed, which is not too bad, except:
Loading the ramdisk image with grub is incredible slow over USB
And
When we chroot to the ZFS root filesystem, we relinquish the ramdisk, which hogs that much RAM and does no good.

A close examination of the microroot reveals that it is primarily filled by the contents of /kernel/* and /platform/*, especially those files in */drv/. These are definitely kernel modules and device drivers, which are not all necessary for finding the rootfs. Question is, which of these are actually necessary for the primitive boot process? and further, How can we convince kernel to fetch the rest of the modules from the ZFS root.
This would allow us to store only boot-essential modules in the ramdisk and keep it small.

And I have answers:
OpenSolaris' kernel loads modules and drivers "on demand", as they are called for by system services or applications. That means that it will take only the modules it needs, leaving others untouched until late in the boot. Moreover, OpenSolaris loads kernel modules from three distinct sources, /platform/kernel/, /kernel/, and /usr/kernel/. It is important to note that the modules are loaded from the filesystem, which means kernel also follows such filesystem links as loopback mounts(lofs). Finally, there is distinct set of modules that kernel will load before reaching /sbin/init, and these are for the most part boot essential.

Which means,
We CAN separate "non-essential" kernel modules from the ramdisk image loaded in a live usb boot, and allow kernel to load them from the ZFS filesystem later in the boot, when it is ready and the modules are actually needed. Telling kernel to load modules from the ZFS filesystem should be easy, except for the fact that kernel still thinks the root filesystem is the ramdisk when we are chrooted. In other words, userland thinks the ZFS filesystem is /, and kernelland thinks the ramdisk is /. This means kernel will search in [ramdisk]/kernel for modules rather than our /kernel when it wants them. However, by experimentation I discovered that kernel will follow loopback mounts to load modules, so in order to fool kernel into loading modules from [zfs]/kernel, all we need to do is loopback mount /mnt2/kernel on /kernel BEFORE we have switch root.

End Result:
We save ramdisk image space by storing more kernel modules on the zfs root and not loading them until we mount it.

I'll make the necessary procedure edits soon.

8.30.2008

Automated Installer for OpenSolaris Live USB

It's here...
I've created a program that will convert a standard OpenSolaris Install to a portable ZFS-based OpenSolaris Live Install. It automates the steps described in my previous posts as to how to make an OpenSolaris Install live usb with ZFS, and can be applied to a generic OpenSolaris install on either USB or fixed disk.

Simply put, it creates the modified ramdisk image necessary to boot OpenSolaris regardless of its root device, and injects that image into the installation correctly.

The automated installer can be run in one of two ways:
When supplied only the modified /sbin/init script
, it will assemble the rest of the necessary components for live boot, and install them accordingly.


Usage


This program should be run from a OpenSolaris [Indiana] liveCD or DVD.
Your target disk must have more than 4gb of space. It may be a USB disk, or otherwise



0) Install OpenSolaris to the target disk using the installer, and boot from it for the first time. Do not move the drive to a different port or computer
[You may skip this step for an exisiting install.
1) Boot from the liveCD, Open a superuser shell and
2) Import the zfs pool that contains OpenSolaris: zpool import -f [poolname]
3) Prepare the autoinstaller:
a) Create the following directory structure (or extract it from the zip archive)--> ./convert
./convert/proto
./convert/proto/sbin
b) Place zfsconverter in ./convert and init(the script) in ./convert/proto/sbin/.
c) Fix the permissions: chmod 555 /sbin/init && chown root:sys /sbin/init.
also, chmod +x ./convert/zfsconverter
d) Check that the "protodir" in zfsconverter is set to /[path to .]/convert/proto
4) execute zfsconverter


Note that if you wish to customize your installation with special components: scripts, kernel modules, etc, you may place these files inside the protodir, from where they will be copied into the microroot image.


---
The Script


#!/bin/bash

#########################################################
# Author: Anand Gupta
# 8/29/2008 - Sun Microsystems Labs
# zfsconverter
#########################################################

PATH=/bin/:/sbin/:/usr/bin/:/usr/sbin/

##Target settings
bootfs=rpool/ROOT/opensolaris
ramdisk=/media/OpenSolaris-2008-05/boot
zfsroot=/rpool

##Options
tempdir=/rpool

protodir=/root/convert/proto

##populate generic protodir from LiveCD
popproto()
{
mkdir -p ${protodir}/kernel
mkdir -p ${protodir}/mnt2
mkdir -p ${protodir}/usr/lib/fs/cachefs
mkdir -p ${protodir}/usr/lib/fs/ctfs
mkdir -p ${protodir}/usr/lib/fs/fd
mkdir -p ${protodir}/usr/lib/fs/dev
mkdir -p ${protodir}/usr/lib/fs/lofs
mkdir -p ${protodir}/usr/lib/fs/mntfs
mkdir -p ${protodir}/usr/lib/fs/proc
mkdir -p ${protodir}/usr/lib/fs/sharefs
mkdir -p ${protodir}/usr/lib/fs/tmpfs
mkdir -p ${protodir}/usr/lib/fs/zfs

(cd /usr/kernel/ && tar -cf - .) | (cd ${protodir}/kernel/ && tar -xf -)

(cd /usr/lib/fs/cachefs && tar -cf - .) | (cd ${protodir}/usr/lib/fs/cachefs && tar -xf -)
(cd /usr/lib/fs/ctfs && tar -cf - .) | (cd ${protodir}/usr/lib/fs/ctfs && tar -xf -)
(cd /usr/lib/fs/fd && tar -cf - .) | (cd ${protodir}/usr/lib/fs/fd && tar -xf -)
(cd /usr/lib/fs/dev && tar -cf - .) | (cd ${protodir}/usr/lib/fs/dev && tar -xf -)
(cd /usr/lib/fs/lofs && tar -cf - .) | (cd ${protodir}/usr/lib/fs/lofs && tar -xf -)
(cd /usr/lib/fs/mntfs && tar -cf - .) | (cd ${protodir}/usr/lib/fs/mntfs && tar -xf -)
(cd /usr/lib/fs/proc && tar -cf - .) | (cd ${protodir}/usr/lib/fs/proc && tar -xf -)
(cd /usr/lib/fs/sharefs && tar -cf - .) | (cd ${protodir}/usr/lib/fs/sharefs && tar -xf -)
(cd /usr/lib/fs/tmpfs && tar -cf - .) | (cd ${protodir}/usr/lib/fs/tmpfs && tar -xf -)
(cd /usr/lib/fs/zfs && tar -cf - .) | (cd ${protodir}/usr/lib/fs/zfs && tar -xf -)

# sh ${protodir}/additions.sh
}

##Preparation of filesystems
preparemounts()
{
mkdir /mntzfs
mount -F zfs ${bootfs} /mntzfs
mkdir /mntram
gzip -dc ${ramdisk}/x86.microroot > ${tempdir}/microroot.img
mount `lofiadm -a ${tempdir}/microroot.img` /mntram
}

##Unmount filesystems
umountall()
{
umount /mntram
lofiadm -d ${tempdir}/microroot.img
gzip -c1 ${tempdir}/microroot.img > ${tempdir}/x86.microroot
rm -f /mntzfs/boot/x86.microroot
mv ${tempdir}/x86.microroot /mntzfs/boot/
umount /mntzfs
echo "scrub zpool...if any errors arise, fix them and rerun this program"
zpool status
echo "if there are no errors, the installation process has suceeded"
}

##Fix Menu.lst
fixmenulst(){
echo "####"
echo "title Opensolaris Portable" >> ${zfsroot}/boot/grub/menu.lst
echo "bootfs ${bootfs}" >> ${zfsroot}/boot/grub/menu.lst
echo "kernel$ /platform/i86pc/kernel/\$ISADIR/unix " >> ${zfsroot}/boot/grub/menu.lst
echo "module /boot/x86.microroot " >> ${zfsroot}/boot/grub/menu.lst
echo "####"
}

echo Initializing installation, ensure that all prerequisites are fulfilled
echo "mounting filesystems"
preparemounts

echo "preparing protodir"
popproto

echo "copying files"
(cd ${protodir} && tar -cf - .) | (cd /mntram && tar -xf -)

echo "umounting filesystems"
umountall

echo "fixing menu.lst"
fixmenulst

echo "Done...Reboot to continue"



I'll attach the self-contained zip and an example microroot soon.

8.22.2008

Rapid Upgrade for OpenSolaris with ZFS

When upgrading an OpenSolaris install, we would like to preserve personal files and information with as little hassle as possible. This is particularly easy when we have multiple disks among which to separate OS and personalization. Unfortunately, it is not trivial to upgrade a single disk install of OpenSolaris while preserving one's files. I think I can change this though.

Currently, there are a limited number of options for updating an existing OpenSolaris install.
We can reinstall with a newer version of the OS, using the installer. Problem with this is the installer requires you to reformat and create the zfs pool and volume to be installed on. It is possible to have another partition and/or zfs pool on that disk to preserve your files, but this is contrary to the purpose of ZFS...And if you have 50Gbs of files, you don't want to spend hours backing them up and restoring.
We can use the package manager to upgrade the distribution and associated packages, but this can be a painfully slow process, downloading and installing thousands of packages...
We would use LiveUpgrade, except for the fact that a ZFS version doesn't exist atm...

So...we try a different approach, tailored to ZFS...
Simply put, we will install the system on another drive, then clone the boot environment created by the installer to our host/favorite machine via zfs snapshots...Lost? here are the instructions, its easier that it sounds:

You will need,
-A USB harddisk or external drive big enough to fit your Solaris Install (~4Gb or more) Remember that you can add packages later, so a small drive may be ok.
-Solaris Install media or Jumpstart Server
-Your target disk(what you want to upgrade), with more than enough space for a new install (>4Gb)

1) With the external disk plugged in, boot from your install media. You may want to disconnect your working drive for safe measure.

2) Use your installer to install to the external disk. DO NOT REBOOT when it finishes, or at least don't boot off USB.

3) Reconnect your target drive, and boot from a LiveCD distribution. We'll need to transfer the new boot environment to the old drive without having either running.

4) In a superuser shell (su)
There are 2 "rpool"s connected, so importing the correct zfs pool will be interesting. We need to find the vdev associated with the target drive so we can find its UUID.

$ format -e



This will print out the disk devices present. Find the cXtXdX associated with your target drive.


$ zpool import



This will print out the zfs pools present. Find the rpool that contains cXtXdX you noted above. Note its UUID (a long string of numbers).


$ zpool import [UUID]



5) [Optional] Preserve your old boot environment, this wil allow you to rollback the changes if necessary. This can be done as follows:


$ zfs snapshot rpool/ROOT/opensolaris@itwasworking
$ zfs send rpool/ROOT/opensolaris@itwasworking > /rpool/oldbe.snapshot



we won't be playing with your export/home because that's where your home directory is...we wanna save that, right?

6) Kill the old BE...


$ zfs destroy rpool/ROOT/opensolaris"




7) Import the USB drive...
Once again, find its UUID by calling "zpool import"(see step 4). Then:


$ zpool import [UUID] xpool



This should rename that pool xpool...

8) Copy over the new OpenSolaris Install


$ zfs snapshot xpool/ROOT/opensolaris@iamnew
$ zfs send xpool/ROOT/opensolaris@iamnew > /rpool/iamnew.snapshot
$ zfs recv rpool/ROOT/opensolaris@iamnew < iamnew.snapshot



9) When that's finished...we're done. Make sure to scrub and export rpool, and eject your live media and external drive. Then boot from the target disk!

Good luck

7.22.2008

ZFS on Disk Images: The Ultimate Simulation

It appears that ZFS can be installed on a series of disk images in a manner akin to that on other devices... and it works flawlessly! Because disk images are virtual block devices, essentially sets of blocks in a file, we would expect them to work like normal HDs with ZFS. Recalling that ZFS is pretty much hardware independent and managed by soft frameworks, there shouldn't be a need to have hard devices. They in fact do display this property, opening the door to a host of possibilities. Now, you can test ZFS raid setups and their fault tolerance virtually, without trashing any hardware. As a simulation tool this provides system administrators an environment to test and debug potential troubles in their larger scale systems, which need to remain stable at all times.

Pretty much everything worked as expected...the only changes that needed to be made were with the location of the virtual devices I specified: instead of /dev/disksomething, i just specified the Disk Image path. importing and exporting the pool was possible as long as I specified the device directory during import.

The process:
Use a disk utility to create 3 (or more) disk images:
let their location be /ZFS for the purpose of this trial


create pool and volume(s)
sudo -s or su, then


$ zpool create virtpool raidz /ZFS/zdisk1.img /ZFS/zdisk2.img /ZFS/zdisk3.img
$ zpool status
$ zfs create virtpool/zvol


do whatever you need to do in zvol
cp...touch...mkdir...whatever

now you can try a zfs test procedure
all zfs functions should work fine...note:
zpool import virtpool will be different because we don't want to search /dev


$ zpool import -d /ZFS/ virtpool



7.20.2008

ZFS Volume Copy and OS imaging reborn


dd has been around since the dawn of time, using its sheer muscle to move entire filesystems from disk to disk without defect. If you want to make sure nothing changes when imaging and cloning an OS, you turn to dd. Problem with dd is that is too comprehensive, it copies a the content of a disk block by block, ignorant to the actual files contained in those blocks. dding to a disk or partition also means that disk is entirely erased, and the new blocks are locked in place. If you're dding a disk image, the filesystem on the receiving disk becomes locked to the image's original size...dd is just too crude sometimes...

Enter ZFS

ZFS is abstracted away from the physical disk blocks, dealing instead in files and directories. Further, ZFS deals in dynamically-sized volumes instead of partitions; each is literally a collection of files that is highly portable yet cloneable.
Interestingly enough, there is no built-in function to copy an entire ZFS volume from one pool to another, as one would do when preparing many disks with a particular OS-application setup...but all the pieces are there to produce one, so why not?
ZFS currently supports the creation and copying of snapshots, which on their own contain now files, but can be used as precursors to a cloned filesystem. What we will do is create a snapshot of the zfs volume in question, send it from that pool to the target pool, restore it as a filesystem and ensure that all the files are migrated as a part of that process.

let zfpool be the source zfs pool with the volume zvol to be copied.
let rpool be the target pool where a new zfs volume will be created in the image of zvol.

First, let's snapshot zfpool/zvol, this will save its current state without saving files.


$ zfs snapshot zfpool/zvol@20080720



Next, we need to move that snapshot from zfpool to rpool...


$ zfs send zfpool/zvol > /tmp/temp.snapshot && zfs receive rpool < /tmp/temp/snapshot



Now if we zfs list, we'll see that there is a snapshot in both zfpool and rpool...this is quite good...and then you see the new, populated volume there as well...
And we're done...we've cloned zvol and moved it to another disk/set of disks!
So, imagine if we have an Opensolaris install on a volume like zvol, it's now been imaged to another disk...ready to use!

One thing, this works pretty well with clones of volumes, but cloning the root of a pool(zfpool alone) is not as clear cut, and i do not suggest it for exactly that reason...I'm not sure if it works.
These steps are summarized below.
note that there are segments in comments that can be used instead of the prompts for target and source, for complete automation of this task.


Script:


#!/bin/bash
sudo zpool list

#prespecify snapshot
#SNAPSHOT = zfpool/zvol@20080714
#TARGET = rpool
#---

echo "Enter snapshot name: "
read SNAPSHOT
echo "Enter FS/pool"
read POOL
echo "Enter volume"
read LOCATION
echo "Enter target pool/volume"
read TARGET
if ["$LOCATION" = ""]
then
FROMSNAPSHOT="$POOL@$SNAPSHOT"
else
FROMSNAPSHOT="$POOL/$LOCATION@$SNAPSHOT"
fi
#auto generate a new snapshot
sudo zfs snapshot $FROMSNAPSHOT
if ["$LOCATION" = ""]
then
TOSNAPSHOT="$TARGET@$SNAPSHOT"
else
TOSNAPSHOT="$TARGET/$LOCATION@$SNAPSHOT"
fi
sudo zfs send $SNAPSHOT > /tmp/temp.snapshot && zfs receive $TOSNAPSHOT < /tmp/temp.snapshot

7.14.2008

ZFS and OS X: Does it work?


With the recent announcement of the next iteration of Mac OS, 10.6 Snow Leopard, Apple has taken a giant leap of faith, providing full ZFS support for use in its server OS. In anticipation of this functionality, I tested the most recent ZFS support infrastructure for OSX, with some surprising results.

In 10.5, Apple provided read-only support for ZFS, allowing users to import and mount zfs pools and volumes, but not modify them. Apple developers are offered the opportunity to download a bleeding edge patch to enable read-write support. This patch, however, is only for 10.5.1 and will not run in any other system(new or old). Fortunately, developers at zfs.macosforge.org have persisted in the improvement of ZFS support on OSX. Their system certainly provides greater functionality than Apple's, but the question is, how much functionality is that?

First the setup
Using the latest binary (build 117), Installation is straightforward and relatively simple..
I'm using 10.5.2.
I use two usb drives for a zfs raid volume...one is 4.1gb, the other is 4.2gb
I performed Sun's classic ZFS function test: Does it smash?
From start to finish:
1) Create a zfs pool with raidz across the drives:


$ sudo -s
$ zpool create zfpool raidz disk[X] disk[Y]



2) Create a zfs volume inside


$ zfs create zfpool/zvol



3) Make some files


$ cd /Volumes/zfpool/zvol
$ touch qq
$ mkdir direct
$ echo " a lot of text" > gibberish.txt


Perhaps copy some larger files here as well...movies are good...You can play them while doing damage
4) Begin chaos: pull out one of the drives
5) Check the status:
You will notice that the status has not changed; zfs in solaris has daemons updating it constantly, whereas in OSX it will need to be manually updated


$ zpool status
$ zpool clear zfpool
$ zpool scrub zfpool
$ zpool status


Now one of the drives will show Removed and the health will be Degraded, shouldn't be a problem for raidz
6)Damage Assessment


$ cd /Volumes/zfpool/zvol
$ ls



files should be intact
#Assess other file function: play a movie!
7) Return to calm for a moment: reconnect the other drive and


$ zpool clear zfpool
$ zpool scrub zfpool
$ zpool status



#everything should be online and healthy
#you can do the same test with the other drive
8)More Damage! unplug both drives and plug them in different ports...
Check the status(5) and Assess damage again...(6)
Stuff should still work...note how the device ids change but the health doesn't.
-----------------------------------------
All in all, OSX did pretty well with ZFS in these trials, which are analogous in many ways to the sledgehammer trials that Sun does: setup ZFS on stage and destroy the harddisks sequentially while watching the filesystem repair itself.

More specifically, the data on the ZFS volume survived the various trials, though the fact that the zpool status had to be manually updated was a pain. Still, looks like the ZFS implementation is functional enough to be used on a small scale. ZFS in Snow Leopard will probably be safe for enterprise use, seeing as current support passes basic data safety tests: Videos played fine and the files never vanished through thick and thin.
------------------------------------------
Now for the bad news:
I was able to confuse ZFS at some point such that it would not reattach a removed volume...this is where OSX is still sticky....
when reconnected the drive would claim to be busy and I couldn't perform a zpool replace...
so while my data was intact, I couldn't restore the raid to its state of happiness from OSX--I had to use Solaris.
Also, the ZFS implementation for OSX is not the latest version of ZFS, so ZFS volumes created in newer Solaris builds aren't importable or usable. This was sad :(

7.02.2008

Setting up ZFS and opensolaris on USB


As we prepare the microroot to pass control to ZFS boot, we should probably prepare a functional ZFS boot volume and full opensolaris install. What we will do is quite simple--we'll use a standard ZFS boot setup as provided by 2008.05, and simply slip the microroot inside of it in such a way that GRUB can load the microroot and kernel. Recall that we'll be following the LiveCD boot model up until /sbin/init is first called, so this system must live alongside the ZFS setup.

Booting off the 2008.05 liveCD, run the installer application straight off the desktop. It will prompt you to select a volume to format and install to. Obviously this is the USB drive, and the whole drive should be used--partitioning and ZFS don't make good friends. Once this is done, it is essential that the USB drive is left in place, as a change in the device location would destroy the install as is.

MAKE SURE you have followed the steps in the previous 3 posts before doing this one.

We need to boot Solaris once from the drive to initialize the various services, but that first boot is all that should be done to the fresh installation.

Next, we need to install x86.microroot. Booting off the LiveCD once more,

we import the zfs pool:
rpool is the default boot pool name and shouldn't vary between installs.
and mount the opensolaris install volume: this is legacy mounted and should have the same name


$ zpool import -f rpool
$ mount -F zfs rpool/ROOT/opensolaris /mnt2


Once that is done, copy the modified x86.microroot(previous steps) to /mnt2/boot/x86.microroot, where the ZFS volume is mounted.


$ cp /[Path to microroot/x86.microroot /mnt2/boot/



Finally, GRUB's menu.lst needs to be modified to boot off the microroot; add an entry reading exactly as follows:


title OpenSolaris 2008.05 Live USB
bootfs rpool/ROOT/opensolaris
kernel$ /platform/i86pc/kernel/$ISADIR/unix
module /boot/x86.microroot


Make sure to export and umount the ZFS volume before shutting down..



Previous Step

Altering /sbin/init


In x86.microroot, we need to alter /sbin/init to prepare the ZFS boot volume and call up the proper init. This appears simple, but there are a few adjustments that need to be made to ensure that when we hand control over to init, it has all the environment characteristics it needs. After conclusive tests, I have come up with the following replacement /sbin/init, to be inserted in x86.microroot.
Basically what we is import the real root filesystem (ZFS) from the barebones(but functional) ramdisk image...Once that is done, we need to move/remount the special filesystems that kernel has setup in /device(and other places) to our new root. This is the bulk of the script, as these needed to be tweaked in just the right way...
Before you implement this, ensure that the following modifications have been made to x86.microroot:
- /mnt2 exists inside x86.microroot AND it is empty.
Further, you may want to copy the kernel modules present in /usr/kernel to /kernel; even though they are non-essential, they will initialize devices like /dev/pts, which are essential to the *correct* function of certain programs (eg. terminal/gterm).



#!/bin/bash -x

export PATH=/bin:/usr/bin:/sbin:/usr/sbin

exec >/dev/msglog 2>&1
exec
# /bin/bash

echo "initializing devices"
echo "remounting root"
mount -F ufs -o rw,remount /devices/ramdisk:a /

echo "preparing devfsadm"
devfsadm
devfsadm -I
devfsadm -P
#/usr/lib/devfsadm/devfsadmd -v
echo "finding devices"
ls /dev/dsk

echo "Importing rpool..."
zpool import -f rpool
zfs umount /opt
zfs umount -f /export/home
zfs umount -f /export
echo "Mounting root file system..."
mount -F zfs rpool/ROOT/opensolaris /mnt2
ls /mnt2

echo "mounting /proc, dev, etc.."
chroot /mnt2 /sbin/mount -F mntfs mnttab /etc/mnttab
chroot /mnt2 /sbin/mount -F proc /proc /proc

chroot /mnt2 /sbin/mount -F tmpfs /etc/svc/volatile /etc/svc/volatile

mount -F lofs /devices /mnt2/devices
PATH=/mnt2/bin:/mnt2/usr/bin:/mnt2/sbin:/mnt2/usr/sbin
(cd /dev && tar -cf - .) | (cd /mnt2/dev && tar -xf -)
PATH=/bin:/usr/bin:/sbin:/usr/sbin
chroot /mnt2 /sbin/mount -F dev /dev /dev

chroot /mnt2 /sbin/mount -F ctfs ctfs /system/contract
chroot /mnt2 /sbin/mount -F objfs objfs /system/object

chroot /mnt2 /sbin/mount -F sharefs sharefs /etc/dfs/sharetab
echo "autopush and soconfig..." >/dev/msglog
chroot /mnt2 /sbin/autopush -f /etc/iu.ap
chroot /mnt2 /sbin/soconfig -f /etc/sock2path

#/bin/bash
echo "Executing chroot /sbin/init"
exec chroot /mnt2 /sbin/init $@



Save this as /sbin/init. Make sure it has the same permissions as the /sbin/init originally had. You may need to change the owner as well:
in a superuser shell (su)


$ ls -l /sbin/init
-r-xr-xr-x 1 root sys 58044 Apr 26 18:42 /sbin/init



if it doesn't look like that (not including filesize or timestamp


$ chown root:sys /sbin/init
$ chmod 555 /sbin/init



Good Hunting!

Previous Step
Next Step

Altering a microroot image

In order to create a proxy /sbin/init in x86.microroot, we'll need to unpack the microroot image. x86.microroot happens to be a gzipped ufs disk image, so this process is pretty easy.
From Solaris:


$ gzip -dc x86.microroot > microroot.img
$ lofiadm -a /[Wherever the image is]/microroot.img
/dev/lofi/3
$ mount /dev/lofi/3 /mnt2



from Linux it's read-only and therefore more troublesome, so I'll neglect that for now.
We now have x86.microroot mounted at /mnt2, where we can edit it.

When we're done, all that needs to be done to repack it is:


$ umount /mnt2
$ lofiadm -d /[Path to microroot]/microroot.img
$ gzip -c1 microroot.img > x86.microroot



This is a good place to perform other space-saving alterations to x86.microroot...check out solaristhings.blogspot.com/ Make sure your microroot is still able to import zpools and mount zfs volumes after any and all alterations are made.



Previous Step
Next Step

7.01.2008

Summary of Method


Having successfully achieved the Live ZFS boot, I summarize here my process and logic--the details to perform each step will follow. In all steps I use opensolaris 2008.05. It is suggested these steps be performed from the liveCD or USB, and you will need 200+ MB of free storage on some media for temporary files.

I used a hybrid between the LiveCD boot and the standard ZFS boot strategies to produce my live boot. The concept is as follows:
1) Initiate live boot procedures from x86.microroot Ramdisk image, loading kernel and following standard boot up until /sbin/init is called. See Setting Up the Live USB
2) Using a custom /sbin/init, import and mount ZFS boot volume as installed by standard installer. Note that this is on the same disk as x86.microroot. Then remount necessary virtual filesystems (/devices, /dev, /proc, etc.). See Altering /sbin/init
3) The /sbin/init in the microroot is actually a proxy that calls the /sbin/init from the zfs volume when it has been discovered and prepared. Call /sbin/init with all parameters passed from Kernel to microroot init. The boot continues as ZFS boot normally does...

I will elaborate on all these steps, as well as the necessary infrastructure to prepare them.



Previous Step
Next Step

<<<<>>>>

6.27.2008

Now: ZFS Boot in Opensolaris

The Heart of the Beast: BOOTFS



OpenSolaris currently provides many different boot options, varying from version to version.
To summarize:
2008.05
Live CD/ Live USB: mounts ramdisk (/devices/ramdisk:a on / -- read-write) and extracts rootfs files (x86.microroot) from an iso image to Ramdisk fs. Two other iso images with non-essential files are loopback mounted as read-only later in the boot.

Installed (HD): boots and finds ZFS root among /devices as specified by zpool.cache. This is mounted as / and normal boot proceeds

OpenSolaris SXCE b9X
UFS boot: Workjavascript:void(0)
Publish Posts like a conventional Unix boot...
ZFS boot: in GRUB it finds the ZFS root by a signature using findroot, then extracts the device id from the ZFS headers on disk. These are passed on to the kernel, instead of potentially expired/damaged zpool.cache on disk. Boot proceeds normally.

These are our options in selecting a host to modify so that we can get a mobile ZFS boot. We can either take an existing ZFS boot setup and add in some component to get around its hardware id ties, or we can take a non-hardware tied boot (liveCD, UFS) and add a ZFS setup following the normal boot.



Previous Step

Next Step

<<<<>>>>

Problem of the ZFS live boot

The Problem:
We want a live ZFS boot that can be installed on a usb drive and booted on any machine.


Target Selection:

Opensolaris is the only OS that really can harness ZFS in a stable manner, so it has to be a target...The problem is that the Opensolaris ZFS boot processes listed above are tied to the boot hardware. Let me explain--

The ZFS boot from 2008.05 uses the file zpool.cache to find where the boot device is, and zpool.cache stores record of where the device was plugged in as of the last boot. Reading from zpool.cache, kernel is able to find out what device contains the zfs-bootfs, as well as what pool it is a part of. This method relies on the fact that whatever was true (with regard to device locations) remains true when you start the computer next. It's great if you know your HD won't be moved, but a USB drive moved to a different port (on the same or different machine) will hold a different device number, and kernel will try to find the drive at a previous port during the boot, only to fail horribly.
I tried this and the verdict is that no boot from this method could ever be considered live...

The ZFS boot from SXCE b9X sounds a great deal better; it should, in theory, discover your ZFS pool, volume, etc. and then boot off of it regardless of its location. GRUB has actually been modified in this case to be able to read ZFS volume data from a drive, and thereby find the boot device. Unfortunately, this mechanism has some evil dependencies. While GRUB's findroot() finds the zfs volume to be booted from, it doesn't know that device's location, so it queries the volume for that information. Because kernel has not started yet, and device have not really been set up, the zfs volume simple recalls where it was last connected, and returns that to GRUB. GRUB passes this potentially invalid information on to kernel, and in the case of a live or moved disk, kernel vomits it right back out. No live boot here...

The Live CD/USB doesn't boot off ZFS, but it provides an interesting model for a boot; load a self-contained ramdisk image, start kernel, and mount the USB drive's filesystem and nonessential files shortly thereafter. This means we have a running system before any media has been mounted, and we can mount things from there.



Previous Step

Next Step

<<<<>>>>

Intro ZFS live boot


Hacking Project: Create the first live ZFS boot setup using Opensolaris.


Background:

Linux users are well aware of the great number of distributions that are available in a live cd form; since Knoppix first demonstrated its practicality, the Linux live distro has become the quickest way to avoid using Windows on any platform. Today, linux live distros are cleverly setup with read-write capable filesystems and on-the-fly hardware recognition. Suffice to say that ramdisk and creative volume mounts/symlinks no longer rule the business.

OpenSolaris is also available in a live cd form: 2008.05
While the 2008.05 live cd is not as sophisticated as the Ubuntu or Slax live distros, it is smooth enough to use.

Linux Live USB distros have allowed users a mobile, persistent OS in a form that has become even more useful than live CD; usb media is cheap enough that pocketing one's desktop AND files is practical.

OpenSolaris has a live USB distro based on 2008.05(developer preview), as well as tools to build one with an ISO (distroconstructor project).

ZFS is one of the most amazing filesystems in existence. It is powerful in its ability to separate FS from hardware, and thereby expand or contract to accomodate varying numbers of storage devices with redundancy and decent performance. Sun does a better job of explaining it than I will here. Opensolaris has been tailored to live and breathe ZFS, and is the only OS with fully functioning ZFS support.

Continue to Next Step

Hello World

I'm hoping this will be useful...someday.

This blog will address various OS modification and installation projects that I am working through or have completed, in such a way that the useful/semi-useful/useless products can be recreated.

Among the random things I'll address are
Live OS distributions
Multi boot setups never attempted before(again)
Mobile/Embedded
Make it Damn Small!
Put it where it doesn't belong!
Hacking project not listed here...

Code snippets are my favorite, yours or mine.

Some OS's will show up more than others,
[cough] OSX86 [cough] Opensolaris
because they just are sooo untouched.
And some will appear cause "geez, didn't know that could be done"
[cough] linux [cough]

I'm hoping that the simple projects posted here can at least be a springboard for more projects.
You won't find support here unless you're building something. This is The Braindump.

I'll occasionally post other software and hardware projects...We'll see.
Oh, and please, I am not responsible for broken stuff. What works for me may cripple your machine. I'll post the hardware I use and hope yours works too.