Debian SPARC in a T1000 Logical Domain
Last updated: Wed, 22 Apr 2009 19:37:00 GMT
It can be done, with a little bit of jiggery here, and some pokery there. Here are a few notes that might save you some time, couched in a verbose and conversational tone that's guaranteed to waste you some.
It's a bit of a strange corner to find yourself in, but here I am. I've got a fair bit of Solaris behind me but I'm looking to keep my hand in with various other operating systems. I have a T1000 to play with, and I'm nowhere near utilising the many-threaded grunt it has. Logical Domains look like a great solution to both requirements.
I was aiming for a a couple of guest operating systems; a Linux and a BSD. I'd prefer FreeBSD, but the intarnets tell me that FreeBSD isn't a go, lacking drivers to support the virtual network interfaces and virtual disc controllers that the LDom host presents to guests. The latest snapshot of OpenBSD might be worth a try. That'll be Phase Two.
So, to Linux. I prefer something Debian-ish, simply because I see no reason to suffer anything other than apt-get for package management. I have painful and thankfully distant memories of RPM. With no other advice, the Debian SPARC port was the obvious place to start.
Set Up Your Domains
First things first, patch the Solaris instance on the T1000 up to a reasonable level. I used PCA, obviously. If you're not using PCA, you really should.
Next, install the LDom 1.1 software, as obtained from Sun. I chose not to fart around with their JASS stuff -- I'll secure this myself when I'm comfortable with it, but for now I don't want their package doing anything to my server that's not related directly to LDoms.
My T1000 has 8GiB of memory, 32 virtual CPUs, one 80GB SATA drive. I was aiming for a 2:1:1 split on just about everything: 4GiB RAM and 16 vCPUs for Solaris, 2GiB RAM and 8 vCPUs for each of two guest domains. The LDom Administration Guide I was using was pretty straight-forward and helpful, especially Chapter 4: "Setting Up Service and Logical Domains".
Already having all four onboard NICs configured as two bonded channels, I simply replaced my primary aggregate (aggr1) with my virtual switch (vsw0). Painless.
Division of Memory didn't quite work as I'd expected. With the primary domain allotted 4GiB, I can't give 2GiB each to two guest domains. Obviously, I'm using a certain amount of memory already, but... isn't that coming out of the primary domain's share? I've obviously skimmed over something important somewhere. For now, the two guest domains get 1.5GiB each. Probably more than enough anyway, especially with no TMPFS to suck it up.
On the subject of filesystems, the only aspect of guest domain setup that caused me any grief was supplying virtual disc for storage. I had planned to keep everything neatly separate by following the LDom documentation's example and supplying each guest domain with a slice of the only disc to hand. I spent a few hours shuffling stuff around so that I could destroy my existing ZFS pool and create two big slices at the end of the disc.
I never got it to work. I don't know why. With problems ranging from the Debian installer just not being able to see any storage, to it hanging when the partitioner tried to write a disc label, I just couldn't find the magic combination of vdsdev configuration options.
Avoid.
Instead, read the later example showing use of ZFS zvols, 6.11: "Using ZFS with Virtual Disks". That works a treat.
Here's my ZFS config:
trevor_concat_p0/vols/ldg0_root type volume - trevor_concat_p0/vols/ldg0_root creation Wed Apr 22 10:08 2009 - trevor_concat_p0/vols/ldg0_root used 12G - trevor_concat_p0/vols/ldg0_root available 36.3G - trevor_concat_p0/vols/ldg0_root referenced 2.23G - trevor_concat_p0/vols/ldg0_root compressratio 1.00x - trevor_concat_p0/vols/ldg0_root reservation 12G local trevor_concat_p0/vols/ldg0_root volsize 12G - trevor_concat_p0/vols/ldg0_root volblocksize 8K - trevor_concat_p0/vols/ldg0_root checksum on default trevor_concat_p0/vols/ldg0_root compression off inherited from trevor_concat_p0/vols trevor_concat_p0/vols/ldg0_root readonly off default trevor_concat_p0/vols/ldg0_root shareiscsi off default trevor_concat_p0/vols/ldg0_root copies 1 default trevor_concat_p0/vols/ldg0_root refreservation 12G local
You can see that I've fully-reserved the root volume for ldg0. That was superstition. As is the disabling of compression, which I otherwise leave on, all other things being equal. I had some trouble later with EXT2 filesystems over sunvdc devices. More later on that.
This, finally, is my working domain configuration:
NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME
primary active -n-cv- SP 16 4G 0.2% 21h 1m
SOFTSTATE
Solaris running
MAC
00:14:4f:be:ef:ba
HOSTID
0x4fbeefba
VCPU
VID PID UTIL STRAND
0 0 0.8% 100%
1 1 0.1% 100%
2 2 0.2% 100%
3 3 0.0% 100%
4 4 0.0% 100%
5 5 0.0% 100%
6 6 0.0% 100%
7 7 0.0% 100%
8 8 0.1% 100%
9 9 0.0% 100%
10 10 0.0% 100%
11 11 0.0% 100%
12 12 0.8% 100%
13 13 0.4% 100%
14 14 0.0% 100%
15 15 0.0% 100%
MAU
ID CPUSET
0 (0, 1, 2, 3)
1 (4, 5, 6, 7)
2 (8, 9, 10, 11)
3 (12, 13, 14, 15)
MEMORY
RA PA SIZE
0x8000000 0x8000000 4G
VARIABLES
keyboard-layout=US-English
nvramrc=." ChassisSerialNumber 4FBEEFBA " cr
security-#badlogins=4294967295
IO
DEVICE PSEUDONYM OPTIONS
pci@780 bus_a
pci@7c0 bus_b
VCC
NAME PORT-RANGE
primary-vcc0 6969-6996
VSW
NAME MAC NET-DEV DEVICE DEFAULT-VLAN-ID PVID VID MODE
primary-vsw0 00:14:4f:be:ef:de aggr1 switch@0 1 1
VDS
NAME VOLUME OPTIONS MPGROUP DEVICE
primary-vds0 zdisk0 excl /dev/zvol/dsk/trevor_concat_p0/vols/ldg0_root
zdisk1 excl /dev/zvol/dsk/trevor_concat_p0/vols/ldg1_root
VLDC
NAME
primary-vldc0
primary-vldc3
VLDCC
NAME SERVICE DESC
ds primary-vldc0@primary domain-services
vldcc1 primary-vldc0@primary ldmfma
vldcc2 SP spfma
VCONS
NAME SERVICE PORT
SP
------------------------------------------------------------------------------
NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME
ldg0 active -n---- 6969 8 1536M 0.0% 2h 24m
SOFTSTATE
Linux running
MAC
00:14:4f:be:ef:d7
HOSTID
0x4fbeefd7
VCPU
VID PID UTIL STRAND
0 16 0.0% 100%
1 17 0.0% 100%
2 18 0.0% 100%
3 19 0.0% 100%
4 20 0.0% 100%
5 21 0.0% 100%
6 22 0.0% 100%
7 23 0.0% 100%
MAU
ID CPUSET
4 (16, 17, 18, 19)
5 (20, 21, 22, 23)
MEMORY
RA PA SIZE
0x8000000 0x108000000 1536M
VARIABLES
auto-boot?=true
boot-device=disk
NETWORK
NAME SERVICE DEVICE MAC MODE PVID VID
vnet0 primary-vsw0@primary network@0 00:14:4f:be:ef:62 1
DISK
NAME VOLUME TOUT DEVICE SERVER MPGROUP
vdisk0 zdisk0@primary-vds0 disk@0 primary
VLDCC
NAME SERVICE DESC
ds primary-vldc0@primary domain-services
VCONS
NAME SERVICE PORT
ldg0 primary-vcc0@primary 6969
------------------------------------------------------------------------------
NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME
ldg1 active -t---- 6996 8 1536M 12% 7h 7m
SOFTSTATE
OpenBoot Primary Boot Loader
MAC
00:14:4f:be:ef:a1
HOSTID
0x4fbeefa1
VCPU
VID PID UTIL STRAND
0 24 100% 100%
1 25 0.0% 100%
2 26 0.0% 100%
3 27 0.0% 100%
4 28 0.0% 100%
5 29 0.0% 100%
6 30 0.0% 100%
7 31 0.0% 100%
MAU
ID CPUSET
6 (24, 25, 26, 27)
7 (28, 29, 30, 31)
MEMORY
RA PA SIZE
0x8000000 0x168000000 1536M
VARIABLES
auto-boot?=true
boot-device=disk
NETWORK
NAME SERVICE DEVICE MAC MODE PVID VID
vnet1 primary-vsw0@primary network@0 00:14:4f:be:ef:c2 1
DISK
NAME VOLUME TOUT DEVICE SERVER MPGROUP
vdisk1 zdisk1@primary-vds0 disk@0 primary
VLDCC
NAME SERVICE DESC
ds primary-vldc0@primary domain-services
VCONS
NAME SERVICE PORT
ldg1 primary-vcc0@primary 6996
Of note, here, I've set each guest domain's boot-device to disk. This works. The LDom docs suggest that this should be set to vdisk. This does not work.
Netboot the Debian Installer
The Debian SPARC installation documentation is very helpful. I already have a JumpStart/PXEboot setup at home, for various platforms I have that don't have any other practical way to install an OS. So I skimmed much of the netboot installation, simply sticking the boot.img on my boot server and amending ISC-DHCPd's config accordingly.
host terry {
hardware ethernet 00:14:4f:be:ef:62;
fixed-address 192.168.0.8;
filename "debian_sparc_boot.img";
next-server 192.168.0.200;
}
Then, making sure that the guest domain was bound and started, connected to its PROM by telnet, from the primary domain to the appropriate local port. Then DHCP boot:
-bash-3.00$ telnet 0 6969
Trying 0.0.0.0...
Connected to 0.
Escape character is '^]'.
Connecting to console "ldg0" in group "ldg0" ....
Press ~? for control options ..
{0} ok boot net:dhcp
Aside, if you run into trouble later, you can perform a rescue boot with the following, from the fake PROM:
{0} ok boot net:dhcp - rescue/enable=true
But you won't be getting into any trouble later.
The installation should proceed as you'd expect. Knock yourself out. Luckily, due to recent changes, the installer does have drivers the are happy to chat to the virtual network and disc that are attached to your domain. Unfortunately, though, the kernel doesn't load them automatically. Consequently, it'll fail to find an ethernet interface; select sunvnet from the list. It'll fail to find a disc controller; select sunvdc from the list.
I had a number of difficulties trying to lay an EXT2 filesystem out in the root partition. I've no idea why. In the end I partitioned my drive with an EXT2 /boot partition at 1, an EXT3 / partition at 2, and swap at 4.
At the task selection phase, I just picked "Standard System". I wish it came with sshd installed. I heartily support their decision not to have sshd running by default, but not to install it seems a little strange to me. No matter.
The End?
Reboot to Breakage
Once the installation is complete the domain will reset. If you get the chance, pass the kernel a rootdelay=2 parameter. This will save you some time shortly, cutting timeout drastically. The booting process will start, tear along noisly, and then come grinding to an unceremonious halt, like this:
Begin: Loading essential drivers ... done. Begin: Running /scripts/init-premount ... done. Begin: Running /scripts/local-top ... done. Begin: Waiting for root file system ... done. Gave up waiting for root device. Common problems: - Boot args (cat /proc/cmdline) - Check rootdelay= (did the system wait long enough?) - Check root= (did the system wait for the right device?) - Missing modules (cat /proc/modules; ls /dev) ALERT! /dev/vdiska2 does not exist. Dropping to a shell! BusyBox v1.10.2 (Debian 1:1.10.2-2) built-in shell (ash) Enter 'help' for a list of built-in commands. /bin/sh: can't access tty; job control turned off (initramfs)
Fix Breakage
When I got this far, I led myself down a lot of dead-ends. It's not silo at fault. It's not a missing bootable partition flag. There are no parameters that you can pass to the kernel to make this any better. Not that I could find, at least.
The problem is exactly the same as the problem we came across earlier with the installer -- the kernel hasn't loaded the sunvnet and sunvdc drivers for us. It's easy to fix, so we can be on our way:
(initramfs) modprobe sunvnet [324164.637414] eth0: Sun LDOM vnet 00:14:4f:be:ef:62 h0: Sun LDOM vnet [324164.646112] eth0: PORT ( remote-mac 00:14:4f:be:ef:de switch-port ) [324164.651743] eth0: PORT ( remote-mac 00:14:4f:be:ef:c2 ) [initramfs) modprobe sunvdc [324208.648989] sunvdc.c:v1.0 (June 25, 2007) [324208.653931] sunvdc: vdiska: 25164600 sectors (12287 MB) [324208.657314] vdiska: vdiska1 vdiska2 vdiska3 vdiska4 vdiska4
With this done, we could continue to boot. The problem with that would be that the default install doesn't spawn a working getty on our fake serial console. It doesn't start sshd, either. We should fix both of those now. Start by mounting our filesystems:
(initramfs) mkdir /target (initramfs) modprobe ext3 (initramfs) mount /dev/vdiska2 /target [324285.663774] EXT2-fs warning (device vdiska2): ext2_fill_super: mounting ext3 filesystem as ext2 (initramfs) mount /dev/vdiska1 /target/boot
Note that, even though we've loaded the EXT3 driver, the root filesystem is still mounted as EXT2. This time. Sometimes, it mounts as EXT3. I'm not going to call it non-deterministic, because that's just fancy-talk for "I don't know enough about what's going on to predict how this will behave." It's not a problem either way, just remember to unmount it cleanly when you're done.
In order to get a login prompt on our console, we need to edit /target/etc/inittab and uncomment the following line:
T0:23:respawn:/sbin/getty -L ttyS0 9600 vt100
And now we need to install sshd. Start by configuring the network, then chroot to our target filesystem, before using apt-get to install openssh-server:
(initramfs) ifconfig eth0 inet 192.168.0.8 netmask 255.255.255.0 up (initramfs) route add default gw 192.168.0.1 (initramfs) chroot /target (initramfs) apt-get install openssh-server Reading package lists... Done Building dependency tree Reading state information... Done The following extra packages will be installed: libx11-6 libx11-data libxau6 libxcb-xlib0 libxcb1 libxdmcp6 libxext6 libxmuu1 openssh-blacklist openssh-blacklist-extra x11-common xauth Suggested packages: ssh-askpass rssh molly-guard The following NEW packages will be installed: libx11-6 libx11-data libxau6 libxcb-xlib0 libxcb1 libxdmcp6 libxext6 libxmuu1 openssh-blacklist openssh-blacklist-extra openssh-server x11-common xauth 0 upgraded, 13 newly installed, 0 to remove and 0 not upgraded. Need to get 5844kB of archives.
You'll probably see a bunch of errors like this:
Setting up x11-common (1:7.3+18) ... Can not write log, openpty() failed (/dev/pts not mounted?)
but these don't matter, because we end up getting to here:
Setting up openssh-server (1:5.1p1-5) ... Creating SSH2 RSA key; this may take some time ... Creating SSH2 DSA key; this may take some time ... Restarting OpenBSD Secure Shell server: sshd.
Now, having installed and started sshd, it's time to kill it. If you don't, you'll find sshd camping with its cwd on your chrooted filesystem, and you won't be able to unmount. Once killed, exit chroot. Now you can unmount /target/boot and /target. Exit the ash shell.
The boot process should pick up where it left off and continue all the way to a login prompt on the fake serial interface at ttyS0. Login as root and fix up the initrd archive so that, next time we reboot, it contains instructions to load the sunvnet and sunvdc drivers:
Debian GNU/Linux 5.0 terry.local ttyS0 terry.local login: root Password: Linux terry.local 2.6.26-2-sparc64-smp #1 SMP Sat Mar 28 12:03:31 UTC 2009 sparc64 Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent permitted by applicable law. terry:~# echo sunvnet >> /etc/initramfs-tools/modules terry:~# echo sunvdc >> /etc/initramfs-tools/modules terry:~# update-initramfs -u
We're done. Reboot, watch in amazement as your domain boots cleanly, all the way to the login prompt. It should have started sshd, too.
Enjoy!