Friday, November 14, 2014

Kernel Zones: root disk recovery

I've been playing with Solaris Kernel Zones recently. I installed a zone named kz1 on an iscsi device so I could test Live Migration as well. At some point I wanted to modify contents of the kz1 root disk outside of the zone, so I imported the pool in a global zone, of course while kz1 was shut down. This worked fine (zpool import -t ...). However the zone crashed on boot as it couldn't import the pool. The reason is that when a pool is imported on another system it updates phys_path in its ZFS label and when kernel zone boots it tries to import its root pool based on phys_path which now might be invalid, which was exactly the case for me. The root disk of kz1 zone was configured as: 

add device
set storage=iscsi://iscsi1.test/lunme.naa.600144f0a613c900000054521d550001
set bootpri=0
set id=0
end

This results in phys_path: /zvnex/zvblk@0:b as a disk driver is virtualized in Kernel Zones. After the pool was imported in a global zone the phys_path was updated to: /scsi_vhci/disk@g600144f0a613c900000054521d550001:b which won't work in the kernel zone. One way to workaround the issue is to create another "rescue" zone with its root disk's id set to 1. Then add the root disk of kz1 to it as an additional non-bootable disk with id=0.
 
add device
set storage=dev:zvol/dsk/rpool/kz2
set bootpri=0
set id=1
end

add device
set storage=iscsi://iscsi1.test/lunme.naa.600144f0a613c900000054521d550001
set id=0
end

Now in order to update the phys_path to the correct one, the pool needs to be imported and exported in the rescue zone:
 
# zpool import -R /mnt/1 -Nf -t kz1 rpool
# zpool export kz1

Notice that the rescue zone doesn't even need to have a network configured - in fact all you need is a minimal OS installation in it with almost everything disabled and you login to it via zlogin.

The kz1 zone will now boot just fine. In most cases you don't need to use the above procedure.You should be able to do all the customizations via AI manifests and system profiles and SMF. But in some cases it is useful to be able to manipulate root disk contents of a kernel zone without actually booting it. Or perhaps you need to recover rpool after it became unbootable.

No comments: