Automatic configuration of Windows instances in OpenStack, part 1
This is the first of two articles in which I discuss my work in getting some Windows instances up and running in our OpenStack environment. This article is primarily about problems I encountered along the way.
Motivations⌗
Like many organizations, we have a mix of Linux and Windows in our environment. Some folks in my group felt that it would be nice to let our Windows admins take advantage of OpenStack for prototyping and sandboxing in the same ways our Linux admins can use it.
While it is trivial to get Linux instances running in OpenStack (there are downloadable images from several distributions that will magically configure themselves on first boot), getting Windows systems set up is a little trickier. There are no pre-configured images to download, and it looks as if there aren’t that many people trying to run Windows under OpenStack right now so there is a lot less common experience to reference.
Like the cool kids do it⌗
My first approach to this situation was to set up our Windows instances to act just like our Linux instances:
- Install Cygwin.
- Run an SSH server.
- Have the system pull down an SSH public key on first boot and use this for administrative access.
This worked reasonably well, but many people felt that this wasn’t a great solution because it wouldn’t feel natural to a typical Windows administrator. It also required a full Cygwin install to drive things, which isn’t terrible but still feels like a pretty big hammer.
As an alternative, we decided we needed some way to either (a) allow the user to pass a password into the instance environment, or (b) provide some way for the instance to communicate a generated password back to the user.
How about user-data?⌗
One of my colleagues suggested that we could allow people to pass an
administrative password into the environment via the user-data
attribute available from the metadata service. While this sounds
like a reasonable idea at first, it has one major flaw: data from the
metadata service is available to anyone on the system who is able to
retrieve a URL. This would make it trivial for anyone on the instance
to retrieve the administrator password.
How about adminPass?⌗
When you boot an instance using the nova command line tools…
nova boot ...
You get back a chunk of metadata, including an adminPass
key, which
is a password randomly generated by OpenStack and availble during the
instance provisioning process:
+------------------------+--------------------------------------+
| Property | Value |
+------------------------+--------------------------------------+
...
| adminPass | RBiWrSNYqK5R |
...
+------------------------+--------------------------------------+
This would be an ideal solution, if only I were able to figure out how
OpenStack made this value available to the instance. After asking
around on #openstack it turns
out that not many people were even aware this feature exists, so
information was hard to come by. I ran across some documentation
that mentioned the libvirt_inject_password
option in nova.conf
with the following description:
(BoolOpt) Inject the admin password at boot time, without an agent.
…but that still didn’t actually explain how it worked, so I went
diving through the code. The libvirt_inject_password
option appears
in only a single file, nova/virt/libvirt/connection.py
, so I knew
where to start. This led me to the _create_image
method, which
grabs the admin_pass
generated by OpenStack:
if FLAGS.libvirt_inject_password:
admin_pass = instance.get('admin_pass')
else:
admin_pass = None
And then passes it to the inject_data
method:
disk.inject_data(injection_path,
key, net, metadata, admin_pass, files,
partition=target_partition,
use_cow=FLAGS.use_cow_images,
config_drive=config_drive)
The inject_data
method comes from nova/virt/disk/api.py
, which is
where things get interesting: it turns out that the injection
mechanism works by:
- Mounting the root filesystem,
- Copying out
/etc/passwd
and/etc/shadow
, - Modifying them, and
- Copying them back.
Like this:
passwd_path = _join_and_check_path_within_fs(fs, 'etc', 'passwd')
shadow_path = _join_and_check_path_within_fs(fs, 'etc', 'shadow')
utils.execute('cp', passwd_path, tmp_passwd, run_as_root=True)
utils.execute('cp', shadow_path, tmp_shadow, run_as_root=True)
_set_passwd(admin_user, admin_passwd, tmp_passwd, tmp_shadow)
utils.execute('cp', tmp_passwd, passwd_path, run_as_root=True)
os.unlink(tmp_passwd)
utils.execute('cp', tmp_shadow, shadow_path, run_as_root=True)
os.unlink(tmp_shadow)
Do you see a problem here, given that I’m working with a Windows
instance? First, it’s possible that the host will be unable to mount
the NTFS filesystem, and secondly, there are no passwd
or shadow
files of any use on the target.
You can pass --config-drive=True
to nova boot
and it will use a
configuration drive (a whole-disk FAT filesystem) for configuration
data (and make this available as a block device when the system
boots), but this fails, hard: most of the code treats this as being
identical to the original root filesystem, so it still tries to
perform the modifications to /etc/passwd
and /etc/shadow
which, of
course, don’t exist.
I whipped some quick
patches
that would write the configuration data (such as admin_pass
) to
simple files at the root of the configuration drive…but then I ran
into a new problem:
Windows doesn’t know how to deal with whole-disk filesystems (nor, apparently, do many windows admins). In the absence of a partition map, Windows assumes that the device is empty.
Oops. At this point it was obvious I was treading on ground best left undisturbed.