Fedora Atomic, OpenStack, and Kubernetes (oh my)
While experimenting with Fedora Atomic, I was looking for an elegant way to automatically deploy Atomic into an OpenStack environment and then automatically schedule some Docker containers on the Atomic host. This post describes my solution.
Like many other cloud-targeted distributions, Fedora Atomic runs
cloud-init when the system boots. We can take advantage of this
to configure the system at first boot by providing a user-data
blob
to Nova when we boot the instance. A user-data
blob can be as
simple as a shell script, and while we could arguably mash everything
into a single script it wouldn’t be particularly maintainable or
flexible in the face of different pod/service/etc descriptions.
In order to build a more flexible solution, we’re going to take advantage of the following features:
Support for multipart MIME archives.
Cloud-init allows you to pass in multiple files via
user-data
by encoding them as a multipart MIME archive.Support for a custom part handler.
Cloud-init recognizes a number of specific MIME types (such as
text/cloud-config
ortext/x-shellscript
). We can provide a custom part handler that will be used to handle MIME types not intrinsincally supported bycloud-init
.
A custom part handler for Kubernetes configurations⌗
I have written a custom part handler that knows about the following MIME types:
text/x-kube-pod
text/x-kube-service
text/x-kube-replica
When the part handler is first initialized it will ensure the
Kubernetes is started. If it is provided with a document matching one
of the above MIME types, it will pass it to the appropriate kubecfg
command to create the objects in Kubernetes.
Creating multipart MIME archives⌗
I have also created a modified version of the standard
write-multipart-mime.py
Python script. This script will inspect the
first lines of files to determine their content type; in addition to
the standard cloud-init
types (like #cloud-config
for a
text/cloud-config
type file), this script recognizes:
#kube-pod
fortext/x-kube-pod
#kube-service
fortext/x-kube-service
#kube-replica
fortext/x-kube-replca
That is, a simple pod description might look something like:
#kube-pod
id: dbserver
desiredState:
manifest:
version: v1beta1
id: dbserver
containers:
- image: mysql
name: dbserver
env:
- name: MYSQL_ROOT_PASSWORD
value: secret
Putting it all together⌗
Assuming that the pod description presented in the previous section is
stored in a file named dbserver.yaml
, we can bundle that file up
with our custom part handler like this:
$ write-mime-multipart.py \
kube-part-handler.py dbserver.yaml > userdata
We would then launch a Nova instance using the nova boot
command,
providing the generated userdata
file as an argument to the
user-data
command:
$ nova boot --image fedora-atomic --key-name mykey \
--flavor m1.small --user-data userdata my-atomic-server
You would obviously need to substitute values for --image
and
--key-name
that are appropriate for your environment.
Details, details⌗
If you are experimenting with Fedora Atomic 21, you may find out that
the above example doesn’t work – the official mysql
image generates
an selinux error. We can switch selinux to permissive mode by putting
the following into a file called disable-selinux.sh
:
#!/bin/sh
setenforce 0
sed -i '/^SELINUX=/ s/=.*/=permissive/' /etc/selinux/config
And then including that in our MIME archive:
$ write-mime-multipart.py \
kube-part-handler.py disable-selinux.sh dbserver.yaml > userdata
A brief demonstration⌗
If we launch an instance as described in the previous section and then log in, we should find that the pod has already been scheduled:
# kubecfg list pods
ID Image(s) Host Labels Status
---------- ---------- ---------- ---------- ----------
dbserver mysql / Waiting
At this point, docker
needs to pull the mysql
image locally, so
this step can take a bit depending on the state of your local internet
connection.
Running docker ps
at this point will yield:
# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
3561e39f198c kubernetes/pause:latest "/pause" 46 seconds ago Up 43 seconds k8s--net.d96a64a9--dbserver.etcd--3d30eac0_-_745c_-_11e4_-_b32a_-_fa163e6e92ce--d872be51
The pause
image here is a Kubernetes detail that is used to
configure the networking for a pod (in the Kubernetes world, a pod is
a group of linked containers that share a common network namespace).
After a few minutes, you should eventually see:
# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
644c8fc5a79c mysql:latest "/entrypoint.sh mysq 3 minutes ago Up 3 minutes k8s--dbserver.fd48803d--dbserver.etcd--3d30eac0_-_745c_-_11e4_-_b32a_-_fa163e6e92ce--58794467
3561e39f198c kubernetes/pause:latest "/pause" 5 minutes ago Up 5 minutes k8s--net.d96a64a9--dbserver.etcd--3d30eac0_-_745c_-_11e4_-_b32a_-_fa163e6e92ce--d872be51
And kubecfg
should show the pod as running:
# kubecfg list pods
ID Image(s) Host Labels Status
---------- ---------- ---------- ---------- ----------
dbserver mysql 127.0.0.1/ Running
Problems, problems⌗
This works and is I think a relatively elegant solution. However,
there are some drawbacks. In particular, the custom part handler
runs fairly early in the cloud-init
process, which means that it
cannot depend on changes implemented by user-data
scripts (because
these run much later).
A better solution might be to have the custom part handler simply write the Kubernetes configs into a directory somewhere, and then install a service that launches after Kubernetes and (a) watches that directory for files, then (b) passes the configuration to Kubernetes and deletes (or relocates) the file.