Fun with devicemapper snapshots
I find myself working with Raspbian disk images fairly often. A typical workflow is:
- Download the disk image.
- Mount the filesystem somewhere to check something.
- Make some changes or install packages just to check something else.
- Crap I’ve made changes.
…at which point I need to fetch a new copy of the image next time I want to start fresh.
Sure, I could just make a copy of the image and work from there, but what fun is that? This seemed like a perfect opportunity to learn more about the device mapper and in particular how the snapshot target works.
Making sure we have a block device⌗
The device mapper only operates on block devices, so the first thing we need to do is to make the source image available as a block device. We can do that with the losetup command, which maps a file to a virtual block device (or loop device).
I run something like this:
losetup --find --show --read-only 2017-11-29-raspbian-stretch-lite.img
This will find the first available block device, and then use it make
my disk image available in read-only mode. Those of you who are
familiar with losetup
may be thinking, “you know, losetup
knows
how to handle partitioned devices”, but I am ignoring that for the
purpose of using device mapper to solve things.
Mapping a partition⌗
We’ve just mapped the entire disk image to a block device. We can’t use this directly because the image has multiple partitions:
# sfdisk -l /dev/loop0
Disk /dev/loop0: 1.7 GiB, 1858076672 bytes, 3629056 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x37665771
Device Boot Start End Sectors Size Id Type
/dev/loop0p1 8192 93236 85045 41.5M c W95 FAT32 (LBA)
/dev/loop0p2 94208 3629055 3534848 1.7G 83 Linux
We want to expose partition 2, which contains the root filesystem. We
can see from the above output that partition 2 starts at sector
94208
and extends for 3534848
sectors (where a sector is for our
purposes 512 bytes). If you need to get at this information
programatically, sfdisk
has --json
option that can be useful; for
example:
# p_start=$(sfdisk --json /dev/loop0 |
jq ".partitiontable.partitions[1].start")
# echo $p_start
94208
We want to expose this partition as a distinct block device. We’re
going to do this by creating a device mapper linear
target. To
create device mapper devices, we use the dmsetup
command; the basic
syntax is:
dmsetup create <name>
By default, this expects to read a table describing the device on
stdin
, although it is also possible to specify one on the command
line. A table consists of one or more lines of the format:
<base> <length> <target> [<options>]
Where <base>
is the starting offset in sectors for this particular
segment, <length>
is the length, <target>
is the target type
(linear, snapshot, zero, etc), and the option are specific to the
particular target in use.
To create a device exposing partition 2 of our image, we run:
# dmsetup create base <<EOF
0 3534848 linear /dev/loop0 94208
EOF
This creates a device named /dev/mapper/base
. Sectors 0
through
3534848
of this device will be provided by /dev/loop0
, starting at
offset 94208
. At this point, we can actually mount the filesystem:
# mount -o ro /dev/mapper/base /mnt
# ls /mnt
bin dev home lost+found mnt proc run srv tmp var
boot etc lib media opt root sbin sys usr
# umount /mnt
But wait, there’s a problem! These disk images usually have very little free space. We’re going to want to extend the length of our base device by some amount so that we have room for new packages and so forth. Fortunately, since our goal is that all writes are going to a snapshot, we don’t need to use real space. We can add another segment to our table that uses the zero target.
Let’s first get rid of the device we just created:
# dmsetup remove base
And create a new one:
# dmsetup create base <<EOF
0 3534848 linear /dev/loop0 94208
3534848 6950912 zero
EOF
This extends our base
device out to 5G (or a total of 10485760
sectors), although attempting to read from anything beyond sector
3534848
will return zeros, and writes will be discarded. But that’s
okay, because the space available for writes is going to come from a
COW (“copy-on-write”) device associated with our snapshot: in other
words, the capacity of our snapshot will be linked to size of our COW
device, rather than the size of the underlying base image.
Creating a snapshot⌗
Now that we’ve sorted out our base image it’s time to create the snapshot device. According to the documentation, the table entry for a snapshot looks like:
snapshot <origin> <COW device> <persistent?> <chunksize>
We have our <origin>
(that’s the base image we created in the
previous step), but what are we going to use as our <COW device>
?
This is a chunk of storage that will receive any writes to the
snapshot device. This could be any block device (another loop device,
an LVM volume, a spare disk partition), but for my purposes it seemed
convenient to use a RAM disk, since I had no need for persistent
changes. We can use the zram kernel module for that. Let’s start
by loading the module:
# modprobe zram
Without any additional parameters this will create a single RAM disk,
/dev/zram0
. Initially, it’s not very big:
# blockdev --getsz /dev/zram0
0
But we can change that using the sysfs
interface provided in
/sys/block/zram0/
. The disksize
option controls the size of the
disk. Let’s say we want to handle up to 512M of writes; that means we
need to write the value 512M
to /sys/block/zram0/disksize
:
echo 512M > /sys/block/zram0/disksize
And now:
# blockdev --getsz /dev/zram0
1048576
We now have all the requirements to create our snapshot device:
dmsetup create snap <<EOF
0 10485760 snapshot /dev/mapper/base /dev/zram0 N 16
EOF
This creates a device named /dev/mapper/snap
. It is a 5G block
device backed by /dev/mapper/base
, with changes written to
/dev/zram0
. We can mount it:
# mount /dev/mapper/snap /mnt
# df -h /mnt
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/snap 1.7G 943M 623M 61% /mnt
And we can resize it:
# resize2fs !$
resize2fs /dev/mapper/snap
resize2fs 1.43.3 (04-Sep-2016)
Filesystem at /dev/mapper/snap is mounted on /mnt; on-line resizing required
old_desc_blocks = 1, new_desc_blocks = 1
The filesystem on /dev/mapper/snap is now 1310720 (4k) blocks long.
# df -h /mnt
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/snap 4.9G 944M 3.8G 20% /mnt
You’ll note here that it looks like we have a 5G device, because
that’s the size of our base image. Because we’ve only allocated
512M
to our COW device, we can actually only handle up to 512M of
writes before we invalidate the snapshot.
We can inspect the amount of our COW device that has been consumed by
changes by using dmsetup status
:
# dmsetup status snap
0 10485760 snapshot 107392/1048576 0
This tells us that 107392
sectors of 1048576
total have been
consumed so far (in other words, about 54M out of 512M). We can get
similar information from the perspective of the zram
module using
zramctl
:
# zramctl
NAME ALGORITHM DISKSIZE DATA COMPR TOTAL STREAMS MOUNTPOINT
/dev/zram0 lzo 512M 52.4M 34K 72K 4
This information is also available in /sys/block/zram0/mm_stat
, but
without any labels.