libvirt_cloud/README.md

212 lines
8.2 KiB
Markdown
Raw Normal View History

2023-12-29 19:40:31 +00:00
Spin up libvrt instances from a cloud-init capable base that can be
maintained using chroot on qcow2, and by ansible using libvirt_qemu.
This builds on two projects:
1) bash scripts to take a Gentoo qcow2 nightly box and use it as a base layer
for another qcow image that you spin up in libvirt, overlaying the base layer
for testing. You can then throw away that image for each test (if you like);
it takes less than a minute to make the base qcow2 layer, and other minute
to make the overlay layer and spin it up.
https://github.com/earlruby/create-vm/
2) python scripts to take a Gentoo stage3 and portage to build a qcow2 box and
use it as a base layer for a cloud-init capable qcow image.
https://github.com/NucleaPeon/gentooimgr/
The huge advantage of this is that you can maintain the base image
under chroot, so you can build test instances that have no incoming
network and still maintain them using tools like ansible, which
support maintaing chroot instances. The problem of using 1) gentooimgr
is that the configuration code of the install step is written in
Python, rather than handled by a more capable tool like ansible.
The problem of using 2) gentooimgr is that the base box may be old (it
is) and you may want to use an existing gentoo kernel and initramfs,
and the base box may be missing qemu-quest-agent (it is). *Much* worse
is that to use cloud-init you have to load rust and 344 crates from
God only knows where that you'll never audit, just to get oauth: no
thanks Google.
You can use ansible to maintain the base layer using chroot, and use
it again to create and spin up the test instance. And as we install
qemu-quest-agent in the base layer, you can manage the test instance
with ansible using libvirt guest-agent, even if the test instance
allows no incoming network.
For now, the code is written to build Gentoo base images, but it could
be extended to other bases. It can build a Gentoo base image from a
Gentoo system. or another system with a Gentoo system mounted on some
directory, that can be chrooted into, and files copied from. It may be
able to build from a Debian without a mounted Gentoo filesystem, but this
is currently untested.
## Workflow
1) We build the qcow2 base image that we can maintain by chroot
mounting the disk, so we can sort out problems conveniently. We started
doing that by using EarlRuby's python approach, but later rewrote an
old ansible role by https://github.com/agaffney/ansible-gentoo_install/
That ansible role is in roles/ansible-gentoo_install/tasks/ which is
executed as an included role by the toxcore role.
It's a very basic piece of coding that works on a local connection
and is run on the host by the build_base Makefile target. It starts
out as a local connection play, and run chroot internally when it needs it.
You must set these variable in a host in hosts.yml in the linux_chroot_group:
BOX_NBD_DEV: nbd1
BOX_NBD_MP: /mnt/gentoo
BOX_NBD_FILES: "/i/data/Agile/tmp/Topics/GentooImgr"
BOX_NBD_BASE_QCOW: "/g/Agile/tmp/Topics/GentooImgr/gentoo.qcow2"
This role is slow and may take an hour or more;
2023-12-29 19:40:31 +00:00
It will build the BOX_NBD_BASE_QCOW.
As a safety feature you must create and open the qcow base image before
running the roles: the roles do not use qemu-nbd -c or qemu-nbd -d by
design. You may also choose to download the gentoo latest stage3 and
portage files to the directory specified in hosts.ynl as BOX_NBD_FILES
These
2023-12-29 19:40:31 +00:00
2) We build the qcow2 overlay image that we can maintain by libvirt.
It is run on the host by the build_overlay Makefile target which runs
/usr/local/bin/toxcore_build_overlay_qcow.bash. It gets its parameters from
the hosts.yml file from the host called gentoo1 in the linux_libvirt_group.
2023-12-29 19:40:31 +00:00
## Roles
There are 3 ansible roles:
1. base : The base role sets up the basics and is required to be run.
2024-01-01 01:04:40 +00:00
It sets up the essential parameters to run roles on the host or client.
Check the settings in roles/base/defaults/main.yml before running the role.
2023-12-29 19:40:31 +00:00
2. proxy : The proxy role sets up the networking with proxies,
and is required to be run, even if you don't use a proxy.
2024-01-01 01:04:40 +00:00
It sets proxying and installs basic packages on the host or client.
Check the settings in roles/proxy/defaults/main.yml before running the role.
2023-12-29 19:40:31 +00:00
3. toxcore :
2024-01-01 01:04:40 +00:00
This role sets up the software to run libvirt on the host.
Check the settings in roles/toxcore/defaults/main.yml before running the role.
In addition, toxcore calls an included role ansible-gentoo_install.
This is an updated version of the abandonned
https://github.com/agaffney/ansible-gentoo_install/ This role,
when run on the host, builds the Gentoo base qcow image. As a safety
feature, you must create the qcow2 image and activate it with:
The host creates the base qcow2 image and then creates the overlay
image. When both are created, it install Tox software on the host and
client.
modprobe nbd
qemu-img $BOX_NBD_BASE_QCOW 20G
qemu-nbd -c $BOX_NBD_DEV $BOX_NBD_BASE_QCOW
and put these values into the hosts.yml file in the pentoo or devuan
target, depending on your host operating system. The filesytem that
holds base qcow2 $BOX_NBD_BASE_QCOW must have at least 12G available,
and may grow to almost 20G.
2023-12-30 22:09:49 +00:00
After you have finished building the base qcow2 image, you will want
to dismount it with qemu-nbd -d $BOX_NBD_DEV. Be careful and look
in /proc/partitions to see if it is still there after you dismount it:
it the partition is busy the dismount will fail silently, and you can
get into trouble if partprobe complains. You may have to reboot,
or it may resolve itself. Wierd.
2023-12-29 19:40:31 +00:00
Each role has been conditionalized to run with different connections.
## Connection Types
There are 3 ansible connection types:
1. localhost : ( ansible_connection == local )
Running the roles with a local connection will setup the host to
be able to run the software. The toxcore role will build the
toxcore software on the localhost and runs tests on it. It will also
build the base qcow2 image that will underly the overlay box.
2. chroot : ( ansible_connection == chroot )
When you have built the base box, you can chroot mount the qcow2 image
with qemu-nbd, so all of the configuring of the base can be done without
a network. A local bash script then builds the overlay qcow2 instance,
in less than a minute.
3. remote : ( ansible_connection == libvirt_qemu )
The base box provides the libvirt_qemu connection that is be used
to run ansible roles on the overlay image. The toxcore role will
build the toxcore software in the overlay and runs tests on it.
All of the 3 roles can all be run with all 3 connection types.
## Stages
There are 4 stages to building an instance:
1. Setup the localhost :
set up the host up with the software and settings needed to build boxes.
2. Build the base qcow2 base box :
3. Build the overlay qcow2 instance :
4. Test the overlay instance :
## Hosts.yml targets
## Makefile targets
all: install lint build check run test
1. install
2. lint
3. build
4. check
5. run
6. test
## Simplest usage
On Ansibles from 2.10 and later, you will need the community plugins installed.
### Downloaded base qcow
### Created and copuied base qcow on a Gentoo system
### Created and copied base qcow on a non-Gentoo system with a Gentoo mounted
## Advanced Usage
2023-12-30 12:52:24 +00:00
### GentooImgr: Gentoo Image Builder for Cloud and Turnkey ISO installers
There is a modified version of https://github.com/NucleaPeon/gentooimgr/
where we've modified the code a little to do use Python logging. We can
still use it for the build stage, but we think the install stage is better
done using ansible, hence the libvirt_cloud playbook.
The code is in src/ansible_gentooimgr The code is being supported as
an ansible module using library/ansible_gentooimgr.py which is a work
in progress; the idea is to use it for the build and status actions,
but handle the install tasks using ansible.
2023-12-29 19:40:31 +00:00
### ansible_local.bash
We have a script that calls ansible to run our play: ansible_local.yml
2023-12-30 12:52:24 +00:00
but sets some defaults
2023-12-29 19:40:31 +00:00
[ -l limit ]
[ -c connection ]
[ --skip comma,separated]
[ --tags comma,separated]
[ --check]
[ --diff]
[ --step]
[ --verbose 0-3] higher number, more debugging
roles - base and proxy roles will always be run.