A starting guide to Kernel Development

Preamble

This document has been viewed through many perspectives from many reviewers, each wanting a conflicting adaptation. Unlike other Kernel documentation, this has different aims. This document is either a concise starting point to a curious reader, or a dissected script to set up a quick development environment. A teaching tool. It is aimed to those individuals who have at least a middling skill level in C and Linux, and so should be suitable for upper level university courses. Parts which a reader does not understand are readily searchable, or at a level to build the reader’s research ability to be more suitable to Kernel development. It will not explain everything, only a high level description to keep readers on track of the flow of what is happening, since that has been the most consistent deficiency against this document from reviewers.

Introduction

This guide intends to serve as an adaptable example of how to set up an environment for Kernel Development. It uses as an example a non-hardware focused development pattern for the fanotify(7) system. It is composed of the following high level steps:

  • Configure the shell environment, folders, and scripts

  • Gathering a suitable Virtual Machine image, and Linux Repository

  • Configure the Virtual Machine image

  • Obtain a Kernel configuration

  • Compile the Kernel

  • Using gdb, show basic interaction and debugging steps with the Kernel.

C development past that point is considered out of scope, as is developing for any part of the Kernel proper since there are better sources for each component.

Requirements

  • git

  • wget

  • qemu

  • gdb

  • g++ supporting C++23

  • make

  • bison

  • flex

  • guestmount

  • expect

NOTE: On some distributions, kernels (/boot/vmlinuz*) lack global read permissions. Administrator permissions are required to make the kernel chosen by guestmount to be readable. There is debate about the effectiveness of this security decision. On some distributions like Ubuntu, this will cause a problem. In the context of changing a one-off system, having this file globally read-only is considered safe.

Setup

Configure Environment

This step will set up some environment variables which will be used in this guide, as well as variables used for guestmount and to allow gdb to use the debugging scripts provided by the Linux Kernel source repository.

mkdir -p "$HOME/Documents/linux-workspace/kernel-dev"
cd "$HOME/Documents/linux-workspace/kernel-dev"
export LINUX_REPO_PATH="$(pwd)/linux"

echo 'qemu-system-x86_64 -s -cpu host -nographic -accel kvm -m 1G Arch-Linux-x86_64-basic.qcow2' > run_vm.sh
chmod +x run_vm.sh

echo 'qemu-system-x86_64 -s -cpu host -nographic -accel kvm -m 1G -kernel linux/vmlinux -append "root=/dev/sda3 rw console=ttyS0,115200n8 nokaslr" Arch-Linux-x86_64-basic.qcow2 -S' > debug_vm.sh
chmod +x debug_vm.sh

echo 'guestmount -w -a Arch-Linux-x86_64-basic.qcow2 -m /dev/sda3 mnt/' > mount_vm.sh
chmod +x mount_vm.sh

echo "export LINUX_REPO_PATH=\"$LINUX_REPO_PATH\"" >> ~/.bashrc

echo 'add-auto-load-safe-path "$LINUX_REPO_PATH/scripts/gdb/vmlinux-gdb.py"' >> "$HOME/.config/gdb/gdbinit"

echo 'export LIBGUESTFS_DEBUG=1 LIBGUESTFS_TRACE=1' >> $HOME/.bashrc
echo 'export USERCFLAGS=" -ggdb "' >> $HOME/.bashrc
source "$HOME/.bashrc"

Obtain Resources

Here we need to obtain the basic materials to work with. These are a Virtual Machine image and a copy of the Linux Kernel source code itself.

NOTE: While this example uses the main Linux trunk, you should look up the most specific repo and branch for your development case.

git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
wget https://geo.mirror.pkgbuild.com/images/latest/Arch-Linux-x86_64-basic.qcow2

Modify the Virtual Machine to be scriptable

This step performs several actions and is slow. First, it copies the Linux Kernel source into the VM. Next, it changes the default GRUB settings in the VM such that on each GRUB update, the VM will be set up to use a serial console. After that, since we require a serial console to script VM interaction, an unfortunately complex pipeline is used to augment the grub.cfg file so that both GRUB and Linux use a configured serial console to interact.

mkdir -p mnt
./mount_vm.sh
cp -r linux mnt/home/arch/

pushd mnt/
echo 'GRUB_TERMINAL_INPUT="console serial"' >> etc/default/grub
echo 'GRUB_TERMINAL_OUTPUT="gfxterm serial"' >> etc/default/grub
echo 'GRUB_SERIAL_COMMAND="serial --unit=0 --speed=115200"' >> etc/default/grub

echo 'serial --unit=0 --speed=115200' > tmp
cat grub.cfg >> tmp
mv -f tmp grub.cfg

sed 's/terminal_input console/terminal_input serial/' < grub.cfg | sed 's/terminal_output gfxterm/terminal_output serial/' > tmp && mv -f tmp grub.cfg

BOOT_LINES="$(grep '/boot/vmlinuz-linux' < grub.cfg | sort -u)"
echo "$BOOT_LINES" | while IFS= read -r l ;\
    do l_escaped="$(printf '%s' "$l" | sed -e 's/[\/&]/\\&/g')" ;\
    l_escaped="$(printf '%s' "$l_escaped" | sed -e 's/\t/\\t/g')" ;\
    sed "s/$l_escaped/$l_escaped console=ttyS0,115200n8/g" < grub.cfg | sed 's/console=ttyS0,115200n8 console=ttyS0,115200n8/console=ttyS0,115200n8/g' > tmp ;\
    mv -f tmp grub.cfg ;\
    chmod 664 grub.cfg ;\
done

popd

umount mnt

Obtain applicable Kernel configuration

All this work lets the following script work. This is an expect script and can match and send in response-specific data. It configures the VM and generates the critical .config file needed to compile the Linux Kernel.

The reason all this effort is taken to generate the .config inside the VM but not compile it is purely one of resource and time efficiency.

As a script:

#!/usr/bin/expect -f

set timeout -1

set log [lindex $argv 0]

spawn qemu-system-x86_64 -s -cpu host -nographic -accel kvm -m 1G Arch-Linux-x86_64-basic.qcow2

expect "login: "
send "arch\r"
expect "Password: "
send "arch\r"

expect "$ "
send "sudo pacman -Syu --noconfirm --needed base-devel xmlto inetutils bc git cpio perl\r"


expect "$ "
send "cd linux\r"

expect "$ "
send "sudo make localyesconfig\r"

expect {
    "(NEW)" {
        send "\r"
        exp_continue
    }
    "$ " {
        send "sudo poweroff\r"
        expect eof
        exit
    }
}

Finally, we copy out the .config from the VM to the host’s Linux Kernel repo. However, a few tweaks are known to be needed, and these are performed below. Further changes may be accomplished through the use of commands like make xconfig.

./mount_vm.sh
cp -r mnt/home/arch/linux/.config linux/.config
sync
umount mnt

pushd $LINUX_REPO_PATH

sed -i '/CONFIG_SATA_AHCI=$/d' .config
sed -i '/CONFIG_DEBUG_INFO=$/d' .config
sed -i '/CONFIG_DEBUG_INFO_DWARF5=$/d' .config
sed -i '/CONFIG_GDB_SCRIPTS=$/d' .config
sed -i '/CONFIG_GDB_INFO_REDUCED=$/d' .config
sed -i '/CONFIG_KGDB=$/d' .config
sed -i '/CONFIG_FRAME_POINTER=$/d' .config
sed -i '/CONFIG_KVM_GUEST=$/d' .config
sed -i '/CONFIG_RANDOMIZE_BASE=$/d' .config

echo "CONFIG_SATA_AHCI=y" >> .config
echo "CONFIG_DEBUG_INFO=y" >> .config
echo "CONFIG_DEBUG_INFO_DWARF5=y" >> .config
echo "CONFIG_GDB_SCRIPTS=y" >> .config
echo "CONFIG_GDB_INFO_REDUCED=n" >> .config
echo "CONFIG_KGDB=y" >> .config
echo "CONFIG_FRAME_POINTER=y" >> .config
echo "CONFIG_KVM_GUEST=y" >> .config
echo "CONFIG_RANDOMIZE_BASE=n" >> .config

Compile Linux

The last setup step is to compile the kernel as shown below. This is CPU and memory intensive. It is also notorious for failing for odd reasons. Just because a configuration exists doesn’t mean it can work. If you have difficulty here, it is best to reach out to the #kernel IRC channel for advice.

NOTE: Don’t forget to adjust the “-j” flag for your system. Many users will need to reduce the number of parallel jobs running.

NOTE: This step may require manual intervention. Simply accept the default options for [Y]es or [N]o. Avoid [M]odule because for later steps kernel modules will be unavailable to load into Qemu from external kernel loading.

make -j

popd

Develop and debug

At this juncture, the raw resources are present for development. Now, the rest needs to be shown through an example. This guide uses an example for fanotify(7) which is simple and easily adapted for other use cases.

Term 1

./debug_vm.sh
# <Go to Term 2>
# <login with username and password "arch">
git clone https://gitlab.com/anadon/monitor-for-free-space
cd monitor-for-free-space
make
./mnt_monitor / 5000 ^Z
bg 1
touch ~/_1
# <Go to Term 2>

Term 2

gdb $LINUX_REPO_PATH/vmlinux
: target remote:1234
: continue
<Go to Term 1, wait for login>
: ^C
: break fs/notify/fanotify/fanotify_user.c:process_access_response
: continue
: bt
: step
: continue

And that’s it!

Citations

Special Thanks

  • ngn