Homework 1
During this homework you will create a minimal Linux distribution from scratch using a Compiler Toolchain, the Linux Kernel sources and the busybox sources, and some tools to glue these parts together. Finally qemu, a virtual machine emulator, will be used to test the resulting system components.
Overview
- Pre-Requisites
- Skills You Will Acquire
- Documentation (Human Language))
- Sysinfo Application
- Linux Kernel
- Run Linux in a Virtual Machine (with qemu))
- InitRamDisk #1
- Busybox
- InitRamDisk #2
- Test-Suite and Continous Integration
- Build Instructions (hw1/hw1.sh))
- Build Artifacts (Binary Files))
- Other Files and Git
- Directory structure
- Enbale System Shutdown
- Parse cmdline options
Pre-Requisites
- Complete Homework 0
- Have SSH/X2Go access to your group's syslab container
Skills You Will Acquire
During this assignment you'll gain experiences in the following activities:
- Configuring and compiling the Linux Kernel for the x86_64 Architecture
- Configuring and compiling the busybox utility
- Using tye Linux systemcall interface as a program developer
- Developing a small program that reads system information with the C language
- Assembling a Linux InitRamDisk file from scratch
- Using qemu to run Linux in a virtual machine
Preparation
Before you proceed, please research the following topics. There is no need get into great detail at this point, but simply get an overview of what they are and how they are related.
- Linux
- GCC - GNU Compiler Collection
- Qemu
- InitRamDisk/InitRamFS
- Linux System Calls
- Linux Filesystem Hierarchy Standard 3.0
Assignments
The assignments can be worked on independently since they are not all necessarily dependent upon each other.
Assignment | Dependencies |
---|---|
Sysinfo Application | - |
Linux Kernel | - |
InitRamDisk #1 | Linux Kernel, Sysinfo Application |
Busybox | - |
InitRamDisk #2 | Linux Kernel, Sysinfo Application, Busybox |
Documentation (Human Language)
A non-technical assignment is to write documentation and answer questions. You can use either English oder German while working on technical assignments. The suggestion is to use English for practicing purposes and to avoid awkward mixins of English technical terms with German ;-)
hw1/QnA.md Question Boxes
Throughout the document you will find several question boxes.
These questions are meant to help you think through what you did and how you can solve the current part of the assignment.
Please keep a protocol for answering these questions in your project repository at $REPO_DIR/hw1/QnA.md
.
The file must contain the questions with brief answers written in your own words.
hw1/README.md (optional)
You can use this document to make the following notes:
- difficulties throughout the homework
- design decisions that you think are important to emphasize
Sysinfo Application
Start by writing a little program to display some information about a running Linux system in the following format:
Hostname: <hostname> Uptime: <uptime in seconds> Process count: <number of processes> Total RAM: <RAM size in bytes> Byte Free RAM: <Free RAM sizse in bytes> Byte Page size: <memory unit size in bytes> Byte
Your program can retrieve all this information using the systemcalls mentioned in the preparation section.
Compilation
Use gcc
to compile your program.
[warning] Produce a statically linked binary of your program code. You can verify that your binary doesn't have any dynamic dependencies using the
file
utility.
After compilation you can run the compiled program on the syslab container directly. This is possible because it has the same architecture as the virtual machine you will be using later on.
[info] To display the architecture on a Linux system use the
uname -m
command
[success] Questions
- What other information can
uname
tell about a system?- How can you instruct
gcc
to produce statically linked binaries?- Which tools are able to display the dynamic dependencies for binaries on Linux?
- What is the practical problem with dynamic linking when you want to install your program on a different system, e.g. a Virtual Machine?
Linux Kernel
The Linux Kernel is the core component of every Linux Distribution.
Sources
Everything starts with the Linux sources which can be downloaded from the official Linux Kernel website.
Use the following release for this assignment:
Kernel Configuration
This assignment will guide you through configuring a kernel that has only the options required for the rest of the assignment.
Change your shell to the the directory with the unpacked kernel sources. Guide:
- Run
make allnoconfig
to deselect all options - Run
make menuconfig
to open an ncurses menu. This menu allows you to comfortably configure the kernel options and modules. The/
key allows you to search through all options. Search and select the following options for the bare minimum operation within the virtual machine later on:
[*] 64-Bit kernel General setup: -> Kernel compression mode (GZIP) [*] Initial RAM filesystem and RAM disk (initramfs/initrd) support Compiler optimization level (Optimize for size) ---> [*] Configure standard kernel features (expert users) ->[*] Enable support for printk [*] Embedded system Executable file formats / Emulations: [*] Kernel support for elf binaries [*] Kernel support for scripts starting with #! Device Drivers: -> Generic Driver options [*] Maintain a devtmpfs filesystem to mount at /dev -> Character devices: [*] Enable TTY [*] Virtual terminal [*] Enable character translations in console [*] Support for console on virtual terminal [*] Support for binding and unbinding console drivers [*] Unix98 PTY support -> Serial Drivers ->[*] 8250/16550 and compatible serial support ->[*] Console on 8250/16550 and compatible serial port File systems -> Pseudo Filesystems: [*] /proc file system support [*] Sysctl support (/proc/sys) [*] sysfs file system support
- Look for an option to hardcode the system's hostname and change it to grp${N}, where N is your group number
- Save and exit the menuconfig. Your configuration is now stored at $KERNEL_SRC/.config!
[success] Questions
- How can you get more information about an item in the menuconfig?
- What is the relation between the menuconfig and the .config file in the kernel source directory?
- Which CONFIG_* options belong to the menu entries that are provided above?
Compilation
The kernel configuration can now be used by the build system to build only the code that has been configured.
time make -j5
This command builds the kernel, using 5 parallel jobs!
The time
program measures the duration of the make execution.
[success] Questions
- What are the different times displayed by
time
?- Where and which are the output binary files produced by the compilation?
- Which is the binary that represents bootable kernel image?
Run Linux in a Virtual Machine (with qemu)
The qemu-system-x86_64
emulator can be used to start a hypervised virtual machine on the running system.
Providing --help
to this program will show you all the options you can provide.
Qemu arguments
We want to run the Virtual Machine with specific arguments
-m <RAM size in meg>
: we start using 64-nographic
: graphic support and provide a kernel file directly.[info] when your VM get's stuck, press ctrl+a followed by x to shut it down
-kernel <path to the kernel image file>
: this file is produced by the kernel compilation!-append "<kernel parameters>"
: With no graphic support you need to tell the kernel to use a text console by passingconsole=<tty-device>
.-initrd <path to the initrd archive
: this file will be built later on
[success] Questions Which TTY-device do you need to pass to the kernel for console input/output?
[info] In the previous configuration step you enabled a specific serial device that can be used for the kernel console.
Bootup the VM!
Boot your kernel and see how far your virtual machine boot process will go. Since you have not built the initrd file yet, omit this argument from the command.
[success] Questions
- Is the system in a usable state, e.g. can you use a shell to execute commands on it?
- If not, what is missing?
InitRamDisk #1
At this point, your Linux system in the Virtual Machine has no userland programs to execute, so it will simply stop after booting up. Therefore the next step is to build a Linux Userland filestructure and pack it as a CPIO archive that can be unpacked by the Kernel at runtime and execute the program(s) within.
The first initial ramdisk will contain only the static binary of your sysinfo application.
File Layout
Please create your first initial ramdisk with the following file layout.
└── bin └── sysinfo
Produce initrd-sysinfo.cpio
Using find
and cpio
you can create an uncompressed CPIO archive.
As an example, the following command creates a cpio archive with the newc format, containing all files in the current directory.
find | cpio -L -v -o -H newc > initrd.cpio
[success] Questions
- What could the -L parameter be useful for?
- How can you list the contents of a CPIO archive?
- What is the path of program that the kernel can execute after unpacking it?
Qemu arguments and the init process
You will use the previous qemu command and extend it with the respective -initrd <file>
and -append "<arg1> <arg2> ..."
arguments.
[success] Questions
- What needs to be passed to the kernel within append in order to tell it what binary to execute?
- What is the default executable path of the kernel in case nothing is passed to change it?
- What is the complete qemu command line to run your sysinfo application as the init process?
Busybox
To prepare a more versatile userland you can leverage the busybox utility. Use the following stable release of busybox for this assignment:
Busybox Configuration
Change your shell to the the directory with the unpacked busybox sources. Then:
- Run
make allnoconfig
to deselect all options - Run
make menuconfig
to open an ncurses menu. This menu lets you configure all the applets that will be available from the single multi-call binary produced in the compilation step.
[success] Questions
- How do multi-call binaries work?
- What applets are needed to allow us to interact with the system?
- How are symlinks to these binaries interpreted?
Compilation
Analog to the kernel compilation you can use the same command with the busybox sources too:
make -j5
This should result in a binary named busybox in the build directory.
Please inspect this binary using the file
utility.
[success] Questions
- Does your busybox file have dynamic link dependencies against libraries installed on the build host?
- How can you verify this on the command line?
[info] Busybox allows to configure the buildsystem via the menuconfig. Look for the term static.
InitRamDisk #2
For the second initial ramdisk you create a system that contains busybox and your sysinfo application. Busybox can be configured to provide an interactive shell interpreter and other utilities like cp, cat, etc..
[success] Questions
- What busybox options have you chosen for your new initrd, and why?
On-Disk-Filesystem Layout
An (ASCI) image says more than words:
├── bin │ ├── busybox │ └── sysinfo └── init
[success] Questions
- Which additional directories are required to make the system work at runtime?
- Which directories need to be created to be compliant with the Linux FHS 3.0?
The init file
The init file shall be an executable shell script - yet to be written by you! - that is responsible for setting up the userland at runtime.
Your init file is responsible for the following tasks on system boot
- Set up the directory layout
- Install busybox applets as symlinks
- Mount the devtmpfs at /dev
- Mount the sysfs at /sys
- Mount the procfs at /proc
- Mount a tmpfs at /tmp
- Run the sysinfo application
- Present the user with an interactive shell
[success] Questions
- What are all the mounts/filesystems that are active after the system has booted?
- How does the filetree under
/
look after the system is booted? (Please don't include all device, sys, and proc entries in your answer)
Result To Be Submitted
This section gives you accurate information which files are part of the submission for this homework. Results for bonus assignments are fully covered within this section.
Test-Suite and Continous Integration
Merge and Run the Continous-Integration test suite.
Build Instructions (hw1/hw1.sh)
A shell script that reproduces the final result of your homework. This shell script will be used to verify your results, and does not need to include commands that run the interactive menuconfig. However, you may implement such functionality for working conveniently within your homework repository.
Arguments and Script behavior
Arguments | Function |
---|---|
(called without any arguments) | Build all artifacts starting with just the files that are checked in to git |
qemu_sysinfo | Run qemu-system-x86_64 , booting your system with the initrd-sysinfo |
qemu_busybox | Run qemu-system-x86_64 , booting your system with the initrd-busybox |
clean | Remove all files not tracked by git |
Build Artifacts (Binary Files)
The following files must be present in at hw1/artifacts/
after the build is complete but not tracked by git!
File | Target Architecture | Purpose |
---|---|---|
bzImage | x86_64 | Kernel binary used for qemu |
sysinfo | x86_64 | Statically linked binary of your little C program |
initrd-sysinfo.cpio | x86_64 | Initial RamDisk file in the form of a cpio archive, containing the sysinfo executable |
initrd-buxybox.cpio | x86_64 | Initial RamDisk file in the form of a cpio archive, containing a statically linked busybox binary and an init executable |
Other Files and Git
[warning] Please do not add binary files to your git repository. Only add files that represent configuration, source code and build commands.
Directory structure
This is an example directory structure with all files listed that are tracked by git. It shows the directory for this homework and the top-level files for continuous integration.
. ├──.travis.yml ├──ci │ └── ... └──hw1 ├── busybox │ └── config ├── hw1.sh ├── initrd-busybox │ ├── bin │ │ ├── busybox -> ../../artifacts/busybox │ │ └── sysinfo -> ../../artifacts/sysinfo │ └── init ├── initrd-sysinfo │ └── bin │ └── sysinfo -> ../../artifacts/sysinfo ├── kernel │ └── config ├── QnA.md ├── README.md └── sysinfo └── src ├── Makefile └── sysinfo.c
Bonus Assignments
These optional assignments allow you to dig in a little deeper! They don't depend on each other so you can cherry-pick the ones you are interested in.
Enbale System Shutdown
Configure the kernel and busybox to allow a clean shutdown via the poweroff
command.
Parse cmdline options
If your kernel is passed hostname=<hostname>
through it's cmdline, your init script should set the hostname accordingly.