Homework 1

During this homework you will create a minimal Linux distribution from scratch using a Compiler Toolchain, the Linux Kernel sources and the busybox sources, and some tools to glue these parts together. Finally qemu, a virtual machine emulator, will be used to test the resulting system components.

Overview

Pre-Requisites

  • Complete Homework 0
  • Have SSH/X2Go access to your group's syslab container

Skills You Will Acquire

During this assignment you'll gain experiences in the following activities:

  • Configuring and compiling the Linux Kernel for the x86_64 Architecture
  • Configuring and compiling the busybox utility
  • Using tye Linux systemcall interface as a program developer
    • Developing a small program that reads system information with the C language
  • Assembling a Linux InitRamDisk file from scratch
  • Using qemu to run Linux in a virtual machine

Preparation

Before you proceed, please research the following topics. There is no need get into great detail at this point, but simply get an overview of what they are and how they are related.

Assignments

The assignments can be worked on independently since they are not all necessarily dependent upon each other.

Assignment Dependencies
Sysinfo Application -
Linux Kernel -
InitRamDisk #1 Linux Kernel, Sysinfo Application
Busybox -
InitRamDisk #2 Linux Kernel, Sysinfo Application, Busybox

Documentation (Human Language)

A non-technical assignment is to write documentation and answer questions. You can use either English oder German while working on technical assignments. The suggestion is to use English for practicing purposes and to avoid awkward mixins of English technical terms with German ;-)

hw1/QnA.md Question Boxes

Throughout the document you will find several question boxes. These questions are meant to help you think through what you did and how you can solve the current part of the assignment. Please keep a protocol for answering these questions in your project repository at $REPO_DIR/hw1/QnA.md. The file must contain the questions with brief answers written in your own words.

hw1/README.md (optional)

You can use this document to make the following notes:

  • difficulties throughout the homework
  • design decisions that you think are important to emphasize

Sysinfo Application

Start by writing a little program to display some information about a running Linux system in the following format:

Hostname: <hostname>
Uptime: <uptime in seconds>
Process count: <number of processes>
Total RAM: <RAM size in bytes> Byte
Free RAM: <Free RAM sizse in bytes> Byte
Page size: <memory unit size in bytes> Byte

Your program can retrieve all this information using the systemcalls mentioned in the preparation section.

Compilation

Use gcc to compile your program.

[warning] Produce a statically linked binary of your program code. You can verify that your binary doesn't have any dynamic dependencies using the file utility.

After compilation you can run the compiled program on the syslab container directly. This is possible because it has the same architecture as the virtual machine you will be using later on.

[info] To display the architecture on a Linux system use the uname -m command


[success] Questions

  • What other information can uname tell about a system?
  • How can you instruct gcc to produce statically linked binaries?
  • Which tools are able to display the dynamic dependencies for binaries on Linux?
  • What is the practical problem with dynamic linking when you want to install your program on a different system, e.g. a Virtual Machine?

Linux Kernel

The Linux Kernel is the core component of every Linux Distribution.

Sources

Everything starts with the Linux sources which can be downloaded from the official Linux Kernel website.

Use the following release for this assignment:

Kernel Configuration

This assignment will guide you through configuring a kernel that has only the options required for the rest of the assignment.

Change your shell to the the directory with the unpacked kernel sources. Guide:

  1. Run make allnoconfig to deselect all options
  2. Run make menuconfig to open an ncurses menu. This menu allows you to comfortably configure the kernel options and modules. The / key allows you to search through all options.
  3. Search and select the following options for the bare minimum operation within the virtual machine later on:

     [*] 64-Bit kernel
     General setup:
     -> Kernel compression mode (GZIP)
     [*] Initial RAM filesystem and RAM disk (initramfs/initrd) support
     Compiler optimization level (Optimize for size)  --->
     [*] Configure standard kernel features (expert users)
         ->[*] Enable support for printk
     [*] Embedded system
    
     Executable file formats / Emulations:
     [*] Kernel support for elf binaries
     [*] Kernel support for scripts starting with #!
    
     Device Drivers:
     -> Generic Driver options
         [*] Maintain a devtmpfs filesystem to mount at /dev
    
     -> Character devices:
         [*] Enable TTY
         [*]   Virtual terminal
         [*]     Enable character translations in console
         [*]     Support for console on virtual terminal
         [*]     Support for binding and unbinding console drivers
         [*]   Unix98 PTY support
         -> Serial Drivers
             ->[*] 8250/16550 and compatible serial support    
             ->[*] Console on 8250/16550 and compatible serial port
    
     File systems -> Pseudo Filesystems:
     [*] /proc file system support
     [*] Sysctl support (/proc/sys) 
     [*] sysfs file system support 
    
  4. Look for an option to hardcode the system's hostname and change it to grp${N}, where N is your group number
  5. Save and exit the menuconfig. Your configuration is now stored at $KERNEL_SRC/.config!

[success] Questions

  • How can you get more information about an item in the menuconfig?
  • What is the relation between the menuconfig and the .config file in the kernel source directory?
  • Which CONFIG_* options belong to the menu entries that are provided above?

Compilation

The kernel configuration can now be used by the build system to build only the code that has been configured.

time make -j5

This command builds the kernel, using 5 parallel jobs! The time program measures the duration of the make execution.


[success] Questions

  • What are the different times displayed by time?
  • Where and which are the output binary files produced by the compilation?
  • Which is the binary that represents bootable kernel image?

Run Linux in a Virtual Machine (with qemu)

The qemu-system-x86_64 emulator can be used to start a hypervised virtual machine on the running system. Providing --help to this program will show you all the options you can provide.

Qemu arguments

We want to run the Virtual Machine with specific arguments

  1. -m <RAM size in meg>: we start using 64
  2. -nographic: graphic support and provide a kernel file directly.

    [info] when your VM get's stuck, press ctrl+a followed by x to shut it down

  3. -kernel <path to the kernel image file>: this file is produced by the kernel compilation!
  4. -append "<kernel parameters>": With no graphic support you need to tell the kernel to use a text console by passing console=<tty-device>.
  5. -initrd <path to the initrd archive: this file will be built later on

[success] Questions Which TTY-device do you need to pass to the kernel for console input/output?

  • [info] In the previous configuration step you enabled a specific serial device that can be used for the kernel console.

Bootup the VM!

Boot your kernel and see how far your virtual machine boot process will go. Since you have not built the initrd file yet, omit this argument from the command.


[success] Questions

  • Is the system in a usable state, e.g. can you use a shell to execute commands on it?
  • If not, what is missing?

InitRamDisk #1

At this point, your Linux system in the Virtual Machine has no userland programs to execute, so it will simply stop after booting up. Therefore the next step is to build a Linux Userland filestructure and pack it as a CPIO archive that can be unpacked by the Kernel at runtime and execute the program(s) within.

The first initial ramdisk will contain only the static binary of your sysinfo application.

File Layout

Please create your first initial ramdisk with the following file layout.

└── bin
    └── sysinfo

Produce initrd-sysinfo.cpio

Using find and cpio you can create an uncompressed CPIO archive.

As an example, the following command creates a cpio archive with the newc format, containing all files in the current directory.

find | cpio -L -v -o -H newc > initrd.cpio

[success] Questions

  • What could the -L parameter be useful for?
  • How can you list the contents of a CPIO archive?
  • What is the path of program that the kernel can execute after unpacking it?

Qemu arguments and the init process

You will use the previous qemu command and extend it with the respective -initrd <file> and -append "<arg1> <arg2> ..." arguments.

[success] Questions

  • What needs to be passed to the kernel within append in order to tell it what binary to execute?
  • What is the default executable path of the kernel in case nothing is passed to change it?
  • What is the complete qemu command line to run your sysinfo application as the init process?

Busybox

To prepare a more versatile userland you can leverage the busybox utility. Use the following stable release of busybox for this assignment:

Busybox Configuration

Change your shell to the the directory with the unpacked busybox sources. Then:

  1. Run make allnoconfig to deselect all options
  2. Run make menuconfig to open an ncurses menu. This menu lets you configure all the applets that will be available from the single multi-call binary produced in the compilation step.

[success] Questions

  • How do multi-call binaries work?
  • What applets are needed to allow us to interact with the system?
  • How are symlinks to these binaries interpreted?

Compilation

Analog to the kernel compilation you can use the same command with the busybox sources too:

make -j5

This should result in a binary named busybox in the build directory. Please inspect this binary using the file utility.

[success] Questions

  • Does your busybox file have dynamic link dependencies against libraries installed on the build host?
  • How can you verify this on the command line?

[info] Busybox allows to configure the buildsystem via the menuconfig. Look for the term static.

InitRamDisk #2

For the second initial ramdisk you create a system that contains busybox and your sysinfo application. Busybox can be configured to provide an interactive shell interpreter and other utilities like cp, cat, etc..

[success] Questions

  • What busybox options have you chosen for your new initrd, and why?

On-Disk-Filesystem Layout

An (ASCI) image says more than words:

├── bin
│   ├── busybox
│   └── sysinfo
└── init

[success] Questions

  • Which additional directories are required to make the system work at runtime?
  • Which directories need to be created to be compliant with the Linux FHS 3.0?

The init file

The init file shall be an executable shell script - yet to be written by you! - that is responsible for setting up the userland at runtime.

Your init file is responsible for the following tasks on system boot

  1. Set up the directory layout
  2. Install busybox applets as symlinks
  3. Mount the devtmpfs at /dev
  4. Mount the sysfs at /sys
  5. Mount the procfs at /proc
  6. Mount a tmpfs at /tmp
  7. Run the sysinfo application
  8. Present the user with an interactive shell

[success] Questions

  • What are all the mounts/filesystems that are active after the system has booted?
  • How does the filetree under / look after the system is booted? (Please don't include all device, sys, and proc entries in your answer)

Result To Be Submitted

This section gives you accurate information which files are part of the submission for this homework. Results for bonus assignments are fully covered within this section.

Test-Suite and Continous Integration

Merge and Run the Continous-Integration test suite.

Build Instructions (hw1/hw1.sh)

A shell script that reproduces the final result of your homework. This shell script will be used to verify your results, and does not need to include commands that run the interactive menuconfig. However, you may implement such functionality for working conveniently within your homework repository.

Arguments and Script behavior

Arguments Function
(called without any arguments) Build all artifacts starting with just the files that are checked in to git
qemu_sysinfo Run qemu-system-x86_64, booting your system with the initrd-sysinfo
qemu_busybox Run qemu-system-x86_64, booting your system with the initrd-busybox
clean Remove all files not tracked by git

Build Artifacts (Binary Files)

The following files must be present in at hw1/artifacts/ after the build is complete but not tracked by git!

File Target Architecture Purpose
bzImage x86_64 Kernel binary used for qemu
sysinfo x86_64 Statically linked binary of your little C program
initrd-sysinfo.cpio x86_64 Initial RamDisk file in the form of a cpio archive, containing the sysinfo executable
initrd-buxybox.cpio x86_64 Initial RamDisk file in the form of a cpio archive, containing a statically linked busybox binary and an init executable

Other Files and Git

[warning] Please do not add binary files to your git repository. Only add files that represent configuration, source code and build commands.

Directory structure

This is an example directory structure with all files listed that are tracked by git. It shows the directory for this homework and the top-level files for continuous integration.

.
├──.travis.yml
├──ci
│  └── ...
└──hw1
   ├── busybox
   │   └── config
   ├── hw1.sh
   ├── initrd-busybox
   │   ├── bin
   │   │   ├── busybox -> ../../artifacts/busybox
   │   │   └── sysinfo -> ../../artifacts/sysinfo
   │   └── init
   ├── initrd-sysinfo
   │   └── bin
   │       └── sysinfo -> ../../artifacts/sysinfo
   ├── kernel
   │   └── config
   ├── QnA.md
   ├── README.md
   └── sysinfo
       └── src
           ├── Makefile
           └── sysinfo.c

Bonus Assignments

These optional assignments allow you to dig in a little deeper! They don't depend on each other so you can cherry-pick the ones you are interested in.

Enbale System Shutdown

Configure the kernel and busybox to allow a clean shutdown via the poweroff command.

Parse cmdline options

If your kernel is passed hostname=<hostname> through it's cmdline, your init script should set the hostname accordingly.

results matching ""

    No results matching ""