Homework 3

This homework will introduce you to compiling binaries for a different architecture, which is called cross compiling. It is based on Homework 2, with the exception of targeting an ARM virtual machine instead of x86_64.

Overview

Documentation (Human Language)

A non-technical assignment is to write documentation and answer questions. You can use either English oder German while working on technical assignments. The suggestion is to use English for practicing purposes and to avoid awkward mixins of English technical terms with German ;-)

hw3/QnA.md Question Boxes

Throughout the document you will find several question boxes. These questions are meant to help you think through what you did and how you can solve the current part of the assignment. Please keep a protocol for answering these questions in your project repository at $REPO_DIR/hw3/QnA.md. The file must contain the questions with brief answers written in your own words.

hw3/README.md (optional)

You can use this document to make the following notes:

  • difficulties throughout the homework
  • design decisions that are necessary to explain or you think are important to emphasize

Preparation

  • Complete Homework 2
  • Have SSH/X2Go access to your group's syslab container

Skills You Will Acquire

During this assignment you'll gain experiences in the following activities:

  • Strengthen the skills learned in the previous homeworks
  • Configuring the Linux Kernel for the qemu-system-aarch64 (arm64) virtual machine architecture
  • Cross Compiling the Linux Kernel, the sysinfo application, busybox, dropbear
  • Using qemu to emulate the aarch64 architecture as a virtual machine and userspace emulator

Bonus only:

  • Learn more about the dynamic linking with the GCC toolchain
  • Learn how to deploy an init system instead of a script based init procedure

[warning] Bonus Assignment Information

The first bonus assignment contains significant changes for this homework, as it switches from static to dynamic linking of libraries which need to be copied to the InitRamDisk. It is suggested that you decide upfront if you want to do the first bonus assignment, as it affects every component of this homework except for the Linux kernel.

Pre-Requisites

Before you proceed, please research the following topics. There is no need get into great detail at this point, but simply get an overview of what they are and how they are related.

Bonus only:

Inspecting the toolchains

Please take a closer look at the main components of the toolchain that humans might interact with directly: gcc and ld.

[success] Questions

  • What is the difference between the commands gcc and aarch64-linux-gnu-gcc?
  • What is the difference between the commands ld and aarch64-linux-gnu-ld?

Assignments

The following table lists the assignments and their interdependencies.

Assignment Dependencies
Linux Kernel -
Busybox -
Dropbear -
InitRamDisk Busybox, Dropbear
qemu for aarch64 (network enabled) Kernel, InitRamDisk

The goals for this homework are the same as for homework 2 with the following exceptions:

  • The sysinfo receives a slight modification
  • The compilation and emulation target platform is aarch64

Bonus only:

  • All binaries are produced with dynamic linking, which requires you to install the libraries on the target system

Linux Kernel

Even though the configuration from homework 2 is for the x86_64 architecture, it can be re-used in this assignment. It will be transformed to an aarch64 (or arm64 in Kernel buildsystem slang) when menuconfig is called with the below mentioned environment variables set.

Configure Linux for a different architecture

To change the architecture of the kernel menuconfig, you need to supply the ARCH and CROSS_COMPILE environment variables accordingly.

The link to the Gentoo wiki in the Pre-Requesites demonstrates the procedure to set these variables

[info] Use the linked wiki as an information source only not as a tutorial!

  • The target platform for this assignment is not the Raspberry Pi 3, because it cannot be emulated by Qemu, and you will choose a machine and cpu type used by Qemu
  • You will use a different toolchain prefix and obviously won't install Gentoo in the VM ;-)
  • Take note of the unintuitive naming scheme (aarch64/arm64/armv8)

For starters, place the homework 2 kernel config in the kernel source directory and run menuconfig. Alternatively you can also use the load config option in the menuconfig.

Don't change the configuration manually, just exit and confirm the save.

[success] Questions Please look at the diff between the newly saved kernel config and the one from homework 2.

  • Which configure option(s) reflect(s) the architecture change?
  • Which ARM platform has been activated by default?

Enable The Correct Serial Device For Console output

The qemu-system-aarch64 uses a different hardware architecture and also has different periphery components as the emulator for x86_64. These peripherals are not automatically enabled by changing the config's architecture.

One of the important devices that your kernel needs to support is the PrimeCell PL011 UART controller, which you will use for serial console output.

Enable the Generic PCI Host controller

On aarch64, the option CONFIG_PCI_HOST_GENERIC is required for the qemu's virtio-net to work.

[success] Questions

  • What is the name of the serial port that you need to pass to console=... in order to get console output on the PL011 device?

Busybox

For this assignment you don't need any new busybox applets.

Dropbear SSH

The config.h file for dropbear can remain the same for this assignment.

Configuration

In addition to the configuration from homework 2, the following option needs to be passed to ./configure

        --host=aarch64-linux-gnu

[success] Questions

  • What is the meaning of the host option for this configure script?

Sysinfo Application Extended

Please extend your sysinfo application from the homework 1 to print out the following information:

---- gethostname Information ----
Hostname: grp0

---- sysinfo Information ----
Uptime: 2 seconds
Process count: 20
Total RAM: 46292992 Byte
Free RAM: 39907328 Byte
Page size: 1 Byte

---- utsname Information ----
system name: Linux
node name: grp0
release: 4.11.0
version: #1 SMP Sat May 6 11:04:10 UTC 2017
machine: aarch64

The information from utsname is retrieved by the uname() library function which is used by the uname command line tool you already learned about.

InitRamDisk

The intitial ramdisk for this assignment will include the same components as in Homework 2, this time compiled for the aarch64 architecture.

Dropbear dlopen()'s some libraries (Reminder)

At runtime, the following libraries will be needed once you try to login to the SSH server:

  • ld-linux-aarch64.so.1
  • libc.so.6
  • libnss_files.so.2

If these libraries are not available at runtime, the login to the SSH server cannot work. As in homework 2, you can get the absolute path for each library within the development environment using the command aarch64-linux-gnu-gcc -print-file-name=$library_name.

The init file

The init file in the initrd can be identical to the one in homework 2.

Run Linux in a Virtual Machine (with qemu)

The qemu command line arguments differ from homework 2 in the following ways:

  • You invoke the aarch64 system emulator now
  • You need to provide a machine type and a CPU model. It's recommended to use virt as the machine type.

[success] Questions

  • Please explain your choice for the machine and CPU types.

Result To Be Submitted (tracked by Git)

This section gives you accurate information which files are part of the submission for this homework. Results for bonus assignments are not covered within this section.

Test-Suite and Continous Integration

Merge and Run the Continous-Integration test suite.

Directory Structure

This is an exemplary structure of how your submitted (track by git) files could be structured:

.
├──.travis.yml
├──ci
│  └── ...
└──hw3
   ├── busybox
   │   └── config
   ├── dropbear
   │   └── options.h
   ├── hw3.sh
   ├── initrd
   │   ├── bin
   │   │   ├── busybox -> ../../artifacts/busybox
   │   │   ├── dropbearmulti -> ../../artifacts/dropbearmulti
   │   │   └── sysinfo -> ../../artifacts/sysinfo
   │   ├── etc
   │   │   ├── group
   │   │   └── passwd
   │   ├── init
   │   └── lib
   ├── kernel
   │   └── config
   └── sysinfo
       └── src
           ├── Makefile
           └── sysinfo.c

Documentation files

(not shown in the above tree) Please include the documentation files that are explained in the beginning.

Build Instructions (hw3/hw3.sh)

A shell script that reproduces the final result of your homework. This shell script will be used to verify your results, and does not need to include commands that run the interactive menuconfig. However, you may implement such functionality for working conveniently within your homework repository.

Arguments and Script behavior

Arguments Function
(called without any arguments) Build all artifacts starting with just the files that are checked in to git
qemu Run qemu-system-aarch64, booting your system with the initrd and network
clean Remove all files not tracked by git
ssh_cmd cmd [args...] Establish a connection to the VM's SSH server via authorized_key authentication. It runs the specified command with all arguments inside the VM. An example would be ssh_cmd "echo Hello, World", as found in the CI scripts.

SSH Key For Authorization

(not shown in the above tree)

The SSH Key file that can be used to authorize with root in your virtual machine. (not displayed in the example tree layout)

Build Artifacts (Binary Files)

The following files must be present in at hw3/artifacts/ after the build is complete but not tracked by git!

File Target Architecture Purpose
Image.gz aarch64 (arm64) Cross-compiled Kernel binary used for qemu
sysinfo aarch64 Cross-compiled Statically linked binary of your little C program
dropbearmulti aarch64 Cross-compiled statically linked multicall binary for Dropbear
initrd.cpio aarch64 Initial RamDisk file in the form of a cpio archive

Other Files and Git

[warning] Please do not add binary files to your git repository. Only add files that represent configuration, source code and build commands.

Bonus Assignments

These optional assignments allow you to dig in a little deeper! They don't depend on each other so you can cherry-pick the ones you are interested in.

Bonus 1: Use dynamic linking for all binaries

Instead of statically linking the libraries into each binary, and hereby redundant binary code on the target system, the libraries can be linked dynamically and stored as shared libraries. In this bonus assignment you will learn a lot about the internals of the ELF binary format and how dynamic linking works on Linux.

Sysinfo Application Extended

Please omit the -static argument for the compiler to turn off static linking.

Busybox

The difference here is that you have to disable the setting that causes a static build.

Please consult the busybox Makefile to see how cross compilation works with the busybox buildsystem.

Dropbear

Please compile dropbear as a dynamically linked multi-call binary this time by omitting STATIC=1 from the make call.

InitRamDisk

The intitial ramdisk for this assignment will include all dynamically libraries for all contained binaries. The most difficult part is to find these libraries because they are scattered all over the filesystem.

[success] Questions By now you should have dynamically linked and cross-compiled binaries for busybox, sysinfo, and dropbearmulti in your artifacts directory.

  • Why does running ldd against these binaries not tell you that they are dynamically linked?
  • How else can you verifty this, and in addition find out the architecture they are built for?
Locating and copying the required shared objects

If you use the file program on the binaries you can see the path of their dynamic interpreter, which is the program that locates the required libraries at runtime. It can also be used to display information about dynamically linked binaries. This method allows you to find the absolute paths for libraries you need to install in your initrd.

The method's rationale is quite simple:

  1. Get the dynamic linker path used by the binary
  2. Use the dynamic linker to print information about the binary
Example: Inspecting the bash executable
$ file $(type -P bash)
bin/bash: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=4be0cc32aba02ec4e0f010047be5ae9dee756960, stripped

$ ldd $(which bash)
    linux-vdso.so.1 (0x00007ffc01133000)
    libtinfo.so.5 => /lib/x86_64-linux-gnu/libtinfo.so.5 (0x00007f4f8a559000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f4f8a355000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f4f89fb6000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f4f8a783000)

The first command displays the interpreter, which is the dynamic linker for x86_64 binaries in this case. The second command uses said dynamic linker to resolve and list all dependencies. The interpreter and all dynamically linked libraries need to be available at runtime on the target system, in order for the executable to work. If you just copy the executable and try to run it there will be an error.

Based on this method you can retrieve the list of shared objects files for the binaries in your initrd.

[success] Questions

  • What error message do you receive when you try to run an executable and forgot to copy the executable's interpreter into the InitRamDisk?
  • What is the interpreter for your cross compiled binary files?
  • Why can you not run the dynamic linker for those binaries directly in your container?
  • How can qemu's userspace emulator help you out?
Dropbear

At runtime, the following libraries will be needed once you try to login to the SSH server:

  • libnss_files.so.2

In comparison to the static compilation scenario this list is reduced by the libraries that are properly linked, thus become visible dependencies and are handled by the mechanism to copy the required shared libraries.

You will not use the environment variable LD_LIBRARY_PATH in this case, but rely on the rpath which is stored in the executable file itself. The dynamic linker will search all paths in rpath automatically when a dynamically linked binary is invoked.

Bonus 2: Use busybox's init as PID1 in your system

The regular assignments instruct you to use a simple shell script as the program that controls the init procedure. On production systems this is handled by a process supervisor, that is configurable rather than scriptable. Activate busybox's init system and integrate it into your initrd. This includes setting up an inittab. Please configure this so that it automatically spawns a (login) shell on the serial port.

results matching ""

    No results matching ""