.. include:: ############################# ADDENDUM: FAST FORWARD - 2020 ############################# As already discussed, *compilers* take semi-readable, higher level code as input, and produce machine language as output. Microprocessors such as the Arduino are small enough and single-purpose enough that they neither want nor need an operating system. However, the Raspberry Pi is sophisticted and capable enough to want one. When you have an operating system, code needs to negotiate with the operating system for most tasks. The operating system is a traffic cop, hotel booking agent, and lots more. Among other things, the operating system has to: * tell programs when to pause and when to resume, in order to allow other programs to run, * keep an inventory of available disk space and allocate it appropriately, * watch for service requests from programs and determine which programs should answer those requests, * protect programs and data, and more. What appears to be simple input from a keyboard or output to a screen is more complex when there's an operating system between the code and the hardware. So, in order to keep the generated machine code somewhat readable, the following ``C`` program does not actually attempt input or output, and thus, does not add a lot of operating system-dependent code. Instead, it mimics the first program shown in the *Altair 8800* Operator's Manual. .. code-block:: C void main() { char a = 5; char b = 10; a = a + b; } By using ``char`` instead of ``int`` the values are constrained to 8 bits, just as on the *Altair 8800*. If the above program is saved as ``add.c`` it can be compiled, and an assembler language / machine language listing can be produced from the executable object code file produced by the compiler. **IMPORTANT DETAILS**: To understand the assembler language and machine code produced, you need to have an idea of the hardware that the compiler is targeting. Typically, this will be the hardware that is running the compiler, but not always. For example, the **Arduino IDE** (Integrated Development Environment) is run on non-Arduino processors, but produces Arduino machine code. To a lesser degree, it is also useful to know the version of the operating system kernel, the version of the compiler, and the "dialect" of the assembler language being used. In the following example, those details are: .. container:: center .. table:: +---------------------+--------------------------------------------------------------+ | **Processors**: | 8 |times| Intel\ |reg| Core\ |trade| i7-2960XM CPU @ 2.70GHz | +---------------------+--------------------------------------------------------------+ | **Memory**: | 31.3 GiB of RAM | +---------------------+--------------------------------------------------------------+ | **OS Type**: | Linux 64-bit | +---------------------+--------------------------------------------------------------+ | **Kernel Version**: | 5.4.0-48-lowlatency | +---------------------+--------------------------------------------------------------+ | **Compiler**: | GNU C Compiler (gcc) version 9.3.0 | +---------------------+--------------------------------------------------------------+ From the ``Bash`` prompt: .. code-block:: bash $ gcc -g -c add.c $ objdump -drwC -Mintel add.o (There are other ways to produce an assembler language / machine language listing. For example, you can ask GCC to generate it during the initial compilation. However, the ``objdump`` method produces a more compact, less verbose output.) Compilers take source code and produce *object* files. Depending upon both the source code and the way that the source code is compiled, the object file may be a non-executable library of functions and subroutines that are *linked* to one or more main programs, or they may be complete, executable programs. *Shared object libraries* (``.so``) on Linux and *dynamically linked libraries* (``.dll``) on Windows are examples of the former. For example, such a library may contain efficient implementations of math algorithms, or special graphics functions. Object files can, and on most systems do, contain more than just the machine language. They can contain a "data" segment that holds all of the constants used in a program, as well as information on where in memory to put both the data and the executable code. On Linux (and other Unix-like systems) object files are stored in `Executable and Linkable Format`_ more commonly referred to by its acronym **ELF**. ``objdump`` reverse-engineers the binary object file, attempting to change it back from unprintable gobblty-gook into a human-readable form -- though some information, like the original variable names, can be lost. ``objdump`` has no preconceived notion of what it is being asked to do. The option ``-Mintel`` in the command explicitly states that ``objdump`` should assume that the instruction set being used is **Intel** and therefore the machine language should be translated back into Intel `X86 architecture`_ mnemonics. The output of the ``objdump`` is: .. code-block:: text add.o: file format elf64-x86-64 Disassembly of section .text: 0000000000000000
: 0: f3 0f 1e fa endbr64 4: 55 push rbp 5: 48 89 e5 mov rbp,rsp 8: c6 45 fe 05 mov BYTE PTR [rbp-0x2],0x5 c: c6 45 ff 0a mov BYTE PTR [rbp-0x1],0xa 10: 0f b6 55 fe movzx edx,BYTE PTR [rbp-0x2] 14: 0f b6 45 ff movzx eax,BYTE PTR [rbp-0x1] 18: 01 d0 add eax,edx 1a: 88 45 fe mov BYTE PTR [rbp-0x2],al 1d: 90 nop 1e: 5d pop rbp 1f: c3 ret Using Wikipedia's `X86 architecture`_ entry, the above can be interpreted as: .. container:: center .. table:: Registers used +----------+--------------------------------+ | Mnemonic | Name | +==========+================================+ | ``rbp`` | Base Pointer | +----------+--------------------------------+ | ``rsp`` | Stack Pointer | +----------+--------------------------------+ | ``eax`` | Accumulator (low 32 bits only) | +----------+--------------------------------+ | ``edx`` | Data (low 32 bits only) | +----------+--------------------------------+ | ``al`` | Accumulator (low 8 bits only) | +----------+--------------------------------+ 0. The ``endbr64`` is beyond the scope of this document. The curious are referred to the StackOverflow question `What does endbr64 instruction actually do?`_ For all intents and purposes, it is a ``nop`` in this code and can be safely ignored. 1. Save the current state of the *Base Pointer* (``rbp``) by pushing it onto a stack. The Base Pointer points to an area where data, rather than code is normally stored. 2. Move the value of the *Stack Pointer* (``rsp``) into into the Base Pointer (``rbp``). I believe this is a compiler optimization: Due to the relatively little amount of data, as well as the small size of each data element, the compiler (I **think**) has chosen to use the stack as a data storage area rather than the memory area normally allocated to data. The stack is smaller than the data area, but also faster to access. 3. Move the value 5 (``0x5``) to the memory location pointed to by ``rbp - 2``. This is the ``char a = 5;`` from the C program. 4. Move the value 10 (``0xa``) to the location pointed to by ``rbp - 1``. This corresponds to ``char b = 10;`` in the C program. 5. Move ``a`` into the low 32 bits of the *Accumulator* (``eax``) and ``b`` into the low 32 bits of the *Data Register* (``edx``). Even though the original code specifies 8-bit quantities, the compiler has chosen to allow for quantities up to 32 bits. (Suppose, for example, instead of adding 10 to 5, the original code raised 5 to the 10th power. Both quantities, 5 and 10 would still fit in 8 bits, but the result would not.) 6. Add the contents of the Data Register (``edx``) to the contents of the Accumulator Register (``eax``) The assignment hasn't been made yet, but this is the addition ``a + b``. The result lives in the Accumulator but has not yet been stored in memory. 7. Move only the low 8 bits of the Accumulator (``al``) to ``rbp - 2``, which is where ``a`` was stored in step 3. Now the assignment has been made (``a = ...``). Together with the previous step, ``a = a + b;`` has now been completed. 8. The ``nop`` appears unnecessary, but the compiler probably generates it so that the ``pop`` statement falls on an even-numbered memory address. Depending upon the architecture of the hardware, and features of the operating system, careful byte-alignment of code is often more efficient, and sometimes absolutely necessary. 9. The ``pop`` cleans up memory by resetting the ``rbp`` back to its initial state. The ``rbp`` can then be used by the next program. 10. ``ret``. When you start a program by typing its name at the Bash prompt, you are, in effect, calling the program as you would a subroutine. At this point in the code, the ``return`` will return control back to Bash (which is a constantly running program which awaits to do your bidding and dispatch tasks that it cannot handle to other programs). A much more thorough coverage can be found in the `official Intel reference documents`_. However, ten volumes is a bit much to tackle. There are other ways to produce assembly language / machine language listings, but the above was the least wordy method I could find. The two most popular assemblers on Linux are the `GNU Assembler`_ (``gas`` a.k.a. ``as``) and the `Netwide Assembler`_ (``nasm``). Both use many of the same mnemonics, and produce the same machine code, but offer slight variations in syntax (e.g. ``gas`` uses ``#`` as the comment delimiter, while ``nasm`` uses ``;`` as the comment delimiter). ``gas`` is tightly woven into the fabric of of Linux, as part of the `GNU Compiler Collection`_ (``gcc``) nee the GNU C Compiler) which, in addition to ``gas`` includes ``C``, ``FORTRAN``, ``C++``, ``Go``, and ``Java`` among others. On Debian-like systems, ``gas`` is part of the ``binutils`` package. I would also suggest installing the ``gcc`` package if you do not already have it. ``nasm`` is its own package. You can assemble and run a "Hello World" in the dialect of your choice (after ensuring the appropriate packages are installed) by looking at the comments in the two examples below: .. container:: center .. table:: "Hello World" examples :header-alignment: center center :column-alignment: center left :column-dividers: single single single +-----------+----------------------------------+ | Assembler | Source code | +-----------+----------------------------------+ | ``nasm`` | `hello.asm <_static/hello.asm>`_ | +-----------+----------------------------------+ | ``gas`` | `hello.s <_static/hello.s>`_ | +-----------+----------------------------------+ See also: * `Resources`_ * the `Comparison of assemblers`_ ---- .. _X86 architecture: https://en.wikibooks.org/wiki/X86_Assembly/X86_Architecture .. _Executable and Linkable Format: https://en.wikipedia.org/wiki/Executable_and_Linkable_Format .. _official Intel reference documents: https://software.intel.com/content/www/us/en/develop/articles/intel-sdm.html .. _comparison of assemblers: https://en.wikipedia.org/wiki/Comparison_of_assemblers .. _what does endbr64 instruction actually do?: https://stackoverflow.com/a/56910435/447830 .. _GNU Assembler: https://sourceware.org/binutils/docs-2.35/as/ .. _Netwide Assembler: https://www.nasm.us/docs.php .. _GNU Compiler Collection: https://en.wikipedia.org/wiki/GNU_Compiler_Collection .. _Resources: resources.html